IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-30761-2.html
   My bibliography  Save this article

Towards artificial general intelligence via a multimodal foundation model

Author

Listed:
  • Nanyi Fei

    (Renmin University of China
    Beijing Key Laboratory of Big Data Management and Analysis Methods
    Renmin University of China)

  • Zhiwu Lu

    (Renmin University of China
    Beijing Key Laboratory of Big Data Management and Analysis Methods)

  • Yizhao Gao

    (Renmin University of China
    Beijing Key Laboratory of Big Data Management and Analysis Methods)

  • Guoxing Yang

    (Renmin University of China
    Beijing Key Laboratory of Big Data Management and Analysis Methods)

  • Yuqi Huo

    (Beijing Key Laboratory of Big Data Management and Analysis Methods
    Renmin University of China)

  • Jingyuan Wen

    (Renmin University of China
    Beijing Key Laboratory of Big Data Management and Analysis Methods)

  • Haoyu Lu

    (Renmin University of China
    Beijing Key Laboratory of Big Data Management and Analysis Methods)

  • Ruihua Song

    (Renmin University of China
    Beijing Key Laboratory of Big Data Management and Analysis Methods)

  • Xin Gao

    (King Abdullah University of Science and Technology)

  • Tao Xiang

    (University of Surrey)

  • Hao Sun

    (Renmin University of China
    Beijing Key Laboratory of Big Data Management and Analysis Methods)

  • Ji-Rong Wen

    (Renmin University of China
    Beijing Key Laboratory of Big Data Management and Analysis Methods
    Renmin University of China)

Abstract

The fundamental goal of artificial intelligence (AI) is to mimic the core cognitive activities of human. Despite tremendous success in the AI research, most of existing methods have only single-cognitive ability. To overcome this limitation and take a solid step towards artificial general intelligence (AGI), we develop a foundation model pre-trained with huge multimodal data, which can be quickly adapted for various downstream cognitive tasks. To achieve this goal, we propose to pre-train our foundation model by self-supervised learning with weak semantic correlation data crawled from the Internet and show that promising results can be obtained on a wide range of downstream tasks. Particularly, with the developed model-interpretability tools, we demonstrate that strong imagination ability is now possessed by our foundation model. We believe that our work makes a transformative stride towards AGI, from our common practice of “weak or narrow AI” to that of “strong or generalized AI”.

Suggested Citation

  • Nanyi Fei & Zhiwu Lu & Yizhao Gao & Guoxing Yang & Yuqi Huo & Jingyuan Wen & Haoyu Lu & Ruihua Song & Xin Gao & Tao Xiang & Hao Sun & Ji-Rong Wen, 2022. "Towards artificial general intelligence via a multimodal foundation model," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-30761-2
    DOI: 10.1038/s41467-022-30761-2
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-30761-2
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-30761-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. R. Quian Quiroga & L. Reddy & G. Kreiman & C. Koch & I. Fried, 2005. "Invariant visual representation by single neurons in the human brain," Nature, Nature, vol. 435(7045), pages 1102-1107, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rodrigo Quian Quiroga & Marta Boscaglia & Jacques Jonas & Hernan G. Rey & Xiaoqian Yan & Louis Maillard & Sophie Colnat-Coulbois & Laurent Koessler & Bruno Rossion, 2023. "Single neuron responses underlying face recognition in the human midfusiform face-selective cortex," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    2. Luca D. Kolibius & Frederic Roux & George Parish & Marije Wal & Mircea Plas & Ramesh Chelvarajah & Vijay Sawlani & David T. Rollings & Johannes D. Lang & Stephanie Gollwitzer & Katrin Walther & Rüdige, 2023. "Hippocampal neurons code individual episodic memories in humans," Nature Human Behaviour, Nature, vol. 7(11), pages 1968-1979, November.
    3. Jakub Kopal & Kuldeep Kumar & Kimia Shafighi & Karin Saltoun & Claudia Modenato & Clara A. Moreau & Guillaume Huguet & Martineau Jean-Louis & Charles-Olivier Martin & Zohra Saci & Nadine Younis & Elis, 2024. "Using rare genetic mutations to revisit structural brain asymmetry," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    4. Louis Kang & Taro Toyoizumi, 2024. "Distinguishing examples while building concepts in hippocampal and artificial networks," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    5. Ahalya Prabhakar & Todd Murphey, 2022. "Mechanical intelligence for learning embodied sensor-object relationships," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    6. Thomas P. Reber & Sina Mackay & Marcel Bausch & Marcel S. Kehl & Valeri Borger & Rainer Surges & Florian Mormann, 2023. "Single-neuron mechanisms of neural adaptation in the human temporal lobe," Nature Communications, Nature, vol. 14(1), pages 1-9, December.
    7. Henning Sprekeler & Christian Michaelis & Laurenz Wiskott, 2007. "Slowness: An Objective for Spike-Timing–Dependent Plasticity?," PLOS Computational Biology, Public Library of Science, vol. 3(6), pages 1-13, June.
    8. Sina Mackay & Thomas P. Reber & Marcel Bausch & Jan Boström & Christian E. Elger & Florian Mormann, 2024. "Concept and location neurons in the human brain provide the ‘what’ and ‘where’ in memory formation," Nature Communications, Nature, vol. 15(1), pages 1-9, December.
    9. Jörn Diedrichsen & Nikolaus Kriegeskorte, 2017. "Representational models: A common framework for understanding encoding, pattern-component, and representational-similarity analysis," PLOS Computational Biology, Public Library of Science, vol. 13(4), pages 1-33, April.
    10. Chiara Gastaldi & Tilo Schwalger & Emanuela De Falco & Rodrigo Quian Quiroga & Wulfram Gerstner, 2021. "When shared concept cells support associations: Theory of overlapping memory engrams," PLOS Computational Biology, Public Library of Science, vol. 17(12), pages 1-44, December.
    11. Jongwoon Kim & Hengji Huang & Earl T. Gilbert & Kaiser C. Arndt & Daniel Fine English & Xiaoting Jia, 2024. "T-DOpE probes reveal sensitivity of hippocampal oscillations to cannabinoids in behaving mice," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    12. Umut Güçlü & Marcel A J van Gerven, 2014. "Unsupervised Feature Learning Improves Prediction of Human Brain Activity in Response to Natural Images," PLOS Computational Biology, Public Library of Science, vol. 10(8), pages 1-12, August.
    13. Martinez-Saito, Mario, 2022. "Discrete scaling and criticality in a chain of adaptive excitable integrators," Chaos, Solitons & Fractals, Elsevier, vol. 163(C).
    14. Dock H. Duncan & Dirk Moorselaar & Jan Theeuwes, 2023. "Pinging the brain to reveal the hidden attentional priority map using encephalography," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    15. David Balduzzi & Giulio Tononi, 2009. "Qualia: The Geometry of Integrated Information," PLOS Computational Biology, Public Library of Science, vol. 5(8), pages 1-24, August.
    16. Carlo Baldassi & Alireza Alemi-Neissi & Marino Pagan & James J DiCarlo & Riccardo Zecchina & Davide Zoccolan, 2013. "Shape Similarity, Better than Semantic Membership, Accounts for the Structure of Visual Object Representations in a Population of Monkey Inferotemporal Neurons," PLOS Computational Biology, Public Library of Science, vol. 9(8), pages 1-20, August.
    17. Shiva Farashahi & Alireza Soltani, 2021. "Computational mechanisms of distributed value representations and mixed learning strategies," Nature Communications, Nature, vol. 12(1), pages 1-18, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-30761-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.