IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-52355-w.html
   My bibliography  Save this article

Network community detection via neural embeddings

Author

Listed:
  • Sadamori Kojaku

    (Binghamton University
    Indiana University)

  • Filippo Radicchi

    (Indiana University)

  • Yong-Yeol Ahn

    (Indiana University)

  • Santo Fortunato

    (Indiana University)

Abstract

Recent advances in machine learning research have produced powerful neural graph embedding methods, which learn useful, low-dimensional vector representations of network data. These neural methods for graph embedding excel in graph machine learning tasks and are now widely adopted. However, how and why these methods work—particularly how network structure gets encoded in the embedding—remain largely unexplained. Here, we show that node2vec—shallow, linear neural network—encodes communities into separable clusters better than random partitioning down to the information-theoretic detectability limit for the stochastic block models. We show that this is due to the equivalence between the embedding learned by node2vec and the spectral embedding via the eigenvectors of the symmetric normalized Laplacian matrix. Numerical simulations demonstrate that node2vec is capable of learning communities on sparse graphs generated by the stochastic blockmodel, as well as on sparse degree-heterogeneous networks. Our results highlight the features of graph neural networks that enable them to separate communities in the embedding space.

Suggested Citation

  • Sadamori Kojaku & Filippo Radicchi & Yong-Yeol Ahn & Santo Fortunato, 2024. "Network community detection via neural embeddings," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-52355-w
    DOI: 10.1038/s41467-024-52355-w
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-52355-w
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-52355-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Vahe Tshitoyan & John Dagdelen & Leigh Weston & Alexander Dunn & Ziqin Rong & Olga Kononova & Kristin A. Persson & Gerbrand Ceder & Anubhav Jain, 2019. "Unsupervised word embeddings capture latent knowledge from materials science literature," Nature, Nature, vol. 571(7763), pages 95-98, July.
    2. Lu Liu & Nima Dehmamy & Jillian Chown & C. Lee Giles & Dashun Wang, 2021. "Understanding the onset of hot streaks across artistic, cultural, and scientific careers," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lu Liu & Benjamin F. Jones & Brian Uzzi & Dashun Wang, 2023. "Data, measurement and empirical methods in the science of science," Nature Human Behaviour, Nature, vol. 7(7), pages 1046-1058, July.
    2. Jia-Min Lu & Hui-Feng Wang & Qi-Hang Guo & Jian-Wei Wang & Tong-Tong Li & Ke-Xin Chen & Meng-Ting Zhang & Jian-Bo Chen & Qian-Nuan Shi & Yi Huang & Shao-Wen Shi & Guang-Yong Chen & Jian-Zhang Pan & Zh, 2024. "Roboticized AI-assisted microfluidic photocatalytic synthesis and screening up to 10,000 reactions per day," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    3. Ananthan Nambiar & Tobias Rubel & James McCaull & Jon deVries & Mark Bedau, 2021. "Dropping diversity of products of large US firms: Models and measures," Papers 2110.08367, arXiv.org.
    4. Jason Youn & Navneet Rai & Ilias Tagkopoulos, 2022. "Knowledge integration and decision support for accelerated discovery of antibiotic resistance genes," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    5. Wu, Lingfei & Kittur, Aniket & Youn, Hyejin & Milojević, Staša & Leahey, Erin & Fiore, Stephen M. & Ahn, Yong-Yeol, 2022. "Metrics and mechanisms: Measuring the unmeasurable in the science of science," Journal of Informetrics, Elsevier, vol. 16(2).
    6. Gordana Ispirova & Tome Eftimov & Barbara Koroušić Seljak, 2020. "P-NUT: Predicting NUTrient Content from Short Text Descriptions," Mathematics, MDPI, vol. 8(10), pages 1-21, October.
    7. Li, Heyang & Wu, Meijun & Wang, Yougui & Zeng, An, 2022. "Bibliographic coupling networks reveal the advantage of diversification in scientific projects," Journal of Informetrics, Elsevier, vol. 16(3).
    8. Zhang, Lin & Qi, Fan & Sivertsen, Gunnar & Liang, Liming & Campbell, David, 2023. "Gender differences in the patterns and consequences of changing specialization in scientific careers," SocArXiv ep5bx, Center for Open Science.
    9. Lin, Yiling & Evans, James A. & Wu, Lingfei, 2022. "New directions in science emerge from disconnection and discord," Journal of Informetrics, Elsevier, vol. 16(1).
    10. Zongrui Pei & Junqi Yin & Peter K. Liaw & Dierk Raabe, 2023. "Toward the design of ultrahigh-entropy alloys via mining six million texts," Nature Communications, Nature, vol. 14(1), pages 1-8, December.
    11. Le Song & Guilong Zhu & Xiao Yin, 2024. "Evaluating the wisdom of scholar crowds from the perspective of knowledge diffusion," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(9), pages 5103-5139, September.
    12. Shaoshuo Li & Baixing Chen & Hao Chen & Zhen Hua & Yang Shao & Heng Yin & Jianwei Wang, 2021. "Analysis of potential genetic biomarkers and molecular mechanism of smoking-related postmenopausal osteoporosis using weighted gene co-expression network analysis and machine learning," PLOS ONE, Public Library of Science, vol. 16(9), pages 1-18, September.
    13. Jeong, Yoo Kyung & Xie, Qing & Yan, Erjia & Song, Min, 2020. "Examining drug and side effect relation using author–entity pair bipartite networks," Journal of Informetrics, Elsevier, vol. 14(1).
    14. Tokmachev, Andrey M., 2023. "Hidden scales in statistics of citation indicators," Journal of Informetrics, Elsevier, vol. 17(1).
    15. John Dagdelen & Alexander Dunn & Sanghoon Lee & Nicholas Walker & Andrew S. Rosen & Gerbrand Ceder & Kristin A. Persson & Anubhav Jain, 2024. "Structured information extraction from scientific text with large language models," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    16. Feng Shi & James Evans, 2023. "Surprising combinations of research contents and contexts are related to impact and emerge with scientific outsiders from distant disciplines," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    17. Jarrahi, Mohammad Hossein & Askay, David & Eshraghi, Ali & Smith, Preston, 2023. "Artificial intelligence and knowledge management: A partnership between human and AI," Business Horizons, Elsevier, vol. 66(1), pages 87-99.
    18. Liu, Meijun & Jaiswal, Ajay & Bu, Yi & Min, Chao & Yang, Sijie & Liu, Zhibo & Acuña, Daniel & Ding, Ying, 2022. "Team formation and team impact: The balance between team freshness and repeat collaboration," Journal of Informetrics, Elsevier, vol. 16(4).
    19. Jianhong Luo & Minjuan Chai & Xuwei Pan, 2021. "Identification of Research Priorities during the COVID-19 Pandemic: Implications for Its Management," IJERPH, MDPI, vol. 18(24), pages 1-15, December.
    20. Martín de Diego, Isaac & González-Fernández, César & Fernández-Isabel, Alberto & Fernández, Rubén R. & Cabezas, Javier, 2021. "System for evaluating the reliability and novelty of medical scientific papers," Journal of Informetrics, Elsevier, vol. 15(4).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-52355-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.