IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i4p450-d504587.html
   My bibliography  Save this article

Frequent Itemset Mining and Multi-Layer Network-Based Analysis of RDF Databases

Author

Listed:
  • Gergely Honti

    (MTA-PE Complex Systems Monitoring Research Group, University of Pannonia, 8200 Veszprem, Hungary
    These authors contributed equally to this work.)

  • János Abonyi

    (MTA-PE Complex Systems Monitoring Research Group, University of Pannonia, 8200 Veszprem, Hungary
    These authors contributed equally to this work.)

Abstract

Triplestores or resource description framework (RDF) stores are purpose-built databases used to organise, store and share data with context. Knowledge extraction from a large amount of interconnected data requires effective tools and methods to address the complexity and the underlying structure of semantic information. We propose a method that generates an interpretable multilayered network from an RDF database. The method utilises frequent itemset mining (FIM) of the subjects, predicates and the objects of the RDF data, and automatically extracts informative subsets of the database for the analysis. The results are used to form layers in an analysable multidimensional network. The methodology enables a consistent, transparent, multi-aspect-oriented knowledge extraction from the linked dataset. To demonstrate the usability and effectiveness of the methodology, we analyse how the science of sustainability and climate change are structured using the Microsoft Academic Knowledge Graph. In the case study, the FIM forms networks of disciplines to reveal the significant interdisciplinary science communities in sustainability and climate change. The constructed multilayer network then enables an analysis of the significant disciplines and interdisciplinary scientific areas. To demonstrate the proposed knowledge extraction process, we search for interdisciplinary science communities and then measure and rank their multidisciplinary effects. The analysis identifies discipline similarities, pinpointing the similarity between atmospheric science and meteorology as well as between geomorphology and oceanography. The results confirm that frequent itemset mining provides an informative sampled subsets of RDF databases which can be simultaneously analysed as layers of a multilayer network.

Suggested Citation

  • Gergely Honti & János Abonyi, 2021. "Frequent Itemset Mining and Multi-Layer Network-Based Analysis of RDF Databases," Mathematics, MDPI, vol. 9(4), pages 1-17, February.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:4:p:450-:d:504587
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/4/450/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/4/450/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Pollner, Péter & Palla, Gergely & Vicsek, Tamás, 2010. "Clustering of tag-induced subgraphs in complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(24), pages 5887-5894.
    2. Gergely Palla & Gergely Tibély & Enys Mones & Péter Pollner & Tamás Vicsek, 2015. "Hierarchical networks of scientific journals," Palgrave Communications, Palgrave Macmillan, vol. 1(palcomms2), pages 15016-15016, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gergely Tibély & David Sousa-Rodrigues & Péter Pollner & Gergely Palla, 2016. "Comparing the Hierarchy of Keywords in On-Line News Portals," PLOS ONE, Public Library of Science, vol. 11(11), pages 1-15, November.
    2. Yurij L. Katchanov & Yulia V. Markova, 2017. "The “space of physics journals”: topological structure and the Journal Impact Factor," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(1), pages 313-333, October.
    3. Alejandro Huertas Herrera & Mónica D. R. Toro-Manríquez & Cristian Lorenzo & María Vanessa Lencinas & Guillermo Martínez Pastur, 2023. "Perspectives on socio-ecological studies in the Northern and Southern Hemispheres," Palgrave Communications, Palgrave Macmillan, vol. 10(1), pages 1-14, December.
    4. Hu, Fei & Zhao, Shangmei & Bing, Tao & Chang, Yiming, 2017. "Hierarchy in industrial structure: The cases of China and the USA," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 469(C), pages 871-882.
    5. Katchanov, Yurij L. & Markova, Yulia V. & Shmatko, Natalia A., 2019. "The distinction machine: Physics journals from the perspective of the Kolmogorov–Smirnov statistic," Journal of Informetrics, Elsevier, vol. 13(4).
    6. Zhang, Min & Wang, Xiaojuan & Jin, Lei & Song, Mei, 2021. "Cascade phenomenon in multilayer networks with dependence groups and hierarchical structure," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 581(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:4:p:450-:d:504587. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.