IDEAS home Printed from https://ideas.repec.org/p/ant/wpaper/2015001.html
   My bibliography  Save this paper

Classification over bipartite graphs through projection

Author

Listed:
  • STANKOVA, Marija
  • MARTENS, David
  • PROVOST, Foster

Abstract

Many real-world large datasets correspond to bipartite graph data settings; think for example of users rating movies or people visiting locations. Although some work exists over such bigraphs, no general network-oriented methodology has been proposed yet to perform node classification. In this paper we propose a three-stage classification framework that effectively deals with the typical very large size of such datasets. First, a weighting of the top nodes is defined. Secondly, the bigraph is projected into a unipartite (homogenous) graph among the bottom nodes, where the weights of the edges are a function of the weights of the top nodes in the bigraph. Finally, relational learners/classifiers are applied to the resulting weighted unigraph. This general framework allows us to explore the design space, by applying different choices for the three stages, introducing new alternatives and mixing-and-matching to create new techniques. We present an empirical study of the predictive and run-time performances for different combinations of functions in the three stages over a large collection of bipartite datasets. There are clear differences in predictive performance with different design choices. Based on these results, we propose several specific combinations that show good accuracy and also allow for easy and fast scaling to big datasets. A comparison with a linear SVM method on the adjacency matrix of the bigraph shows the superiority of the network-oriented approach.

Suggested Citation

  • STANKOVA, Marija & MARTENS, David & PROVOST, Foster, 2015. "Classification over bipartite graphs through projection," Working Papers 2015001, University of Antwerp, Faculty of Business and Economics.
  • Handle: RePEc:ant:wpaper:2015001
    as

    Download full text from publisher

    File URL: https://repository.uantwerpen.be/docman/irua/07acff/c5909d64.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. repec:cup:cbooks:9780511771576 is not listed on IDEAS
    2. Garry Robins & Malcolm Alexander, 2004. "Small Worlds Among Interlocking Directors: Network Structure and Distance in Bipartite Graphs," Computational and Mathematical Organization Theory, Springer, vol. 10(1), pages 69-94, May.
    3. Seierstad, Cathrine & Opsahl, Tore, 2011. "For the few not the many? The effects of affirmative action on presence, prominence, and social capital of women directors in Norway," Scandinavian Journal of Management, Elsevier, vol. 27(1), pages 44-54, March.
    4. Martens, David & Baesens, Bart & Van Gestel, Tony & Vanthienen, Jan, 2007. "Comprehensible credit scoring models using rule extraction from support vector machines," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1466-1476, December.
    5. Ramon Ferrer i Cancho & Ricard V. Solé, 2001. "The Small-World of Human Language," Working Papers 01-03-016, Santa Fe Institute.
    6. Guillaume, Jean-Loup & Latapy, Matthieu, 2006. "Bipartite graphs as models of complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 371(2), pages 795-813.
    7. Easley,David & Kleinberg,Jon, 2010. "Networks, Crowds, and Markets," Cambridge Books, Cambridge University Press, number 9780521195331, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. DE CNUDDE, Sofie & MOEYERSOMS, Julie & STANKOVA, Marija & TOBBACK, Ellen & JAVALY, Vinayak & MARTENS, David, 2015. "Who cares about your Facebook friends? Credit scoring for microfinance," Working Papers 2015018, University of Antwerp, Faculty of Business and Economics.
    2. TOBBACK, Ellen & MOEYERSOMS, Julie & STANKOVA, Marija & MARTENS, David, 2016. "Bankruptcy prediction for SMEs using relational data," Working Papers 2016004, University of Antwerp, Faculty of Business and Economics.
    3. TOBBACK, Ellen & MARTENS, David, 2017. "Retail credit scoring using fine-grained payment data," Working Papers 2017011, University of Antwerp, Faculty of Business and Economics.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lomi, Alessandro & Conaldi, Guido & Tonellato, Marco & Pallotti, Francesca, 2014. "Participation motifs and the emergence of organization in open productions," Structural Change and Economic Dynamics, Elsevier, vol. 29(C), pages 40-57.
    2. Blazquez-Soriano, Amparo & Ramos-Sandoval, Rosmery, 2022. "Information transfer as a tool to improve the resilience of farmers against the effects of climate change: The case of the Peruvian National Agrarian Innovation System," Agricultural Systems, Elsevier, vol. 200(C).
    3. Martin L. Weitzman, 2015. "A Voting Architecture for the Governance of Free-Driver Externalities, with Application to Geoengineering," Scandinavian Journal of Economics, Wiley Blackwell, vol. 117(4), pages 1049-1068, October.
    4. Wei Zhong, 2017. "Simulating influenza pandemic dynamics with public risk communication and individual responsive behavior," Computational and Mathematical Organization Theory, Springer, vol. 23(4), pages 475-495, December.
    5. Guo Weilong & Minca Andreea & Wang Li, 2016. "The topology of overlapping portfolio networks," Statistics & Risk Modeling, De Gruyter, vol. 33(3-4), pages 139-155, December.
    6. Thomas J. Sargent & John Stachurski, 2022. "Economic Networks: Theory and Computation," Papers 2203.11972, arXiv.org, revised Jul 2022.
    7. Derhami, Shahab & Smith, Alice E., 2017. "An integer programming approach for fuzzy rule-based classification systems," European Journal of Operational Research, Elsevier, vol. 256(3), pages 924-934.
    8. Bernd (B.) Heidergott & Jia-Ping Huang & Ines (I.) Lindner, 2018. "Naive Learning in Social Networks with Random Communication," Tinbergen Institute Discussion Papers 18-018/II, Tinbergen Institute.
    9. Johannes M. Bauer & Michael Latzer, 2016. "The economics of the Internet: an overview," Chapters, in: Johannes M. Bauer & Michael Latzer (ed.), Handbook on the Economics of the Internet, chapter 1, pages 3-20, Edward Elgar Publishing.
    10. Joanna Tyrowicz & Siri Terjesen & Jakub Mazurek, 2017. "All on board? New evidence on board gender diversity from a large panel of firms," GRAPE Working Papers 5, GRAPE Group for Research in Applied Economics.
    11. Ted Briscoe, 2008. "Language learning, power laws, and sexual selection," Mind & Society: Cognitive Studies in Economics and Social Sciences, Springer;Fondazione Rosselli, vol. 7(1), pages 65-76, June.
    12. Jorge Peña & Yannick Rochat, 2012. "Bipartite Graphs as Models of Population Structures in Evolutionary Multiplayer Games," PLOS ONE, Public Library of Science, vol. 7(9), pages 1-13, September.
    13. Kobayashi, Teruyoshi & Takaguchi, Taro, 2018. "Identifying relationship lending in the interbank market: A network approach," Journal of Banking & Finance, Elsevier, vol. 97(C), pages 20-36.
    14. Konstantinos Antoniadis & Kostas Zafiropoulos & Vasiliki Vrana, 2016. "A Method for Assessing the Performance of e-Government Twitter Accounts," Future Internet, MDPI, vol. 8(2), pages 1-18, April.
    15. Li, Yibei & Wang, Ximei & Djehiche, Boualem & Hu, Xiaoming, 2020. "Credit scoring by incorporating dynamic networked information," European Journal of Operational Research, Elsevier, vol. 286(3), pages 1103-1112.
    16. Maness, Michael & Cirillo, Cinzia, 2016. "An indirect latent informational conformity social influence choice model: Formulation and case study," Transportation Research Part B: Methodological, Elsevier, vol. 93(PA), pages 75-101.
    17. Bauer, Johannes M., 2014. "Platforms, systems competition, and innovation: Reassessing the foundations of communications policy," Telecommunications Policy, Elsevier, vol. 38(8), pages 662-673.
    18. Loterman, Gert & Brown, Iain & Martens, David & Mues, Christophe & Baesens, Bart, 2012. "Benchmarking regression algorithms for loss given default modeling," International Journal of Forecasting, Elsevier, vol. 28(1), pages 161-170.
    19. Julia Neidhardt & Nataliia Rümmele & Hannes Werthner, 0. "Predicting happiness: user interactions and sentiment analysis in an online travel forum," Information Technology & Tourism, Springer, vol. 0, pages 1-19.
    20. OKUBO Toshihiro & ONO Yukako & SAITO Yukiko, 2014. "Roles of Wholesalers in Transaction Networks," Discussion papers 14059, Research Institute of Economy, Trade and Industry (RIETI).

    More about this item

    Keywords

    Bipartite graphs; Two-mode networks; Affiliation networks; Node classification; Big data;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ant:wpaper:2015001. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joeri Nys (email available below). General contact details of provider: https://edirc.repec.org/data/ftufsbe.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.