IDEAS home Printed from https://ideas.repec.org/a/eee/phsmap/v463y2016icp356-365.html
   My bibliography  Save this article

Flexible sampling large-scale social networks by self-adjustable random walk

Author

Listed:
  • Xu, Xiao-Ke
  • Zhu, Jonathan J.H.

Abstract

Online social networks (OSNs) have become an increasingly attractive gold mine for academic and commercial researchers. However, research on OSNs faces a number of difficult challenges. One bottleneck lies in the massive quantity and often unavailability of OSN population data. Sampling perhaps becomes the only feasible solution to the problems. How to draw samples that can represent the underlying OSNs has remained a formidable task because of a number of conceptual and methodological reasons. Especially, most of the empirically-driven studies on network sampling are confined to simulated data or sub-graph data, which are fundamentally different from real and complete-graph OSNs. In the current study, we propose a flexible sampling method, called Self-Adjustable Random Walk (SARW), and test it against with the population data of a real large-scale OSN. We evaluate the strengths of the sampling method in comparison with four prevailing methods, including uniform, breadth-first search (BFS), random walk (RW), and revised RW (i.e., MHRW) sampling. We try to mix both induced-edge and external-edge information of sampled nodes together in the same sampling process. Our results show that the SARW sampling method has been able to generate unbiased samples of OSNs with maximal precision and minimal cost. The study is helpful for the practice of OSN research by providing a highly needed sampling tools, for the methodological development of large-scale network sampling by comparative evaluations of existing sampling methods, and for the theoretical understanding of human networks by highlighting discrepancies and contradictions between existing knowledge/assumptions of large-scale real OSN data.

Suggested Citation

  • Xu, Xiao-Ke & Zhu, Jonathan J.H., 2016. "Flexible sampling large-scale social networks by self-adjustable random walk," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 463(C), pages 356-365.
  • Handle: RePEc:eee:phsmap:v:463:y:2016:i:c:p:356-365
    DOI: 10.1016/j.physa.2016.07.055
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0378437116304861
    Download Restriction: Full text for ScienceDirect subscribers only. Journal offers the option of making the article available online on Science direct for a fee of $3,000

    File URL: https://libkey.io/10.1016/j.physa.2016.07.055?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. repec:cup:cbooks:9780511771576 is not listed on IDEAS
    2. Easley,David & Kleinberg,Jon, 2010. "Networks, Crowds, and Markets," Cambridge Books, Cambridge University Press, number 9780521195331, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Fuentes, Emilio Aced & Santini, Simone, 2021. "Network navigation with non-Lèvy superdiffusive random walks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 580(C).
    2. Xianer Ying & Mengshuang Pan & Xiner Chen & Yiyi Zhou & Jianhua Liu & Dazhi Li & Binghao Guo & Zihao Zhu, 2024. "Research on Virus Propagation Network Intrusion Detection Based on Graph Neural Network," Mathematics, MDPI, vol. 12(10), pages 1-11, May.
    3. Xu, Xiao-Ke & Wang, Xue & Xiao, Jing, 2018. "Inferring parent–child relationships by a node-remove centrality framework in online social networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 505(C), pages 222-232.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Blazquez-Soriano, Amparo & Ramos-Sandoval, Rosmery, 2022. "Information transfer as a tool to improve the resilience of farmers against the effects of climate change: The case of the Peruvian National Agrarian Innovation System," Agricultural Systems, Elsevier, vol. 200(C).
    2. Martin L. Weitzman, 2015. "A Voting Architecture for the Governance of Free-Driver Externalities, with Application to Geoengineering," Scandinavian Journal of Economics, Wiley Blackwell, vol. 117(4), pages 1049-1068, October.
    3. Wei Zhong, 2017. "Simulating influenza pandemic dynamics with public risk communication and individual responsive behavior," Computational and Mathematical Organization Theory, Springer, vol. 23(4), pages 475-495, December.
    4. Guo Weilong & Minca Andreea & Wang Li, 2016. "The topology of overlapping portfolio networks," Statistics & Risk Modeling, De Gruyter, vol. 33(3-4), pages 139-155, December.
    5. Thomas J. Sargent & John Stachurski, 2022. "Economic Networks: Theory and Computation," Papers 2203.11972, arXiv.org, revised Jul 2022.
    6. Bernd (B.) Heidergott & Jia-Ping Huang & Ines (I.) Lindner, 2018. "Naive Learning in Social Networks with Random Communication," Tinbergen Institute Discussion Papers 18-018/II, Tinbergen Institute.
    7. Johannes M. Bauer & Michael Latzer, 2016. "The economics of the Internet: an overview," Chapters, in: Johannes M. Bauer & Michael Latzer (ed.), Handbook on the Economics of the Internet, chapter 1, pages 3-20, Edward Elgar Publishing.
    8. Kobayashi, Teruyoshi & Takaguchi, Taro, 2018. "Identifying relationship lending in the interbank market: A network approach," Journal of Banking & Finance, Elsevier, vol. 97(C), pages 20-36.
    9. Konstantinos Antoniadis & Kostas Zafiropoulos & Vasiliki Vrana, 2016. "A Method for Assessing the Performance of e-Government Twitter Accounts," Future Internet, MDPI, vol. 8(2), pages 1-18, April.
    10. Maness, Michael & Cirillo, Cinzia, 2016. "An indirect latent informational conformity social influence choice model: Formulation and case study," Transportation Research Part B: Methodological, Elsevier, vol. 93(PA), pages 75-101.
    11. Bauer, Johannes M., 2014. "Platforms, systems competition, and innovation: Reassessing the foundations of communications policy," Telecommunications Policy, Elsevier, vol. 38(8), pages 662-673.
    12. Julia Neidhardt & Nataliia Rümmele & Hannes Werthner, 0. "Predicting happiness: user interactions and sentiment analysis in an online travel forum," Information Technology & Tourism, Springer, vol. 0, pages 1-19.
    13. OKUBO Toshihiro & ONO Yukako & SAITO Yukiko, 2014. "Roles of Wholesalers in Transaction Networks," Discussion papers 14059, Research Institute of Economy, Trade and Industry (RIETI).
    14. Glover, Dominic & Kim, Sung Kyu & Stone, Glenn Davis, 2020. "Golden Rice and technology adoption theory: A study of seed choice dynamics among rice growers in the Philippines," Technology in Society, Elsevier, vol. 60(C).
    15. Daron Acemoglu & Victor Chernozhukov & Iván Werning & Michael D. Whinston, 2021. "Optimal Targeted Lockdowns in a Multigroup SIR Model," American Economic Review: Insights, American Economic Association, vol. 3(4), pages 487-502, December.
    16. Mark Braverman & Jing Chen & Sampath Kannan, 2016. "Optimal Provision-After-Wait in Healthcare," Mathematics of Operations Research, INFORMS, vol. 41(1), pages 352-376, February.
    17. Lomi, Alessandro & Fonti, Fabio, 2012. "Networks in markets and the propensity of companies to collaborate: An empirical test of three mechanisms," Economics Letters, Elsevier, vol. 114(2), pages 216-220.
    18. Zhang, Xuxi & Liu, Xianping & Lewis, Frank L. & Wang, Xia, 2020. "Bipartite tracking consensus of nonlinear multi-agent systems," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 545(C).
    19. Venkat Venkatasubramanian & Yu Luo, 2018. "How much income inequality is fair? Nash bargaining solution and its connection to entropy," Papers 1806.05262, arXiv.org.
    20. Bing Han & Liyan Yang, 2013. "Social Networks, Information Acquisition, and Asset Prices," Management Science, INFORMS, vol. 59(6), pages 1444-1457, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:phsmap:v:463:y:2016:i:c:p:356-365. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/physica-a-statistical-mechpplications/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.