IDEAS home Printed from https://ideas.repec.org/a/inm/orijoc/v32y3i2020p714-729.html
   My bibliography  Save this article

Learning to Correlate Accounts Across Online Social Networks: An Embedding-Based Approach

Author

Listed:
  • Fan Zhou

    (School of Information and Software Engineering, University of Electronic Science and Technology of China, 610054 Chengdu, China)

  • Kunpeng Zhang

    (Decision, Operations & Information Technologies, University of Maryland, College Park, Maryland 20742)

  • Shuying Xie

    (JD.com, Inc., 101111 Beijing, China)

  • Xucheng Luo

    (School of Information and Software Engineering, University of Electronic Science and Technology of China, 610054 Chengdu, China)

Abstract

Cross-site account correlation correlates users who have multiple accounts but the same identity across online social networks (OSNs). Being able to identify cross-site users is important for a variety of applications in social networks, security, and electronic commerce, such as social link prediction and cross-domain recommendation. Because of either heterogeneous characteristics of platforms or some unobserved but intrinsic individual factors, the same individuals are likely to behave differently across OSNs, which accordingly causes many challenges for correlating accounts. Traditionally, account correlation is measured by analyzing user-generated content, such as writing style, rules of naming user accounts, or some existing metadata (e.g., account profile, account historical activities). Accounts can be correlated by de-anonymizing user behaviors, which is sometimes infeasible since such data are not often available. In this work, we propose a method, called ACCount eMbedding (ACCM), to go beyond text data and leverage semantics of network structures, a possibility that has not been well explored so far. ACCM aims to correlate accounts with high accuracy by exploiting the semantic information among accounts through random walks. It models and understands latent representations of accounts using an embedding framework similar to sequences of words in natural language models. It also learns a transformation matrix to project node representations into a common dimensional space for comparison. With evaluations on both real-world and synthetic data sets, we empirically demonstrate that ACCM provides performance improvement compared with several state-of-the-art baselines in correlating user accounts between OSNs.

Suggested Citation

  • Fan Zhou & Kunpeng Zhang & Shuying Xie & Xucheng Luo, 2020. "Learning to Correlate Accounts Across Online Social Networks: An Embedding-Based Approach," INFORMS Journal on Computing, INFORMS, vol. 32(3), pages 714-729, July.
  • Handle: RePEc:inm:orijoc:v:32:y:3:i:2020:p:714-729
    DOI: 10.1287/ijoc.2019.0911
    as

    Download full text from publisher

    File URL: https://doi.org/10.1287/ijoc.2019.0911
    Download Restriction: no

    File URL: https://libkey.io/10.1287/ijoc.2019.0911?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yinghui (Catherine) Yang & Balaji Padmanabhan & Hongyan Liu & Xiaoyu Wang, 2012. "Discovery of Periodic Patterns in Sequence Data: A Variance-Based Approach," INFORMS Journal on Computing, INFORMS, vol. 24(3), pages 372-386, August.
    2. Dawn M. Strickland & Earl Barnes & Joel S. Sokol, 2005. "Optimal Protein Structure Alignment Using Maximum Cliques," Operations Research, INFORMS, vol. 53(3), pages 389-402, June.
    3. Xiao-Bai Li & Sumit Sarkar, 2011. "Protecting Privacy Against Record Linkage Disclosure: A Bounded Swapping Approach for Numeric Data," Information Systems Research, INFORMS, vol. 22(4), pages 774-789, December.
    4. Yinghui (Catherine) Yang & Hongyan Liu & Yuanjue Cai, 2013. "Discovery of Online Shopping Patterns Across Websites," INFORMS Journal on Computing, INFORMS, vol. 25(1), pages 161-176, February.
    5. Zan Huang & Dennis K. J. Lin, 2009. "The Time-Series Link Prediction Problem with Applications in Communication Surveillance," INFORMS Journal on Computing, INFORMS, vol. 21(2), pages 286-303, May.
    6. Chungmok Lee & Minh Pham & Myong K. Jeong & Dohyun Kim & Dennis K. J. Lin & Wanpracha Art Chavalitwongse, 2015. "A Network Structural Approach to the Link Prediction Problem," INFORMS Journal on Computing, INFORMS, vol. 27(2), pages 249-267, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yaxuan Ran & Jiani Liu & Yishi Zhang, 2023. "Integrating Users’ Contextual Engagements with Their General Preferences: An Interpretable Followee Recommendation Method," INFORMS Journal on Computing, INFORMS, vol. 35(3), pages 614-632, May.
    2. Xi Chen & Yan Liu & Cheng Zhang, 2022. "Distinguishing Homophily from Peer Influence Through Network Representation Learning," INFORMS Journal on Computing, INFORMS, vol. 34(4), pages 1958-1969, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fan Zhou & Kunpeng Zhang & Bangying Wu & Yi Yang & Harry Jiannan Wang, 2021. "Unifying Online and Offline Preference for Social Link Prediction," INFORMS Journal on Computing, INFORMS, vol. 33(4), pages 1400-1418, October.
    2. Li, Yongli & Luo, Peng & Fan, Zhi-ping & Chen, Kun & Liu, Jiaguo, 2017. "A utility-based link prediction method in social networks," European Journal of Operational Research, Elsevier, vol. 260(2), pages 693-705.
    3. Moradabadi, Behnaz & Meybodi, Mohammad Reza, 2016. "Link prediction based on temporal similarity metrics using continuous action set learning automata," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 460(C), pages 361-373.
    4. Xiaoye Cheng & Jingjing Zhang & Lu (Lucy) Yan, 2020. "Understanding the Impact of Individual Users’ Rating Characteristics on the Predictive Accuracy of Recommender Systems," INFORMS Journal on Computing, INFORMS, vol. 32(2), pages 303-320, April.
    5. Yi Qian & Hui Xie, 2013. "Drive More Effective Data-Based Innovations: Enhancing the Utility of Secure Databases," NBER Working Papers 19586, National Bureau of Economic Research, Inc.
    6. Douglas Castilho & Tharsis T. P. Souza & Soong Moon Kang & Jo~ao Gama & Andr'e C. P. L. F. de Carvalho, 2021. "Forecasting Financial Market Structure from Network Features using Machine Learning," Papers 2110.11751, arXiv.org.
    7. Piotr Łukasiak & Jacek Błażewicz & Maciej Miłostan, 2010. "Some operations research methods for analyzing protein sequences and structures," Annals of Operations Research, Springer, vol. 175(1), pages 9-35, March.
    8. Morlok, Tina & Matt, Christian & Hess, Thomas, 2017. "Privatheitsforschung in den Wirtschaftswissenschaften: Entwicklung, Stand und Perspektiven," Working Papers 1/2017, University of Munich, Munich School of Management, Institute for Information Systems and New Media.
    9. Constantine N. Goulimis, 2007. "ASP, The Art and Science of Practice: Appeal to NP-Completeness Considered Harmful: Does the Fact That a Problem Is NP-Complete Tell Us Anything?," Interfaces, INFORMS, vol. 37(6), pages 584-586, December.
    10. Haibing Lu & Jaideep Vaidya & Vijayalakshmi Atluri & Yingjiu Li, 2015. "Statistical Database Auditing Without Query Denial Threat," INFORMS Journal on Computing, INFORMS, vol. 27(1), pages 20-34, February.
    11. Meghanath Macha & Natasha Zhang Foutz & Beibei Li & Anindya Ghose, 2024. "Personalized Privacy Preservation in Consumer Mobile Trajectories," Information Systems Research, INFORMS, vol. 35(1), pages 249-271, March.
    12. Feifei He & Chunhua Sun & Yezheng Liu, 2023. "What social characteristics enhance recommender systems? The effects of network embeddedness and preference heterogeneity," Electronic Commerce Research, Springer, vol. 23(3), pages 1807-1827, September.
    13. Moradabadi, Behnaz & Meybodi, Mohammad Reza, 2017. "A novel time series link prediction method: Learning automata approach," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 482(C), pages 422-432.
    14. Douglas Altner & Özlem Ergun, 2011. "Rapidly computing robust minimum capacity s-t cuts: a case study in solving a sequence of maximum flow problems," Annals of Operations Research, Springer, vol. 184(1), pages 3-26, April.
    15. Yi Qian & Hui Xie, 2015. "Drive More Effective Data-Based Innovations: Enhancing the Utility of Secure Databases," Management Science, INFORMS, vol. 61(3), pages 520-541, March.
    16. Xiao-Bai Li & Jialun Qin, 2017. "Anonymizing and Sharing Medical Text Records," Information Systems Research, INFORMS, vol. 28(2), pages 332-352, June.
    17. Steffen Rebennack & Marcus Oswald & Dirk Oliver Theis & Hanna Seitz & Gerhard Reinelt & Panos M. Pardalos, 2011. "A Branch and Cut solver for the maximum stable set problem," Journal of Combinatorial Optimization, Springer, vol. 21(4), pages 434-457, May.
    18. Aslan, Serpil & Kaya, Buket & Kaya, Mehmet, 2019. "Predicting potential links by using strengthened projections in evolving bipartite networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 525(C), pages 998-1011.
    19. Ashkan Eshghi & Ram D. Gopal & Hooman Hidaji & Raymond A. Patterson, 2023. "Now You See It, Now You Don’t: Obfuscation of Online Third-Party Information Sharing," INFORMS Journal on Computing, INFORMS, vol. 35(2), pages 286-303, March.
    20. Shaobo Li & Matthew J. Schneider & Yan Yu & Sachin Gupta, 2023. "Reidentification Risk in Panel Data: Protecting for k -Anonymity," Information Systems Research, INFORMS, vol. 34(3), pages 1066-1088, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijoc:v:32:y:3:i:2020:p:714-729. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.