IDEAS home Printed from https://ideas.repec.org/a/dem/demres/v39y2018i22.html
   My bibliography  Save this article

WhatsApp usage patterns and prediction of demographic characteristics without access to message content

Author

Listed:
  • Avi Rosenfeld

    (Jerusalem College of Technology)

  • Sigal Sina

    (Bar-Ilan University)

  • David Sarne

    (Bar-Ilan University)

  • Or Avidov

    (Bar-Ilan University)

  • Sarit Kraus

    (Bar-Ilan University)

Abstract

Background: Social networks on the Internet have become ubiquitous applications that allow people to easily share text, pictures, and audio and video files. Popular networks include WhatsApp, Facebook, Reddit, and LinkedIn. Objective: We present an extensive study of the usage of the WhatsApp social network, an Internet messaging application that is quickly replacing SMS (short message service) messaging. To better understand people’s use of the network, we provide an analysis of over 6 million encrypted messages from over 100 users, with the objective of building demographic prediction models that use activity data but not the content of these messages. Methods: We performed extensive statistical and numerical analysis of the data and found significant differences in WhatsApp usage across people of different genders and ages. We also entered the data into the Weka and pROC data mining packages and studied models created from decision trees, Bayesian networks, and logistic regression algorithms. Results: We found that different gender and age demographics had significantly different usage habits in almost all message and group attributes. We also noted differences in users’ group behavior and created prediction models, including the likelihood that a given group would have relatively more file attachments and if a group would contain a larger number of participants, a higher frequency of activity, quicker response times, and shorter messages. Conclusions: We were successful in quantifying and predicting a user’s gender and age demographic. Similarly, we were able to predict different types of group usage. All models were built without analyzing message content. Contribution: The main contribution of this paper is the ability to predict user demographics without having access to users’ text content. We present a detailed discussion about the specific attributes that were contained in all predictive models and suggest possible applications based on these results.

Suggested Citation

  • Avi Rosenfeld & Sigal Sina & David Sarne & Or Avidov & Sarit Kraus, 2018. "WhatsApp usage patterns and prediction of demographic characteristics without access to message content," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 39(22), pages 647-670.
  • Handle: RePEc:dem:demres:v:39:y:2018:i:22
    DOI: 10.4054/DemRes.2018.39.22
    as

    Download full text from publisher

    File URL: https://www.demographic-research.org/volumes/vol39/22/39-22.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.4054/DemRes.2018.39.22?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Rajagopal, 2014. "The Human Factors," Palgrave Macmillan Books, in: Architecting Enterprise, chapter 9, pages 225-249, Palgrave Macmillan.
    2. Mike Thelwall & David Wilkinson & Sukhvinder Uppal, 2010. "Data mining emotion in social network communication: Gender differences in MySpace," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 61(1), pages 190-199, January.
    3. Mike Thelwall & David Wilkinson & Sukhvinder Uppal, 2010. "Data mining emotion in social network communication: Gender differences in MySpace," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 61(1), pages 190-199, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tian Tian & Stijn Speelman, 2021. "Pursuing Development behind Heterogeneous Ideologies: Review of Six Evolving Themes and Narratives of Rural Planning in China," Sustainability, MDPI, vol. 13(17), pages 1-16, September.
    2. Liwen Vaughan, 2016. "Uncovering information from social media hyperlinks: An investigation of twitter," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(5), pages 1105-1120, May.
    3. Julia Neidhardt & Nataliia Rümmele & Hannes Werthner, 2017. "Predicting happiness: user interactions and sentiment analysis in an online travel forum," Information Technology & Tourism, Springer, vol. 17(1), pages 101-119, March.
    4. F. Schweitzer & D. Garcia, 2010. "An agent-based model of collective emotions in online communities," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 77(4), pages 533-545, October.
    5. Li, Xianghua & Wang, Zhen & Gao, Chao & Shi, Lei, 2017. "Reasoning human emotional responses from large-scale social and public media," Applied Mathematics and Computation, Elsevier, vol. 310(C), pages 182-193.
    6. Anupriya Khan & Satish Krishnan & Jithesh Arayankalam, 2022. "The Role of ICT Laws and National Culture in Determining ICT Diffusion and Well-Being: A Cross-Country Examination," Information Systems Frontiers, Springer, vol. 24(2), pages 415-440, April.
    7. Ibtesam AbdulAziz Bajri & Nada Abdulmajeed Lashkar, 2020. "Saudi Gender Emotional Expressions in Using Instagram," English Language Teaching, Canadian Center of Science and Education, vol. 13(5), pages 1-94, May.
    8. Roser Beneito-Montagut, 2017. "Emotions, Everyday Life, and the Social Web: Age, Gender, and Social Web Engagement Effects on Online Emotional Expression," Sociological Research Online, , vol. 22(4), pages 87-104, December.
    9. Julia Neidhardt & Nataliia Rümmele & Hannes Werthner, 0. "Predicting happiness: user interactions and sentiment analysis in an online travel forum," Information Technology & Tourism, Springer, vol. 0, pages 1-19.
    10. Jacqueline Ng Lane & Bruce Ankenman & Seyed Iravani, 2018. "Insight into Gender Differences in Higher Education: Evidence from Peer Reviews in an Introductory STEM Course," Service Science, INFORMS, vol. 10(4), pages 442-456, December.
    11. Setten, Eric & Chen, Steven, 2024. "Playing with emotions: Text analysis of emotional tones in gender-casted Children’s media," Journal of Business Research, Elsevier, vol. 175(C).
    12. Yulei Gavin Zhang & Mandy Yan Dang & Hsinchun Chen, 2020. "An Explorative Study on the Virtual World: Investigating the Avatar Gender and Avatar Age Differences in their Social Interactions for Help-Seeking," Information Systems Frontiers, Springer, vol. 22(4), pages 911-925, August.
    13. Chen, Aihui & Lu, Yaobin & Wang, Bin & Zhao, Ling & Li, Ming, 2013. "What drives content creation behavior on SNSs? A commitment perspective," Journal of Business Research, Elsevier, vol. 66(12), pages 2529-2535.
    14. Chmiel, Anna & Sobkowicz, Pawel & Sienkiewicz, Julian & Paltoglou, Georgios & Buckley, Kevan & Thelwall, Mike & Hołyst, Janusz A., 2011. "Negative emotions boost user activity at BBC forum," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 390(16), pages 2936-2944.
    15. Rahman, Shaikh Moksadur, 2020. "Relationship between Job Satisfaction and Turnover Intention: Evidence from Bangladesh," Asian Business Review, Asian Business Consortium, vol. 10(2), pages 99-108.
    16. Naveena Prakasam & Louisa Huxtable-Thomas, 2021. "Reddit: Affordances as an Enabler for Shifting Loyalties," Information Systems Frontiers, Springer, vol. 23(3), pages 723-751, June.
    17. Valeriy Makarov & Albert Bakhtizin, 2014. "The Estimation Of The Regions’ Efficiency Of The Russian Federation Including The Intellectual Capital, The Characteristics Of Readiness For Innovation, Level Of Well-Being, And Quality Of Life," Economy of region, Centre for Economic Security, Institute of Economics of Ural Branch of Russian Academy of Sciences, vol. 1(4), pages 9-30.
    18. Kristine Edgar Danielyan & Samvel Grigoriy Chailyan, 2019. "Delineation of Effectors Impact on The Human Brain Derived Phosphoribosylpyrophosphate Synthetase-1 Activity," Biomedical Journal of Scientific & Technical Research, Biomedical Research Network+, LLC, vol. 24(1), pages 17918-17926, December.
    19. Chuan Wang & Yupeng Liu & Wen Hou & Chao Yu & Guorong Wang & Yuyan Zheng, 2021. "Reliability and availability modeling of Subsea Autonomous High Integrity Pressure Protection System with partial stroke test by Dynamic Bayesian," Journal of Risk and Reliability, , vol. 235(2), pages 268-281, April.
    20. Sana Sadiq & Khadija Anasse & Najib Slimani, 2022. "The impact of mobile phones on high school students: connecting the research dots," Technium Social Sciences Journal, Technium Science, vol. 30(1), pages 252-270, April.

    More about this item

    Keywords

    social network; demographics; social media; WhatsApp; usage prediction;
    All these keywords.

    JEL classification:

    • J1 - Labor and Demographic Economics - - Demographic Economics
    • Z0 - Other Special Topics - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:dem:demres:v:39:y:2018:i:22. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Editorial Office (email available below). General contact details of provider: https://www.demogr.mpg.de/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.