IDEAS home Printed from https://ideas.repec.org/a/vrs/offsta/v36y2020i2p315-338n6.html
   My bibliography  Save this article

Controlling for Selection Bias in Social Media Indicators through Official Statistics: a Proposal

Author

Listed:
  • Iacus Stefano M.

    (Department of Economics, Management and Quantitative Methods, University of Milan, Via Conservatorio 7 - 20122, Milan, Italy.)

  • Porro Giuseppe

    (Department of Law, Economics and Culture, Univertity of Insubria, Via Sant’Abbondio, 12 - 22100, Como, Italy.)

  • Salini Silvia

    (Department of Economics, Management and Quantitative Methods, University of Milan, Via Conservatorio 7 - 20122, Milan, Italy.)

  • Siletti Elena

    (Department of Economics, Management and Quantitative Methods, University of Milan, Via Conservatorio 7 - 20122, Milan, Italy.)

Abstract

With the increase of social media usage, a huge new source of data has become available. Despite the enthusiasm linked to this revolution, one of the main outstanding criticisms in using these data is selection bias. Indeed, the reference population is unknown. Nevertheless, many studies show evidence that these data constitute a valuable source because they are more timely and possess higher space granularity. We propose to adjust statistics based on Twitter data by anchoring them to reliable official statistics through a weighted, space-time, small area estimation model. As a by-product, the proposed method also stabilizes the social media indicators, which is a welcome property required for official statistics. The method can be adapted anytime official statistics exists at the proper level of granularity and for which social media usage within the population is known. As an example, we adjust a subjective well-being indicator of “working conditions” in Italy, and combine it with relevant official statistics. The weights depend on broadband coverage and the Twitter rate at province level, while the analysis is performed at regional level. The resulting statistics are then compared with survey statistics on the “quality of job” at macro-economic regional level, showing evidence of similar paths.

Suggested Citation

  • Iacus Stefano M. & Porro Giuseppe & Salini Silvia & Siletti Elena, 2020. "Controlling for Selection Bias in Social Media Indicators through Official Statistics: a Proposal," Journal of Official Statistics, Sciendo, vol. 36(2), pages 315-338, June.
  • Handle: RePEc:vrs:offsta:v:36:y:2020:i:2:p:315-338:n:6
    DOI: 10.2478/jos-2020-0017
    as

    Download full text from publisher

    File URL: https://doi.org/10.2478/jos-2020-0017
    Download Restriction: no

    File URL: https://libkey.io/10.2478/jos-2020-0017?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Marhuenda, Yolanda & Molina, Isabel & Morales, Domingo, 2013. "Small area estimation with spatio-temporal Fay–Herriot models," Computational Statistics & Data Analysis, Elsevier, vol. 58(C), pages 308-325.
    2. Braaksma, Barteld & Zeelenberg, Kees, 2015. "“Re-make/Re-model”: Should big data change the modelling paradigm in official statistics?," MPRA Paper 87741, University Library of Munich, Germany.
    3. repec:bla:istatr:v:83:y:2015:i:3:p:436-448 is not listed on IDEAS
    4. Sharon E Alajajian & Jake Ryland Williams & Andrew J Reagan & Stephen C Alajajian & Morgan R Frank & Lewis Mitchell & Jacob Lahne & Christopher M Danforth & Peter Sheridan Dodds, 2017. "The Lexicocalorimeter: Gauging public health through caloric input and output on social media," PLOS ONE, Public Library of Science, vol. 12(2), pages 1-25, February.
    5. Marc Fleurbaey, 2009. "Beyond GDP: The Quest for a Measure of Social Welfare," Journal of Economic Literature, American Economic Association, vol. 47(4), pages 1029-1075, December.
    6. Lynn M. R. Ybarra & Sharon L. Lohr, 2008. "Small area estimation when auxiliary information is measured with error," Biometrika, Biometrika Trust, vol. 95(4), pages 919-931.
    7. Stefano Marchetti & Caterina Giusti & Monica Pratesi, 2016. "The use of Twitter data to improve small area estimates of households’ share of food consumption expenditure in Italy [Die Nutzung von Twitter Daten um die Small Area Schätzungen vom Ausgabenanteil," AStA Wirtschafts- und Sozialstatistisches Archiv, Springer;Deutsche Statistische Gesellschaft - German Statistical Society, vol. 10(2), pages 79-93, October.
    8. King, Gary & Pan, Jennifer & Roberts, Margaret E., 2013. "How Censorship in China Allows Government Criticism but Silences Collective Expression," American Political Science Review, Cambridge University Press, vol. 107(2), pages 326-343, May.
    9. Rajagopal, 2014. "The Human Factors," Palgrave Macmillan Books, in: Architecting Enterprise, chapter 9, pages 225-249, Palgrave Macmillan.
    10. King, Gary & Pan, Jennifer & Roberts, Margaret E., 2017. "How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, Not Engaged Argument," American Political Science Review, Cambridge University Press, vol. 111(3), pages 484-501, August.
    11. Luigi Curini & Stefano Iacus & Luciano Canova, 2015. "Measuring Idiosyncratic Happiness Through the Analysis of Twitter: An Application to the Italian Case," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 121(2), pages 525-542, April.
    12. Daniel Kahneman & Alan B. Krueger, 2006. "Developments in the Measurement of Subjective Well-Being," Journal of Economic Perspectives, American Economic Association, vol. 20(1), pages 3-24, Winter.
    13. John Feddersen & Robert Metcalfe & Mark Wooden, 2016. "Subjective wellbeing: why weather matters," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 179(1), pages 203-228, January.
    14. Stefano Maria IACUS & Giuseppe PORRO & Silvia SALINI & Elena SILETTI, 2015. "Social Networks, Happiness and Health: From Sentiment Analysis to a Multidimensional Indicator of Subjective Well-Being," Departmental Working Papers 2015-20, Department of Economics, Management and Quantitative Methods at Università degli Studi di Milano.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Dean Fantazzini & Julia Pushchelenko & Alexey Mironenkov & Alexey Kurbatskii, 2021. "Forecasting Internal Migration in Russia Using Google Trends: Evidence from Moscow and Saint Petersburg," Forecasting, MDPI, vol. 3(4), pages 1-30, October.
    2. Rossouw, Stephanie & Greyling, Talita, 2020. "Big Data and Happiness," GLO Discussion Paper Series 634, Global Labor Organization (GLO).
    3. Silvia Facchinetti & Elena Siletti, 2022. "Well-being Indicators: a Review and Comparison in the Context of Italy," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 159(2), pages 523-547, January.
    4. Tiziana CARPI & Airo HINO & Stefano Maria IACUS & Giuseppe PORRO, 2022. "A Japanese Subjective Well-Being Indicator Based on Twitter Data [‘Collective Smile: Measuring Societal Happiness from Geolocated Images’]," Social Science Japan Journal, University of Tokyo and Oxford University Press, vol. 25(2), pages 273-296.
    5. Tiziana Carpi & Airo Hino & Stefano Maria Iacus & Giuseppe Porro, 2021. "Twitter Subjective Well-Being Indicator During COVID-19 Pandemic: A Cross-Country Comparative Study," Papers 2101.07695, arXiv.org.
    6. Federica Cugnata & Silvia Salini & Elena Siletti, 2021. "Deepening Well-Being Evaluation with Different Data Sources: A Bayesian Networks Approach," IJERPH, MDPI, vol. 18(15), pages 1-10, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. S. M. Iacus & G. Porro & S. Salini & E. Siletti, 2022. "An Italian Composite Subjective Well-Being Index: The Voice of Twitter Users from 2012 to 2017," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 161(2), pages 471-489, June.
    2. Federica Cugnata & Silvia Salini & Elena Siletti, 2021. "Deepening Well-Being Evaluation with Different Data Sources: A Bayesian Networks Approach," IJERPH, MDPI, vol. 18(15), pages 1-10, July.
    3. Silvia Facchinetti & Elena Siletti, 2022. "Well-being Indicators: a Review and Comparison in the Context of Italy," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 159(2), pages 523-547, January.
    4. Kedi Liu & Ranran Wang & Inge Schrijver & Rutger Hoekstra, 2024. "Can we project well-being? Towards integral well-being projections in climate models and beyond," Palgrave Communications, Palgrave Macmillan, vol. 11(1), pages 1-11, December.
    5. Jan Pablo Burgard & Domingo Morales & Anna-Lena Wölwer, 2022. "Small area estimation of socioeconomic indicators for sampled and unsampled domains," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 106(2), pages 287-314, June.
    6. Erin Baggott Carter & Brett L. Carter, 2021. "Propaganda and Protest in Autocracies," Journal of Conflict Resolution, Peace Science Society (International), vol. 65(5), pages 919-949, May.
    7. Emilie Frenkiel & Anna Shpakovskaya, 2019. "The Evolution of Representative Claim-Making by the Chinese Communist Party: From Mao to Xi (1949–2019)," Politics and Governance, Cogitatio Press, vol. 7(3), pages 208-219.
    8. Bruno S. Frey, 2011. "Subjective Well-Being, Politics and Political Economy," Swiss Journal of Economics and Statistics (SJES), Swiss Society of Economics and Statistics (SSES), vol. 147(IV), pages 397-415, December.
    9. Akay, Alpaslan & Bargain, Olivier & Elsayed, Ahmed, 2020. "Global terror, well-being and political attitudes," European Economic Review, Elsevier, vol. 123(C).
    10. Zhang, Yinjunjie & Xu, Zhicheng Phil & Palma, Marco A., 2017. "Misclassification Errors of Subjective Well-being: A New Approach to Mapping Happiness," 2017 Annual Meeting, July 30-August 1, Chicago, Illinois 258553, Agricultural and Applied Economics Association.
    11. Benavent, Roberto & Morales, Domingo, 2016. "Multivariate Fay–Herriot models for small area estimation," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 372-390.
    12. Akay, Alpaslan & Bargain, Olivier & Elsayed, Ahmed, 2018. "Everybody's a Victim? Global Terror, Well-Being and Political Attitudes," Working Papers in Economics 733, University of Gothenburg, Department of Economics.
    13. Ferreira, Susana & Akay, Alpaslan & Brereton, Finbarr & Cuñado, Juncal & Martinsson, Peter & Moro, Mirko & Ningal, Tine F., 2013. "Life satisfaction and air quality in Europe," Ecological Economics, Elsevier, vol. 88(C), pages 1-10.
    14. Yonas Alem & Jonathan Colmer, 2015. "Consumption smoothing and the welfare cost of uncertainty," GRI Working Papers 118b, Grantham Research Institute on Climate Change and the Environment.
    15. Yonas Alem & Jonathan Colmer, 2015. "Consumption Smoothing and the Welfare Cost of Uncertainty," CEP Discussion Papers dp1369, Centre for Economic Performance, LSE.
    16. Knight, S.J; Howley, P.;, 2017. "Can clean air make you happy? Examining the effect of nitrogen dioxide (NO2) on life satisfaction," Health, Econometrics and Data Group (HEDG) Working Papers 17/08, HEDG, c/o Department of Economics, University of York.
    17. Olivier Bargain, 2017. "Welfare analysis and redistributive policies," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 15(4), pages 393-419, December.
    18. Yonas Alem & Jonathan Colmer, 2015. "Consumption Smoothing and the Welfare Cost of Uncertainty," STICERD - Economic Organisation and Public Policy Discussion Papers Series 059, Suntory and Toyota International Centres for Economics and Related Disciplines, LSE.
    19. Yuta J. Masuda & Jason R. Williams & Heather Tallis, 2021. "Does Life Satisfaction Vary with Time and Income? Investigating the Relationship Among Free Time, Income, and Life Satisfaction," Journal of Happiness Studies, Springer, vol. 22(5), pages 2051-2073, June.
    20. Bahadır Dursun & Resul Cesur, 2016. "Transforming lives: the impact of compulsory schooling on hope and happiness," Journal of Population Economics, Springer;European Society for Population Economics, vol. 29(3), pages 911-956, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:vrs:offsta:v:36:y:2020:i:2:p:315-338:n:6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.sciendo.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.