IDEAS home Printed from https://ideas.repec.org/a/bla/jorssa/v185y2022is2ps408-s436.html
   My bibliography  Save this article

When survey science met web tracking: Presenting an error framework for metered data

Author

Listed:
  • Oriol J. Bosch
  • Melanie Revilla

Abstract

Metered data, also called web‐tracking data, are generally collected from a sample of participants who willingly install or configure, onto their devices, technologies that track digital traces left when people go online (e.g., URLs visited). Since metered data allow for the observation of online behaviours unobtrusively, it has been proposed as a useful tool to understand what people do online and what impacts this might have on online and offline phenomena. It is crucial, nevertheless, to understand its limitations. Although some research have explored the potential errors of metered data, a systematic categorisation and conceptualisation of these errors are missing. Inspired by the Total Survey Error, we present a Total Error framework for digital traces collected with Meters (TEM). The TEM framework (1) describes the data generation and the analysis process for metered data and (2) documents the sources of bias and variance that may arise in each step of this process. Using a case study we also show how the TEM can be applied in real life to identify, quantify and reduce metered data errors. Results suggest that metered data might indeed be affected by the error sources identified in our framework and, to some extent, biased. This framework can help improve the quality of both stand‐alone metered data research projects, as well as foster the understanding of how and when survey and metered data can be combined.

Suggested Citation

  • Oriol J. Bosch & Melanie Revilla, 2022. "When survey science met web tracking: Presenting an error framework for metered data," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(S2), pages 408-436, December.
  • Handle: RePEc:bla:jorssa:v:185:y:2022:i:s2:p:s408-s436
    DOI: 10.1111/rssa.12956
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssa.12956
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssa.12956?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jake M. Hofman & Duncan J. Watts & Susan Athey & Filiz Garip & Thomas L. Griffiths & Jon Kleinberg & Helen Margetts & Sendhil Mullainathan & Matthew J. Salganik & Simine Vazire & Alessandro Vespignani, 2021. "Integrating explanation and prediction in computational social science," Nature, Nature, vol. 595(7866), pages 181-188, July.
    2. D. L. Oberski & A. Kirchner & S. Eckman & F. Kreuter, 2017. "Evaluating the Quality of Survey and Administrative Data with Generalized Multitrait-Multimethod Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1477-1489, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Keusch, Florian & Pankowska, Paulina & Cernat, Alexandru & Bach, Ruben L., 2023. "Do you have two minutes to talk about your data? Willingness to participate and nonparticipation bias in Facebook data donation," SocArXiv n9rx3, Center for Open Science.
    2. Camilla Salvatore, 2023. "Inference with non-probability samples and survey data integration: a science mapping study," METRON, Springer;Sapienza Università di Roma, vol. 81(1), pages 83-107, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. van Delden Arnout & van der Laan Jan & Prins Annemarie, 2018. "Detecting Reporting Errors in Data from Decentralised Autonomous Administrations with an Application to Hospital Data," Journal of Official Statistics, Sciendo, vol. 34(4), pages 863-888, December.
    2. Meyer, Bruce D. & Mittag, Nikolas, 2019. "Combining Administrative and Survey Data to Improve Income Measurement," IZA Discussion Papers 12266, Institute of Labor Economics (IZA).
    3. Nelson P. Rayl & Nitish R. Sinha, 2022. "Integrating Prediction and Attribution to Classify News," Finance and Economics Discussion Series 2022-042, Board of Governors of the Federal Reserve System (U.S.).
    4. Ogbonnaya, Ijeoma Nwabuzor & Keeney, Annie J., 2018. "A systematic review of the effectiveness of interagency and cross-system collaborations in the United States to improve child welfare outcomes," Children and Youth Services Review, Elsevier, vol. 94(C), pages 225-245.
    5. Evangelos Katsamakas, 2024. "Business models for the simulation hypothesis," Papers 2404.08991, arXiv.org.
    6. Stüber, Heiko & Grabka, Markus M. & Schnitzlein, Daniel D., 2023. "A tale of two data sets: comparing German administrative and survey data using wage inequality as an example," Journal for Labour Market Research, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany], vol. 57, pages 1-8.
    7. Ari Hyytinen & Petri Rouvinen & Mika Pajarinen & Joosua Virtanen, 2023. "Ex Ante Predictability of Rapid Growth: A Design Science Approach," Entrepreneurship Theory and Practice, , vol. 47(6), pages 2465-2493, November.
    8. Miguel G. Folgado & Veronica Sanz, 2022. "Exploring the political pulse of a country using data science tools," Journal of Computational Social Science, Springer, vol. 5(1), pages 987-1000, May.
    9. Bosch Jover, Oriol & Revilla, Melanie, 2022. "When survey science met web tracking: presenting an error framework for metered data," LSE Research Online Documents on Economics 116431, London School of Economics and Political Science, LSE Library.
    10. Pina-Sánchez, Jose & Buil-Gil, David & brunton-smith, ian & Cernat, Alexandru, 2021. "The impact of measurement error in models using police recorded crime rates," SocArXiv ydf4b, Center for Open Science.
    11. Meyer, Bruce D. & Mittag, Nikolas, 2017. "Using Linked Survey and Administrative Data to Better Measure Income: Implications for Poverty, Program Effectiveness and Holes in the Safety Net," IZA Discussion Papers 10943, Institute of Labor Economics (IZA).
    12. Isabelle Bonhoure & Anna Cigarini & Julián Vicens & Bàrbara Mitats & Josep Perelló, 2023. "Reformulating computational social science with citizen social science: the case of a community-based mental health care research," Palgrave Communications, Palgrave Macmillan, vol. 10(1), pages 1-14, December.
    13. Elizabeth Dolan & James Goulding & Harry Marshall & Gavin Smith & Gavin Long & Laila J. Tata, 2023. "Assessing the value of integrating national longitudinal shopping data into respiratory disease forecasting models," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    14. Bruce D. Meyer & Nikolas Mittag, 2019. "Combining Administrative and Survey Data to Improve Income Measurement," NBER Working Papers 25738, National Bureau of Economic Research, Inc.
    15. Filippo Simini & Gianni Barlacchi & Massimilano Luca & Luca Pappalardo, 2021. "A Deep Gravity model for mobility flows generation," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    16. Ahmed Abbasi & Jeffrey Parsons & Gautam Pant & Olivia R. Liu Sheng & Suprateek Sarker, 2024. "Pathways for Design Research on Artificial Intelligence," Information Systems Research, INFORMS, vol. 35(2), pages 441-459, June.
    17. Gessendorfer Jonathan & Beste Jonas & Drechsler Jörg & Sakshaug Joseph W., 2018. "Statistical Matching as a Supplement to Record Linkage: A Valuable Method to Tackle Nonconsent Bias?," Journal of Official Statistics, Sciendo, vol. 34(4), pages 909-933, December.
    18. Luis Ayala & Ana Pérez & Mercedes Prieto-Alaiz, 2022. "The impact of different data sources on the level and structure of income inequality," SERIEs: Journal of the Spanish Economic Association, Springer;Spanish Economic Association, vol. 13(3), pages 583-611, September.
    19. Islam, Towhidul & Meade, Nigel & Carson, Richard T. & Louviere, Jordan J. & Wang, Juan, 2022. "The usefulness of socio-demographic variables in predicting purchase decisions: Evidence from machine learning procedures," Journal of Business Research, Elsevier, vol. 151(C), pages 324-338.
    20. Gary Charness & Brian Jabarian & John List, 2023. "Generation Next: Experimentation with AI," Artefactual Field Experiments 00777, The Field Experiments Website.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:185:y:2022:i:s2:p:s408-s436. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.