IDEAS home Printed from https://ideas.repec.org/p/ehl/lserod/124537.html
   My bibliography  Save this paper

Uncovering digital trace data biases: tracking undercoverage in web tracking data

Author

Listed:
  • Bosch Jover, Oriol
  • Sturgis, Patrick
  • Kuha, Jouni
  • Revilla, Melanie

Abstract

Digital trace data is an increasingly popular alternative to surveys, often considered as the gold standard. This study critically assesses the use of web tracking data to study online media exposure. Specifically, we focus on a critical error source of this type of data, tracking undercoverage: researchers’ failure to capture data from all the devices and browsers that individuals utilize to go online. Using data from Spain, Portugal, and Italy, we explore undercoverage in online panels and simulate biases in online media exposure estimates. We show that undercoverage is highly prevalent when using commercial panels, with more than 70% of participants affected. Additionally, the primary determinant of undercoverage is the type and number of devices used, rather than individual’s characteristics. Moreover, through a simulation study, we demonstrate that web tracking estimates are often substantially biased. Methodologically, the paper showcases how auxiliary survey data can help study web tracking errors.

Suggested Citation

  • Bosch Jover, Oriol & Sturgis, Patrick & Kuha, Jouni & Revilla, Melanie, 2024. "Uncovering digital trace data biases: tracking undercoverage in web tracking data," LSE Research Online Documents on Economics 124537, London School of Economics and Political Science, LSE Library.
  • Handle: RePEc:ehl:lserod:124537
    as

    Download full text from publisher

    File URL: http://eprints.lse.ac.uk/124537/
    File Function: Open access version.
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Ian R. White, 2010. "simsum: Analyses of simulation studies including Monte Carlo error," Stata Journal, StataCorp LLC, vol. 10(3), pages 369-385, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. R. N. Rattihalli, 2023. "A Class of Multivariate Power Skew Symmetric Distributions: Properties and Inference for the Power-Parameter," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 85(2), pages 1356-1393, August.
    2. Bosch, Oriol J. & Sturgis, Patrick & Kuha, Jouni & Revilla, Melanie, 2023. "Uncovering digital trace data biases: tracking undercoverage in web tracking data," SocArXiv t2dbj, Center for Open Science.
    3. Kanabar, Ricky & Eibich, Peter & Plum, Alexander, 2025. "Returns to testosterone across men’s earnings distribution in the UK," ISER Working Paper Series 2025-02, Institute for Social and Economic Research.
    4. Warrington Nicole M. & Tilling Kate & Howe Laura D. & Paternoster Lavinia & Pennell Craig E. & Wu Yan Yan & Briollais Laurent, 2014. "Robustness of the linear mixed effects model to error distribution assumptions and the consequences for genome-wide association studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 13(5), pages 567-587, October.
    5. Ng'ombe, John, 2019. "Economics of the Greenseeder Hand Planter, Discrete Choice Modeling, and On-Farm Field Experimentation," Thesis Commons jckt7, Center for Open Science.
    6. Edgar C. Merkle & Daniel Furr & Sophia Rabe-Hesketh, 2019. "Bayesian Comparison of Latent Variable Models: Conditional Versus Marginal Likelihoods," Psychometrika, Springer;The Psychometric Society, vol. 84(3), pages 802-829, September.
    7. Tim Morris & Ella Marley-Zagar, 2023. "Coding robust simulation studies in Stata," Biostatistics and Epidemiology Virtual Symposium 2023 01, Stata Users Group.
    8. Brian Gin & Nicholas Sim & Anders Skrondal & Sophia Rabe-Hesketh, 2020. "A Dyadic IRT Model," Psychometrika, Springer;The Psychometric Society, vol. 85(3), pages 815-836, September.

    More about this item

    JEL classification:

    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ehl:lserod:124537. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: LSERO Manager (email available below). General contact details of provider: https://edirc.repec.org/data/lsepsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.