IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0158120.html
   My bibliography  Save this article

How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

Author

Listed:
  • Brady T West
  • Joseph W Sakshaug
  • Guy Alain S Aurelien

Abstract

Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data.

Suggested Citation

  • Brady T West & Joseph W Sakshaug & Guy Alain S Aurelien, 2016. "How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?," PLOS ONE, Public Library of Science, vol. 11(6), pages 1-29, June.
  • Handle: RePEc:plo:pone00:0158120
    DOI: 10.1371/journal.pone.0158120
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0158120
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0158120&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0158120?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Little R.J., 2004. "To Model or Not To Model? Competing Modes of Inference for Finite Population Sampling," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 546-556, January.
    2. Brady T. West & Sean Esteban McCabe, 2012. "Incorporating complex sample design effects when only final survey weights are available," Stata Journal, StataCorp LP, vol. 12(4), pages 718-725, December.
    3. D. Pfeffermann & C. J. Skinner & D. J. Holmes & H. Goldstein & J. Rasbash, 1998. "Weighting for unequal selection probabilities in multilevel models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(1), pages 23-40.
    4. Sakshaug, J.W. & West, B.T., 2014. "Important considerations when analyzing health survey data collected using a complex sample design," American Journal of Public Health, American Public Health Association, vol. 104(1), pages 15-16.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. West Brady T. & Sakshaug Joseph W. & Aurelien Guy Alain S., 2018. "Accounting for Complex Sampling in Survey Estimation: A Review of Current Software Tools," Journal of Official Statistics, Sciendo, vol. 34(3), pages 721-752, September.
    2. Brady T. West & Joseph W. Sakshaug, 2017. "The Need to Account for Complex Sampling Features when Analyzing Establishment Survey Data: An Illustration using the 2013 Business Research and Development and Innovation Survey (BRDIS)," Working Papers 17-62, Center for Economic Studies, U.S. Census Bureau.
    3. Kenneth Owusu Ansah & Nutifafa Eugene Yaw Dey & Abigail Esinam Adade & Pascal Agbadi, 2022. "Determinants of life satisfaction among Ghanaians aged 15 to 49 years: A further analysis of the 2017/2018 Multiple Cluster Indicator Survey," PLOS ONE, Public Library of Science, vol. 17(1), pages 1-18, January.
    4. Yasmin S. Cypel & Shira Maguen & Paul A. Bernhard & William J. Culpepper & Aaron I. Schneiderman, 2024. "Prevalence and Correlates of Food and/or Housing Instability among Men and Women Post-9/11 US Veterans," IJERPH, MDPI, vol. 21(3), pages 1-16, March.
    5. Rat für Sozial- und Wirtschaftsdaten RatSWD (ed.), 2023. "Erhebung und Nutzung unstrukturierter Daten in den Sozial-, Verhaltens- und Wirtschaftswissenschaften," RatSWD Output Series, German Data Forum (RatSWD), volume 7, number 7-2de.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. West Brady T. & Sakshaug Joseph W. & Aurelien Guy Alain S., 2018. "Accounting for Complex Sampling in Survey Estimation: A Review of Current Software Tools," Journal of Official Statistics, Sciendo, vol. 34(3), pages 721-752, September.
    2. Little Roderick J., 2013. "Discussion," Journal of Official Statistics, Sciendo, vol. 29(3), pages 363-366, June.
    3. Kunihama, T. & Herring, A.H. & Halpern, C.T. & Dunson, D.B., 2016. "Nonparametric Bayes modeling with sample survey weights," Statistics & Probability Letters, Elsevier, vol. 113(C), pages 41-48.
    4. Marivoet, Wim & De Herdt, Tom, 2017. "From figures to facts: making sense of socio-economic surveys in the Democratic Republic of the Congo (DRC)," IOB Analyses & Policy Briefs 23, Universiteit Antwerpen, Institute of Development Policy (IOB).
    5. Geoffrey Jones & Wesley O. Johnson, 2016. "A Bayesian Superpopulation Approach to Inference for Finite Populations Based on Imperfect Diagnostic Outcomes," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 21(2), pages 314-327, June.
    6. Hammon, Angelina & Zinn, Sabine, 2020. "Multiple imputation of binary multilevel missing not at random data," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 69(3), pages 547-564.
    7. J. Andrew Royle, 2009. "Analysis of Capture–Recapture Models with Individual Covariates Using Data Augmentation," Biometrics, The International Biometric Society, vol. 65(1), pages 267-274, March.
    8. Jungah Choi & Hyunsuk Han, 2023. "Understanding the Influence of Teacher-Student Relationship on Mathematics Achievement: Evidence From Korean Students," SAGE Open, , vol. 13(4), pages 21582440231, November.
    9. Woodward, Albert & Das, Abhik & Raskin, Ira E. & Morgan-Lopez, Antonio A., 2006. "An exploratory analysis of treatment completion and client and organizational factors using hierarchical linear modeling," Evaluation and Program Planning, Elsevier, vol. 29(4), pages 335-351, November.
    10. Woojin Chung & Roeul Kim, 2020. "A Reversal of the Association between Education Level and Obesity Risk during Ageing: A Gender-Specific Longitudinal Study in South Korea," IJERPH, MDPI, vol. 17(18), pages 1-19, September.
    11. Carrington C. J. Shepherd & Holly D. Clifford & Francis Mitrou & Shannon M. Melody & Ellen J. Bennett & Fay H. Johnston & Luke D. Knibbs & Gavin Pereira & Janessa L. Pickering & Teck H. Teo & Lea-Ann , 2019. "The Contribution of Geogenic Particulate Matter to Lung Disease in Indigenous Children," IJERPH, MDPI, vol. 16(15), pages 1-12, July.
    12. David Kaplan & Chansoon Lee, 2018. "Optimizing Prediction Using Bayesian Model Averaging: Examples Using Large-Scale Educational Assessments," Evaluation Review, , vol. 42(4), pages 423-457, August.
    13. Patricia Dörr & Jan Pablo Burgard, 2019. "Data-driven transformations and survey-weighting for linear mixed models," Research Papers in Economics 2019-16, University of Trier, Department of Economics.
    14. Bijak Jakub & Bryant Johan & Gołata Elżbieta & Smallwood Steve, 2021. "Preface," Journal of Official Statistics, Sciendo, vol. 37(3), pages 533-541, September.
    15. Shepherd, Carrington CJ & Li, Jianghong & Mitrou, Francis & Zubrick, Stephen R., 2012. "Socioeconomic disparities in the mental health of Indigenous children in Western Australia," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 12, pages 1-1.
    16. Jorge Walter & Daniel Z. Levin & J. Keith Murnighan, 2015. "Reconnection Choices: Selecting the Most Valuable (vs. Most Preferred) Dormant Ties," Organization Science, INFORMS, vol. 26(5), pages 1447-1465, October.
    17. Borgini, Riccardo & Bianco, Paola Del & Salvati, Nicola & Schmid, Timo & Tzavidis, Nikos, 2015. "Modelling the distribution of health related quality of life of advanced melanoma patients in a longitudinal multi-centre clinical trial using M-quantile random effects regression," Discussion Papers 2015/19, Free University Berlin, School of Business & Economics.
    18. Parcel Joshua D. & Schroeter John R. & Azzam Azzeddine M, 2017. "A Re-Examination of Multistage Economies in Hog Farming," Journal of Agricultural & Food Industrial Organization, De Gruyter, vol. 15(2), pages 1-15, December.
    19. Mary Ying-Fang Wang & Paul Tuss & Lihong Qi, 2019. "Augmented Weighted Estimators Dealing with Practical Positivity Violation to Causal inferences in a Random Coefficient Model," Psychometrika, Springer;The Psychometric Society, vol. 84(2), pages 447-467, June.
    20. Jennings, Jacky M. & Hensel, Devon J. & Tanner, Amanda E. & Reilly, Meredith L. & Ellen, Jonathan M., 2014. "Are social organizational factors independently associated with a current bacterial sexually transmitted infection among urban adolescents and young adults?," Social Science & Medicine, Elsevier, vol. 118(C), pages 52-60.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0158120. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.