IDEAS home Printed from https://ideas.repec.org/a/bla/jorssa/v185y2022i1p156-177.html
   My bibliography  Save this article

Multiple system estimation using covariates having missing values and measurement error: Estimating the size of the Māori population in New Zealand

Author

Listed:
  • Peter G. M. van der Heijden
  • Maarten Cruyff
  • Paul A. Smith
  • Christine Bycroft
  • Patrick Graham
  • Nathaniel Matheson‐Dunning

Abstract

We investigate the use of two or more linked lists, for both population size estimation and the relationship between variables appearing on all or only some lists. This relationship is usually not fully known because some individuals appear in only some lists, and some are not in any list. These two problems have been solved simultaneously using the EM algorithm. We extend this approach to estimate the size of the indigenous Māori population in New Zealand, leading to several innovations: (1) the approach is extended to four lists (including the population census), where the reporting of Māori status differs between registers; (2) some individuals in one or more lists have missing ethnicity, and we adapt the approach to handle this additional missingness; (3) some lists cover subsets of the population by design. We discuss under which assumptions such structural undercoverage can be ignored and provide a general result; (4) we treat the Māori indicator in each list as a variable measured with error, and embed a latent class model in the multiple system estimation to estimate the population size of a latent variable, interpreted as the true Māori status. Finally, we discuss estimating the Māori population size from administrative data only. Supplementary materials for our article are available online.

Suggested Citation

  • Peter G. M. van der Heijden & Maarten Cruyff & Paul A. Smith & Christine Bycroft & Patrick Graham & Nathaniel Matheson‐Dunning, 2022. "Multiple system estimation using covariates having missing values and measurement error: Estimating the size of the Māori population in New Zealand," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(1), pages 156-177, January.
  • Handle: RePEc:bla:jorssa:v:185:y:2022:i:1:p:156-177
    DOI: 10.1111/rssa.12731
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssa.12731
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssa.12731?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Elena Stanghellini & Peter G. M. van der Heijden, 2004. "A Multiple-Record Systems Estimation Method that Takes Observed and Unobserved Heterogeneity into Account," Biometrics, The International Biometric Society, vol. 60(2), pages 510-516, June.
    2. David J. Hand, 2018. "Statistical challenges of administrative and transaction data," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 181(3), pages 555-605, June.
    3. Di Cecco Davide & Di Zio Marco & Filipponi Danila & Rocchetti Irene, 2018. "Population Size Estimation Using Multiple Incomplete Lists with Overcoverage," Journal of Official Statistics, Sciendo, vol. 34(2), pages 557-572, June.
    4. Jason M. Sutherland & Carl James Schwarz & Louis-Paul Rivest, 2007. "Multilist Population Estimation with Incomplete and Partial Stratification," Biometrics, The International Biometric Society, vol. 63(3), pages 910-916, September.
    5. Ludi Simpson & Stephen Jivraj & James Warren, 2016. "The stability of ethnic identity in England and Wales 2001–2011," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 179(4), pages 1025-1049, October.
    6. Laura Boeschoten & Ton de Waal & Jeroen K. Vermunt, 2019. "Estimating the number of serious road injuries per vehicle type in the Netherlands by using multiple imputation of latent classes," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 182(4), pages 1463-1486, October.
    7. Boeschoten Laura & Oberski Daniel & de Waal Ton, 2017. "Estimating Classification Errors Under Edit Restrictions in Composite Survey-Register Data Using Multiple Imputation Latent Class Modelling (MILC)," Journal of Official Statistics, Sciendo, vol. 33(4), pages 921-962, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ton de Waal & Arnout van Delden & Sander Scholtus, 2020. "Multi‐source Statistics: Basic Situations and Methods," International Statistical Review, International Statistical Institute, vol. 88(1), pages 203-228, April.
    2. Albert Sabater & Gemma Catney, 2019. "Unpacking Summary Measures of Ethnic Residential Segregation Using an Age Group and Age Cohort Perspective," European Journal of Population, Springer;European Association for Population Studies, vol. 35(1), pages 161-189, February.
    3. Jonas F. Schenkel & Li‐Chun Zhang, 2022. "Adjusting misclassification using a second classifier with an external validation sample," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 1882-1902, October.
    4. Stephanie Coffey, PhD. & Jaya Damineni & John Eltinge, PhD. & Anup Mathur, PhD. & Kayla Varela & Allison Zotti, 2023. "Some Open Questions on Multiple-Source Extensions of Adaptive-Survey Design Concepts and Methods," Working Papers 23-03, Center for Economic Studies, U.S. Census Bureau.
    5. James Jackson & Robin Mitra & Brian Francis & Iain Dove, 2022. "Using saturated count models for user‐friendly synthesis of large confidential administrative databases," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 1613-1643, October.
    6. Danilo Fegatelli & Luca Tardella, 2013. "Improved inference on capture recapture models with behavioural effects," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 22(1), pages 45-66, March.
    7. Saville, Christopher W.N., 2020. "Mental health consequences of minority political positions: The case of brexit," Social Science & Medicine, Elsevier, vol. 258(C).
    8. Zhang Li-Chun, 2019. "A Note on Dual System Population Size Estimator," Journal of Official Statistics, Sciendo, vol. 35(1), pages 279-283, March.
    9. Francesco Bartolucci & Fulvia Pennoni, 2007. "A Class of Latent Markov Models for Capture–Recapture Data Allowing for Time, Heterogeneity, and Behavior Effects," Biometrics, The International Biometric Society, vol. 63(2), pages 568-578, June.
    10. Patrick Broman & Tahu Kukutai, 2021. "Fixed not fluid: European identification in the Aotearoa New Zealand census," Journal of Population Research, Springer, vol. 38(2), pages 103-138, June.
    11. Stephen Fienberg & Daniel Manrique-Vallier, 2009. "Integrated methodology for multiple systems estimation and record linkage using a missing data formulation," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 93(1), pages 49-60, March.
    12. Jamie C. Moore & Gabriele B. Durrant & Peter W. F. Smith, 2021. "Do coefficients of variation of response propensities approximate non‐response biases during survey data collection?," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 301-323, January.
    13. Thandrayen, Joanne & Wang, Yan, 2009. "A latent variable regression model for capture-recapture data," Computational Statistics & Data Analysis, Elsevier, vol. 53(7), pages 2740-2746, May.
    14. Na You & Chang Xuan Mao, 2008. "Population Size Estimation in a Two-List Surveillance System with a Discrete Covariate," Biometrics, The International Biometric Society, vol. 64(2), pages 371-376, June.
    15. Ana Beatriz Galvão & James Mitchell, 2023. "Real‐Time Perceptions of Historical GDP Data Uncertainty," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 85(3), pages 457-481, June.
    16. Fiona Shalley & Kalinda Griffiths & Tom Wilson, 2023. "No Longer Indigenous," Population Research and Policy Review, Springer;Southern Demographic Association (SDA), vol. 42(4), pages 1-27, August.
    17. Lothian Jack & Holmberg Anders & Seyb Allyson, 2019. "An Evolutionary Schema for Using “it-is-what-it-is” Data in Official Statistics," Journal of Official Statistics, Sciendo, vol. 35(1), pages 137-165, March.
    18. Marušić Zrinka & Kožul Marijana & Brozović Ivana, 2020. "Measuring non-commercial tourism traffic in Croatia: Challenges of using administrative data," Croatian Review of Economic, Business and Social Statistics, Sciendo, vol. 6(2), pages 69-81, December.
    19. Di Cecco Davide & Di Zio Marco & Filipponi Danila & Rocchetti Irene, 2018. "Population Size Estimation Using Multiple Incomplete Lists with Overcoverage," Journal of Official Statistics, Sciendo, vol. 34(2), pages 557-572, June.
    20. Paul Labonne & Martin Weale, 2020. "Temporal disaggregation of overlapping noisy quarterly data: estimation of monthly output from UK value‐added tax data," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(3), pages 1211-1230, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:185:y:2022:i:1:p:156-177. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.