IDEAS home Printed from https://ideas.repec.org/a/vrs/offsta/v32y2016i3p619-642n4.html
   My bibliography  Save this article

Accuracy of Mixed-Source Statistics as Affected by Classification Errors

Author

Listed:
  • van Delden Arnout

    (Statistics Netherlands, Department of Process Development and Methodology, Henri Faasdreef 312, P.O. Box 24500, 2490 HA The Hague, The Netherlands.)

  • Scholtus Sander

    (Statistics Netherlands, Department of Process Development and Methodology, Henri Faasdreef 312, P.O. Box 24500, 2490 HA The Hague, The Netherlands.)

  • Burger Joep

    (Statistics Netherlands, Department of Process Development and Methodology, CBS-weg 11, P.O. Box 4481, 6401 CZ Heerlen, The Netherlands.)

Abstract

Publications in official statistics are increasingly based on a combination of sources. Although combining data sources may result in nearly complete coverage of the target population, the outcomes are not error free. Estimating the effect of nonsampling errors on the accuracy of mixed-source statistics is crucial for decision making, but it is not straightforward. Here we simulate the effect of classification errors on the accuracy of turnover-level estimates in car-trade industries. We combine an audit sample, the dynamics in the business register, and expert knowledge to estimate a transition matrix of classification-error probabilities. Bias and variance of the turnover estimates caused by classification errors are estimated by a bootstrap resampling approach. In addition, we study the extent to which manual selective editing at micro level can improve the accuracy. Our analyses reveal which industries do not meet preset quality criteria. Surprisingly, more selective editing can result in less accurate estimates for specific industries, and a fixed allocation of editing effort over industries is more effective than an allocation in proportion with the accuracy and population size of each industry. We discuss how to develop a practical method that can be implemented in production to estimate the accuracy of register-based estimates.

Suggested Citation

  • van Delden Arnout & Scholtus Sander & Burger Joep, 2016. "Accuracy of Mixed-Source Statistics as Affected by Classification Errors," Journal of Official Statistics, Sciendo, vol. 32(3), pages 619-642, September.
  • Handle: RePEc:vrs:offsta:v:32:y:2016:i:3:p:619-642:n:4
    DOI: 10.1515/jos-2016-0032
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/jos-2016-0032
    Download Restriction: no

    File URL: https://libkey.io/10.1515/jos-2016-0032?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Peter Hall & Tapabrata Maiti, 2006. "On parametric bootstrap methods for small area prediction," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(2), pages 221-238, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kubokawa, Tatsuya & Nagashima, Bui, 2012. "Parametric bootstrap methods for bias correction in linear mixed models," Journal of Multivariate Analysis, Elsevier, vol. 106(C), pages 1-16.
    2. González-Manteiga, W. & Lombardi­a, M.J. & Molina, I. & Morales, D. & Santamari­a, L., 2008. "Analytic and bootstrap approximations of prediction errors under a multivariate Fay-Herriot model," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5242-5252, August.
    3. Patrick Krennmair & Timo Schmid, 2022. "Flexible domain prediction using mixed effects random forests," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1865-1894, November.
    4. repec:csb:stintr:v:17:y:2016:i:1:p:9-24 is not listed on IDEAS
    5. Erciulescu Andreea L. & Fuller Wayne A., 2016. "Small Area Prediction Under Alternative Model Specifications," Statistics in Transition New Series, Polish Statistical Association, vol. 17(1), pages 9-24, March.
    6. Torabi, Mahmoud, 2012. "Small area estimation using survey weights under a nested error linear regression model with structural measurement error," Journal of Multivariate Analysis, Elsevier, vol. 109(C), pages 52-60.
    7. G. Bertarelli & R. Chambers & N. Salvati, 2021. "Outlier robust small domain estimation via bias correction and robust bootstrapping," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 331-357, March.
    8. Arnab Bhattacharjee & Eduardo Castro & Taps Maiti & João Marques, 2014. "Endogenous spatial structure and delineation of submarkets: A new framework with application to housing markets," SEEC Discussion Papers 1403, Spatial Economics and Econometrics Centre, Heriot Watt University.
    9. Sanjoy Sinha & Abdus Sattar, 2015. "Inference in semi-parametric spline mixed models for longitudinal data," METRON, Springer;Sapienza Università di Roma, vol. 73(3), pages 377-395, December.
    10. Nicholas Longford, 2014. "Policy-related small-area estimation," Economics Working Papers 1427, Department of Economics and Business, Universitat Pompeu Fabra.
    11. Andreea L. Erciulescu & Wayne A. Fuller, 2016. "Small Area Prediction Under Alternative Model Specifications," Statistics in Transition New Series, Polish Statistical Association, vol. 17(1), pages 9-24, March.
    12. Peter Hall & Tapabrata Maiti, 2008. "Non‐parametric inference for clustered binary and count data when only summary information is available," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(4), pages 725-738, September.
    13. Malay Ghosh & Tapabrata Maiti, 2008. "Empirical Bayes Confidence Intervals for Means of Natural Exponential Family‐Quadratic Variance Function Distributions with Application to Small Area Estimation," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 35(3), pages 484-495, September.
    14. Cristina Rueda & José Menéndez & Federico Gómez, 2010. "Small area estimators based on restricted mixed models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 19(3), pages 558-579, November.
    15. Molina, Isabel & Rao, J.N.K., 2009. "Small area estimation on poverty indicators," DES - Working Papers. Statistics and Econometrics. WS ws091505, Universidad Carlos III de Madrid. Departamento de Estadística.
    16. Sugasawa, Shonosuke & Kubokawa, Tatsuya, 2015. "Parametric transformed Fay–Herriot model for small area estimation," Journal of Multivariate Analysis, Elsevier, vol. 139(C), pages 295-311.
    17. Katarzyna Reluga & María‐José Lombardía & Stefan Sperlich, 2023. "Simultaneous inference for linear mixed model parameters with an application to small area estimation," International Statistical Review, International Statistical Institute, vol. 91(2), pages 193-217, August.
    18. Torabi, Mahmoud & Rao, J.N.K., 2014. "On small area estimation under a sub-area level model," Journal of Multivariate Analysis, Elsevier, vol. 127(C), pages 36-55.
    19. María José Lombardía & Esther López-Vizcaíno & Cristina Rueda, 2021. "Selection model for domains across time: application to labour force survey by economic activities," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 228-254, March.
    20. Stefano Marchetti & Caterina Giusti & Nicola Salvati & Monica Pratesi, 2017. "Small area estimation based on M-quantile models in presence of outliers in auxiliary variables," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 26(4), pages 531-555, November.
    21. Tatsuya Kubokawa, 2010. "On Measuring Uncertainty of Small Area Estimators with Higher Order Accuracy," CIRJE F-Series CIRJE-F-754, CIRJE, Faculty of Economics, University of Tokyo.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:vrs:offsta:v:32:y:2016:i:3:p:619-642:n:4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.sciendo.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.