IDEAS home Printed from https://ideas.repec.org/a/vrs/offsta/v32y2016i3p619-642n4.html
   My bibliography  Save this article

Accuracy of Mixed-Source Statistics as Affected by Classification Errors

Author

Listed:
  • van Delden Arnout

    (Statistics Netherlands, Department of Process Development and Methodology, Henri Faasdreef 312, P.O. Box 24500, 2490 HA The Hague, The Netherlands.)

  • Scholtus Sander

    (Statistics Netherlands, Department of Process Development and Methodology, Henri Faasdreef 312, P.O. Box 24500, 2490 HA The Hague, The Netherlands.)

  • Burger Joep

    (Statistics Netherlands, Department of Process Development and Methodology, CBS-weg 11, P.O. Box 4481, 6401 CZ Heerlen, The Netherlands.)

Abstract

Publications in official statistics are increasingly based on a combination of sources. Although combining data sources may result in nearly complete coverage of the target population, the outcomes are not error free. Estimating the effect of nonsampling errors on the accuracy of mixed-source statistics is crucial for decision making, but it is not straightforward. Here we simulate the effect of classification errors on the accuracy of turnover-level estimates in car-trade industries. We combine an audit sample, the dynamics in the business register, and expert knowledge to estimate a transition matrix of classification-error probabilities. Bias and variance of the turnover estimates caused by classification errors are estimated by a bootstrap resampling approach. In addition, we study the extent to which manual selective editing at micro level can improve the accuracy. Our analyses reveal which industries do not meet preset quality criteria. Surprisingly, more selective editing can result in less accurate estimates for specific industries, and a fixed allocation of editing effort over industries is more effective than an allocation in proportion with the accuracy and population size of each industry. We discuss how to develop a practical method that can be implemented in production to estimate the accuracy of register-based estimates.

Suggested Citation

  • van Delden Arnout & Scholtus Sander & Burger Joep, 2016. "Accuracy of Mixed-Source Statistics as Affected by Classification Errors," Journal of Official Statistics, Sciendo, vol. 32(3), pages 619-642, September.
  • Handle: RePEc:vrs:offsta:v:32:y:2016:i:3:p:619-642:n:4
    DOI: 10.1515/jos-2016-0032
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/jos-2016-0032
    Download Restriction: no

    File URL: https://libkey.io/10.1515/jos-2016-0032?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Peter Hall & Tapabrata Maiti, 2006. "On parametric bootstrap methods for small area prediction," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(2), pages 221-238, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Patrick Krennmair & Timo Schmid, 2022. "Flexible domain prediction using mixed effects random forests," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1865-1894, November.
    2. repec:csb:stintr:v:17:y:2016:i:1:p:9-24 is not listed on IDEAS
    3. Erciulescu Andreea L. & Fuller Wayne A., 2016. "Small Area Prediction Under Alternative Model Specifications," Statistics in Transition New Series, Statistics Poland, vol. 17(1), pages 9-24, March.
    4. Peter Hall & Tapabrata Maiti, 2008. "Non‐parametric inference for clustered binary and count data when only summary information is available," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(4), pages 725-738, September.
    5. Cristina Rueda & José Menéndez & Federico Gómez, 2010. "Small area estimators based on restricted mixed models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 19(3), pages 558-579, November.
    6. Torabi, Mahmoud & Rao, J.N.K., 2014. "On small area estimation under a sub-area level model," Journal of Multivariate Analysis, Elsevier, vol. 127(C), pages 36-55.
    7. María José Lombardía & Esther López-Vizcaíno & Cristina Rueda, 2021. "Selection model for domains across time: application to labour force survey by economic activities," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 228-254, March.
    8. Tatsuya Kubokawa, 2010. "On Measuring Uncertainty of Small Area Estimators with Higher Order Accuracy," CIRJE F-Series CIRJE-F-754, CIRJE, Faculty of Economics, University of Tokyo.
    9. Flores-Agreda, Daniel & Cantoni, Eva, 2019. "Bootstrap estimation of uncertainty in prediction for generalized linear mixed models," Computational Statistics & Data Analysis, Elsevier, vol. 130(C), pages 1-17.
    10. Tzavidis, Nikos & Zhang, Li-Chun & Luna Hernandez, Angela & Schmid, Timo & Rojas-Perilla, Natalia, 2016. "From start to finish: A framework for the production of small area official statistics," Discussion Papers 2016/13, Free University Berlin, School of Business & Economics.
    11. M. Ugarte & A. Militino & T. Goicoa, 2009. "Benchmarked estimates in small areas using linear mixed models with restrictions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 18(2), pages 342-364, August.
    12. Masaki,Takaaki & Newhouse,David Locke & Silwal,Ani Rudra & Bedada,Adane & Engstrom,Ryan, 2020. "Small Area Estimation of Non-Monetary Poverty with Geospatial Data," Policy Research Working Paper Series 9383, The World Bank.
    13. Schmid, Timo & Tzavidis, Nikos & Münnich, Ralf & Chambers, Ray, 2015. "Outlier robust small area estimation under spatial correlation," Discussion Papers 2015/8, Free University Berlin, School of Business & Economics.
    14. Tomasz .Zk{a}d{l}o & Adam Chwila, 2024. "A step towards the integration of machine learning and small area estimation," Papers 2402.07521, arXiv.org.
    15. Marchetti, Stefano & Tzavidis, Nikos & Pratesi, Monica, 2012. "Non-parametric bootstrap mean squared error estimation for M-quantile estimators of small area averages, quantiles and poverty indicators," Computational Statistics & Data Analysis, Elsevier, vol. 56(10), pages 2889-2902.
    16. Li, Huilin & Lahiri, P., 2010. "An adjusted maximum likelihood method for solving small area estimation problems," Journal of Multivariate Analysis, Elsevier, vol. 101(4), pages 882-892, April.
    17. Malay Ghosh, 2020. "Rejoinder," Statistics in Transition New Series, Polish Statistical Association, vol. 21(4), pages 59-67, August.
    18. Nikos Tzavidis & Li‐Chun Zhang & Angela Luna & Timo Schmid & Natalia Rojas‐Perilla, 2018. "From start to finish: a framework for the production of small area official statistics," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 181(4), pages 927-979, October.
    19. María Bugallo & Domingo Morales & María Dolores Esteban & Maria Chiara Pagliarella, 2024. "Model-Based Estimation of Small Area Dissimilarity Indexes: An Application to Sex Occupational Segregation in Spain," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 174(2), pages 473-501, September.
    20. Reluga, Katarzyna & Lombardía, María-José & Sperlich, Stefan, 2024. "Bootstrap-based statistical inference for linear mixed effects under misspecifications," Computational Statistics & Data Analysis, Elsevier, vol. 199(C).
    21. Tatsuya Kubokawa, 2009. "Corrected Empirical Bayes Confidence Intervals in Nested Error Regression Models," CIRJE F-Series CIRJE-F-632, CIRJE, Faculty of Economics, University of Tokyo.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:vrs:offsta:v:32:y:2016:i:3:p:619-642:n:4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.sciendo.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.