IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v177y2023ics0167947322001499.html
   My bibliography  Save this article

Assessment of the effect of constraints in a new multivariate mixed method for statistical matching

Author

Listed:
  • Claramunt González, Juan
  • van Delden, Arnout
  • de Waal, Ton

Abstract

A Multivariate Mixed method for Statistical Matching (MMSM) is proposed. The MMSM is a predictive mean matching method to impute values when integrating two datasets from the same population without overlapping units measuring several common and non-common variables. It considers the multivariate structure of the data by using multivariate Bayesian regression. The MMSM can also include auxiliary information from an additional dataset to improve the computation of intermediate values, and constraints to improve the selection of the donors. The results from a simulation study show that including information from an auxiliary dataset leads to far better results, especially in terms of bias and percentage of correct imputations. The inclusion of constraints also increases the quality of the imputations, and hence of the statistical matching.

Suggested Citation

  • Claramunt González, Juan & van Delden, Arnout & de Waal, Ton, 2023. "Assessment of the effect of constraints in a new multivariate mixed method for statistical matching," Computational Statistics & Data Analysis, Elsevier, vol. 177(C).
  • Handle: RePEc:eee:csdana:v:177:y:2023:i:c:s0167947322001499
    DOI: 10.1016/j.csda.2022.107569
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947322001499
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2022.107569?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Gerko Vink & Laurence E. Frank & Jeroen Pannekoek & Stef Buuren, 2014. "Predictive mean matching imputation of semicontinuous variables," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 68(1), pages 61-90, February.
    2. van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
    3. Pier Luigi Conti & Daniela Marella & Andrea Neri, 2017. "Statistical matching and uncertainty analysis in combining household income and expenditure data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 26(3), pages 485-505, August.
    4. Rubin, Donald B, 1986. "Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations," Journal of Business & Economic Statistics, American Statistical Association, vol. 4(1), pages 87-94, January.
    5. Moriarity, Chris & Scheuren, Fritz, 2003. "A Note on Rubin's Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations," Journal of Business & Economic Statistics, American Statistical Association, vol. 21(1), pages 65-73, January.
    6. Pier Luigi Conti & Daniela Marella & Mauro Scanu, 2016. "Statistical Matching Analysis for Complex Survey Data With Applications," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1715-1725, October.
    7. Conti, Pier Luigi & Marella, Daniela & Scanu, Mauro, 2008. "Evaluation of matching noise for imputation techniques based on nonparametric local linear regression estimators," Computational Statistics & Data Analysis, Elsevier, vol. 53(2), pages 354-365, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ahfock, Daniel & Pyne, Saumyadipta & McLachlan, Geoffrey J., 2022. "Statistical file-matching of non-Gaussian data: A game theoretic approach," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    2. Chiara Elena Dalla & Menon Martina & Perali Federico, 2019. "An Integrated Database to Measure Living Standards," Journal of Official Statistics, Sciendo, vol. 35(3), pages 531-576, September.
    3. Gessendorfer Jonathan & Beste Jonas & Drechsler Jörg & Sakshaug Joseph W., 2018. "Statistical Matching as a Supplement to Record Linkage: A Valuable Method to Tackle Nonconsent Bias?," Journal of Official Statistics, Sciendo, vol. 34(4), pages 909-933, December.
    4. Andrea Cutillo & Mauro Scanu, 2020. "A Mixed Approach for Data Fusion of HBS and SILC," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 150(2), pages 411-437, July.
    5. Daniela Marella & Danny Pfeffermann, 2023. "Accounting for Non‐ignorable Sampling and Non‐response in Statistical Matching," International Statistical Review, International Statistical Institute, vol. 91(2), pages 269-293, August.
    6. Zahra Rezaei Ghahroodi, 2023. "Statistical matching of sample survey data: application to integrate Iranian time use and labour force surveys," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(3), pages 1023-1051, September.
    7. Norah Alyabs & Sy Han Chiou, 2022. "The Missing Indicator Approach for Accelerated Failure Time Model with Covariates Subject to Limits of Detection," Stats, MDPI, vol. 5(2), pages 1-13, May.
    8. Michael S. Rendall & Bonnie Ghosh-Dastidar & Margaret M. Weden & Zafar Nazarov, 2011. "Multiple Imputation for Combined-Survey Estimation With Incomplete Regressors In One But Not Both Surveys," Working Papers WR-887-1, RAND Corporation.
    9. Joost R. Ginkel, 2020. "Standardized Regression Coefficients and Newly Proposed Estimators for $${R}^{{2}}$$R2 in Multiply Imputed Data," Psychometrika, Springer;The Psychometric Society, vol. 85(1), pages 185-205, March.
    10. Lamarche, Pierre, 2017. "Estimating consumption in the HFCS: Experimental results on the first wave of the HFCS," Statistics Paper Series 22, European Central Bank.
    11. Clinton P. McCully, 2013. "Integration of Micro and Macro Data on Consumer Income and Expenditures," BEA Working Papers 0101, Bureau of Economic Analysis.
    12. Gowri Gopalakrishna & Gerben ter Riet & Gerko Vink & Ineke Stoop & Jelte M Wicherts & Lex M Bouter, 2022. "Prevalence of questionable research practices, research misconduct and their potential explanatory factors: A survey among academic researchers in The Netherlands," PLOS ONE, Public Library of Science, vol. 17(2), pages 1-16, February.
    13. Nicklas Pettersson, 2013. "Bias reduction of finite population imputation by kernel methods," Statistics in Transition new series, Główny Urząd Statystyczny (Polska), vol. 14(1), pages 139-160, March.
    14. Paolo Brunori & Pedro Salas-Rojo & Paolo Verme, 2022. "Estimating Inequality with Missing Incomes," Working Papers 616, ECINEQ, Society for the Study of Economic Inequality.
    15. Angelo Moretti & Natalie Shlomo, 2023. "Improving Probabilistic Record Linkage Using Statistical Prediction Models," International Statistical Review, International Statistical Institute, vol. 91(3), pages 368-394, December.
    16. Saeideh Kamgar & Florian Meinfelder & Ralf Münnich & Hamidreza Navvabpour, 2020. "Estimation within the new integrated system of household surveys in Germany," Statistical Papers, Springer, vol. 61(5), pages 2091-2117, October.
    17. Kristian Kleinke & Mark Stemmler & Jost Reinecke & Friedrich Lösel, 2011. "Efficient ways to impute incomplete panel data," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 95(4), pages 351-373, December.
    18. Jana Emmenegger & Ralf Münnich & Jannik Schaller, 2022. "Evaluating Data Fusion Methods to Improve Income Modelling," Research Papers in Economics 2022-03, University of Trier, Department of Economics.
    19. Francesco D. d’Ovidio & Paola Perchinunno & Laura Antonucci, 2021. "Data Integration Techniques for the Identification of Poverty Profiles," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 156(2), pages 515-531, August.
    20. Okay Gunes, 2017. "Analysis of Households' Decision Using Full Demand Elasticity Estimates: an Estimation on Turkish Data," Documents de travail du Centre d'Economie de la Sorbonne 17017, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:177:y:2023:i:c:s0167947322001499. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.