IDEAS home Printed from https://ideas.repec.org/p/iab/iabdpa/201816.html
   My bibliography  Save this paper

R package hmi: a convenient tool for hierarchical multiple imputation and beyond

Author

Listed:
  • Speidel, Matthias

    (Institute for Employment Research (IAB), Nuremberg, Germany)

  • Drechsler, Jörg

    (Institute for Employment Research (IAB), Nuremberg, Germany)

  • Jolani, Shahab

    (Maastricht University)

Abstract

"Applications of multiple imputation have long outgrown the traditional context of dealing with item nonresponse in cross-sectional datasets. Nowadays multiple imputation is also applied to impute missing values in hierarchical datasets, address confidentiality concerns, combine data from different sources, or correct measurement errors in surveys. However, software developments did not keep up with these recent extensions. Most imputation software can only deal with item nonresponse in cross-sectional settings and extensions for hierarchical data - if available at all - are typically limited in scope. Furthermore, to our knowledge no software is currently available for dealing with measurement error using multiple imputation approaches. The R package hmi tries to close some of these gaps. It offers multiple imputation routines in hierarchical settings form any variable types (for example, nominal, ordinal, or continuous variables). It also provides imputation routines for interval data and handles a common measurement error problem in survey data: Biased inferences due to implicit rounding of the reported values. The user-friendly setup which only requires the data and optionally the specification of the analysis model of interest makes the package especially attractive for users less familiar with the peculiarities of multiple imputation. The compatibility with the popular mice package ensures that the rich set of analysis and diagnostic tools and post-imputation commands available in mice can be used easily once the data have been imputed." (Author's abstract, IAB-Doku) ((en))

Suggested Citation

  • Speidel, Matthias & Drechsler, Jörg & Jolani, Shahab, 2018. "R package hmi: a convenient tool for hierarchical multiple imputation and beyond," IAB-Discussion Paper 201816, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
  • Handle: RePEc:iab:iabdpa:201816
    as

    Download full text from publisher

    File URL: https://doku.iab.de/discussionpapers/2018/dp1618.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
    2. Gartner, Hermann & Rässler, Susanne, 2005. "Analyzing the changing gender wage gap based on multiply imputed right censored wages," IAB-Discussion Paper 200505, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    3. repec:mpr:mprres:6195 is not listed on IDEAS
    4. Schenker, Nathaniel & Raghunathan, Trivellore E. & Chiu, Pei-Lu & Makuc, Diane M. & Zhang, Guangyu & Cohen, Alan J., 2006. "Multiple Imputation of Missing Income Data in the National Health Interview Survey," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 924-933, September.
    5. Jörg Drechsler, 2015. "Multiple Imputation of Multilevel Missing Data—Rigor Versus Simplicity," Journal of Educational and Behavioral Statistics, , vol. 40(1), pages 69-95, February.
    6. Jian Zhu & Trivellore E. Raghunathan, 2015. "Convergence Properties of a Sequential Regression Multiple Imputation Algorithm," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(511), pages 1112-1124, September.
    7. Jeff Larrimore & Richard Burkhauser & Shuaizhang Feng & Laura Zayatz, 2008. "Consistent Cell Means for Topcoded Incomes in the Public Use March CPS (1976-2007)," Working Papers 08-06, Center for Economic Studies, U.S. Census Bureau.
    8. Gurprit Grover & Vinay K. Gupta, 2015. "Multiple imputation of censored survival data in the presence of missing covariates using restricted mean survival time," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(4), pages 817-827, April.
    9. Su, Yu-Sung & Gelman, Andrew & Hill, Jennifer & Yajima, Masanao, 2011. "Multiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i02).
    10. Patrick Royston, 2007. "Multiple imputation of missing values: further update of ice, with an emphasis on interval censoring," Stata Journal, StataCorp LP, vol. 7(4), pages 445-464, December.
    11. Hadfield, Jarrod D., 2010. "MCMC Methods for Multi-Response Generalized Linear Mixed Models: The MCMCglmm R Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i02).
    12. Stephen P. Jenkins & Richard V. Burkhauser & Shuaizhang Feng & Jeff Larrimore, 2011. "Measuring inequality using censored data: a multiple‐imputation approach to estimation and inference," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 174(1), pages 63-81, January.
    13. Jingchen Liu & Andrew Gelman & Jennifer Hill & Yu-Sung Su & Jonathan Kropko, 2014. "On the stationary distribution of iterative imputations," Biometrika, Biometrika Trust, vol. 101(1), pages 155-173.
    14. Carpenter, James R. & Goldstein, Harvey & Kenward, Michael G., 2011. "REALCOM-IMPUTE Software for Multilevel Multiple Imputation with Mixed Response Types," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i05).
    15. Rubin, Donald B, 1986. "Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations," Journal of Business & Economic Statistics, American Statistical Association, vol. 4(1), pages 87-94, January.
    16. Wickham, Hadley, 2007. "Reshaping Data with the reshape Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 21(i12).
    17. H. Schneeweiss & J. Komlos & A. Ahmad, 2010. "Symmetric and asymmetric rounding: a review and some new results," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 94(3), pages 247-271, September.
    18. Reiter, Jerome P. & Raghunathan, Trivellore E., 2007. "The Multiple Adaptations of Multiple Imputation," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1462-1471, December.
    19. Nowok, Beata & Raab, Gillian M. & Dibben, Chris, 2016. "synthpop: Bespoke Creation of Synthetic Data in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 74(i11).
    20. Drechsler, Jörg & Kiesl, Hans, 2014. "Beat the heap - an imputation strategy for valid inferences from rounded income data," IAB-Discussion Paper 201402, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    21. S. Zinn & A. Würbach, 2016. "A statistical approach to address the problem of heaping in self-reported income data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 43(4), pages 682-703, March.
    22. Templ, Matthias & Meindl, Bernhard & Kowarik, Alexander & Dupriez, Olivier, 2017. "Simulation of Synthetic Complex Data: The R Package simPop," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 79(i10).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Simon Grund & Oliver Lüdtke & Alexander Robitzsch, 2023. "Handling Missing Data in Cross-Classified Multilevel Analyses: An Evaluation of Different Multiple Imputation Approaches," Journal of Educational and Behavioral Statistics, , vol. 48(4), pages 454-489, August.
    2. Simon Grund & Oliver Lüdtke & Alexander Robitzsch, 2016. "Multiple Imputation of Multilevel Missing Data," SAGE Open, , vol. 6(4), pages 21582440166, October.
    3. Tatjana Miljkovic & Ying-Ju Chen, 2021. "A new computational approach for estimation of the Gini index based on grouped data," Computational Statistics, Springer, vol. 36(3), pages 2289-2311, September.
    4. Saeideh Kamgar & Florian Meinfelder & Ralf Münnich & Hamidreza Navvabpour, 2020. "Estimation within the new integrated system of household surveys in Germany," Statistical Papers, Springer, vol. 61(5), pages 2091-2117, October.
    5. Humera Razzak & Christian Heumann, 2019. "Hybrid Multiple Imputation In A Large Scale Complex Survey," Statistics in Transition New Series, Polish Statistical Association, vol. 20(4), pages 33-58, December.
    6. Razzak Humera & Heumann Christian, 2019. "Hybrid Multiple Imputation In A Large Scale Complex Survey," Statistics in Transition New Series, Polish Statistical Association, vol. 20(4), pages 33-58, December.
    7. Josse, Julie & Husson, François, 2016. "missMDA: A Package for Handling Missing Values in Multivariate Data Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 70(i01).
    8. Rashid, S. & Mitra, R. & Steele, R.J., 2015. "Using mixtures of t densities to make inferences in the presence of missing data with a small number of multiply imputed data sets," Computational Statistics & Data Analysis, Elsevier, vol. 92(C), pages 84-96.
    9. repec:jss:jstsof:45:i01 is not listed on IDEAS
    10. Joost Ginkel & Pieter Kroonenberg, 2014. "Using Generalized Procrustes Analysis for Multiple Imputation in Principal Component Analysis," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 242-269, July.
    11. Norah Alyabs & Sy Han Chiou, 2022. "The Missing Indicator Approach for Accelerated Failure Time Model with Covariates Subject to Limits of Detection," Stats, MDPI, vol. 5(2), pages 1-13, May.
    12. Joost R. Ginkel, 2020. "Standardized Regression Coefficients and Newly Proposed Estimators for $${R}^{{2}}$$R2 in Multiply Imputed Data," Psychometrika, Springer;The Psychometric Society, vol. 85(1), pages 185-205, March.
    13. Gerko Vink & Laurence E. Frank & Jeroen Pannekoek & Stef Buuren, 2014. "Predictive mean matching imputation of semicontinuous variables," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 68(1), pages 61-90, February.
    14. Philip Armour & Richard V. Burkhauser & Jeff Larrimore, 2016. "Using The Pareto Distribution To Improve Estimates Of Topcoded Earnings," Economic Inquiry, Western Economic Association International, vol. 54(2), pages 1263-1273, April.
    15. Christian Seiler, 2013. "Nonresponse in Business Tendency Surveys: Theoretical Discourse and Empirical Evidence," ifo Beiträge zur Wirtschaftsforschung, ifo Institute - Leibniz Institute for Economic Research at the University of Munich, number 52.
    16. Cheng, Xiaoyue & Cook, Dianne & Hofmann, Heike, 2015. "Visually Exploring Missing Values in Multivariable Data Using a Graphical User Interface," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 68(i06).
    17. Kristian Kleinke & Mark Stemmler & Jost Reinecke & Friedrich Lösel, 2011. "Efficient ways to impute incomplete panel data," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 95(4), pages 351-373, December.
    18. Drechsler, Jörg & Kiesl, Hans, 2014. "Beat the heap - an imputation strategy for valid inferences from rounded income data," IAB-Discussion Paper 201402, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    19. Manuel Gomes & Nils Gutacker & Chris Bojke & Andrew Street, 2014. "Addressing missing data in patient-reported outcome measures (PROMs): implications for comparing provider performance," Working Papers 101cherp, Centre for Health Economics, University of York.
    20. Nowok, Beata & Raab, Gillian M. & Dibben, Chris, 2016. "synthpop: Bespoke Creation of Synthetic Data in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 74(i11).
    21. Adel Bosch & Steven F. Koch, 2021. "Individual and Household Debt: Does Imputation Choice Matter?," Working Papers 202141, University of Pretoria, Department of Economics.

    More about this item

    Keywords

    Bundesrepublik Deutschland ; Datengewinnung ; Fehler ; Imputationsverfahren ; Datenfusion ; lineares Modell ; Mehrebenenanalyse ; Software ; IAB-Haushaltspanel;
    All these keywords.

    JEL classification:

    • C83 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Survey Methods; Sampling Methods
    • C38 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Classification Methdos; Cluster Analysis; Principal Components; Factor Analysis

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:iab:iabdpa:201816. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: IAB, Geschäftsbereich Wissenschaftliche Fachinformation und Bibliothek (email available below). General contact details of provider: https://edirc.repec.org/data/iabbbde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.