IDEAS home Printed from https://ideas.repec.org/p/iab/iabdpa/201816.html
   My bibliography  Save this paper

R package hmi: a convenient tool for hierarchical multiple imputation and beyond

Author

Listed:
  • Speidel, Matthias

    (Institute for Employment Research (IAB), Nuremberg, Germany)

  • Drechsler, Jörg

    (Institute for Employment Research (IAB), Nuremberg, Germany)

  • Jolani, Shahab

    (Maastricht University)

Abstract

"Applications of multiple imputation have long outgrown the traditional context of dealing with item nonresponse in cross-sectional datasets. Nowadays multiple imputation is also applied to impute missing values in hierarchical datasets, address confidentiality concerns, combine data from different sources, or correct measurement errors in surveys. However, software developments did not keep up with these recent extensions. Most imputation software can only deal with item nonresponse in cross-sectional settings and extensions for hierarchical data - if available at all - are typically limited in scope. Furthermore, to our knowledge no software is currently available for dealing with measurement error using multiple imputation approaches. The R package hmi tries to close some of these gaps. It offers multiple imputation routines in hierarchical settings form any variable types (for example, nominal, ordinal, or continuous variables). It also provides imputation routines for interval data and handles a common measurement error problem in survey data: Biased inferences due to implicit rounding of the reported values. The user-friendly setup which only requires the data and optionally the specification of the analysis model of interest makes the package especially attractive for users less familiar with the peculiarities of multiple imputation. The compatibility with the popular mice package ensures that the rich set of analysis and diagnostic tools and post-imputation commands available in mice can be used easily once the data have been imputed." (Author's abstract, IAB-Doku) ((en))

Suggested Citation

  • Speidel, Matthias & Drechsler, Jörg & Jolani, Shahab, 2018. "R package hmi: a convenient tool for hierarchical multiple imputation and beyond," IAB-Discussion Paper 201816, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
  • Handle: RePEc:iab:iabdpa:201816
    as

    Download full text from publisher

    File URL: https://doku.iab.de/discussionpapers/2018/dp1618.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Patrick Royston, 2007. "Multiple imputation of missing values: further update of ice, with an emphasis on interval censoring," Stata Journal, StataCorp LP, vol. 7(4), pages 445-464, December.
    2. van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
    3. Hadfield, Jarrod D., 2010. "MCMC Methods for Multi-Response Generalized Linear Mixed Models: The MCMCglmm R Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i02).
    4. Gartner, Hermann & Rässler, Susanne, 2005. "Analyzing the changing gender wage gap based on multiply imputed right censored wages," IAB-Discussion Paper 200505, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    5. repec:mpr:mprres:6195 is not listed on IDEAS
    6. Stephen P. Jenkins & Richard V. Burkhauser & Shuaizhang Feng & Jeff Larrimore, 2011. "Measuring inequality using censored data: a multiple‐imputation approach to estimation and inference," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 174(1), pages 63-81, January.
    7. Schenker, Nathaniel & Raghunathan, Trivellore E. & Chiu, Pei-Lu & Makuc, Diane M. & Zhang, Guangyu & Cohen, Alan J., 2006. "Multiple Imputation of Missing Income Data in the National Health Interview Survey," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 924-933, September.
    8. Jörg Drechsler, 2015. "Multiple Imputation of Multilevel Missing Data—Rigor Versus Simplicity," Journal of Educational and Behavioral Statistics, , vol. 40(1), pages 69-95, February.
    9. Jian Zhu & Trivellore E. Raghunathan, 2015. "Convergence Properties of a Sequential Regression Multiple Imputation Algorithm," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(511), pages 1112-1124, September.
    10. Jingchen Liu & Andrew Gelman & Jennifer Hill & Yu-Sung Su & Jonathan Kropko, 2014. "On the stationary distribution of iterative imputations," Biometrika, Biometrika Trust, vol. 101(1), pages 155-173.
    11. Jeff Larrimore & Richard Burkhauser & Shuaizhang Feng & Laura Zayatz, 2008. "Consistent Cell Means for Topcoded Incomes in the Public Use March CPS (1976-2007)," Working Papers 08-06, Center for Economic Studies, U.S. Census Bureau.
    12. Carpenter, James R. & Goldstein, Harvey & Kenward, Michael G., 2011. "REALCOM-IMPUTE Software for Multilevel Multiple Imputation with Mixed Response Types," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i05).
    13. Rubin, Donald B, 1986. "Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations," Journal of Business & Economic Statistics, American Statistical Association, vol. 4(1), pages 87-94, January.
    14. Wickham, Hadley, 2007. "Reshaping Data with the reshape Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 21(i12).
    15. Gurprit Grover & Vinay K. Gupta, 2015. "Multiple imputation of censored survival data in the presence of missing covariates using restricted mean survival time," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(4), pages 817-827, April.
    16. H. Schneeweiss & J. Komlos & A. Ahmad, 2010. "Symmetric and asymmetric rounding: a review and some new results," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 94(3), pages 247-271, September.
    17. Reiter, Jerome P. & Raghunathan, Trivellore E., 2007. "The Multiple Adaptations of Multiple Imputation," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1462-1471, December.
    18. Su, Yu-Sung & Gelman, Andrew & Hill, Jennifer & Yajima, Masanao, 2011. "Multiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i02).
    19. Nowok, Beata & Raab, Gillian M. & Dibben, Chris, 2016. "synthpop: Bespoke Creation of Synthetic Data in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 74(i11).
    20. Drechsler, Jörg & Kiesl, Hans, 2014. "Beat the heap - an imputation strategy for valid inferences from rounded income data," IAB-Discussion Paper 201402, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    21. S. Zinn & A. Würbach, 2016. "A statistical approach to address the problem of heaping in self-reported income data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 43(4), pages 682-703, March.
    22. Templ, Matthias & Meindl, Bernhard & Kowarik, Alexander & Dupriez, Olivier, 2017. "Simulation of Synthetic Complex Data: The R Package simPop," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 79(i10).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Simon Grund & Oliver Lüdtke & Alexander Robitzsch, 2023. "Handling Missing Data in Cross-Classified Multilevel Analyses: An Evaluation of Different Multiple Imputation Approaches," Journal of Educational and Behavioral Statistics, , vol. 48(4), pages 454-489, August.
    2. Simon Grund & Oliver Lüdtke & Alexander Robitzsch, 2016. "Multiple Imputation of Multilevel Missing Data," SAGE Open, , vol. 6(4), pages 21582440166, October.
    3. Tatjana Miljkovic & Ying-Ju Chen, 2021. "A new computational approach for estimation of the Gini index based on grouped data," Computational Statistics, Springer, vol. 36(3), pages 2289-2311, September.
    4. Saeideh Kamgar & Florian Meinfelder & Ralf Münnich & Hamidreza Navvabpour, 2020. "Estimation within the new integrated system of household surveys in Germany," Statistical Papers, Springer, vol. 61(5), pages 2091-2117, October.
    5. Rashid, S. & Mitra, R. & Steele, R.J., 2015. "Using mixtures of t densities to make inferences in the presence of missing data with a small number of multiply imputed data sets," Computational Statistics & Data Analysis, Elsevier, vol. 92(C), pages 84-96.
    6. repec:jss:jstsof:45:i01 is not listed on IDEAS
    7. Humera Razzak & Christian Heumann, 2019. "Hybrid Multiple Imputation In A Large Scale Complex Survey," Statistics in Transition New Series, Polish Statistical Association, vol. 20(4), pages 33-58, December.
    8. Razzak Humera & Heumann Christian, 2019. "Hybrid Multiple Imputation In A Large Scale Complex Survey," Statistics in Transition New Series, Polish Statistical Association, vol. 20(4), pages 33-58, December.
    9. Josse, Julie & Husson, François, 2016. "missMDA: A Package for Handling Missing Values in Multivariate Data Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 70(i01).
    10. Joost Ginkel & Pieter Kroonenberg, 2014. "Using Generalized Procrustes Analysis for Multiple Imputation in Principal Component Analysis," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 242-269, July.
    11. Norah Alyabs & Sy Han Chiou, 2022. "The Missing Indicator Approach for Accelerated Failure Time Model with Covariates Subject to Limits of Detection," Stats, MDPI, vol. 5(2), pages 1-13, May.
    12. Joost R. Ginkel, 2020. "Standardized Regression Coefficients and Newly Proposed Estimators for $${R}^{{2}}$$R2 in Multiply Imputed Data," Psychometrika, Springer;The Psychometric Society, vol. 85(1), pages 185-205, March.
    13. Vladimir Hlasny & Paolo Verme, 2022. "The Impact of Top Incomes Biases on the Measurement of Inequality in the United States," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 84(4), pages 749-788, August.
    14. Gerko Vink & Laurence E. Frank & Jeroen Pannekoek & Stef Buuren, 2014. "Predictive mean matching imputation of semicontinuous variables," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 68(1), pages 61-90, February.
    15. Simon Grund & Oliver Lüdtke & Alexander Robitzsch, 2018. "Multiple Imputation of Missing Data at Level 2: A Comparison of Fully Conditional and Joint Modeling in Multilevel Designs," Journal of Educational and Behavioral Statistics, , vol. 43(3), pages 316-353, June.
    16. Stephen P. Jenkins & Richard V. Burkhauser & Shuaizhang Feng & Jeff Larrimore, 2009. "Measuring Inequality Using Censored Data: A Multiple Imputation Approach," Discussion Papers of DIW Berlin 866, DIW Berlin, German Institute for Economic Research.
    17. repec:cup:judgdm:v:15:y:2020:i:5:p:798-806 is not listed on IDEAS
    18. Gowri Gopalakrishna & Gerben ter Riet & Gerko Vink & Ineke Stoop & Jelte M Wicherts & Lex M Bouter, 2022. "Prevalence of questionable research practices, research misconduct and their potential explanatory factors: A survey among academic researchers in The Netherlands," PLOS ONE, Public Library of Science, vol. 17(2), pages 1-16, February.
    19. Philip Armour & Richard V. Burkhauser & Jeff Larrimore, 2016. "Using The Pareto Distribution To Improve Estimates Of Topcoded Earnings," Economic Inquiry, Western Economic Association International, vol. 54(2), pages 1263-1273, April.
    20. Brunori, Paolo & Salas-Rojo, Pedro & Verme, Paolo, 2022. "Estimating Inequality with Missing Incomes," GLO Discussion Paper Series 1138, Global Labor Organization (GLO).
    21. Oya Kalaycioglu & Andrew Copas & Michael King & Rumana Z. Omar, 2016. "A comparison of multiple-imputation methods for handling missing data in repeated measurements observational studies," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 179(3), pages 683-706, June.
    22. Burns, Christopher & Prager, Daniel & Ghosh, Sujit & Goodwin, Barry, 2015. "Imputing for Missing Data in the ARMS Household Section: A Multivariate Imputation Approach," 2015 AAEA & WAEA Joint Annual Meeting, July 26-28, San Francisco, California 205291, Agricultural and Applied Economics Association.

    More about this item

    Keywords

    Bundesrepublik Deutschland ; Datengewinnung ; Fehler ; Imputationsverfahren ; Datenfusion ; lineares Modell ; Mehrebenenanalyse ; Software ; IAB-Haushaltspanel;
    All these keywords.

    JEL classification:

    • C83 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Survey Methods; Sampling Methods
    • C38 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Classification Methdos; Cluster Analysis; Principal Components; Factor Analysis

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:iab:iabdpa:201816. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: IAB, Geschäftsbereich Wissenschaftliche Fachinformation und Bibliothek (email available below). General contact details of provider: https://edirc.repec.org/data/iabbbde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.