IDEAS home Printed from https://ideas.repec.org/a/sae/somere/v50y2021i3p1259-1283.html
   My bibliography  Save this article

Multiple Imputation Using Gaussian Copulas

Author

Listed:
  • Florian M. Hollenbach
  • Iavor Bojinov
  • Shahryar Minhas
  • Nils W. Metternich
  • Michael D. Ward
  • Alexander Volfovsky

Abstract

Missing observations are pervasive throughout empirical research, especially in the social sciences. Despite multiple approaches to dealing adequately with missing data, many scholars still fail to address this vital issue. In this article, we present a simple-to-use method for generating multiple imputations (MIs) using a Gaussian copula. The Gaussian copula for MI allows scholars to attain estimation results that have good coverage and small bias. The use of copulas to model the dependence among variables will enable researchers to construct valid joint distributions of the data, even without knowledge of the actual underlying marginal distributions. MIs are then generated by drawing observations from the resulting posterior joint distribution and replacing the missing values. Using simulated and observational data from published social science research, we compare imputation via Gaussian copulas with two other widely used imputation methods: multiple imputation via chained equations and Amelia II. Our results suggest that the Gaussian copula approach has a slightly smaller bias, higher coverage rates, and narrower confidence intervals compared to the other methods. This is especially true when the variables with missing data are not normally distributed. These results, combined with theoretical guarantees and ease of use, suggest that the approach examined provides an attractive alternative for applied researchers undertaking MIs.

Suggested Citation

  • Florian M. Hollenbach & Iavor Bojinov & Shahryar Minhas & Nils W. Metternich & Michael D. Ward & Alexander Volfovsky, 2021. "Multiple Imputation Using Gaussian Copulas," Sociological Methods & Research, , vol. 50(3), pages 1259-1283, August.
  • Handle: RePEc:sae:somere:v:50:y:2021:i:3:p:1259-1283
    DOI: 10.1177/0049124118799381
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.1177/0049124118799381
    Download Restriction: no

    File URL: https://libkey.io/10.1177/0049124118799381?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yucel, Recai M., 2011. "State of the Multiple Imputation Software," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i01).
    2. King, Gary & Honaker, James & Joseph, Anne & Scheve, Kenneth, 2001. "Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation," American Political Science Review, Cambridge University Press, vol. 95(1), pages 49-69, March.
    3. F. Di Lascio & Simone Giannerini & Alessandra Reale, 2015. "Exploring copulas for the imputation of complex dependent data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 24(1), pages 159-175, March.
    4. Kropko, Jonathan & Goodrich, Ben & Gelman, Andrew & Hill, Jennifer, 2014. "Multiple Imputation for Continuous and Categorical Data: Comparing Joint Multivariate Normal and Conditional Approaches," Political Analysis, Cambridge University Press, vol. 22(4), pages 497-519.
    5. Fabrizia Mealli & Donald B. Rubin, 2015. "Clarifying missing at random and related definitions, and implications when coupled with exchangeability," Biometrika, Biometrika Trust, vol. 102(4), pages 995-1000.
    6. Michael W. Robbins & Sujit K. Ghosh & Joshua D. Habiger, 2013. "Imputation in High-Dimensional Economic Data as Applied to the Agricultural Resource Management Survey," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(501), pages 81-95, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Josse, Julie & Husson, François, 2016. "missMDA: A Package for Handling Missing Values in Multivariate Data Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 70(i01).
    2. Schalk Burger & Searle Silverman & Gary van Vuuren, 2018. "Deriving Correlation Matrices for Missing Financial Time-Series Data," International Journal of Economics and Finance, Canadian Center of Science and Education, vol. 10(10), pages 105-105, October.
    3. Sophia Rabe-Hesketh & Anders Skrondal, 2023. "Ignoring Non-ignorable Missingness," Psychometrika, Springer;The Psychometric Society, vol. 88(1), pages 31-50, March.
    4. Burns, Christopher & Prager, Daniel & Ghosh, Sujit & Goodwin, Barry, 2015. "Imputing for Missing Data in the ARMS Household Section: A Multivariate Imputation Approach," 2015 AAEA & WAEA Joint Annual Meeting, July 26-28, San Francisco, California 205291, Agricultural and Applied Economics Association.
    5. Scott Gehlbach & Konstantin Sonin & Ekaterina Zhuravskaya, 2010. "Businessman Candidates," American Journal of Political Science, John Wiley & Sons, vol. 54(3), pages 718-736, July.
    6. Matthew Blackwell & James Honaker & Gary King, 2017. "A Unified Approach to Measurement Error and Missing Data: Overview and Applications," Sociological Methods & Research, , vol. 46(3), pages 303-341, August.
    7. Cohen, Joseph N, 2010. "Neoliberalism’s relationship with economic growth in the developing world: Was it the power of the market or the resolution of financial crisis?," MPRA Paper 24527, University Library of Munich, Germany.
    8. Sergei Guriev & Daniel Treisman, 2020. "The Popularity of Authoritarian Leaders: A cross-national investigation," SciencePo Working papers Main hal-03878626, HAL.
    9. Chen, Andrew Y. & McCoy, Jack, 2024. "Missing values handling for machine learning portfolios," Journal of Financial Economics, Elsevier, vol. 155(C).
    10. Sebastian Barfort & Nikolaj Harmon & Frederik Hjorth & Asmus Leth Olsen, 2015. "Dishonesty and Selection into Public Service in Denmark: Who Runs the World’s Least Corrupt Public Sector?," Discussion Papers 15-12, University of Copenhagen. Department of Economics.
    11. Alessandro Bitetto & Paola Cerchiello & Charilaos Mertzanis, 2021. "A data-driven approach to measuring epidemiological susceptibility risk around the world," DEM Working Papers Series 200, University of Pavia, Department of Economics and Management.
    12. Michael Mousseau, 2012. "The Democratic Peace Unraveled: It’s the Economy," Koç University-TUSIAD Economic Research Forum Working Papers 1207, Koc University-TUSIAD Economic Research Forum.
    13. Marcel Lubbers & Peer Scheepers, 2005. "Political versus Instrumental Euro-scepticism," European Union Politics, , vol. 6(2), pages 223-242, June.
    14. Osterloh, Steffen & Heinemann, Friedrich, 2013. "The political economy of corporate tax harmonization — Why do European politicians (dis)like minimum tax rates?," European Journal of Political Economy, Elsevier, vol. 29(C), pages 18-37.
    15. Zhong, Hua & Hu, Wuyang, 2015. "Farmers’ Willingness to Engage in Best Management Practices: an Application of Multiple Imputation," 2015 Annual Meeting, January 31-February 3, 2015, Atlanta, Georgia 196962, Southern Agricultural Economics Association.
    16. Seiler, Christian & Heumann, Christian, 2013. "Microdata imputations and macrodata implications: Evidence from the Ifo Business Survey," Economic Modelling, Elsevier, vol. 35(C), pages 722-733.
    17. Bekkouche, Yasmine & Cagé, Julia & Dewitte, Edgard, 2022. "The heterogeneous price of a vote: Evidence from multiparty systems, 1993–2017," Journal of Public Economics, Elsevier, vol. 206(C).
    18. Haeussler, Carolin & Sauermann, Henry, 2013. "Credit where credit is due? The impact of project contributions and social factors on authorship and inventorship," Research Policy, Elsevier, vol. 42(3), pages 688-703.
    19. F. Marta L. Lascio & Simone Giannerini, 2019. "Clustering dependent observations with copula functions," Statistical Papers, Springer, vol. 60(1), pages 35-51, February.
    20. Stocké, Volker & Stark, Tobias, 2005. "Stichprobenverzerrung durch Item-Nonresponse in der international vergleichenden Politikwissenschaft," Sonderforschungsbereich 504 Publications 05-43, Sonderforschungsbereich 504, Universität Mannheim;Sonderforschungsbereich 504, University of Mannheim.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:somere:v:50:y:2021:i:3:p:1259-1283. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.