Expanding tidy data principles to facilitate missing data exploration, visualization and assessment of imputations
Author
Abstract
Suggested Citation
Download full text from publisher
References listed on IDEAS
- Shannon E. Ellis & Jeffrey T. Leek, 2018. "How to Share Data for Collaboration," The American Statistician, Taylor & Francis Journals, vol. 72(1), pages 53-57, January.
- van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
- Honaker, James & King, Gary & Blackwell, Matthew, 2011. "Amelia II: A Program for Missing Data," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i07).
- Grolemund, Garrett & Wickham, Hadley, 2011. "Dates and Times Made Easy with lubridate," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i03).
- Josse, Julie & Husson, François, 2016. "missMDA: A Package for Handling Missing Values in Multivariate Data Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 70(i01).
- Kowarik, Alexander & Templ, Matthias, 2016. "Imputation with the R Package VIM," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 74(i07).
- Lê, Sébastien & Josse, Julie & Husson, François, 2008. "FactoMineR: An R Package for Multivariate Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 25(i01).
- Cheng, Xiaoyue & Cook, Dianne & Hofmann, Heike, 2015. "Visually Exploring Missing Values in Multivariable Data Using a Graphical User Interface," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 68(i06).
Citations
Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
Cited by:
- Shen‐Ming Lee & Truong‐Nhat Le & Phuoc‐Loc Tran & Chin‐Shang Li, 2022. "Investigating the association of a sensitive attribute with a random variable using the Christofides generalised randomised response design and Bayesian methods," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1471-1502, November.
- Maldonado-Salguero, Patricia & Bueso-Sánchez, María Carmen & Molina-García, Ángel & Sánchez-Lozano, Juan Miguel, 2022. "Spatio-temporal dynamic clustering modeling for solar irradiance resource assessment," Renewable Energy, Elsevier, vol. 200(C), pages 344-359.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Maria Lucia Parrella & Giuseppina Albano & Michele La Rocca & Cira Perna, 2019. "Reconstructing missing data sequences in multivariate time series: an application to environmental data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 28(2), pages 359-383, June.
- Kowarik, Alexander & Templ, Matthias, 2016. "Imputation with the R Package VIM," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 74(i07).
- Lara Lopez & Fernando L. Vázquez & Ángela J. Torres & Patricia Otero & Vanessa Blanco & Olga Díaz & Mario Páramo, 2020. "Long-Term Effects of a Cognitive Behavioral Conference Call Intervention on Depression in Non-Professional Caregivers," IJERPH, MDPI, vol. 17(22), pages 1-24, November.
- Nicklas Pettersson, 2013. "Bias reduction of finite population imputation by kernel methods," Statistics in Transition new series, Główny Urząd Statystyczny (Polska), vol. 14(1), pages 139-160, March.
- Jiang, Wei & Josse, Julie & Lavielle, Marc, 2020. "Logistic regression with missing covariates—Parameter estimation, model selection and prediction within a joint-modeling framework," Computational Statistics & Data Analysis, Elsevier, vol. 145(C).
- Ettie M. Lipner & Joshua French & Carleton R. Bern & Katherine Walton-Day & David Knox & Michael Strong & D. Rebecca Prevots & James L. Crooks, 2020. "Nontuberculous Mycobacterial Disease and Molybdenum in Colorado Watersheds," IJERPH, MDPI, vol. 17(11), pages 1-15, May.
- Cheng, Xiaoyue & Cook, Dianne & Hofmann, Heike, 2015. "Visually Exploring Missing Values in Multivariable Data Using a Graphical User Interface," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 68(i06).
- Ahmad R. Alsaber & Jiazhu Pan & Adeeba Al-Hurban, 2021. "Handling Complex Missing Data Using Random Forest Approach for an Air Quality Monitoring Dataset: A Case Study of Kuwait Environmental Data (2012 to 2018)," IJERPH, MDPI, vol. 18(3), pages 1-25, February.
- Pépin, Antonin & Morel, Kevin & van der Werf, Hayo M.G., 2021. "Conventionalised vs. agroecological practices on organic vegetable farms: Investigating the influence of farm structure in a bifurcation perspective," Agricultural Systems, Elsevier, vol. 190(C).
- Nengsih Titin Agustin & Bertrand Frédéric & Maumy-Bertrand Myriam & Meyer Nicolas, 2019. "Determining the number of components in PLS regression on incomplete data set," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 18(6), pages 1-28, December.
- Josse, Julie & Husson, François, 2016. "missMDA: A Package for Handling Missing Values in Multivariate Data Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 70(i01).
- Schalk Burger & Searle Silverman & Gary van Vuuren, 2018. "Deriving Correlation Matrices for Missing Financial Time-Series Data," International Journal of Economics and Finance, Canadian Center of Science and Education, vol. 10(10), pages 105-105, October.
- World Bank & Organisation for Economic Co-operation and Development, 2017. "A Step Ahead," World Bank Publications - Books, The World Bank Group, number 27527.
- Adel Bosch & Steven F. Koch, 2021. "Individual and Household Debt: Does Imputation Choice Matter?," Working Papers 202141, University of Pretoria, Department of Economics.
- Michael Greenacre & Patrick J. F Groenen & Trevor Hastie & Alfonso Iodice d’Enza & Angelos Markos & Elena Tuzhilina, 2023. "Principal component analysis," Economics Working Papers 1856, Department of Economics and Business, Universitat Pompeu Fabra.
- Parashmoni Borah & Suhasini Hazarika & Amit Prakash, 2022. "Assessing the state of homogeneity, variability and trends in the rainfall time series from 1969 to 2017 and its significance for groundwater in north-east India," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 111(1), pages 585-617, March.
- Jan Kluge & Sarah Lappöhn & Kerstin Plank, 2023. "Predictors of TFP growth in European countries," Empirica, Springer;Austrian Institute for Economic Research;Austrian Economic Association, vol. 50(1), pages 109-140, February.
- Henry Webel & Lili Niu & Annelaura Bach Nielsen & Marie Locard-Paulet & Matthias Mann & Lars Juhl Jensen & Simon Rasmussen, 2024. "Imputation of label-free quantitative mass spectrometry-based proteomics data using self-supervised deep learning," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
- Schoemaker, Nikita K. & Juffer, Femmie & Rippe, Ralph C.A. & Vermeer, Harriet J. & Stoltenborgh, Marije & Jagersma, Gabrine J. & Maras, Athanasios & Alink, Lenneke R.A., 2020. "Positive parenting in foster care: Testing the effectiveness of a video-feedback intervention program on foster parents’ behavior and attitudes," Children and Youth Services Review, Elsevier, vol. 110(C).
- Thelma Dede Baddoo & Zhijia Li & Samuel Nii Odai & Kenneth Rodolphe Chabi Boni & Isaac Kwesi Nooni & Samuel Ato Andam-Akorful, 2021. "Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation," IJERPH, MDPI, vol. 18(16), pages 1-26, August.
More about this item
Keywords
workflow; statistical computing; data science; data visualization; tidyverse; data pipeline.;All these keywords.
JEL classification:
- C10 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - General
- C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
- C22 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Time-Series Models; Dynamic Quantile Regressions; Dynamic Treatment Effect Models; Diffusion Processes
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:msh:ebswps:2018-14. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Professor Xibin Zhang (email available below). General contact details of provider: https://edirc.repec.org/data/dxmonau.html .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.