Predicting missing values: a comparative study on non-parametric approaches for imputation
Author
Abstract
Suggested Citation
DOI: 10.1007/s00180-019-00900-3
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
References listed on IDEAS
- van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
- Strobl, Carolin & Boulesteix, Anne-Laure & Augustin, Thomas, 2007. "Unbiased split selection for classification trees based on the Gini Index," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 483-501, September.
- Konietschke, F. & Harrar, S.W. & Lange, K. & Brunner, E., 2012. "Ranking procedures for matched pairs with missing data — Asymptotic theory and a small sample approximation," Computational Statistics & Data Analysis, Elsevier, vol. 56(5), pages 1090-1102.
- Xu, Li-Wen & Yang, Fang-Qin & Abula, Aji’erguli & Qin, Shuang, 2013. "A parametric bootstrap approach for two-way ANOVA in presence of possible interactions with unequal variances," Journal of Multivariate Analysis, Elsevier, vol. 115(C), pages 172-180.
- Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
- Konietschke, Frank & Bathke, Arne C. & Harrar, Solomon W. & Pauly, Markus, 2015. "Parametric and nonparametric bootstrap methods for general MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 140(C), pages 291-301.
- Claudio Conversano & Roberta Siciliano, 2009. "Incremental Tree-Based Missing Data Imputation with Lexicographic Ordering," Journal of Classification, Springer;The Classification Society, vol. 26(3), pages 361-379, December.
Citations
Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
Cited by:
- Christoph Stach & Clémentine Gritti & Julia Bräcker & Michael Behringer & Bernhard Mitschang, 2022. "Protecting Sensitive Data in the Information Age: State of the Art and Future Prospects," Future Internet, MDPI, vol. 14(11), pages 1-43, October.
- Mohamed Lamine Sidibé & Roland Yonaba & Fowé Tazen & Héla Karoui & Ousmane Koanda & Babacar Lèye & Harinaivo Anderson Andrianisa & Harouna Karambiri, 2023. "Understanding the COVID-19 pandemic prevalence in Africa through optimal feature selection and clustering: evidence from a statistical perspective," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 25(11), pages 13565-13593, November.
- Yang, Yadong & Shahbeik, Hossein & Shafizadeh, Alireza & Masoudnia, Nima & Rafiee, Shahin & Zhang, Yijia & Pan, Junting & Tabatabaei, Meisam & Aghbashlo, Mortaza, 2022. "Biomass microwave pyrolysis characterization by machine learning for sustainable rural biorefineries," Renewable Energy, Elsevier, vol. 201(P2), pages 70-86.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Friedrich, Sarah & Pauly, Markus, 2018. "MATS: Inference for potentially singular and heteroscedastic MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 166-179.
- Huang Lin & Merete Eggesbø & Shyamal Das Peddada, 2022. "Linear and nonlinear correlation estimators unveil undescribed taxa interactions in microbiome data," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
- Ali B. Barlas & Seda Guler Mert & Berk Orkun Isa & Alvaro Ortiz & Tomasa Rodrigo & Baris Soybilgen & Ege Yazgan, 2024. "Big data financial transactions and GDP nowcasting: The case of Turkey," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 43(2), pages 227-248, March.
- Milica Maricic & Jose A. Egea & Veljko Jeremic, 2019. "A Hybrid Enhanced Scatter Search—Composite I-Distance Indicator (eSS-CIDI) Optimization Approach for Determining Weights Within Composite Indicators," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 144(2), pages 497-537, July.
- Hapfelmeier, A. & Ulm, K., 2014. "Variable selection by Random Forests using data with missing values," Computational Statistics & Data Analysis, Elsevier, vol. 80(C), pages 129-139.
- Lukasz Struski & Marek Śmieja & Jacek Tabor, 2020. "Pointed Subspace Approach to Incomplete Data," Journal of Classification, Springer;The Classification Society, vol. 37(1), pages 42-57, April.
- Mondal, Anjana & Sattler, Paavo & Kumar, Somesh, 2023. "Testing against ordered alternatives in a two-way model without interaction under heteroscedasticity," Journal of Multivariate Analysis, Elsevier, vol. 196(C).
- Mansoor, Umer & Jamal, Arshad & Su, Junbiao & Sze, N.N. & Chen, Anthony, 2023. "Investigating the risk factors of motorcycle crash injury severity in Pakistan: Insights and policy recommendations," Transport Policy, Elsevier, vol. 139(C), pages 21-38.
- Noémi Kreif & Richard Grieve & Iván Díaz & David Harrison, 2015. "Evaluation of the Effect of a Continuous Treatment: A Machine Learning Approach with an Application to Treatment for Traumatic Brain Injury," Health Economics, John Wiley & Sons, Ltd., vol. 24(9), pages 1213-1228, September.
- Abhilash Bandam & Eedris Busari & Chloi Syranidou & Jochen Linssen & Detlef Stolten, 2022. "Classification of Building Types in Germany: A Data-Driven Modeling Approach," Data, MDPI, vol. 7(4), pages 1-23, April.
- Boonstra Philip S. & Little Roderick J.A. & West Brady T. & Andridge Rebecca R. & Alvarado-Leiton Fernanda, 2021. "A Simulation Study of Diagnostics for Selection Bias," Journal of Official Statistics, Sciendo, vol. 37(3), pages 751-769, September.
- Bissan Ghaddar & Ignacio Gómez-Casares & Julio González-Díaz & Brais González-Rodríguez & Beatriz Pateiro-López & Sofía Rodríguez-Ballesteros, 2023. "Learning for Spatial Branching: An Algorithm Selection Approach," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1024-1043, September.
- Akash Malhotra, 2018. "A hybrid econometric-machine learning approach for relative importance analysis: Prioritizing food policy," Papers 1806.04517, arXiv.org, revised Aug 2020.
- Christopher J Greenwood & George J Youssef & Primrose Letcher & Jacqui A Macdonald & Lauryn J Hagg & Ann Sanson & Jenn Mcintosh & Delyse M Hutchinson & John W Toumbourou & Matthew Fuller-Tyszkiewicz &, 2020. "A comparison of penalised regression methods for informing the selection of predictive markers," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-14, November.
- Liangyuan Hu & Lihua Li, 2022. "Using Tree-Based Machine Learning for Health Studies: Literature Review and Case Series," IJERPH, MDPI, vol. 19(23), pages 1-13, December.
- Norah Alyabs & Sy Han Chiou, 2022. "The Missing Indicator Approach for Accelerated Failure Time Model with Covariates Subject to Limits of Detection," Stats, MDPI, vol. 5(2), pages 1-13, May.
- Feldkircher, Martin, 2014.
"The determinants of vulnerability to the global financial crisis 2008 to 2009: Credit growth and other sources of risk,"
Journal of International Money and Finance, Elsevier, vol. 43(C), pages 19-49.
- Feldkircher, Martin, 2012. "The determinants of vulnerability to the global financial crisis 2008 to 2009: Credit growth and other sources of risk," BOFIT Discussion Papers 26/2012, Bank of Finland Institute for Emerging Economies (BOFIT).
- Nahushananda Chakravarthy H G & Karthik M Seenappa & Sujay Raghavendra Naganna & Dayananda Pruthviraja, 2023. "Machine Learning Models for the Prediction of the Compressive Strength of Self-Compacting Concrete Incorporating Incinerated Bio-Medical Waste Ash," Sustainability, MDPI, vol. 15(18), pages 1-22, September.
- Tim Voigt & Martin Kohlhase & Oliver Nelles, 2021. "Incremental DoE and Modeling Methodology with Gaussian Process Regression: An Industrially Applicable Approach to Incorporate Expert Knowledge," Mathematics, MDPI, vol. 9(19), pages 1-26, October.
- Eunsil Seok & Akhgar Ghassabian & Yuyan Wang & Mengling Liu, 2024. "Statistical Methods for Modeling Exposure Variables Subject to Limit of Detection," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 16(2), pages 435-458, July.
More about this item
Keywords
Random forest; Stochastic gradient tree boosting; Resampling; MICE;All these keywords.
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:34:y:2019:i:4:d:10.1007_s00180-019-00900-3. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.