Benchmark for filter methods for feature selection in high-dimensional classification data
Author
Abstract
Suggested Citation
DOI: 10.1016/j.csda.2019.106839
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
References listed on IDEAS
- Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
- Smyth Gordon K, 2004. "Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 3(1), pages 1-28, February.
- Simon, Noah & Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2011. "Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 39(i05).
- Yindalon Aphinyanaphongs & Lawrence D. Fu & Zhiguo Li & Eric R. Peskin & Efstratios Efstathiadis & Constantin F. Aliferis & Alexander Statnikov, 2014. "A comprehensive empirical comparison of modern supervised classification and feature selection methods for text categorization," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(10), pages 1964-1987, October.
- Karatzoglou, Alexandros & Smola, Alexandros & Hornik, Kurt & Zeileis, Achim, 2004. "kernlab - An S4 Package for Kernel Methods in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 11(i09).
Citations
Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
Cited by:
- Hapfelmeier, Alexander & Hornung, Roman & Haller, Bernhard, 2023. "Efficient permutation testing of variable importance measures by the example of random forests," Computational Statistics & Data Analysis, Elsevier, vol. 181(C).
- Cappozzo, Andrea & Greselin, Francesca & Murphy, Thomas Brendan, 2021. "Robust variable selection for model-based learning in presence of adulteration," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
- Tang, Wenjun & Wang, Hao & Lee, Xian-Long & Yang, Hong-Tzer, 2022. "Machine learning approach to uncovering residential energy consumption patterns based on socioeconomic and smart meter data," Energy, Elsevier, vol. 240(C).
- Krarti, Moncef & Aldubyan, Mohammad, 2021. "Review analysis of COVID-19 impact on electricity demand for residential buildings," Renewable and Sustainable Energy Reviews, Elsevier, vol. 143(C).
- Manuel Oviedo-de la Fuente & Carlos Cabo & Celestino Ordóñez & Javier Roca-Pardiñas, 2021. "A Distance Correlation Approach for Optimum Multiscale Selection in 3D Point Cloud Classification," Mathematics, MDPI, vol. 9(12), pages 1-19, June.
- Florian Pargent & Florian Pfisterer & Janek Thomas & Bernd Bischl, 2022. "Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features," Computational Statistics, Springer, vol. 37(5), pages 2671-2692, November.
- van Zyl, Corne & Ye, Xianming & Naidoo, Raj, 2024. "Harnessing eXplainable artificial intelligence for feature selection in time series energy forecasting: A comparative analysis of Grad-CAM and SHAP," Applied Energy, Elsevier, vol. 353(PA).
- Dhivya Elavarasan & Durai Raj Vincent P M & Kathiravan Srinivasan & Chuan-Yu Chang, 2020. "A Hybrid CFS Filter and RF-RFE Wrapper-Based Feature Extraction for Enhanced Agricultural Crop Yield Prediction Modeling," Agriculture, MDPI, vol. 10(9), pages 1-27, September.
- Wen-Kuo Chen & Dalianus Riantama & Long-Sheng Chen, 2020. "Using a Text Mining Approach to Hear Voices of Customers from Social Media toward the Fast-Food Restaurant Industry," Sustainability, MDPI, vol. 13(1), pages 1-17, December.
- repec:iim:iimawp:14638 is not listed on IDEAS
- Fatemeh Moodi & Amir Jahangard-Rafsanjani & Sajad Zarifzadeh, 2023. "Feature selection and regression methods for stock price prediction using technical indicators," Papers 2310.09903, arXiv.org, revised Nov 2023.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Samir K. Safi & Sheema Gul, 2024. "An Enhanced Tree Ensemble for Classification in the Presence of Extreme Class Imbalance," Mathematics, MDPI, vol. 12(20), pages 1-17, October.
- Fitzpatrick, Trevor & Mues, Christophe, 2021. "How can lenders prosper? Comparing machine learning approaches to identify profitable peer-to-peer loan investments," European Journal of Operational Research, Elsevier, vol. 294(2), pages 711-722.
- Schratz, Patrick & Muenchow, Jannes & Iturritxa, Eugenia & Richter, Jakob & Brenning, Alexander, 2019. "Hyperparameter tuning and performance assessment of statistical and machine-learning algorithms using spatial data," Ecological Modelling, Elsevier, vol. 406(C), pages 109-120.
- Aaron C Ericsson & J Wade Davis & William Spollen & Nathan Bivens & Scott Givan & Catherine E Hagan & Mark McIntosh & Craig L Franklin, 2015. "Effects of Vendor and Genetic Background on the Composition of the Fecal Microbiota of Inbred Mice," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-19, February.
- Backer, David & Billing, Trey, 2024. "Forecasting the prevalence of child acute malnutrition using environmental and conflict conditions as leading indicators," World Development, Elsevier, vol. 176(C).
- Tsukioka, Yasutomo & Yanagi, Junya & Takada, Teruko, 2018. "Investor sentiment extracted from internet stock message boards and IPO puzzles," International Review of Economics & Finance, Elsevier, vol. 56(C), pages 205-217.
- Mariana Oliveira & Luís Torgo & Vítor Santos Costa, 2021. "Evaluation Procedures for Forecasting with Spatiotemporal Data," Mathematics, MDPI, vol. 9(6), pages 1-27, March.
- Daniel J. Luckett & Eric B. Laber & Samer S. El‐Kamary & Cheng Fan & Ravi Jhaveri & Charles M. Perou & Fatma M. Shebl & Michael R. Kosorok, 2021. "Receiver operating characteristic curves and confidence bands for support vector machines," Biometrics, The International Biometric Society, vol. 77(4), pages 1422-1430, December.
- Hossain, Ahmed & Beyene, Joseph & Willan, Andrew R. & Hu, Pingzhao, 2009. "A flexible approximate likelihood ratio test for detecting differential expression in microarray data," Computational Statistics & Data Analysis, Elsevier, vol. 53(10), pages 3685-3695, August.
- Soave, David & Lawless, Jerald F., 2023. "Regularized regression for two phase failure time studies," Computational Statistics & Data Analysis, Elsevier, vol. 182(C).
- Grabisch, Michel & Kojadinovic, Ivan & Meyer, Patrick, 2008.
"A review of methods for capacity identification in Choquet integral based multi-attribute utility theory: Applications of the Kappalab R package,"
European Journal of Operational Research, Elsevier, vol. 186(2), pages 766-785, April.
- Michel Grabisch & Ivan Kojadinovic & Patrick Meyer, 2008. "A review of methods for capacity identification in Choquet integral based multi-attribute utility theory: Applications of the Kappalab R package," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-00187175, HAL.
- Michel Grabisch & Ivan Kojadinovic & Patrick Meyer, 2008. "A review of methods for capacity identification in Choquet integral based multi-attribute utility theory: Applications of the Kappalab R package," Post-Print halshs-00187175, HAL.
- Hua Xin & Yuhlong Lio & Hsien-Ching Chen & Tzong-Ru Tsai, 2024. "Zero-Inflated Binary Classification Model with Elastic Net Regularization," Mathematics, MDPI, vol. 12(19), pages 1-17, September.
- Xiaohong Li & Guy N Brock & Eric C Rouchka & Nigel G F Cooper & Dongfeng Wu & Timothy E O’Toole & Ryan S Gill & Abdallah M Eteleeb & Liz O’Brien & Shesh N Rai, 2017. "A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data," PLOS ONE, Public Library of Science, vol. 12(5), pages 1-22, May.
- Zemin Zheng & Jie Zhang & Yang Li, 2022. "L 0 -Regularized Learning for High-Dimensional Additive Hazards Regression," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2762-2775, September.
- Bokelmann, Björn & Lessmann, Stefan, 2024. "Improving uplift model evaluation on randomized controlled trial data," European Journal of Operational Research, Elsevier, vol. 313(2), pages 691-707.
- Joel Podgorski & Oliver Kracht & Luis Araguas-Araguas & Stefan Terzer-Wassmuth & Jodie Miller & Ralf Straub & Rolf Kipfer & Michael Berg, 2024. "Groundwater vulnerability to pollution in Africa’s Sahel region," Nature Sustainability, Nature, vol. 7(5), pages 558-567, May.
- Kerr Kathleen F., 2012. "Optimality Criteria for the Design of 2-Color Microarray Studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(1), pages 1-9, January.
- Bellotti, Anthony & Brigo, Damiano & Gambetti, Paolo & Vrins, Frédéric, 2021.
"Forecasting recovery rates on non-performing loans with machine learning,"
International Journal of Forecasting, Elsevier, vol. 37(1), pages 428-444.
- Bellotti, Anthony & Brigo, Damiano & Gambetti, Paolo & Vrins, Frédéric, 2020. "Forecasting recovery rates on non-performing loans with machine learning," LIDAM Reprints LFIN 2020002, Université catholique de Louvain, Louvain Finance (LFIN).
- Bellotti, Anthony & Brigo, Damiano & Gambetti, Paolo & Vrins, Frédéric, 2020. "Forecasting recovery rates on non-performing loans with machine learning," LIDAM Discussion Papers LFIN 2020002, Université catholique de Louvain, Louvain Finance (LFIN).
- Ambroise Jérôme & Bearzatto Bertrand & Robert Annie & Macq Benoit & Gala Jean-Luc, 2012. "Combining Multiple Laser Scans of Spotted Microarrays by Means of a Two-Way ANOVA Model," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-20, February.
- Simon Bussy & Mokhtar Z. Alaya & Anne‐Sophie Jannot & Agathe Guilloux, 2022. "Binacox: automatic cut‐point detection in high‐dimensional Cox model with applications in genetics," Biometrics, The International Biometric Society, vol. 78(4), pages 1414-1426, December.
More about this item
Keywords
Feature selection; Filter methods; High-dimensional data; Benchmark;All these keywords.
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:143:y:2020:i:c:s016794731930194x. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.