IDEAS home Printed from https://ideas.repec.org/p/hal/journl/hal-04848056.html
   My bibliography  Save this paper

Multivariate filter methods for feature selection with the γ-metric

Author

Listed:
  • Nicolas Ngo

    (AMU - Aix Marseille Université, INSERM - Institut National de la Santé et de la Recherche Médicale, SESSTIM - U1252 INSERM - Aix Marseille Univ - UMR 259 IRD - Sciences Economiques et Sociales de la Santé & Traitement de l'Information Médicale - IRD - Institut de Recherche pour le Développement - AMU - Aix Marseille Université - INSERM - Institut National de la Santé et de la Recherche Médicale, ISSPAM - Institut des sciences de la santé publique [Marseille])

  • Pierre Michel

    (AMU - Aix Marseille Université, CNRS - Centre National de la Recherche Scientifique, AMSE - Aix-Marseille Sciences Economiques - EHESS - École des hautes études en sciences sociales - AMU - Aix Marseille Université - ECM - École Centrale de Marseille - CNRS - Centre National de la Recherche Scientifique)

  • Roch Giorgi

    (AMU - Aix Marseille Université, APHM - Assistance Publique - Hôpitaux de Marseille, INSERM - Institut National de la Santé et de la Recherche Médicale, SESSTIM - U1252 INSERM - Aix Marseille Univ - UMR 259 IRD - Sciences Economiques et Sociales de la Santé & Traitement de l'Information Médicale - IRD - Institut de Recherche pour le Développement - AMU - Aix Marseille Université - INSERM - Institut National de la Santé et de la Recherche Médicale, ISSPAM - Institut des sciences de la santé publique [Marseille], TIMONE - Hôpital de la Timone [CHU - APHM], BiosTIC - Biostatistique et technologies de l'information et de la communication (BioSTIC) - [Hôpital de la Timone - APHM] - APHM - Assistance Publique - Hôpitaux de Marseille - TIMONE - Hôpital de la Timone [CHU - APHM], IRD [Occitanie] - Institut de Recherche pour le Développement)

Abstract

Background The γ-metric value is generally used as the importance score of a feature (or a set of features) in a clas- sification context. This study aimed to go further by creating a new methodology for multivariate feature selection for classification, whereby the γ-metric is associated with a specific search direction (and therefore a specific stopping criterion). As three search directions are used, we effectively created three distinct methods. MethodsWe assessed the performance of our new methodology through a simulation study, comparing them against more conventional methods. Classification performance indicators, number of selected features, stability and execution time were used to evaluate the performance of the methods. We also evaluated how well the proposed methodology selected relevant features for the detection of atrial fibrillation, which is a cardiac arrhythmia. ResultsWe found that in the simulation study as well as the detection of AF task, our methods were able to select informative features and maintain a good level of predictive performance; however in a case of strong correlation and large datasets, the γ-metric based methods were less efficient to exclude non-informative features. Conclusions Results highlighted a good combination of both the forward search direction and the γ-metric as an evaluation function. However, using the backward search direction, the feature selection algorithm could fall into a local optima and can be improved.

Suggested Citation

  • Nicolas Ngo & Pierre Michel & Roch Giorgi, 2024. "Multivariate filter methods for feature selection with the γ-metric," Post-Print hal-04848056, HAL.
  • Handle: RePEc:hal:journl:hal-04848056
    DOI: 10.1186/s12874-024-02426-9
    Note: View the original document on HAL open archive server: https://hal.science/hal-04848056v1
    as

    Download full text from publisher

    File URL: https://hal.science/hal-04848056v1/document
    Download Restriction: no

    File URL: https://libkey.io/10.1186/s12874-024-02426-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Stanislav Kolenikov & Gustavo Angeles, 2009. "Socioeconomic Status Measurement With Discrete Proxy Variables: Is Principal Component Analysis A Reliable Answer?," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 55(1), pages 128-165, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guirong Li & Jiajia Xu & Liying Li & Zhaolei Shi & Hongmei Yi & James Chu & Elena Kardanova & Yanyan Li & Prashant Loyalka & Scott Rozelle, 2020. "The Impacts of Highly Resourced Vocational Schools on Student Outcomes in China," China & World Economy, Institute of World Economics and Politics, Chinese Academy of Social Sciences, vol. 28(6), pages 125-150, November.
    2. Paschalis Arvanitidis & Athina Economou & Christos Kollias, 2016. "Terrorism’s effects on social capital in European countries," Public Choice, Springer, vol. 169(3), pages 231-250, December.
    3. Inyoung Park & Jieon Lee & Jungwoo Nam & Yuri Jo & Daeho Lee, 2022. "Which networking strategy improves ICT startup companies' technical efficiency?," Managerial and Decision Economics, John Wiley & Sons, Ltd., vol. 43(6), pages 2434-2443, September.
    4. Brown, Joe & Hamoudi, Amar & Jeuland, Marc & Turrini, Gina, 2017. "Seeing, believing, and behaving: Heterogeneous effects of an information intervention on household water treatment," Journal of Environmental Economics and Management, Elsevier, vol. 86(C), pages 141-159.
    5. David Pérez-Mesa & à ngel S. Marrero, 2024. "Adult health and inequality of opportunity in Spain," Working Papers 671, ECINEQ, Society for the Study of Economic Inequality.
    6. Lannes, Laurence, 2015. "Improving health worker performance: The patient-perspective from a PBF program in Rwanda," Social Science & Medicine, Elsevier, vol. 138(C), pages 1-11.
    7. Patrick S. Ward & Valerien O. Pede, 2015. "Capturing social network effects in technology adoption: the spatial diffusion of hybrid rice in Bangladesh," Australian Journal of Agricultural and Resource Economics, Australian Agricultural and Resource Economics Society, vol. 59(2), pages 225-241, April.
    8. Jesus Felipe & Arnelyn Abdon & Utsav Kumar, 2012. "Tracking the Middle-income Trap: What Is It, Who Is in It, and Why?," Economics Working Paper Archive wp_715, Levy Economics Institute.
    9. Juan M Villa, 2016. "A harmonised proxy means test for Kenya’s National Safety Net programme," Global Development Institute Working Paper Series 032016, GDI, The University of Manchester.
    10. Clark Gray & Richard Bilsborrow, 2013. "Environmental Influences on Human Migration in Rural Ecuador," Demography, Springer;Population Association of America (PAA), vol. 50(4), pages 1217-1241, August.
    11. Jae Min Lee & Jonghee Lee & Kyoung Tae Kim, 2020. "Consumer Financial Well-Being: Knowledge is Not Enough," Journal of Family and Economic Issues, Springer, vol. 41(2), pages 218-228, June.
    12. Han, Linsong & Li, Xun & Xu, Gang, 2022. "Anti-corruption and poverty alleviation: Evidence from China," Journal of Economic Behavior & Organization, Elsevier, vol. 203(C), pages 150-172.
    13. Esposito, Lucio & Villaseñor, Adrián, 2017. "Relative deprivation: Measurement issues and predictive role for body image dissatisfaction," Social Science & Medicine, Elsevier, vol. 192(C), pages 49-57.
    14. Yang Yixin & Lü Xin & Ma Jian & Qiao Han, 2014. "A Robust Factor Analysis Model for Dichotomous Data," Journal of Systems Science and Information, De Gruyter, vol. 2(5), pages 437-450, October.
    15. Bessonova, Evguenia & Gonchar, Ksenia, 2019. "How the innovation-competition link is shaped by technology distance in a high-barrier catch-up economy," Technovation, Elsevier, vol. 86, pages 15-32.
    16. Christopoulos, Dimitris K. & McAdam, Peter, 2019. "Efficiency, Inefficiency, And The Mena Frontier," Macroeconomic Dynamics, Cambridge University Press, vol. 23(2), pages 489-521, March.
    17. Dong, Fengxia & Mitchell, Paul D. & Hurley, Terrance M. & Frisvold, George B., 2012. "Quantifying Farmer Adoption Intensity for Weed Resistance Management Practices and Its Determinants," 2012 Annual Meeting, August 12-14, 2012, Seattle, Washington 125194, Agricultural and Applied Economics Association.
    18. Lucio Esposito & Sunil Mitra Kumar & Adrián Villaseñor, 2020. "The importance of being earliest: birth order and educational outcomes along the socioeconomic ladder in Mexico," Journal of Population Economics, Springer;European Society for Population Economics, vol. 33(3), pages 1069-1099, July.
    19. Hongyan Liu & Yaojiang Shi & Emma Auden & Scott Rozelle, 2018. "Anxiety in Rural Chinese Children and Adolescents: Comparisons across Provinces and among Subgroups," IJERPH, MDPI, vol. 15(10), pages 1-14, September.
    20. Junaid Ahmed & Mazhar Mughal & Stephan Klasen, 2018. "Great Expectations? Remittances and Asset Accumulation in Pakistan," Journal of International Development, John Wiley & Sons, Ltd., vol. 30(3), pages 507-532, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:journl:hal-04848056. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.