IDEAS home Printed from https://ideas.repec.org/a/spr/joheur/v31y2025i1d10.1007_s10732-025-09550-9.html
   My bibliography  Save this article

Feature selection for high-dimensional data using a multivariate search space reduction strategy based scatter search

Author

Listed:
  • Miguel Garcia-Torres

    (Universidad Pablo de Olavide)

Abstract

In feature selection, the increasing of the dimensionality and the complexity of feature interactions make the problem challenging. Furthermore, searching for an optimal subset of features from a high-dimensional feature space is known to be an $$\mathcal{N}\mathcal{P}$$ N P -hard problem. To improve the efficiency and effectiveness of the search algorithm, feature grouping has emerged as a way to reduce the search space by clustering features according to a measure. In this work we propose to reduce the search space by applying a greedy algorithm, called Multivariate Greedy Predominant Groups Generator (MGPGG). MGPGG extends the idea of the Greedy Predominant Groups Generator (GPGG) algorithm by taking into account feature interaction among three or more features. For this purpose, MGPGG uses the Multivariate Symmetrical Uncertainty (MSU) to group features that share information about the class label. We also propose a Scatter Search strategy that integrates MGPGG to find small subsets of features with high predictive power. The proposed algorithm, called Multivariate Predominant Group-based Scatter Search (MPGSS), is tested on high-dimensional data from biomedical and text-mining fields. The proposal is compared with state-of-the-art feature selection strategies. Results show that MPGSS is competitive since it is capable of finding small subsets of features while keeping high predictive classification models.

Suggested Citation

  • Miguel Garcia-Torres, 2025. "Feature selection for high-dimensional data using a multivariate search space reduction strategy based scatter search," Journal of Heuristics, Springer, vol. 31(1), pages 1-33, March.
  • Handle: RePEc:spr:joheur:v:31:y:2025:i:1:d:10.1007_s10732-025-09550-9
    DOI: 10.1007/s10732-025-09550-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10732-025-09550-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10732-025-09550-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:joheur:v:31:y:2025:i:1:d:10.1007_s10732-025-09550-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.