Author
Listed:
- Sebastián Alberto Grillo
(Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay)
- José Luis Vázquez Noguera
(Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay)
- Julio César Mello Román
(Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay
Facultad Politécnica, Universidad Nacional de Asunción, San Lorenzo 111421, Paraguay
Facultad de Ciencias Exactas y Tecnológicas, Universidad Nacional de Concepción, Concepción 010123, Paraguay)
- Miguel García-Torres
(Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay
Data Science and Big Data Lab, Universidad Pablo de Olavide, 41013 Seville, Spain)
- Jacques Facon
(Department of Computer and Electronics, Universidade Federal do Espírito Santo, São Mateus 29932-540, Brazil)
- Diego P. Pinto-Roa
(Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay
Facultad Politécnica, Universidad Nacional de Asunción, San Lorenzo 111421, Paraguay
Facultad de Ciencias Exactas y Tecnológicas, Universidad Nacional de Concepción, Concepción 010123, Paraguay)
- Luis Salgueiro Romero
(Signal Theory and Communications Department, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain)
- Francisco Gómez-Vela
(Data Science and Big Data Lab, Universidad Pablo de Olavide, 41013 Seville, Spain)
- Laura Raquel Bareiro Paniagua
(Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay)
- Deysi Natalia Leguizamon Correa
(Computer Engineer Department, Universidad Americana, Asunción 1206, Paraguay)
Abstract
In feature selection, redundancy is one of the major concerns since the removal of redundancy in data is connected with dimensionality reduction. Despite the evidence of such a connection, few works present theoretical studies regarding redundancy. In this work, we analyze the effect of redundant features on the performance of classification models. We can summarize the contribution of this work as follows: (i) develop a theoretical framework to analyze feature construction and selection, (ii) show that certain properly defined features are redundant but make the data linearly separable, and (iii) propose a formal criterion to validate feature construction methods. The results of experiments suggest that a large number of redundant features can reduce the classification error. The results imply that it is not enough to analyze features solely using criteria that measure the amount of information provided by such features.
Suggested Citation
Sebastián Alberto Grillo & José Luis Vázquez Noguera & Julio César Mello Román & Miguel García-Torres & Jacques Facon & Diego P. Pinto-Roa & Luis Salgueiro Romero & Francisco Gómez-Vela & Laura Raquel, 2021.
"Redundancy Is Not Necessarily Detrimental in Classification Problems,"
Mathematics, MDPI, vol. 9(22), pages 1-22, November.
Handle:
RePEc:gam:jmathe:v:9:y:2021:i:22:p:2899-:d:679141
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:22:p:2899-:d:679141. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.