IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v131y2019icp12-36.html
   My bibliography  Save this article

Ensemble decision forest of RBF networks via hybrid feature clustering approach for high-dimensional data classification

Author

Listed:
  • Abpeykar, Shadi
  • Ghatee, Mehdi
  • Zare, Hadi

Abstract

Classification of the high-dimensional data is challenging due to the curse of dimensionality, heavy computational burden and decreasing precision of algorithms. In order to mitigate these effects, feature selection approaches that can determine an efficient subset of features are utilized in the processing. However, most of these techniques attain just one subset of non-redundant features including the best ones. Alternatively, clustering approaches can be used to find the most informative clusters of features instead of generating just a single subset. So called, Hybrid Feature Clustering (HFC) method is capable of maximizing the classification accuracy while keeping the amount of redundant features in each cluster low. The patterns of each cluster are classified by a neural tree that employs Radial Basis Function (RBF) for the nodes. Within each neural tree, a hierarchical approach is proposed to transfer the knowledge of synaptic weights from a parent RBF node to each child. A gating network is applied on the forest of these neural trees in order to aggregate the results. By assessing the classification accuracy and the computational complexity on high-dimensional datasets it can be shown that the proposed solution has outperformed the state of the art classifiers. Furthermore, the computational complexity and the convergence of this method are theoretically proven and the robustness analysis under noisy conditions is conducted.

Suggested Citation

  • Abpeykar, Shadi & Ghatee, Mehdi & Zare, Hadi, 2019. "Ensemble decision forest of RBF networks via hybrid feature clustering approach for high-dimensional data classification," Computational Statistics & Data Analysis, Elsevier, vol. 131(C), pages 12-36.
  • Handle: RePEc:eee:csdana:v:131:y:2019:i:c:p:12-36
    DOI: 10.1016/j.csda.2018.08.015
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947318301981
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2018.08.015?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Mahani, Alireza S. & Sharabiani, Mansour T.A., 2015. "SIMD parallel MCMC sampling with applications for big-data Bayesian analytics," Computational Statistics & Data Analysis, Elsevier, vol. 88(C), pages 75-99.
    2. Frénay, Benoît & Doquire, Gauthier & Verleysen, Michel, 2014. "Estimating mutual information for feature selection in the presence of label noise," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 832-848.
    3. Bak, Britta Anker & Jensen, Jens Ledet, 2016. "High dimensional classifiers in the imbalanced case," Computational Statistics & Data Analysis, Elsevier, vol. 98(C), pages 46-59.
    4. Reynès, Christelle & Sabatier, Robert & Molinari, Nicolas & Lehmann, Sylvain, 2008. "A new genetic algorithm in proteomics: Feature selection for SELDI-TOF data," Computational Statistics & Data Analysis, Elsevier, vol. 52(9), pages 4380-4394, May.
    5. Xue, Yuan & Yin, Xiangrong & Jiang, Xiaolin, 2016. "Ensemble sufficient dimension folding methods for analyzing matrix-valued data," Computational Statistics & Data Analysis, Elsevier, vol. 103(C), pages 193-205.
    6. Dernoncourt, David & Hanczar, Blaise & Zucker, Jean-Daniel, 2014. "Analysis of feature selection stability on high dimension and small sample data," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 681-693.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yu, Lean & Zhang, Xiaoming, 2021. "Can small sample dataset be used for efficient internet loan credit risk assessment? Evidence from online peer to peer lending," Finance Research Letters, Elsevier, vol. 38(C).
    2. Federico Palacios-González & Rosa M. García-Fernández, 2020. "A faster algorithm to estimate multiresolution densities," Computational Statistics, Springer, vol. 35(3), pages 1207-1230, September.
    3. He, Yan-Lin & Wang, Ping-Jiang & Zhang, Ming-Qing & Zhu, Qun-Xiong & Xu, Yuan, 2018. "A novel and effective nonlinear interpolation virtual sample generation method for enhancing energy prediction and analysis on small data problem: A case study of Ethylene industry," Energy, Elsevier, vol. 147(C), pages 418-427.
    4. David Juárez-Varón & Victoria Tur-Viñes & Alejandro Rabasa-Dolado & Kristina Polotskaya, 2020. "An Adaptive Machine Learning Methodology Applied to Neuromarketing Analysis: Prediction of Consumer Behaviour Regarding the Key Elements of the Packaging Design of an Educational Toy," Social Sciences, MDPI, vol. 9(9), pages 1-23, September.
    5. Li, Song & Tso, Geoffrey K.F. & Long, Lufan, 2017. "Powered embarrassing parallel MCMC sampling in Bayesian inference, a weighted average intuition," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 11-20.
    6. Pin Wang & Yongming Li & Bohan Chen & Xianling Hu & Jin Yan & Yu Xia & Jie Yang, 2017. "Proportional Hybrid Mechanism for Population Based Feature Selection Algorithm," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(05), pages 1309-1338, September.
    7. Kristof Lommers & Ouns El Harzli & Jack Kim, 2021. "Confronting Machine Learning With Financial Research," Papers 2103.00366, arXiv.org, revised Mar 2021.
    8. Pierre Michel & Nicolas Ngo & Jean-François Pons & Stéphane Delliaux & Roch Giorgi, 2021. "A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings," Post-Print hal-03222439, HAL.
    9. Xianlong Zhang & Fei Zhang & Hsiang-te Kung & Ping Shi & Ayinuer Yushanjiang & Shidan Zhu, 2018. "Estimation of the Fe and Cu Contents of the Surface Water in the Ebinur Lake Basin Based on LIBS and a Machine Learning Algorithm," IJERPH, MDPI, vol. 15(11), pages 1-20, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:131:y:2019:i:c:p:12-36. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.