IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v8y2020i10p1846-d431712.html
   My bibliography  Save this article

Simultaneous Feature Selection and Classification for Data-Adaptive Kernel-Penalized SVM

Author

Listed:
  • Xin Liu

    (School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, China)

  • Bangxin Zhao

    (Department of Statistical and Actuarial Sciences, University of Western Ontario, London, ON N6A 5B7, Canada)

  • Wenqing He

    (Department of Statistical and Actuarial Sciences, University of Western Ontario, London, ON N6A 5B7, Canada)

Abstract

Simultaneous feature selection and classification have been explored in the literature to extend the support vector machine (SVM) techniques by adding penalty terms to the loss function directly. However, it is the kernel function that controls the performance of the SVM, and an imbalance in the data will deteriorate the performance of an SVM. In this paper, we examine a new method of simultaneous feature selection and binary classification. Instead of incorporating the standard loss function of the SVM, a penalty is added to the data-adaptive kernel function directly to control the performance of the SVM, by firstly conformally transforming the kernel functions of the SVM, and then re-conducting an SVM classifier based on the sparse features selected. Both convex and non-convex penalties, such as least absolute shrinkage and selection (LASSO), moothly clipped absolute deviation (SCAD) and minimax concave penalty (MCP) are explored, and the oracle property of the estimator is established accordingly. An iterative optimization procedure is applied as there is no analytic form of the estimated coefficients available. Numerical comparisons show that the proposed method outperforms the competitors considered when data are imbalanced, and it performs similarly to the competitors when data are balanced. The method can be easily applied in medical images from different platforms.

Suggested Citation

  • Xin Liu & Bangxin Zhao & Wenqing He, 2020. "Simultaneous Feature Selection and Classification for Data-Adaptive Kernel-Penalized SVM," Mathematics, MDPI, vol. 8(10), pages 1-22, October.
  • Handle: RePEc:gam:jmathe:v:8:y:2020:i:10:p:1846-:d:431712
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/8/10/1846/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/8/10/1846/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Jiahua Chen & Zehua Chen, 2008. "Extended Bayesian information criteria for model selection with large model spaces," Biometrika, Biometrika Trust, vol. 95(3), pages 759-771.
    2. Xiang Zhang & Yichao Wu & Lan Wang & Runze Li, 2016. "Variable selection for support vector machines in moderately high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(1), pages 53-76, January.
    3. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    4. Lang Zhang & Haiqing Hu & Dan Zhang, 2015. "A credit risk assessment model based on SVM for small and medium enterprises in supply chain finance," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 1(1), pages 1-21, December.
    5. Khokhar, Suhail & Mohd Zin, Abdullah Asuhaimi B. & Mokhtar, Ahmad Safawi B. & Pesaran, Mahmoud, 2015. "A comprehensive overview on signal processing and artificial intelligence techniques applications in classification of power quality disturbances," Renewable and Sustainable Energy Reviews, Elsevier, vol. 51(C), pages 1650-1663.
    6. Mazumder, Rahul & Friedman, Jerome H. & Hastie, Trevor, 2011. "SparseNet: Coordinate Descent With Nonconvex Penalties," Journal of the American Statistical Association, American Statistical Association, vol. 106(495), pages 1125-1138.
    7. Olvi L. Mangasarian & W. Nick Street & William H. Wolberg, 1995. "Breast Cancer Diagnosis and Prognosis Via Linear Programming," Operations Research, INFORMS, vol. 43(4), pages 570-577, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Junya Tang & Kuo-Yi Lin & Li Li, 2022. "Using Domain Adaptation for Incremental SVM Classification of Drift Data," Mathematics, MDPI, vol. 10(19), pages 1-17, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rahul Ghosal & Arnab Maity & Timothy Clark & Stefano B. Longo, 2020. "Variable selection in functional linear concurrent regression," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 69(3), pages 565-587, June.
    2. Xiang Zhang & Yichao Wu & Lan Wang & Runze Li, 2016. "Variable selection for support vector machines in moderately high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(1), pages 53-76, January.
    3. Eun Ryung Lee & Hohsuk Noh & Byeong U. Park, 2014. "Model Selection via Bayesian Information Criterion for Quantile Regression Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 216-229, March.
    4. Runmin Shi & Faming Liang & Qifan Song & Ye Luo & Malay Ghosh, 2018. "A Blockwise Consistency Method for Parameter Estimation of Complex Models," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 80(1), pages 179-223, December.
    5. Jiang, He & Tao, Changqi & Dong, Yao & Xiong, Ren, 2021. "Robust low-rank multiple kernel learning with compound regularization," European Journal of Operational Research, Elsevier, vol. 295(2), pages 634-647.
    6. Gabriel E Hoffman & Benjamin A Logsdon & Jason G Mezey, 2013. "PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data," PLOS Computational Biology, Public Library of Science, vol. 9(6), pages 1-19, June.
    7. Benjamin G. Stokell & Rajen D. Shah & Ryan J. Tibshirani, 2021. "Modelling high‐dimensional categorical data using nonconvex fusion penalties," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 579-611, July.
    8. Margherita Giuzio, 2017. "Genetic algorithm versus classical methods in sparse index tracking," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 40(1), pages 243-256, November.
    9. Zak-Szatkowska, Malgorzata & Bogdan, Malgorzata, 2011. "Modified versions of the Bayesian Information Criterion for sparse Generalized Linear Models," Computational Statistics & Data Analysis, Elsevier, vol. 55(11), pages 2908-2924, November.
    10. Gaorong Li & Liugen Xue & Heng Lian, 2012. "SCAD-penalised generalised additive models with non-polynomial dimensionality," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 24(3), pages 681-697.
    11. Xiaotong Shen & Wei Pan & Yunzhang Zhu & Hui Zhou, 2013. "On constrained and regularized high-dimensional regression," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 65(5), pages 807-832, October.
    12. Shan Luo & Zehua Chen, 2014. "Sequential Lasso Cum EBIC for Feature Selection With Ultra-High Dimensional Feature Space," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1229-1240, September.
    13. Lian, Heng & Du, Pang & Li, YuanZhang & Liang, Hua, 2014. "Partially linear structure identification in generalized additive models with NP-dimensionality," Computational Statistics & Data Analysis, Elsevier, vol. 80(C), pages 197-208.
    14. Bartosz Uniejewski, 2024. "Regularization for electricity price forecasting," Papers 2404.03968, arXiv.org.
    15. Tang, Yanlin & Song, Xinyuan & Wang, Huixia Judy & Zhu, Zhongyi, 2013. "Variable selection in high-dimensional quantile varying coefficient models," Journal of Multivariate Analysis, Elsevier, vol. 122(C), pages 115-132.
    16. Yunxiao Chen & Xiaoou Li & Jingchen Liu & Zhiliang Ying, 2017. "Regularized Latent Class Analysis with Application in Cognitive Diagnosis," Psychometrika, Springer;The Psychometric Society, vol. 82(3), pages 660-692, September.
    17. Li, Xinyi & Wang, Li & Nettleton, Dan, 2019. "Sparse model identification and learning for ultra-high-dimensional additive partially linear models," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 204-228.
    18. Zhaoliang Wang & Liugen Xue & Gaorong Li & Fei Lu, 2019. "Spline estimator for ultra-high dimensional partially linear varying coefficient models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 71(3), pages 657-677, June.
    19. Zhang, Ting & Wang, Lei, 2020. "Smoothed empirical likelihood inference and variable selection for quantile regression with nonignorable missing response," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    20. Chenchen Ma & Jing Ouyang & Gongjun Xu, 2023. "Learning Latent and Hierarchical Structures in Cognitive Diagnosis Models," Psychometrika, Springer;The Psychometric Society, vol. 88(1), pages 175-207, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:8:y:2020:i:10:p:1846-:d:431712. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.