IDEAS home Printed from https://ideas.repec.org/a/spr/aodasc/v4y2017i3d10.1007_s40745-017-0106-3.html
   My bibliography  Save this article

A Rough Based Hybrid Binary PSO Algorithm for Flat Feature Selection and Classification in Gene Expression Data

Author

Listed:
  • Suresh Dara

    (B.V. Raju Inistitute of Technology)

  • Haider Banka

    (Indian Institute of Technology (ISM))

  • Chandra Sekhara Rao Annavarapu

    (Indian Institute of Technology (ISM))

Abstract

Feature selection in high dimensional data, particularly, in gene expression data, is one of the challenging task in bioinformatics due to the curse of dimensionality, data redundancy and noise values. In gene expression data, insignificant features causes poor classification, hence feature selection reduces feature subset, improving classification accuracy. Feature selection algorithms in gene expression data(such as filter based, wrapper based and hybrid methods) performing poor accuracy, where as few methods takes too much time to converge for an acceptable results. For example, in NSGA-II, over 10,000 generations, on an average, to converge in the search space. where it incurs increased computational time. Proposed rough based hybrid binary PSO algorithm, which uses a heuristic based fast processing strategy to reduce crude domain features by statistical elimination of redundant features and then discretized subsequently into a binary table, known as distinction table, in rough set theory. This distinction table is later used as input to evaluate and optimize the objectives functions i.e., to generate reduct in rough set theory. The proposed hybrid binary PSO is then used to tune the objective functions, to choose the most important features (i:e:reduct). The fitness function is used in such a way that it can reduce the cardinality of the features and at the same time, improve the classification performance as well. Results have been demonstrated to show the effectiveness of the proposed method, on existing three benchmark datasets (i.e. colon cancer, lymphoma and leukemia data), from literature.

Suggested Citation

  • Suresh Dara & Haider Banka & Chandra Sekhara Rao Annavarapu, 2017. "A Rough Based Hybrid Binary PSO Algorithm for Flat Feature Selection and Classification in Gene Expression Data," Annals of Data Science, Springer, vol. 4(3), pages 341-360, September.
  • Handle: RePEc:spr:aodasc:v:4:y:2017:i:3:d:10.1007_s40745-017-0106-3
    DOI: 10.1007/s40745-017-0106-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40745-017-0106-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40745-017-0106-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Gastwirth, Joseph L, 1972. "The Estimation of the Lorenz Curve and Gini Index," The Review of Economics and Statistics, MIT Press, vol. 54(3), pages 306-316, August.
    2. Pawlak, Zdzislaw, 1997. "Rough set approach to knowledge-based decision support," European Journal of Operational Research, Elsevier, vol. 99(1), pages 48-57, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Igor Fedotenkov, 2020. "A Review of More than One Hundred Pareto-Tail Index Estimators," Statistica, Department of Statistics, University of Bologna, vol. 80(3), pages 245-299.
    2. Modalsli, Jørgen, 2011. "Inequality and growth in the very long run: inferring inequality from data on social groups," Memorandum 11/2011, Oslo University, Department of Economics.
    3. Nijkamp, Peter & Poot, Jacques, 2015. "Cultural Diversity: A Matter of Measurement," IZA Discussion Papers 8782, Institute of Labor Economics (IZA).
    4. Stephen Davies & Peter L. Ormosi, 2014. "The economic impact of cartels and anti-cartel enforcement," Working Paper series, University of East Anglia, Centre for Competition Policy (CCP) 2013-07v2, Centre for Competition Policy, University of East Anglia, Norwich, UK..
    5. Zhu, Yongjun & Yan, Erjia, 2017. "Examining academic ranking and inequality in library and information science through faculty hiring networks," Journal of Informetrics, Elsevier, vol. 11(2), pages 641-654.
    6. Csörgö, Miklós & Zitikis, Ricardas, 1997. "On the rate of strong consistency of Lorenz curves," Statistics & Probability Letters, Elsevier, vol. 34(2), pages 113-121, June.
    7. Maurizio d’Amato, 2007. "Comparing Rough Set Theory with Multiple Regression Analysis as Automated Valuation Methodologies," International Real Estate Review, Global Social Science Institute, vol. 10(2), pages 42-65.
    8. Johan Fellman, 2021. "Empirical Analyses of Income: Finland (2009) and Australia (1967-1968)," Journal of Statistical and Econometric Methods, SCIENPRESS Ltd, vol. 10(1), pages 1-3.
    9. Salvatore Barbagallo & Simona Consoli & Nello Pappalardo & Salvatore Greco & Santo Zimbone, 2006. "Discovering Reservoir Operating Rules by a Rough Set Approach," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 20(1), pages 19-36, February.
    10. WANG, Zuxiang & SMYTH, Russell & NG, Yew-Kwang, 2009. "A new ordered family of Lorenz curves with an application to measuring income inequality and poverty in rural China," China Economic Review, Elsevier, vol. 20(2), pages 218-235, June.
    11. Ugo Panizza, 1999. "Desigualdad del ingreso y crecimiento económico: elementos de juicio de datos de USA," Research Department Publications 4179, Inter-American Development Bank, Research Department.
    12. Alessandro Scuderi & Luisa Sturiale & Giuseppe Timpanaro & Agata Matarazzo & Silvia Zingale & Paolo Guarnaccia, 2022. "A Model to Support Sustainable Resource Management in the “Etna River Valleys” Biosphere Reserve: The Dominance-Based Rough Set Approach," Sustainability, MDPI, vol. 14(9), pages 1-19, April.
    13. James B. Mcdonald & Jeff Sorensen & Patrick A. Turley, 2013. "Skewness And Kurtosis Properties Of Income Distribution Models," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 59(2), pages 360-374, June.
    14. Anthony B Aktinson, 2015. "Top incomes in East Africa before and after independence," Working Papers halshs-02654566, HAL.
    15. Si-Hui Dong & Hui-Cheng Zhou & Hai-Jun Xu, 2004. "A Forecast Model of Hydrologic Single Element Medium and Long-Period Based on Rough Set Theory," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 18(5), pages 483-495, October.
    16. Huimin Xu & Shougeng Hu & Xi Li, 2023. "Urban Distribution and Evolution of the Yangtze River Economic Belt from the Perspectives of Urban Area and Night-Time Light," Land, MDPI, vol. 12(2), pages 1-21, January.
    17. Jianguo Di & Wenge Liu & Jiaqi Sun & Dianfeng Zhang, 2025. "Market Potential Evaluation of Photovoltaic Technologies in the Context of Future Architectural Trends," Sustainability, MDPI, vol. 17(3), pages 1-27, January.
    18. Yves Tillé, 2016. "The legacy of Corrado Gini in survey sampling and inequality theory," METRON, Springer;Sapienza Università di Roma, vol. 74(2), pages 167-176, August.
    19. Fontanari, Andrea & Cirillo, Pasquale & Oosterlee, Cornelis W., 2018. "From Concentration Profiles to Concentration Maps. New tools for the study of loss distributions," Insurance: Mathematics and Economics, Elsevier, vol. 78(C), pages 13-29.
    20. Daranrat Jaitiang & Wen-Chi Huang & Shang-Ho Yang, 2021. "Does Income Inequality Exist among Urban Farmers? A Demonstration of Lorenz Curves from Northern Thailand," Sustainability, MDPI, vol. 13(9), pages 1-16, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:aodasc:v:4:y:2017:i:3:d:10.1007_s40745-017-0106-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.