IDEAS home Printed from https://ideas.repec.org/a/spr/aodasc/v4y2017i3d10.1007_s40745-017-0106-3.html
   My bibliography  Save this article

A Rough Based Hybrid Binary PSO Algorithm for Flat Feature Selection and Classification in Gene Expression Data

Author

Listed:
  • Suresh Dara

    (B.V. Raju Inistitute of Technology)

  • Haider Banka

    (Indian Institute of Technology (ISM))

  • Chandra Sekhara Rao Annavarapu

    (Indian Institute of Technology (ISM))

Abstract

Feature selection in high dimensional data, particularly, in gene expression data, is one of the challenging task in bioinformatics due to the curse of dimensionality, data redundancy and noise values. In gene expression data, insignificant features causes poor classification, hence feature selection reduces feature subset, improving classification accuracy. Feature selection algorithms in gene expression data(such as filter based, wrapper based and hybrid methods) performing poor accuracy, where as few methods takes too much time to converge for an acceptable results. For example, in NSGA-II, over 10,000 generations, on an average, to converge in the search space. where it incurs increased computational time. Proposed rough based hybrid binary PSO algorithm, which uses a heuristic based fast processing strategy to reduce crude domain features by statistical elimination of redundant features and then discretized subsequently into a binary table, known as distinction table, in rough set theory. This distinction table is later used as input to evaluate and optimize the objectives functions i.e., to generate reduct in rough set theory. The proposed hybrid binary PSO is then used to tune the objective functions, to choose the most important features (i:e:reduct). The fitness function is used in such a way that it can reduce the cardinality of the features and at the same time, improve the classification performance as well. Results have been demonstrated to show the effectiveness of the proposed method, on existing three benchmark datasets (i.e. colon cancer, lymphoma and leukemia data), from literature.

Suggested Citation

  • Suresh Dara & Haider Banka & Chandra Sekhara Rao Annavarapu, 2017. "A Rough Based Hybrid Binary PSO Algorithm for Flat Feature Selection and Classification in Gene Expression Data," Annals of Data Science, Springer, vol. 4(3), pages 341-360, September.
  • Handle: RePEc:spr:aodasc:v:4:y:2017:i:3:d:10.1007_s40745-017-0106-3
    DOI: 10.1007/s40745-017-0106-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40745-017-0106-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40745-017-0106-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Pawlak, Zdzislaw, 1997. "Rough set approach to knowledge-based decision support," European Journal of Operational Research, Elsevier, vol. 99(1), pages 48-57, May.
    2. Gastwirth, Joseph L, 1972. "The Estimation of the Lorenz Curve and Gini Index," The Review of Economics and Statistics, MIT Press, vol. 54(3), pages 306-316, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiaofeng Lv & Gupeng Zhang & Xinkuo Xu & Qinghai Li, 2017. "Bootstrap-calibrated empirical likelihood confidence intervals for the difference between two Gini indexes," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 15(2), pages 195-216, June.
    2. Mak, Brenda & Munakata, Toshinori, 2002. "Rule extraction from expert heuristics: A comparative study of rough sets with neural networks and ID3," European Journal of Operational Research, Elsevier, vol. 136(1), pages 212-229, January.
    3. Renaud, J. & Thibault, J. & Lanouette, R. & Kiss, L.N. & Zaras, K. & Fonteix, C., 2007. "Comparison of two multicriteria decision aid methods: Net Flow and Rough Set Methods in a high yield pulping process," European Journal of Operational Research, Elsevier, vol. 177(3), pages 1418-1432, March.
    4. Igor Fedotenkov, 2020. "A Review of More than One Hundred Pareto-Tail Index Estimators," Statistica, Department of Statistics, University of Bologna, vol. 80(3), pages 245-299.
    5. Clarke, Philip & Van Ourti, Tom, 2010. "Calculating the concentration index when income is grouped," Journal of Health Economics, Elsevier, vol. 29(1), pages 151-157, January.
    6. Chotikapanich, Duangkamon & Griffiths, William E, 2002. "Estimating Lorenz Curves Using a Dirichlet Distribution," Journal of Business & Economic Statistics, American Statistical Association, vol. 20(2), pages 290-295, April.
    7. Korb, Penelope & Blank, Steven C. & Erickson, Kenneth W., 2004. "Profit patterns in the U.S. and the West, 1992 and 1997: What county-level data reveal," 2004 Annual Meeting, June 30-July 2, 2004, Honolulu, Hawaii 291740, Western Agricultural Economics Association.
    8. Xiaofeng Lv & Gupeng Zhang & Guangyu Ren, 2017. "Gini index estimation for lifetime data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(2), pages 275-304, April.
    9. Michael C. Lovell, 1999. "Inequality Within and Among Nations," Journal of Income Distribution, Ad libros publications inc., vol. 8(1), pages 1-1, June.
    10. Modalsli, Jørgen, 2011. "Inequality and growth in the very long run: inferring inequality from data on social groups," Memorandum 11/2011, Oslo University, Department of Economics.
    11. Fernandez del Pozo, J. A. & Bielza, C. & Gomez, M., 2005. "A list-based compact representation for large decision tables management," European Journal of Operational Research, Elsevier, vol. 160(3), pages 638-662, February.
    12. Pavlína Hejduková & Lucie Kureková, 2017. "Income Inequality And Selected Methods Of Its Measurement With The Use Of Practical Data For International Comparison," International Journal of Economic Sciences, International Institute of Social and Economic Sciences, vol. 6(2), pages 68-81, November.
    13. Stéphane Mussard, 2006. "La décomposition des mesures d’inégalité en sources de revenu : l’indice de Gini et les généralisations," Cahiers de recherche 06-05, Departement d'économique de l'École de gestion à l'Université de Sherbrooke.
    14. Pinkovskiy, Maxim L., 2013. "World welfare is rising: Estimation using nonparametric bounds on welfare measures," Journal of Public Economics, Elsevier, vol. 97(C), pages 176-195.
    15. Gift Dumedah & Nadine Schuurman, 2008. "Minimizing the effects of inaccurate sediment description in borehole data using rough sets and transition probability," Journal of Geographical Systems, Springer, vol. 10(3), pages 291-315, September.
    16. Suryakant Yadav, 2021. "Progress of Inequality in Age at Death in India: Role of Adult Mortality," European Journal of Population, Springer;European Association for Population Studies, vol. 37(3), pages 523-550, July.
    17. Nijkamp, Peter & Poot, Jacques, 2015. "Cultural Diversity: A Matter of Measurement," IZA Discussion Papers 8782, Institute of Labor Economics (IZA).
    18. Trincado, E. & Vindel, J.M., 2024. "A viability index for comparing the binominal return-risk of solar radiation," Renewable Energy, Elsevier, vol. 220(C).
    19. Tom Van Ourti & Philip Clarke, 2008. "The Bias of the Gini Coefficient due to Grouping," Tinbergen Institute Discussion Papers 08-095/3, Tinbergen Institute.
    20. Vladimir Hlasny, 2021. "Parametric representation of the top of income distributions: Options, historical evidence, and model selection," Journal of Economic Surveys, Wiley Blackwell, vol. 35(4), pages 1217-1256, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:aodasc:v:4:y:2017:i:3:d:10.1007_s40745-017-0106-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.