IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i12p1423-d577564.html
   My bibliography  Save this article

Fuzzy Clustering Methods with Rényi Relative Entropy and Cluster Size

Author

Listed:
  • Javier Bonilla

    (Department of Statistics and Operations Research, Universidad Complutense de Madrid, 28040 Madrid, Spain
    Comisión Nacional del Mercado de Valores, 28006 Madrid, Spain)

  • Daniel Vélez

    (Department of Statistics and Operations Research, Universidad Complutense de Madrid, 28040 Madrid, Spain)

  • Javier Montero

    (Department of Statistics and Operations Research, Universidad Complutense de Madrid, 28040 Madrid, Spain)

  • J. Tinguaro Rodríguez

    (Department of Statistics and Operations Research, Universidad Complutense de Madrid, 28040 Madrid, Spain)

Abstract

In the last two decades, information entropy measures have been relevantly applied in fuzzy clustering problems in order to regularize solutions by avoiding the formation of partitions with excessively overlapping clusters. Following this idea, relative entropy or divergence measures have been similarly applied, particularly to enable that kind of entropy-based regularization to also take into account, as well as interact with, cluster size variables. Particularly, since Rényi divergence generalizes several other divergence measures, its application in fuzzy clustering seems promising for devising more general and potentially more effective methods. However, previous works making use of either Rényi entropy or divergence in fuzzy clustering, respectively, have not considered cluster sizes (thus applying regularization in terms of entropy, not divergence) or employed divergence without a regularization purpose. Then, the main contribution of this work is the introduction of a new regularization term based on Rényi relative entropy between membership degrees and observation ratios per cluster to penalize overlapping solutions in fuzzy clustering analysis. Specifically, such Rényi divergence-based term is added to the variance-based Fuzzy C-means objective function when allowing cluster sizes. This then leads to the development of two new fuzzy clustering methods exhibiting Rényi divergence-based regularization, the second one extending the first by considering a Gaussian kernel metric instead of the Euclidean distance. Iterative expressions for these methods are derived through the explicit application of Lagrange multipliers. An interesting feature of these expressions is that the proposed methods seem to take advantage of a greater amount of information in the updating steps for membership degrees and observations ratios per cluster. Finally, an extensive computational study is presented showing the feasibility and comparatively good performance of the proposed methods.

Suggested Citation

  • Javier Bonilla & Daniel Vélez & Javier Montero & J. Tinguaro Rodríguez, 2021. "Fuzzy Clustering Methods with Rényi Relative Entropy and Cluster Size," Mathematics, MDPI, vol. 9(12), pages 1-27, June.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:12:p:1423-:d:577564
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/12/1423/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/12/1423/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Stephen Johnson, 1967. "Hierarchical clustering schemes," Psychometrika, Springer;The Psychometric Society, vol. 32(3), pages 241-254, September.
    2. Amo, A. & Montero, J. & Biging, G. & Cutello, V., 2004. "Fuzzy classification systems," European Journal of Operational Research, Elsevier, vol. 156(2), pages 495-507, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Katarzyna Hampel & Paulina Ucieklak-Jez & Agnieszka Bem, 2021. "Health System Responsiveness in the Light of the Euro Health Consumer Index," European Research Studies Journal, European Research Studies Journal, vol. 0(4B), pages 659-667.
    2. Kim, Junyung & Shah, Asad Ullah Amin & Kang, Hyun Gook, 2020. "Dynamic risk assessment with bayesian network and clustering analysis," Reliability Engineering and System Safety, Elsevier, vol. 201(C).
    3. David G Mets & Michael S Brainard, 2018. "An automated approach to the quantitation of vocalizations and vocal learning in the songbird," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-29, August.
    4. Noah E. Friedkin, 1984. "Structural Cohesion and Equivalence Explanations of Social Homogeneity," Sociological Methods & Research, , vol. 12(3), pages 235-261, February.
    5. David Matesanz Gomez & Guillermo J. Ortega & Benno Torgler, 2011. "Measuring globalization: A hierarchical network approach," CREMA Working Paper Series 2011-11, Center for Research in Economics, Management and the Arts (CREMA).
    6. Balepur, Prashant Narayan, 1998. "Impacts of Computer-Mediated Communication on Travel and Communication Patterns: The Davis Community Network Study," Institute of Transportation Studies, Research Reports, Working Papers, Proceedings qt6cb1f85c, Institute of Transportation Studies, UC Berkeley.
    7. Lisa Price, 2001. "Demystifying farmers' entomological and pest management knowledge: A methodology for assessing the impacts on knowledge from IPM-FFS and NES interventions," Agriculture and Human Values, Springer;The Agriculture, Food, & Human Values Society (AFHVS), vol. 18(2), pages 153-176, June.
    8. Elisa Frutos-Bernal & Ángel Martín del Rey & Irene Mariñas-Collado & María Teresa Santos-Martín, 2022. "An Analysis of Travel Patterns in Barcelona Metro Using Tucker3 Decomposition," Mathematics, MDPI, vol. 10(7), pages 1-17, March.
    9. Geert Soete & Wayne DeSarbo & J. Carroll, 1985. "Optimal variable weighting for hierarchical clustering: An alternating least-squares algorithm," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 173-192, December.
    10. Teh, Boon Kin & Goo, Yik Wen & Lian, Tong Wei & Ong, Wei Guang & Choi, Wen Ting & Damodaran, Mridula & Cheong, Siew Ann, 2015. "The Chinese Correction of February 2007: How financial hierarchies change in a market crash," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 424(C), pages 225-241.
    11. Yoshio Takane & Forrest Young & Jan Leeuw, 1977. "Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features," Psychometrika, Springer;The Psychometric Society, vol. 42(1), pages 7-67, March.
    12. Wentao Qu & Xianchao Xiu & Huangyue Chen & Lingchen Kong, 2023. "A Survey on High-Dimensional Subspace Clustering," Mathematics, MDPI, vol. 11(2), pages 1-39, January.
    13. Taggart, J. H., 1999. "MNC subsidiary performance, risk, and corporate expectations," International Business Review, Elsevier, vol. 8(2), pages 233-255, April.
    14. Sorin Alexandru Ungureanu & Diana Andreea Mandricel & Bogdan Ioan Coculescu & Ionica Oncioiu, 2020. "Prevention in Dental Medicine. Case Studies and Explanations Regarding the Cost-Benefit Ratio," Academic Journal of Economic Studies, Faculty of Finance, Banking and Accountancy Bucharest,"Dimitrie Cantemir" Christian University Bucharest, vol. 6(2), pages 135-147, June.
    15. Fang, Yixin & Wang, Junhui, 2011. "Penalized cluster analysis with applications to family data," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2128-2136, June.
    16. Xingyin Duan & Xiaobo Wu & Jie Ge & Li Deng & Liang Shen & Jingwen Xu & Xiaoying Xu & Qin He & Yixin Chen & Xuesong Gao & Bing Li, 2024. "A Novel Hierarchical Clustering Sequential Forward Feature Selection Method for Paddy Rice Agriculture Mapping Based on Time-Series Images," Agriculture, MDPI, vol. 14(9), pages 1-20, August.
    17. Simon Blanchard & Wayne DeSarbo, 2013. "A New Zero-Inflated Negative Binomial Methodology for Latent Category Identification," Psychometrika, Springer;The Psychometric Society, vol. 78(2), pages 322-340, April.
    18. Satoru Yokoyama & Atsuho Nakayama & Akinori Okada, 2009. "One-mode three-way overlapping cluster analysis," Computational Statistics, Springer, vol. 24(1), pages 165-179, February.
    19. Vincent S. Tseng & Hsieh-Hui Yu & Shih-Chiang Yang, 2009. "Efficient mining of multilevel gene association rules from microarray and gene ontology," Information Systems Frontiers, Springer, vol. 11(4), pages 433-447, September.
    20. repec:jss:jstsof:35:i07 is not listed on IDEAS
    21. Thomas J. Lampoltshammer & Valerie Albrecht & Corinna Raith, 2021. "Teaching Digital Sustainability in Higher Education from a Transdisciplinary Perspective," Sustainability, MDPI, vol. 13(21), pages 1-21, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:12:p:1423-:d:577564. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.