IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v339y2024i3d10.1007_s10479-022-04642-2.html
   My bibliography  Save this article

Orthogonal nonnegative matrix factorization problems for clustering: A new formulation and a competitive algorithm

Author

Listed:
  • Ja’far Dehghanpour

    (Sharif University of Technology)

  • Nezam Mahdavi-Amiri

    (Sharif University of Technology)

Abstract

Orthogonal Nonnegative Matrix Factorization (ONMF) with orthogonality constraints on a matrix has been found to provide better clustering results over existing clustering problems. Because of the orthogonality constraint, this optimization problem is difficult to solve. Many of the existing constraint-preserving methods deal directly with the constraints using different techniques such as matrix decomposition or computing exponential matrices. Here, we propose an alternative formulation of the ONMF problem which converts the orthogonality constraints into non-convex constraints. To handle the non-convex constraints, a penalty function is applied. The penalized problem is a smooth nonlinear programming problem with quadratic (convex) constraints that can be solved by a proper optimization method. We first make use of an optimization method with two gradient projection steps and then apply a post-processing technique to construct a partition of the clustering problem. Comparative performance analysis of our proposed approach with other available clustering methods on randomly generated test problems and hard synthetic data-sets shows the outperformance of our approach, in terms of the obtained misclassification error rate and the Rand index.

Suggested Citation

  • Ja’far Dehghanpour & Nezam Mahdavi-Amiri, 2024. "Orthogonal nonnegative matrix factorization problems for clustering: A new formulation and a competitive algorithm," Annals of Operations Research, Springer, vol. 339(3), pages 1481-1497, August.
  • Handle: RePEc:spr:annopr:v:339:y:2024:i:3:d:10.1007_s10479-022-04642-2
    DOI: 10.1007/s10479-022-04642-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-022-04642-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-022-04642-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Rajiv D. Banker & Hsihui Chang & Zhiqiang Zheng, 2017. "On the use of super-efficiency procedures for ranking efficient units and identifying outliers," Annals of Operations Research, Springer, vol. 250(1), pages 21-35, March.
    2. Sebastián Moreno & Jordi Pereira & Wilfredo Yushimito, 2020. "A hybrid K-means and integer programming method for commercial territory design: a case study in meat distribution," Annals of Operations Research, Springer, vol. 286(1), pages 87-117, March.
    3. Zengchang Qin & Tao Wan & Hanqing Zhao, 2017. "Hybrid clustering of data and vague concepts based on labels semantics," Annals of Operations Research, Springer, vol. 256(2), pages 393-416, September.
    4. Ron Shefi & Marc Teboulle, 2016. "On the rate of convergence of the proximal alternating linearized minimization algorithm for convex problems," EURO Journal on Computational Optimization, Springer;EURO - The Association of European Operational Research Societies, vol. 4(1), pages 27-46, February.
    5. Lian Duan & Lida Xu & Ying Liu & Jun Lee, 2009. "Cluster-based outlier detection," Annals of Operations Research, Springer, vol. 168(1), pages 151-168, April.
    6. Ali Tosyali & Jinho Kim & Jeongsub Choi & Yunyi Kang & Myong K. Jeong, 2020. "New node anomaly detection algorithm based on nonnegative matrix factorization for directed citation networks," Annals of Operations Research, Springer, vol. 288(1), pages 457-474, May.
    7. Derya Dinler & Mustafa Kemal Tural & Nur Evin Ozdemirel, 2020. "Centroid based Tree-Structured Data Clustering Using Vertex/Edge Overlap and Graph Edit Distance," Annals of Operations Research, Springer, vol. 289(1), pages 85-122, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ali Tosyali & Jinho Kim & Jeongsub Choi & Yunyi Kang & Myong K. Jeong, 2020. "New node anomaly detection algorithm based on nonnegative matrix factorization for directed citation networks," Annals of Operations Research, Springer, vol. 288(1), pages 457-474, May.
    2. Behnam Tavakkol & Myong K. Jeong & Susan L. Albin, 2021. "Validity indices for clusters of uncertain data objects," Annals of Operations Research, Springer, vol. 303(1), pages 321-357, August.
    3. Jaeseung Baek & Myong K. Jeong & Elsayed A. Elsayed, 2024. "Spatial randomness-based anomaly detection approach for monitoring local variations in multimode surface topography," Annals of Operations Research, Springer, vol. 341(1), pages 173-195, October.
    4. Pontus Mattsson & Jonas Månsson & Christian Andersson & Fredrik Bonander, 2018. "A bootstrapped Malmquist index applied to Swedish district courts," European Journal of Law and Economics, Springer, vol. 46(1), pages 109-139, August.
    5. Yi Yu & Jaeseung Baek & Ali Tosyali & Myong K. Jeong, 2024. "Robust asymmetric non-negative matrix factorization for clustering nodes in directed networks," Annals of Operations Research, Springer, vol. 341(1), pages 245-265, October.
    6. Zervopoulos, Panagiotis & Emrouznejad, Ali & Sklavos, Sokratis, 2019. "A Bayesian approach for correcting bias of data envelopment analysis estimators," MPRA Paper 91886, University Library of Munich, Germany.
    7. Philippe Bernard & Najat El Mekkaoui De Freitas & Bertrand B. Maillet, 2022. "A financial fraud detection indicator for investors: an IDeA," Annals of Operations Research, Springer, vol. 313(2), pages 809-832, June.
    8. Marek Śmieja & Magdalena Wiercioch, 2017. "Constrained clustering with a complex cluster structure," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(3), pages 493-518, September.
    9. Aielli, Gian Piero & Caporin, Massimiliano, 2013. "Fast clustering of GARCH processes via Gaussian mixture models," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 94(C), pages 205-222.
    10. Arnab Adhikari & Adrija Majumdar & Gaurav Gupta & Arnab Bisi, 2020. "An innovative super-efficiency data envelopment analysis, semi-variance, and Shannon-entropy-based methodology for player selection: evidence from cricket," Annals of Operations Research, Springer, vol. 284(1), pages 1-32, January.
    11. Hae-Sang Park & Jeonghwa Lee & Chi-Hyuck Jun, 2014. "Clustering noise-included data by controlling decision errors," Annals of Operations Research, Springer, vol. 216(1), pages 129-144, May.
    12. Rakin Abrar & Showmitra Kumar Sarkar & Kashfia Tasnim Nishtha & Swapan Talukdar & Shahfahad & Atiqur Rahman & Abu Reza Md Towfiqul Islam & Amir Mosavi, 2022. "Assessing the Spatial Mapping of Heat Vulnerability under Urban Heat Island (UHI) Effect in the Dhaka Metropolitan Area," Sustainability, MDPI, vol. 14(9), pages 1-24, April.
    13. Masoud Ahookhosh & Le Thi Khanh Hien & Nicolas Gillis & Panagiotis Patrinos, 2021. "A Block Inertial Bregman Proximal Algorithm for Nonsmooth Nonconvex Problems with Application to Symmetric Nonnegative Matrix Tri-Factorization," Journal of Optimization Theory and Applications, Springer, vol. 190(1), pages 234-258, July.
    14. Khezrimotlagh, Dariush & Cook, Wade D. & Zhu, Joe, 2020. "A nonparametric framework to detect outliers in estimating production frontiers," European Journal of Operational Research, Elsevier, vol. 286(1), pages 375-388.
    15. Chen, Xin & Zhou, Wenjia, 2023. "Performance evaluation of aquavoltaics in China: Retrospect and prospect," Renewable and Sustainable Energy Reviews, Elsevier, vol. 173(C).
    16. Zhou, Lin & Zhen, Lu & Baldacci, Roberto & Boschetti, Marco & Dai, Ying & Lim, Andrew, 2021. "A Heuristic Algorithm for solving a large-scale real-world territory design problem," Omega, Elsevier, vol. 103(C).
    17. Li, Jiaxin & Peng, Jiachao & Shuai, Chuanmin & Wang, Zihan & Huang, Fubin & Khayyam, Muhammad, 2022. "Does the solar PV program enhance the social empowerment of China's rural poor?," Energy, Elsevier, vol. 253(C).
    18. Roy Cerqueti & Mario Maggi & Jessica Riccioni, 2024. "Statistical methods for decision support systems in finance: how Benford’s law predicts financial risk," Annals of Operations Research, Springer, vol. 342(3), pages 1445-1469, November.
    19. Farnè, Matteo & Vouldis, Angelos T., 2018. "A methodology for automised outlier detection in high-dimensional datasets: an application to euro area banks' supervisory data," Working Paper Series 2171, European Central Bank.
    20. Vladimír Holý & Karel Šafr, 2018. "Are economically advanced countries more efficient in basic and applied research?," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 26(4), pages 933-950, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:339:y:2024:i:3:d:10.1007_s10479-022-04642-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.