IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v8y2020i3p373-d329636.html
   My bibliography  Save this article

Improved Initialization of the EM Algorithm for Mixture Model Parameter Estimation

Author

Listed:
  • Branislav Panić

    (Faculty of Mechanical Engineering, University of Ljubljana, Aškerčeva ulica 6, 1000 Ljubljana, Slovenia)

  • Jernej Klemenc

    (Faculty of Mechanical Engineering, University of Ljubljana, Aškerčeva ulica 6, 1000 Ljubljana, Slovenia)

  • Marko Nagode

    (Faculty of Mechanical Engineering, University of Ljubljana, Aškerčeva ulica 6, 1000 Ljubljana, Slovenia)

Abstract

A commonly used tool for estimating the parameters of a mixture model is the Expectation–Maximization (EM) algorithm, which is an iterative procedure that can serve as a maximum-likelihood estimator. The EM algorithm has well-documented drawbacks, such as the need for good initial values and the possibility of being trapped in local optima. Nevertheless, because of its appealing properties, EM plays an important role in estimating the parameters of mixture models. To overcome these initialization problems with EM, in this paper, we propose the Rough-Enhanced-Bayes mixture estimation (REBMIX) algorithm as a more effective initialization algorithm. Three different strategies are derived for dealing with the unknown number of components in the mixture model. These strategies are thoroughly tested on artificial datasets, density–estimation datasets and image–segmentation problems and compared with state-of-the-art initialization methods for the EM. Our proposal shows promising results in terms of clustering and density-estimation performance as well as in terms of computational efficiency. All the improvements are implemented in the rebmix R package.

Suggested Citation

  • Branislav Panić & Jernej Klemenc & Marko Nagode, 2020. "Improved Initialization of the EM Algorithm for Mixture Model Parameter Estimation," Mathematics, MDPI, vol. 8(3), pages 1-29, March.
  • Handle: RePEc:gam:jmathe:v:8:y:2020:i:3:p:373-:d:329636
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/8/3/373/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/8/3/373/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. McLachlan, Geoffrey J. & Krishnan, Thriyambakam & Ng, See Ket, 2004. "The EM Algorithm," Papers 2004,24, Humboldt University of Berlin, Center for Applied Statistics and Economics (CASE).
    2. Franko, Mitja & Nagode, Marko, 2015. "Probability density function of the equivalent stress amplitude using statistical transformation," Reliability Engineering and System Safety, Elsevier, vol. 134(C), pages 118-125.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Senthil Kumar Jagatheesaperumal & Varun Prakash Rajamohan & Abdul Khader Jilani Saudagar & Abdullah AlTameem & Muhammad Sajjad & Khan Muhammad, 2023. "MoMo: Mouse-Based Motion Planning for Optimized Grasping to Declutter Objects Using a Mobile Robotic Manipulator," Mathematics, MDPI, vol. 11(20), pages 1-25, October.
    2. Yingkui Jiao & Zhiwei Li & Junchao Zhu & Bin Xue & Baofeng Zhang, 2022. "ABIDE: A Novel Scheme for Ultrasonic Echo Estimation by Combining CEEMD-SSWT Method with EM Algorithm," Sustainability, MDPI, vol. 14(4), pages 1-21, February.
    3. Nagode, Marko & Oman, Simon & Klemenc, Jernej & Panić, Branislav, 2023. "Gumbel mixture modelling for multiple failure data," Reliability Engineering and System Safety, Elsevier, vol. 230(C).
    4. Branislav Panić & Marko Nagode & Jernej Klemenc & Simon Oman, 2022. "On Methods for Merging Mixture Model Components Suitable for Unsupervised Image Segmentation Tasks," Mathematics, MDPI, vol. 10(22), pages 1-22, November.
    5. Ben Wu & Subhadip Pal & Jian Kang & Ying Guo, 2022. "Rejoinder to discussions of “distributional independent component analysis for diverse neuroimaging modalities”," Biometrics, The International Biometric Society, vol. 78(3), pages 1122-1126, September.
    6. Yinan Li & Kai-Tai Fang & Ping He & Heng Peng, 2022. "Representative Points from a Mixture of Two Normal Distributions," Mathematics, MDPI, vol. 10(21), pages 1-28, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Żyromski, Andrzej & Szulczewski, Wiesław & Biniak-Pieróg, Małgorzata & Jakubowski, Wojciech, 2016. "The estimation of basket willow (Salix viminalis) yield – New approach. Part I: Background and statistical description," Renewable and Sustainable Energy Reviews, Elsevier, vol. 65(C), pages 1118-1126.
    2. Ke-Hai Yuan & Kentaro Hayashi, 2005. "On muthén’s maximum likelihood for two-level covariance structure models," Psychometrika, Springer;The Psychometric Society, vol. 70(1), pages 147-167, March.
    3. Ringle, Christian M., 2006. "Segmentation for path models and unobserved heterogeneity: The finite mixture partial least squares approach," MPRA Paper 10734, University Library of Munich, Germany.
    4. Saeedeh Eskandari & Mahdis Amiri & Nitheshnirmal Sãdhasivam & Hamid Reza Pourghasemi, 2020. "Comparison of new individual and hybrid machine learning algorithms for modeling and mapping fire hazard: a supplementary analysis of fire hazard in different counties of Golestan Province in Iran," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 104(1), pages 305-327, October.
    5. Qunqiang Feng & Hosam Mahmoud & Alois Panholzer, 2008. "Limit laws for the Randić index of random binary tree models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 60(2), pages 319-343, June.
    6. Jakubowski, Wojciech & Szulczewski, Wiesław & Żyromski, Andrzej & Biniak-Pieróg, Małgorzata, 2016. "The estimation of basket willow (Salix viminalis) yield – New approach, Part II: Theoretical model and its practical application," Renewable and Sustainable Energy Reviews, Elsevier, vol. 66(C), pages 843-851.
    7. James C. Spall, 2012. "Cyclic Seesaw Process for Optimization and Identification," Journal of Optimization Theory and Applications, Springer, vol. 154(1), pages 187-208, July.
    8. Nagode, Marko & Oman, Simon & Klemenc, Jernej & Panić, Branislav, 2023. "Gumbel mixture modelling for multiple failure data," Reliability Engineering and System Safety, Elsevier, vol. 230(C).
    9. Ke-Hai Yuan & Peter Bentler, 2004. "On the asymptotic distributions of two statistics for two-level covariance structure models within the class of elliptical distributions," Psychometrika, Springer;The Psychometric Society, vol. 69(3), pages 437-457, September.
    10. Christophe Genolini & Bruno Falissard, 2010. "KmL: k-means for longitudinal data," Computational Statistics, Springer, vol. 25(2), pages 317-328, June.
    11. Guillaume Horny, 2009. "Inference in mixed proportional hazard models with K random effects," Statistical Papers, Springer, vol. 50(3), pages 481-499, June.
    12. Orozco-Garcia, Carolina & Schmeiser, Hato, 2015. "How sensitive is the pricing of lookback and interest rate guarantees when changing the modelling assumptions?," Insurance: Mathematics and Economics, Elsevier, vol. 65(C), pages 77-93.
    13. Feinerer, Ingo & Hornik, Kurt & Meyer, David, 2008. "Text Mining Infrastructure in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 25(i05).
    14. Fordellone, Mario & Vichi, Maurizio, 2020. "Finding groups in structural equation modeling through the partial least squares algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 147(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:8:y:2020:i:3:p:373-:d:329636. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.