IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v159y2021ics0167947320302504.html
   My bibliography  Save this article

A new class of stochastic EM algorithms. Escaping local maxima and handling intractable sampling

Author

Listed:
  • Allassonnière, Stéphanie
  • Chevallier, Juliette

Abstract

The expectation–maximization (EM) algorithm is a powerful computational technique for maximum likelihood estimation in incomplete data models. When the expectation step cannot be performed in closed form, a stochastic approximation of EM (SAEM) can be used. The convergence of the SAEM toward critical points of the observed likelihood has been proved and its numerical efficiency has been demonstrated. However, sampling from the posterior distribution may be intractable or have a high computational cost. Moreover, despite appealing features, the limit position of this algorithm can strongly depend on its starting one. Sampling from an approximation of the distribution in the expectation phase of the SAEM allows coping with these two issues. This new procedure is referred to as approximated-SAEM and is proved to converge toward critical points of the observed likelihood. Experiments on synthetic and real data highlight the performance of this algorithm in comparison to the SAEM and the EM when feasible.

Suggested Citation

  • Allassonnière, Stéphanie & Chevallier, Juliette, 2021. "A new class of stochastic EM algorithms. Escaping local maxima and handling intractable sampling," Computational Statistics & Data Analysis, Elsevier, vol. 159(C).
  • Handle: RePEc:eee:csdana:v:159:y:2021:i:c:s0167947320302504
    DOI: 10.1016/j.csda.2020.107159
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947320302504
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2020.107159?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. David M. Blei & Alp Kucukelbir & Jon D. McAuliffe, 2017. "Variational Inference: A Review for Statisticians," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 859-877, April.
    2. Ravi Varadhan & Christophe Roland, 2008. "Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 35(2), pages 335-353, June.
    3. Biernacki, Christophe & Celeux, Gilles & Govaert, Gerard, 2003. "Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 41(3-4), pages 561-575, January.
    4. Xiao‐Li Meng & David Van Dyk, 1997. "The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(3), pages 511-567.
    5. Glenn Milligan, 1980. "An examination of the effect of six types of error perturbation on fifteen clustering algorithms," Psychometrika, Springer;The Psychometric Society, vol. 45(3), pages 325-342, September.
    6. Umberto Picchini & Adeline Samson, 2018. "Coupling stochastic EM and approximate Bayesian computation for parameter inference in state-space models," Computational Statistics, Springer, vol. 33(1), pages 179-212, March.
    7. repec:dau:papers:123456789/5724 is not listed on IDEAS
    8. Jank, Wolfgang, 2005. "Quasi-Monte Carlo sampling to improve the efficiency of Monte Carlo EM," Computational Statistics & Data Analysis, Elsevier, vol. 48(4), pages 685-701, April.
    9. Samson, Adeline & Lavielle, Marc & Mentre, France, 2006. "Extension of the SAEM algorithm to left-censored data in nonlinear mixed-effects model: Application to HIV dynamics model," Computational Statistics & Data Analysis, Elsevier, vol. 51(3), pages 1562-1574, December.
    10. J. G. Booth & J. P. Hobert, 1999. "Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(1), pages 265-285.
    11. Kuhn, E. & Lavielle, M., 2005. "Maximum likelihood estimation in nonlinear mixed effects models," Computational Statistics & Data Analysis, Elsevier, vol. 49(4), pages 1020-1038, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jochen Ranger & Christoph König & Benjamin W. Domingue & Jörg-Tobias Kuhn & Andreas Frey, 2024. "A Multidimensional Partially Compensatory Response Time Model on Basis of the Log-Normal Distribution," Journal of Educational and Behavioral Statistics, , vol. 49(3), pages 431-464, June.
    2. Saâdaoui, Foued, 2023. "Randomized extrapolation for accelerating EM-type fixed-point algorithms," Journal of Multivariate Analysis, Elsevier, vol. 196(C).
    3. Gámiz, María Luz & Mammen, Enno & Martínez-Miranda, María Dolores & Nielsen, Jens Perch, 2022. "Missing link survival analysis with applications to available pandemic data," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Trevezas, S. & Malefaki, S. & Cournède, P.-H., 2014. "Parameter estimation via stochastic variants of the ECM algorithm with applications to plant growth modeling," Computational Statistics & Data Analysis, Elsevier, vol. 78(C), pages 82-99.
    2. Gonzalez, Jorge & Tuerlinckx, Francis & De Boeck, Paul & Cools, Ronald, 2006. "Numerical integration in logistic-normal models," Computational Statistics & Data Analysis, Elsevier, vol. 51(3), pages 1535-1548, December.
    3. Marc Lavielle & Adeline Samson & Ana Karina Fermin & France Mentré, 2011. "Maximum Likelihood Estimation of Long-Term HIV Dynamic Models and Antiviral Response," Biometrics, The International Biometric Society, vol. 67(1), pages 250-259, March.
    4. Tomarchio, Salvatore D. & Punzo, Antonio & Bagnato, Luca, 2020. "Two new matrix-variate distributions with application in model-based clustering," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).
    5. Larissa A. Matos & Víctor H. Lachos & Tsung-I Lin & Luis M. Castro, 2019. "Heavy-tailed longitudinal regression models for censored data: a robust parametric approach," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(3), pages 844-878, September.
    6. Riccardo Rastelli & Michael Fop, 2020. "A stochastic block model for interaction lengths," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 485-512, June.
    7. Spark C. Tseung & Ian Weng Chan & Tsz Chai Fung & Andrei L. Badescu & X. Sheldon Lin, 2022. "A Posteriori Risk Classification and Ratemaking with Random Effects in the Mixture-of-Experts Model," Papers 2209.15212, arXiv.org.
    8. Christian E. Galarza & Luis M. Castro & Francisco Louzada & Victor H. Lachos, 2020. "Quantile regression for nonlinear mixed effects models: a likelihood based perspective," Statistical Papers, Springer, vol. 61(3), pages 1281-1307, June.
    9. Wan-Lun Wang & Luis M. Castro & Wan-Chen Hsieh & Tsung-I Lin, 2021. "Mixtures of factor analyzers with covariates for modeling multiply censored dependent variables," Statistical Papers, Springer, vol. 62(5), pages 2119-2145, October.
    10. Li Cai, 2010. "High-dimensional Exploratory Item Factor Analysis by A Metropolis–Hastings Robbins–Monro Algorithm," Psychometrika, Springer;The Psychometric Society, vol. 75(1), pages 33-57, March.
    11. Wang, Jing, 2007. "EM algorithms for nonlinear mixed effects models," Computational Statistics & Data Analysis, Elsevier, vol. 51(6), pages 3244-3256, March.
    12. Celine Marielle Laffont & Marc Vandemeulebroecke & Didier Concordet, 2014. "Multivariate Analysis of Longitudinal Ordinal Data With Mixed Effects Models, With Application to Clinical Outcomes in Osteoarthritis," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 955-966, September.
    13. Wan-Lun Wang & Tsung-I Lin, 2022. "Robust clustering of multiply censored data via mixtures of t factor analyzers," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 22-53, March.
    14. Shu Yang & Jae Kwang Kim, 2016. "Likelihood-based Inference with Missing Data Under Missing-at-Random," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(2), pages 436-454, June.
    15. Kim, Junyung & Shah, Asad Ullah Amin & Kang, Hyun Gook, 2020. "Dynamic risk assessment with bayesian network and clustering analysis," Reliability Engineering and System Safety, Elsevier, vol. 201(C).
    16. Goethner, Maximilian & Hornuf, Lars & Regner, Tobias, 2021. "Protecting investors in equity crowdfunding: An empirical analysis of the small investor protection act," Technological Forecasting and Social Change, Elsevier, vol. 162(C).
    17. Adrian O’Hagan & Arthur White, 2019. "Improved model-based clustering performance using Bayesian initialization averaging," Computational Statistics, Springer, vol. 34(1), pages 201-231, March.
    18. Hemant Kulkarni & Jayabrata Biswas & Kiranmoy Das, 2019. "A joint quantile regression model for multiple longitudinal outcomes," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 103(4), pages 453-473, December.
    19. John C. McCabe-Dansted & Arkadii Slinko, 2006. "Exploratory Analysis of Similarities Between Social Choice Rules," Group Decision and Negotiation, Springer, vol. 15(1), pages 77-107, January.
    20. Ali Abdelzadeh, 2014. "The Impact of Political Conviction on the Relation Between Winning or Losing and Political Dissatisfaction," SAGE Open, , vol. 4(2), pages 21582440145, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:159:y:2021:i:c:s0167947320302504. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.