IDEAS home Printed from https://ideas.repec.org/a/spr/alstar/v105y2021i4d10.1007_s10182-021-00392-x.html
   My bibliography  Save this article

Is EM really necessary here? Examples where it seems simpler not to use EM

Author

Listed:
  • Iain L. MacDonald

    (University of Cape Town)

Abstract

If one is to judge by counts of citations of the fundamental paper (Dempster in JRSSB 39: 1–38, 1977), EM algorithms are a runaway success. But it is surprisingly easy to find published applications of EM that are unnecessary, in the sense that there are simpler methods available that will solve the relevant estimation problems. In particular, such problems can often be solved by the simple expedient of submitting the observed-data likelihood (or log-likelihood) to a general-purpose routine for unconstrained optimization. This can dispense with the need to derive and code (or modify) the E and M steps, a process which can sometimes be laborious or error-prone. Here, I discuss six such applications of EM in some detail, and in an appendix describe briefly some others that have already appeared in the literature. Whether these are atypical of applications of EM seems an open question, although one that may be difficult to answer; this question is of relevance to current practice, but may also be of historical interest. But it is clear that there are problems traditionally solved by EM (e.g. the fitting of finite mixtures of distributions) that can also be solved by other means. It is suggested that, before going to the effort of devising an EM algorithm to use on a new problem, the researcher should consider whether other methods (e.g. direct numerical maximization or an MM algorithm of some other kind) may be either simpler to implement or more efficient.

Suggested Citation

  • Iain L. MacDonald, 2021. "Is EM really necessary here? Examples where it seems simpler not to use EM," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 105(4), pages 629-647, December.
  • Handle: RePEc:spr:alstar:v:105:y:2021:i:4:d:10.1007_s10182-021-00392-x
    DOI: 10.1007/s10182-021-00392-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10182-021-00392-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10182-021-00392-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Kundu, Debasis & Dey, Arabin Kumar, 2009. "Estimating the parameters of the Marshall-Olkin bivariate Weibull distribution by EM algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 956-965, February.
    2. Xiao‐Li Meng & David Van Dyk, 1997. "The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(3), pages 511-567.
    3. Yunxiao He & Chuanhai Liu, 2012. "The dynamic ‘expectation–conditional maximization either’ algorithm," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(2), pages 313-336, March.
    4. Shiyao Liu & Huaiqing Wu & William Q. Meeker, 2015. "Understanding and Addressing the Unbounded "Likelihood" Problem," The American Statistician, Taylor & Francis Journals, vol. 69(3), pages 191-200, August.
    5. Zhou, Hua & Lange, Kenneth, 2009. "Rating Movies and Rating the Raters Who Rate Them," The American Statistician, American Statistical Association, vol. 63(4), pages 297-307.
    6. Roland Langrock, 2011. "Some applications of nonlinear and non-Gaussian state--space modelling by means of hidden Markov models," Journal of Applied Statistics, Taylor & Francis Journals, vol. 38(12), pages 2955-2970, March.
    7. Patricia M. E. Altham, 1978. "Two Generalizations of the Binomial Distribution," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 27(2), pages 162-167, June.
    8. Langrock, Roland & MacDonald, Iain L. & Zucchini, Walter, 2012. "Some nonstandard stochastic volatility models and their estimation using structured hidden Markov models," Journal of Empirical Finance, Elsevier, vol. 19(1), pages 147-161.
    9. Woojoo Lee & Yudi Pawitan, 2014. "Direct Calculation of the Variance of Maximum Penalized Likelihood Estimates via EM Algorithm," The American Statistician, Taylor & Francis Journals, vol. 68(2), pages 93-97, May.
    10. Brown, Garfield O. & Buckley, Winston S., 2015. "Experience rating with Poisson mixtures," Annals of Actuarial Science, Cambridge University Press, vol. 9(2), pages 304-321, September.
    11. Mortaza Jamshidian & Robert I. Jennrich, 1997. "Acceleration of the EM Algorithm by using Quasi‐Newton Methods," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(3), pages 569-587.
    12. Shi, Ning-Zhong & Zheng, Shu-Rong & Guo, Jianhua, 2005. "The restricted EM algorithm under inequality restrictions on the parameters," Journal of Multivariate Analysis, Elsevier, vol. 92(1), pages 53-76, January.
    13. Iain L. MacDonald & Brendon M. Lapham, 2016. "Even More Direct Calculation of the Variance of a Maximum Penalized-Likelihood Estimator," The American Statistician, Taylor & Francis Journals, vol. 70(1), pages 114-118, February.
    14. A. Azzalini & A.W. Bowman, 1990. "A Look at Some Data on the Old Faithful Geyser," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 39(3), pages 357-365, November.
    15. Sabrina Mulinacci, 2018. "Archimedean-based Marshall-Olkin Distributions and Related Dependence Structures," Methodology and Computing in Applied Probability, Springer, vol. 20(1), pages 205-236, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Carlos A. Abanto-Valle & Gabriel Rodríguez & Hernán B. Garrafa-Aragón, 2020. "Stochastic Volatility in Mean: Empirical Evidence from Stock Latin American Markets," Documentos de Trabajo / Working Papers 2020-481, Departamento de Economía - Pontificia Universidad Católica del Perú.
    2. Roland Langrock & Théo Michelot & Alexander Sohn & Thomas Kneib, 2015. "Semiparametric stochastic volatility modelling using penalized splines," Computational Statistics, Springer, vol. 30(2), pages 517-537, June.
    3. Carlos A. Abanto‐Valle & Roland Langrock & Ming‐Hui Chen & Michel V. Cardoso, 2017. "Maximum likelihood estimation for stochastic volatility in mean models with heavy‐tailed distributions," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 33(4), pages 394-408, August.
    4. Zhou, Lin & Tang, Yayong, 2021. "Linearly preconditioned nonlinear conjugate gradient acceleration of the PX-EM algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 155(C).
    5. Počuča, Nikola & Jevtić, Petar & McNicholas, Paul D. & Miljkovic, Tatjana, 2020. "Modeling frequency and severity of claims with the zero-inflated generalized cluster-weighted models," Insurance: Mathematics and Economics, Elsevier, vol. 94(C), pages 79-93.
    6. Domenico Piccolo & Rosaria Simone, 2019. "The class of cub models: statistical foundations, inferential issues and empirical evidence," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 28(3), pages 389-435, September.
    7. Hirukawa, Masayuki, 2010. "Nonparametric multiplicative bias correction for kernel-type density estimation on the unit interval," Computational Statistics & Data Analysis, Elsevier, vol. 54(2), pages 473-495, February.
    8. García, V.J. & Gómez-Déniz, E. & Vázquez-Polo, F.J., 2010. "A new skew generalization of the normal distribution: Properties and applications," Computational Statistics & Data Analysis, Elsevier, vol. 54(8), pages 2021-2034, August.
    9. Jurgen A. Doornik, 2018. "Accelerated Estimation of Switching Algorithms: The Cointegrated VAR Model and Other Applications," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 45(2), pages 283-300, June.
    10. Christian Hering & Jan-Frederik Mai, 2012. "Moment-based estimation of extendible Marshall-Olkin copulas," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 75(5), pages 601-620, July.
    11. Fred Huffer & Cheolyong Park, 2000. "A test for multivariate structure," Journal of Applied Statistics, Taylor & Francis Journals, vol. 27(5), pages 633-650.
    12. Chun Wang & Steven W. Nydick, 2020. "On Longitudinal Item Response Theory Models: A Didactic," Journal of Educational and Behavioral Statistics, , vol. 45(3), pages 339-368, June.
    13. Iain L. MacDonald & Brendon M. Lapham, 2016. "Even More Direct Calculation of the Variance of a Maximum Penalized-Likelihood Estimator," The American Statistician, Taylor & Francis Journals, vol. 70(1), pages 114-118, February.
    14. Yana Melnykov & Xuwen Zhu & Volodymyr Melnykov, 2021. "Transformation mixture modeling for skewed data groups with heavy tails and scatter," Computational Statistics, Springer, vol. 36(1), pages 61-78, March.
    15. repec:cte:wsrepe:ws1450804 is not listed on IDEAS
    16. Yu, Chang & Zelterman, Daniel, 2008. "Sums of exchangeable Bernoulli random variables for family and litter frequency data," Computational Statistics & Data Analysis, Elsevier, vol. 52(3), pages 1636-1649, January.
    17. Kundu, Debasis & Franco, Manuel & Vivo, Juana-Maria, 2014. "Multivariate distributions with proportional reversed hazard marginals," Computational Statistics & Data Analysis, Elsevier, vol. 77(C), pages 98-112.
    18. José E. Chacón, 2019. "Mixture model modal clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(2), pages 379-404, June.
    19. Tomarchio, Salvatore D. & Punzo, Antonio & Bagnato, Luca, 2020. "Two new matrix-variate distributions with application in model-based clustering," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).
    20. Borges, Patrick & Rodrigues, Josemar & Balakrishnan, Narayanaswamy & Bazán, Jorge, 2014. "A COM–Poisson type generalization of the binomial distribution and its properties and applications," Statistics & Probability Letters, Elsevier, vol. 87(C), pages 158-166.
    21. Vu, Duy & Aitkin, Murray, 2015. "Variational algorithms for biclustering models," Computational Statistics & Data Analysis, Elsevier, vol. 89(C), pages 12-24.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:alstar:v:105:y:2021:i:4:d:10.1007_s10182-021-00392-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.