IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v122y2018icp80-91.html
   My bibliography  Save this article

Approximate nonparametric maximum likelihood for mixture models: A convex optimization approach to fitting arbitrary multivariate mixing distributions

Author

Listed:
  • Feng, Long
  • Dicker, Lee H.

Abstract

Nonparametric maximum likelihood (NPML) for mixture models is a technique for estimating mixing distributions that has a long and rich history in statistics going back to the 1950s, and is closely related to empirical Bayes methods. Historically, NPML-based methods have been considered to be relatively impractical because of computational and theoretical obstacles. However, recent work focusing on approximate NPML methods suggests that these methods may have great promise for a variety of modern applications. Building on this recent work, a class of flexible, scalable, and easy to implement approximate NPML methods is studied for problems with multivariate mixing distributions. Concrete guidance on implementing these methods is provided, with theoretical and empirical support; topics covered include identifying the support set of the mixing distribution, and comparing algorithms (across a variety of metrics) for solving the simple convex optimization problem at the core of the approximate NPML problem. Additionally, three diverse real data applications are studied to illustrate the methods’ performance: (i) A baseball data analysis (a classical example for empirical Bayes methods), (ii) high-dimensional microarray classification, and (iii) online prediction of blood-glucose density for diabetes patients. Among other things, the empirical results demonstrate the relative effectiveness of using multivariate (as opposed to univariate) mixing distributions for NPML-based approaches.

Suggested Citation

  • Feng, Long & Dicker, Lee H., 2018. "Approximate nonparametric maximum likelihood for mixture models: A convex optimization approach to fitting arbitrary multivariate mixing distributions," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 80-91.
  • Handle: RePEc:eee:csdana:v:122:y:2018:i:c:p:80-91
    DOI: 10.1016/j.csda.2018.01.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947318300070
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2018.01.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jiaying Gu & Roger Koenker, 2017. "Empirical Bayesball Remixed: Empirical Bayes Methods for Longitudinal Data," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 32(3), pages 575-599, April.
    2. Roger Koenker & Ivan Mizera, 2014. "Convex Optimization, Shape Constraints, Compound Decisions, and Empirical Bayes Rules," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 674-685, June.
    3. Xianchao Xie & S. C. Kou & Lawrence D. Brown, 2012. "SURE Estimates for a Heteroscedastic Hierarchical Model," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1465-1479, December.
    4. Jiaying Gu & Roger Koenker, 2016. "On a Problem of Robbins," International Statistical Review, International Statistical Institute, vol. 84(2), pages 224-244, August.
    5. Koenker, Roger & Mizera, Ivan, 2014. "Convex Optimization in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 60(i05).
    6. Lee H. Dicker & Sihai D. Zhao, 2016. "High-dimensional classification via nonparametric empirical Bayes and maximum likelihood inference," Biometrika, Biometrika Trust, vol. 103(1), pages 21-34.
    7. Jiaying Gu & Roger Koenker, 2017. "Unobserved Heterogeneity in Income Dynamics: An Empirical Bayes Perspective," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 35(1), pages 1-16, January.
    8. Qing Mai & Hui Zou & Ming Yuan, 2012. "A direct approach to sparse discriminant analysis in ultra-high dimensions," Biometrika, Biometrika Trust, vol. 99(1), pages 29-42.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wang, Yihe & Zhao, Sihai Dave, 2021. "A nonparametric empirical Bayes approach to large-scale multivariate regression," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    2. Park, Hoyoung & Baek, Seungchul & Park, Junyong, 2022. "High-dimensional linear discriminant analysis using nonparametric methods," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    3. Huiqin Xin & Sihai Dave Zhao, 2023. "A compound decision approach to covariance matrix estimation," Biometrics, The International Biometric Society, vol. 79(2), pages 1201-1212, June.
    4. Srikanth Jagabathula & Lakshminarayanan Subramanian & Ashwin Venkataraman, 2020. "A Conditional Gradient Approach for Nonparametric Estimation of Mixing Distributions," Management Science, INFORMS, vol. 66(8), pages 3635-3656, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jiaying Gu & Roger Koenker, 2018. "Nonparametric maximum likelihood methods for binary response models with random coefficients," Papers 1811.03329, arXiv.org, revised Jan 2020.
    2. Timothy B. Armstrong & Michal Kolesár & Mikkel Plagborg‐Møller, 2022. "Robust Empirical Bayes Confidence Intervals," Econometrica, Econometric Society, vol. 90(6), pages 2567-2602, November.
    3. Sihai Dave Zhao, 2017. "Integrative genetic risk prediction using non-parametric empirical Bayes classification," Biometrics, The International Biometric Society, vol. 73(2), pages 582-592, June.
    4. Michael Gilraine & Jiaying Gu & Robert McMillan, 2020. "A New Method for Estimating Teacher Value-Added," NBER Working Papers 27094, National Bureau of Economic Research, Inc.
    5. Mike Gilraine & Jiaying Gu & Robert McMillan, 2022. "A Nonparametric Approach for Studying Teacher Impacts," Working Papers tecipa-716, University of Toronto, Department of Economics.
    6. Jiafeng Chen, 2022. "Empirical Bayes When Estimation Precision Predicts Parameters," Papers 2212.14444, arXiv.org, revised Apr 2024.
    7. Li Tan & Cory Koedel, 2019. "The Effects of Differential Income Replacement and Mortality on U.S. Social Security Redistribution," Southern Economic Journal, John Wiley & Sons, vol. 86(2), pages 613-637, October.
    8. Michael Gilraine & Jiaying Gu & Robert McMillan, 2021. "A Nonparametric Method for Estimating Teacher Value-Added," Working Papers tecipa-689, University of Toronto, Department of Economics.
    9. Timothy B. Armstrong & Michal Koles'ar & Mikkel Plagborg-M{o}ller, 2020. "Robust Empirical Bayes Confidence Intervals," Papers 2004.03448, arXiv.org, revised May 2022.
    10. Park, Junyong, 2018. "Simultaneous estimation based on empirical likelihood and general maximum likelihood estimation," Computational Statistics & Data Analysis, Elsevier, vol. 117(C), pages 19-31.
    11. Jiaying Gu & Roger Koenker, 2018. "Nonparametric maximum likelihood methods for binary response models with random coefficients," CeMMAP working papers CWP65/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    12. Roger Koenker, 2017. "Bayesian deconvolution: an R vinaigrette," CeMMAP working papers 38/17, Institute for Fiscal Studies.
    13. Stéphane Bonhomme & Martin Weidner, 2019. "Posterior average effects," CeMMAP working papers CWP43/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    14. Jiaying Gu & Roger Koenker, 2014. "Unobserved heterogeneity in income dynamics: an empirical Bayes perspective," CeMMAP working papers 43/14, Institute for Fiscal Studies.
    15. Fox, Jeremy T. & Kim, Kyoo il & Yang, Chenyu, 2016. "A simple nonparametric approach to estimating the distribution of random coefficients in structural models," Journal of Econometrics, Elsevier, vol. 195(2), pages 236-254.
    16. Li, Xiaoou & Chen, Yunxiao & Chen, Xi & Liu, Jingchen & Ying, Zhiliang, 2021. "Optimal stopping and worker selection in crowdsourcing: an adaptive sequential probability ratio test framework," LSE Research Online Documents on Economics 100873, London School of Economics and Political Science, LSE Library.
    17. Jiaying Gu & Roger Koenker, 2017. "Rebayes: an R package for empirical bayes mixture methods," CeMMAP working papers CWP37/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    18. Liu, Laura & Moon, Hyungsik Roger & Schorfheide, Frank, 2021. "Panel forecasts of country-level Covid-19 infections," Journal of Econometrics, Elsevier, vol. 220(1), pages 2-22.
    19. Jiaying Gu & Roger Koenker & Stanislav Volgushev, 2017. "Testing for homogeneity in mixture models," CeMMAP working papers CWP39/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    20. Alberto Abadie & Maximilian Kasy, 2019. "Choosing Among Regularized Estimators in Empirical Economics: The Risk of Machine Learning," The Review of Economics and Statistics, MIT Press, vol. 101(5), pages 743-762, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:122:y:2018:i:c:p:80-91. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.