IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v70y2014icp61-66.html
   My bibliography  Save this article

Factor analysis parameter estimation from incomplete data

Author

Listed:
  • Roberts, W.J.J.

Abstract

An expectation–maximization (EM) algorithm for factor analysis parameter estimation when observations are missing is developed. In contrast to existing EM algorithms for this problem, the algorithm here is developed assuming the missing observations are not part of the complete data in the EM formulation. The resulting algorithm provides increased computational efficiency through sparse matrix operations. The algorithm is demonstrated on two sparse, high-dimensional data sets that are prohibitively large for existing algorithms: the Netflix movie recommendation data set and the Yahoo! musical item data set. The resulting factor models are applied to predict missing values using conditional mean estimation, achieving root mean square errors of 0.9001 and 24.08 on the Netflix and Yahoo! data sets, respectively.

Suggested Citation

  • Roberts, W.J.J., 2014. "Factor analysis parameter estimation from incomplete data," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 61-66.
  • Handle: RePEc:eee:csdana:v:70:y:2014:i:c:p:61-66
    DOI: 10.1016/j.csda.2013.08.018
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947313003150
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2013.08.018?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Donald Rubin & Dorothy Thayer, 1982. "EM algorithms for ML factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 47(1), pages 69-76, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Forzani, Liliana & García Arancibia, Rodrigo & Llop, Pamela & Tomassi, Diego, 2018. "Supervised dimension reduction for ordinal predictors," Computational Statistics & Data Analysis, Elsevier, vol. 125(C), pages 136-155.
    2. Nikolaos Zirogiannis & Yorghos Tripodis, 2018. "Dynamic factor analysis for short panels: estimating performance trajectories for water utilities," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 27(1), pages 131-150, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sentana, Enrique & Calzolari, Giorgio & Fiorentini, Gabriele, 2008. "Indirect estimation of large conditionally heteroskedastic factor models, with an application to the Dow 30 stocks," Journal of Econometrics, Elsevier, vol. 146(1), pages 10-25, September.
    2. Matteo Barigozzi & Matteo Luciani, 2019. "Quasi Maximum Likelihood Estimation and Inference of Large Approximate Dynamic Factor Models via the EM algorithm," Papers 1910.03821, arXiv.org, revised Sep 2024.
    3. Zirogiannis, Nikolaos & Tripodis, Yorghos, 2013. "A Generalized Dynamic Factor Model for Panel Data: Estimation with a Two-Cycle Conditional Expectation-Maximization Algorithm," Working Paper Series 142752, University of Massachusetts, Amherst, Department of Resource Economics.
    4. Dorota Toczydlowska & Gareth W. Peters & Man Chung Fung & Pavel V. Shevchenko, 2017. "Stochastic Period and Cohort Effect State-Space Mortality Models Incorporating Demographic Factors via Probabilistic Robust Principal Components," Risks, MDPI, vol. 5(3), pages 1-77, July.
    5. Avellán, Guillermo & González-Astudillo, Manuel & Salcedo, Juan José, 2020. "A Streamlined Procedure to Construct a Macroeconomic Uncertainty Index with an Application to the Ecuadorian Economy," MPRA Paper 102593, University Library of Munich, Germany.
    6. Aßmann, Christian & Boysen-Hogrefe, Jens & Pape, Markus, 2012. "The directional identification problem in Bayesian factor analysis: An ex-post approach," Kiel Working Papers 1799, Kiel Institute for the World Economy (IfW Kiel).
    7. Chen, Derek H. C. & Gawande, Kishore, 2007. "Underlying dimensions of knowledge assessment : factor analysis of the knowledge assessment methodology data," Policy Research Working Paper Series 4216, The World Bank.
    8. Bodnar, Taras & Reiß, Markus, 2016. "Exact and asymptotic tests on a factor model in low and large dimensions with applications," Journal of Multivariate Analysis, Elsevier, vol. 150(C), pages 125-151.
    9. Blum Yuna & Houée-Bigot Magalie & Causeur David, 2016. "Sparse factor model for co-expression networks with an application using prior biological knowledge," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 15(3), pages 253-272, June.
    10. Zhou, Lin & Tang, Yayong, 2021. "Linearly preconditioned nonlinear conjugate gradient acceleration of the PX-EM algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 155(C).
    11. Bai, Jushan, 2024. "Likelihood approach to dynamic panel models with interactive effects," Journal of Econometrics, Elsevier, vol. 240(1).
    12. Friguet, Chloé & Causeur, David, 2011. "Estimation of the proportion of true null hypotheses in high-dimensional data under dependence," Computational Statistics & Data Analysis, Elsevier, vol. 55(9), pages 2665-2676, September.
    13. Taehun Lee & Li Cai, 2012. "Alternative Multiple Imputation Inference for Mean and Covariance Structure Modeling," Journal of Educational and Behavioral Statistics, , vol. 37(6), pages 675-702, December.
    14. Kim, Jiwhan & Nam, Changi & Ryu, Min Ho, 2020. "IPTV vs. emerging video services: Dilemma of telcos to upgrade the broadband," Telecommunications Policy, Elsevier, vol. 44(4).
    15. Jin, Shaobo & Moustaki, Irini & Yang-Wallentin, Fan, 2018. "Approximated penalized maximum likelihood for exploratory factor analysis: an orthogonal case," LSE Research Online Documents on Economics 88118, London School of Economics and Political Science, LSE Library.
    16. James, Jonathan, 2018. "Estimation of factor structured covariance mixed logit models," Journal of choice modelling, Elsevier, vol. 28(C), pages 41-55.
    17. Martín Almuzara & Dante Amengual & Enrique Sentana, 2019. "Normality tests for latent variables," Quantitative Economics, Econometric Society, vol. 10(3), pages 981-1017, July.
    18. Matteo Barigozzi & Daniele Massacci, 2022. "Modelling Large Dimensional Datasets with Markov Switching Factor Models," Papers 2210.09828, arXiv.org, revised Dec 2024.
    19. Shaoxin Wang & Hu Yang & Chaoli Yao, 2019. "On the penalized maximum likelihood estimation of high-dimensional approximate factor model," Computational Statistics, Springer, vol. 34(2), pages 819-846, June.
    20. Aßmann, Christian & Boysen-Hogrefe, Jens & Pape, Markus, 2014. "Bayesian analysis of dynamic factor models: An ex-post approach towards the rotation problem," Kiel Working Papers 1902, Kiel Institute for the World Economy (IfW Kiel).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:70:y:2014:i:c:p:61-66. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.