IDEAS home Printed from https://ideas.repec.org/p/osf/osfxxx/9twbk_v1.html
   My bibliography  Save this paper

Regularized multigroup exploratory approximate factor analysis for easy analysis of complex data

Author

Listed:
  • Van Deun, Katrijn
  • Lê, Trà T.
  • Malinowski, Jakub
  • Mols, Floortje
  • Schoormans, Dounya

Abstract

Exploring multigroup data for similarities and differences in the measurement model is a substantial part of the research conducted in the behavioral and social sciences. Examples include studying the measurement invariance of psychological scales over age or ethnic groups and comparing symptom correlations between different psychological disorders. Multigroup exploratory factor analysis is often the method of choice. However, currently available methods are restrictive in their use. First, these methods cannot handle complex data with small sample sizes relative to the number of variables, while high-dimension, low-sample-size data are increasingly used as a result of digitalization (e.g., word counts obtained by text mining of online messages or omics data). Second, the use of existing software is often arduous. Here, we propose a regularized exploratory approximate factor analysis method that addresses these issues by building on a strong computational framework: The resulting method yields solutions that are constrained to show simple structure and similarity of the loadings over groups when supported by the data. The minimal input required is restricted to the data and number of factors. In a simulation study, we show that the method considerably outperforms existing methods, also in the low-dimensional setting; publicly available genomics data on different psychopathologies are used to illustrate that the method works in the ultrahigh-dimensional setting. Implementation of the method in the R software language for statistical computing is publicly available on GitHub, including the code used to conduct the simulation study and to perform the analyses of the three empirical data sets.

Suggested Citation

  • Van Deun, Katrijn & Lê, Trà T. & Malinowski, Jakub & Mols, Floortje & Schoormans, Dounya, 2025. "Regularized multigroup exploratory approximate factor analysis for easy analysis of complex data," OSF Preprints 9twbk_v1, Center for Open Science.
  • Handle: RePEc:osf:osfxxx:9twbk_v1
    DOI: 10.31219/osf.io/9twbk_v1
    as

    Download full text from publisher

    File URL: https://osf.io/download/67c99db54558244421fd75ff/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/9twbk_v1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Norman Cliff, 1966. "Orthogonal rotation to congruence," Psychometrika, Springer;The Psychometric Society, vol. 31(1), pages 33-42, March.
    2. Guerra Urzola, Rosember & Van Deun, Katrijn & Vera, J. C. & Sijtsma, K., 2021. "A guide for sparse PCA : Model comparison and applications," Other publications TiSEM 4d35b931-7f49-444b-b92f-a, Tilburg University, School of Economics and Management.
    3. Rosember Guerra-Urzola & Katrijn Van Deun & Juan C. Vera & Klaas Sijtsma, 2021. "A Guide for Sparse PCA: Model Comparison and Applications," Psychometrika, Springer;The Psychometric Society, vol. 86(4), pages 893-919, December.
    4. Chamberlain, Gary & Rothschild, Michael, 1983. "Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets," Econometrica, Econometric Society, vol. 51(5), pages 1281-1304, September.
    5. Bai, Jushan & Ng, Serena, 2023. "Approximate factor models with weaker loadings," Journal of Econometrics, Elsevier, vol. 235(2), pages 1893-1916.
    6. la Grange, Anthony & le Roux, Niël & Gardner-Lubbe, Sugnet, 2009. "BiplotGUI: Interactive Biplots in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 30(i12).
    7. Kohei Adachi & Nickolay T. Trendafilov, 2016. "Sparse principal component analysis subject to prespecified cardinality of loadings," Computational Statistics, Springer, vol. 31(4), pages 1403-1427, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rosember Guerra-Urzola & Niek C. Schipper & Anya Tonne & Klaas Sijtsma & Juan C. Vera & Katrijn Deun, 2023. "Sparsifying the least-squares approach to PCA: comparison of lasso and cardinality constraint," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(1), pages 269-286, March.
    2. Diego Fresoli & Pilar Poncela & Esther Ruiz, 2024. "Dealing with idiosyncratic cross-correlation when constructing confidence regions for PC factors," Papers 2407.06883, arXiv.org.
    3. Jianqing Fan & Yuling Yan & Yuheng Zheng, 2024. "When can weak latent factors be statistically inferred?," Papers 2407.03616, arXiv.org, revised Sep 2024.
    4. Matteo Barigozzi & Marc Hallin, 2024. "The Dynamic, the Static, and the Weak Factor Models and the Analysis of High-Dimensional Time Series," Working Papers ECARES 2024-14, ULB -- Universite Libre de Bruxelles.
    5. Michael Greenacre & Patrick J. F Groenen & Trevor Hastie & Alfonso Iodice d’Enza & Angelos Markos & Elena Tuzhilina, 2023. "Principal component analysis," Economics Working Papers 1856, Department of Economics and Business, Universitat Pompeu Fabra.
    6. Tomohiro Ando & Ruey S. Tsay, 2009. "Model selection for generalized linear models with factor‐augmented predictors," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 25(3), pages 207-235, May.
    7. Cavit Pakel & Neil Shephard & Kevin Sheppard & Robert F. Engle, 2021. "Fitting Vast Dimensional Time-Varying Covariance Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 39(3), pages 652-668, July.
    8. Ard H.J. den Reijer, 2005. "Forecasting Dutch GDP using Large Scale Factor Models," DNB Working Papers 028, Netherlands Central Bank, Research Department.
    9. Matteo Barigozzi & Antonio M. Conti & Matteo Luciani, 2014. "Do Euro Area Countries Respond Asymmetrically to the Common Monetary Policy?," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 76(5), pages 693-714, October.
    10. Peñaranda, Francisco & Sentana, Enrique, 2016. "Duality in mean-variance frontiers with conditioning information," Journal of Empirical Finance, Elsevier, vol. 38(PB), pages 762-785.
    11. Paul Viefers & Ferdinand Fichtner & Simon Junker & Maximilian Podstawski, 2014. "Filtering German Economic Conditions from a Large Dataset: The New DIW Economic Barometer," Discussion Papers of DIW Berlin 1414, DIW Berlin, German Institute for Economic Research.
    12. Maysam Khodayari Gharanchaei & Prabhu Prasad Panda & Xilin Chen, 2024. "Quantitative Investment Diversification Strategies via Various Risk Models," Papers 2407.01550, arXiv.org.
    13. Bakalli, Gaetan & Guerrier, Stéphane & Scaillet, Olivier, 2023. "A penalized two-pass regression to predict stock returns with time-varying risk premia," Journal of Econometrics, Elsevier, vol. 237(2).
    14. David Havrlant & Peter Tóth & Julia Wörz, 2016. "On the optimal number of indicators – nowcasting GDP growth in CESEE," Focus on European Economic Integration, Oesterreichische Nationalbank (Austrian Central Bank), issue 4, pages 54-72.
    15. Catherine Doz & Domenico Giannone & Lucrezia Reichlin, 2012. "A Quasi–Maximum Likelihood Approach for Large, Approximate Dynamic Factor Models," The Review of Economics and Statistics, MIT Press, vol. 94(4), pages 1014-1024, November.
    16. Khan, M. Ali & Sun, Yeneng, 2001. "Asymptotic Arbitrage and the APT with or without Measure-Theoretic Structures," Journal of Economic Theory, Elsevier, vol. 101(1), pages 222-251, November.
    17. Fan, Jianqing & Liao, Yuan & Shi, Xiaofeng, 2015. "Risks of large portfolios," Journal of Econometrics, Elsevier, vol. 186(2), pages 367-387.
    18. Rachida Ouysse, 2013. "Forecasting using a large number of predictors: Bayesian model averaging versus principal components regression," Discussion Papers 2013-04, School of Economics, The University of New South Wales.
    19. Mario Forni & Luca Gambetti & Luca Sala, 2014. "No News in Business Cycles," Economic Journal, Royal Economic Society, vol. 124(581), pages 1168-1191, December.
    20. Yuefeng Han & Rong Chen & Dan Yang & Cun-Hui Zhang, 2020. "Tensor Factor Model Estimation by Iterative Projection," Papers 2006.02611, arXiv.org, revised Jul 2024.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:osfxxx:9twbk_v1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://osf.io/preprints/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.