IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v186y2021ics0047259x21000646.html
   My bibliography  Save this article

A semiparametric latent factor model for large scale temporal data with heteroscedasticity

Author

Listed:
  • Zhang, Lyuou
  • Zhou, Wen
  • Wang, Haonan

Abstract

Large scale temporal data have flourished in a vast array of applications, and their sophisticated structures, especially the heteroscedasticity among subjects with inter- and intra-temporal dependence, have fueled a great demand for new statistical models. In this paper, with covariate information, we consider a flexible model for large scale temporal data with subject-specific heteroscedasticity. Formally, the model employs latent semiparametric factors to simultaneously account for the subject-specific heteroscedasticity and the contemporaneous and/or serial correlations. The subject-specific heteroscedasticity is modeled as the product of the unobserved factor process and subject’s covariate effect, which is further characterized via additive models. For estimation, we propose a two-step procedure. First, the latent factor process and nonparametric loading are recovered through projection-based methods, and following, we estimate the regression components by approaches motivated from the generalized least squares. By scrupulously examining the non-asymptotic rates for recovering the factor process and its loading, we show the consistency and efficiency of estimated regression coefficients in the absence of prior knowledge of latent factor process and subject’s covariate effect. The statistical guarantees remain valid even for finite time points that makes our method particularly appealing when the subjects significantly outnumber the observation time points. Using comprehensive simulations, we demonstrate the finite sample performance of our method, which corroborates the theoretical findings. Finally, we apply our method to a data set of air quality and energy consumption collected at 129 monitoring sites in the United States in 2015.

Suggested Citation

  • Zhang, Lyuou & Zhou, Wen & Wang, Haonan, 2021. "A semiparametric latent factor model for large scale temporal data with heteroscedasticity," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
  • Handle: RePEc:eee:jmvana:v:186:y:2021:i:c:s0047259x21000646
    DOI: 10.1016/j.jmva.2021.104786
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X21000646
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2021.104786?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Bai, Jushan & Ng, Serena, 2013. "Principal components estimation and identification of static factors," Journal of Econometrics, Elsevier, vol. 176(1), pages 18-29.
    2. Jianqing Fan & Yuan Liao & Martina Mincheva, 2013. "Large covariance estimation by thresholding principal orthogonal complements," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(4), pages 603-680, September.
    3. Meng Cao & Wen Zhou & F. Jay Breidt & Graham Peers, 2020. "Large scale maximum average power multiple inference on time‐course count data with application to RNA‐seq analysis," Biometrics, The International Biometric Society, vol. 76(1), pages 9-22, March.
    4. Connor, Gregory & Linton, Oliver, 2007. "Semiparametric estimation of a characteristic-based factor model of common stock returns," Journal of Empirical Finance, Elsevier, vol. 14(5), pages 694-717, December.
    5. Robinson, Peter M, 1988. "Root- N-Consistent Semiparametric Regression," Econometrica, Econometric Society, vol. 56(4), pages 931-954, July.
    6. Gregory Connor & Matthias Hagmann & Oliver Linton, 2012. "Efficient Semiparametric Estimation of the Fama–French Model and Extensions," Econometrica, Econometric Society, vol. 80(2), pages 713-754, March.
    7. Zudi Lu & Dag Johan Steinskog & Dag Tjøstheim & Qiwei Yao, 2009. "Adaptively varying‐coefficient spatiotemporal models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(4), pages 859-880, September.
    8. Chamberlain, Gary & Rothschild, Michael, 1983. "Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets," Econometrica, Econometric Society, vol. 51(5), pages 1281-1304, September.
    9. Gregory Connor & Matthias Hagmann & Oliver Linton, 2007. "Efficient Estimation of a Semiparametric Characteristic- Based Factor Model of Security Returns," Swiss Finance Institute Research Paper Series 07-26, Swiss Finance Institute.
    10. Giovanni Motta & Hernando Ombao, 2012. "Evolutionary Factor Analysis of Replicated Time Series," Biometrics, The International Biometric Society, vol. 68(3), pages 825-836, September.
    11. Phillips, Robert F., 2010. "Iterated Feasible Generalized Least-Squares Estimation of Augmented Dynamic Panel Data Models," Journal of Business & Economic Statistics, American Statistical Association, vol. 28(3), pages 410-422.
    12. Z. John Daye & Jinbo Chen & Hongzhe Li, 2012. "High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis," Biometrics, The International Biometric Society, vol. 68(1), pages 316-326, March.
    13. Fangfang Wang & Haonan Wang, 2018. "Modelling non‐stationary multivariate time series of counts via common factors," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 769-791, September.
    14. Hua Liang & Yongsong Qin & Xinyu Zhang & David Ruppert, 2009. "Empirical Likelihood‐Based Inferences for Generalized Partially Linear Models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 36(3), pages 433-443, September.
    15. Li, Baibing & Martin, Elaine B. & Morris, A. Julian, 2002. "On principal component analysis in L1," Computational Statistics & Data Analysis, Elsevier, vol. 40(3), pages 471-474, September.
    16. Lan Wang & Xiao-Hua Zhou, 2007. "Assessing the Adequacy of Variance Function in Heteroscedastic Regression Models," Biometrics, The International Biometric Society, vol. 63(4), pages 1218-1225, December.
    17. Barigozzi, Matteo & Cho, Haeran & Fryzlewicz, Piotr, 2018. "Simultaneous multiple change-point and factor analysis for high-dimensional time series," Journal of Econometrics, Elsevier, vol. 206(1), pages 187-225.
    18. Jeffrey T Leek & John D Storey, 2007. "Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis," PLOS Genetics, Public Library of Science, vol. 3(9), pages 1-12, September.
    19. Jushan Bai, 2009. "Panel Data Models With Interactive Fixed Effects," Econometrica, Econometric Society, vol. 77(4), pages 1229-1279, July.
    20. Bianchi, Daniele & Billio, Monica & Casarin, Roberto & Guidolin, Massimo, 2019. "Modeling systemic risk with Markov Switching Graphical SUR models," Journal of Econometrics, Elsevier, vol. 210(1), pages 58-74.
    21. Lam, Clifford & Yao, Qiwei, 2012. "Factor modeling for high-dimensional time series: inference for the number of factors," LSE Research Online Documents on Economics 45684, London School of Economics and Political Science, LSE Library.
    22. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    23. Fan, Jianqing & Fan, Yingying & Lv, Jinchi, 2008. "High dimensional covariance matrix estimation using a factor model," Journal of Econometrics, Elsevier, vol. 147(1), pages 186-197, November.
    24. Jushan Bai, 2003. "Inferential Theory for Factor Models of Large Dimensions," Econometrica, Econometric Society, vol. 71(1), pages 135-171, January.
    25. Hallin, Marc & Liska, Roman, 2011. "Dynamic factors in the presence of blocks," Journal of Econometrics, Elsevier, vol. 163(1), pages 29-41, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yang, Shuquan & Ling, Nengxiang, 2023. "Robust projected principal component analysis for large-dimensional semiparametric factor modeling," Journal of Multivariate Analysis, Elsevier, vol. 195(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jianqing Fan & Yuan Liao & Han Liu, 2016. "An overview of the estimation of large covariance and precision matrices," Econometrics Journal, Royal Economic Society, vol. 19(1), pages 1-32, February.
    2. Jianqing Fan & Kunpeng Li & Yuan Liao, 2020. "Recent Developments on Factor Models and its Applications in Econometric Learning," Papers 2009.10103, arXiv.org.
    3. Barigozzi, Matteo & Trapani, Lorenzo, 2020. "Sequential testing for structural stability in approximate factor models," Stochastic Processes and their Applications, Elsevier, vol. 130(8), pages 5149-5187.
    4. Li, Kunpeng & Li, Qi & Lu, Lina, 2018. "Quasi maximum likelihood analysis of high dimensional constrained factor models," Journal of Econometrics, Elsevier, vol. 206(2), pages 574-612.
    5. Choi, Sung Hoon & Kim, Donggyu, 2023. "Large volatility matrix analysis using global and national factor models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1917-1933.
    6. Yoshimasa Uematsu & Takashi Yamagata, 2019. "Estimation of Weak Factor Models," DSSR Discussion Papers 96, Graduate School of Economics and Management, Tohoku University.
    7. Fan, Jianqing & Ke, Yuan & Liao, Yuan, 2021. "Augmented factor models with applications to validating market risk factors and forecasting bond risk premia," Journal of Econometrics, Elsevier, vol. 222(1), pages 269-294.
    8. Jaeheon Jung, 2019. "Estimating a Large Covariance Matrix in Time-varying Factor Models," Papers 1910.11965, arXiv.org.
    9. Georg Keilbar & Juan M. Rodriguez-Poo & Alexandra Soberon & Weining Wang, 2022. "A semiparametric approach for interactive fixed effects panel data models," Papers 2201.11482, arXiv.org, revised Mar 2023.
    10. Fan, Jianqing & Xue, Lingzhou & Yao, Jiawei, 2017. "Sufficient forecasting using factor models," Journal of Econometrics, Elsevier, vol. 201(2), pages 292-306.
    11. Bai, Jushan & Liao, Yuan, 2016. "Efficient estimation of approximate factor models via penalized maximum likelihood," Journal of Econometrics, Elsevier, vol. 191(1), pages 1-18.
    12. Matteo Barigozzi, 2023. "Asymptotic equivalence of Principal Components and Quasi Maximum Likelihood estimators in Large Approximate Factor Models," Papers 2307.09864, arXiv.org, revised Jun 2024.
    13. Yuan Liao & Xiye Yang, 2017. "Uniform Inference for Characteristic Effects of Large Continuous-Time Linear Models," Papers 1711.04392, arXiv.org, revised Dec 2018.
    14. Gagliardini, Patrick & Ossola, Elisa & Scaillet, Olivier, 2019. "Estimation of large dimensional conditional factor models in finance," Working Papers unige:125031, University of Geneva, Geneva School of Economics and Management.
    15. Yoshimasa Uematsu & Takashi Yamagata, 2019. "Estimation of Weak Factor Models," ISER Discussion Paper 1053r, Institute of Social and Economic Research, Osaka University, revised Mar 2020.
    16. Bai, Jushan & Ng, Serena, 2019. "Rank regularized estimation of approximate factor models," Journal of Econometrics, Elsevier, vol. 212(1), pages 78-96.
    17. Jianqing Fan & Yuan Liao & Martina Mincheva, 2013. "Large covariance estimation by thresholding principal orthogonal complements," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(4), pages 603-680, September.
    18. Dai, Chaoxing & Lu, Kun & Xiu, Dacheng, 2019. "Knowing factors or factor loadings, or neither? Evaluating estimators of large covariance matrices with noisy and asynchronous data," Journal of Econometrics, Elsevier, vol. 208(1), pages 43-79.
    19. Joongyeub Yeo & George Papanicolaou, 2016. "Random matrix approach to estimation of high-dimensional factor models," Papers 1611.05571, arXiv.org, revised Nov 2017.
    20. Zhaoxing Gao & Ruey S. Tsay, 2021. "Divide-and-Conquer: A Distributed Hierarchical Factor Approach to Modeling Large-Scale Time Series Data," Papers 2103.14626, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:186:y:2021:i:c:s0047259x21000646. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.