IDEAS home Printed from https://ideas.repec.org/p/cuf/wpaper/678.html
   My bibliography  Save this paper

Scaled PCA: A New Approach to Dimension Reduction

Author

Listed:
  • Dashan Huang

    (Lee Kong Chian School of Business, Singapore Management University)

  • Fuwei Jiang

    (School of Finance, Central University of Finance and Economics)

  • Kunpeng Li

    (International School of Economics and Management, Capital University of Economics and Business)

  • Guoshi Tong

    (Fanhai International School of Finance, Fudan University)

  • Guofu Zhou

    (Olin Business School, Washington University in St. Louis)

Abstract

This paper proposes a novel supervised learning technique for forecasting: scaled principal component analysis (sPCA). The sPCA improves the traditional principal component analysis (PCA) by scaling each predictor with its predictive slope on the target to be forecasted. Unlike the PCA that maximizes the common variation of the predictors, the sPCA assigns more weight to those predictors with stronger forecasting power. In a general factor framework, we show that, under some appropriate conditions on data, the sPCA forecast beats the PCA forecast, and when these conditions break down, extensive simulations indicate that the sPCA still has a large chance to outperform the PCA. A real data example on macroeconomic forecasting shows that the sPCA has better performance in general.

Suggested Citation

  • Dashan Huang & Fuwei Jiang & Kunpeng Li & Guoshi Tong & Guofu Zhou, 2022. "Scaled PCA: A New Approach to Dimension Reduction," CEMA Working Papers 678, China Economics and Management Academy, Central University of Finance and Economics.
  • Handle: RePEc:cuf:wpaper:678
    as

    Download full text from publisher

    File URL: https://down.aefweb.net/WorkingPapers/w678.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Kelly, Bryan T. & Pruitt, Seth & Su, Yinan, 2019. "Characteristics are covariances: A unified model of risk and return," Journal of Financial Economics, Elsevier, vol. 134(3), pages 501-524.
    2. Maurizio Daniele & Winfried Pohlmeier & Aygul Zagidullina, 2018. "Sparse Approximate Factor Estimation for High-Dimensional Covariance Matrices," Working Paper Series of the Department of Economics, University of Konstanz 2018-07, Department of Economics, University of Konstanz.
    3. Bai, Jushan & Ng, Serena, 2008. "Forecasting economic time series using targeted predictors," Journal of Econometrics, Elsevier, vol. 146(2), pages 304-317, October.
    4. Ludvigson, Sydney C. & Ng, Serena, 2007. "The empirical risk-return relation: A factor analysis approach," Journal of Financial Economics, Elsevier, vol. 83(1), pages 171-222, January.
    5. Michael W. McCracken & Serena Ng, 2016. "FRED-MD: A Monthly Database for Macroeconomic Research," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 574-589, October.
    6. Clark, Todd E. & West, Kenneth D., 2007. "Approximately normal tests for equal predictive accuracy in nested models," Journal of Econometrics, Elsevier, vol. 138(1), pages 291-311, May.
    7. Bryan Kelly & Seth Pruitt, 2013. "Market Expectations in the Cross-Section of Present Values," Journal of Finance, American Finance Association, vol. 68(5), pages 1721-1756, October.
    8. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    9. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    10. Dashan Huang & Fuwei Jiang & Jun Tu & Guofu Zhou, 2015. "Investor Sentiment Aligned: A Powerful Predictor of Stock Returns," The Review of Financial Studies, Society for Financial Studies, vol. 28(3), pages 791-837.
    11. Joachim Freyberger & Andreas Neuhierl & Michael Weber, 2020. "Dissecting Characteristics Nonparametrically," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2326-2377.
    12. Gregory Connor & Matthias Hagmann & Oliver Linton, 2012. "Efficient Semiparametric Estimation of the Fama–French Model and Extensions," Econometrica, Econometric Society, vol. 80(2), pages 713-754, March.
    13. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    14. Markus Pelger, 2020. "Understanding Systematic Risk: A High‐Frequency Approach," Journal of Finance, American Finance Association, vol. 75(4), pages 2179-2220, August.
    15. Onatski, Alexei, 2012. "Asymptotics of the principal components estimator of large factor models with weakly influential factors," Journal of Econometrics, Elsevier, vol. 168(2), pages 244-258.
    16. Gu, Shihao & Kelly, Bryan & Xiu, Dacheng, 2021. "Autoencoder asset pricing models," Journal of Econometrics, Elsevier, vol. 222(1), pages 429-450.
    17. Jushan Bai & Serena Ng, 2004. "A PANIC Attack on Unit Roots and Cointegration," Econometrica, Econometric Society, vol. 72(4), pages 1127-1177, July.
    18. Stefano Giglio & Dacheng Xiu, 2021. "Asset Pricing with Omitted Factors," Journal of Political Economy, University of Chicago Press, vol. 129(7), pages 1947-1990.
    19. Nathaniel Light & Denys Maslov & Oleg Rytchkov, 2017. "Aggregation of Information About the Cross Section of Stock Returns: A Latent Variable Approach," The Review of Financial Studies, Society for Financial Studies, vol. 30(4), pages 1339-1381.
    20. Hai Lin & Chunchi Wu & Guofu Zhou, 2018. "Forecasting Corporate Bond Returns with a Large Set of Predictors: An Iterated Combination Approach," Management Science, INFORMS, vol. 64(9), pages 4218-4238, September.
    21. Jushan Bai, 2003. "Inferential Theory for Factor Models of Large Dimensions," Econometrica, Econometric Society, vol. 71(1), pages 135-171, January.
    22. Jushan Bai & Serena Ng, 2006. "Confidence Intervals for Diffusion Index Forecasts and Inference for Factor-Augmented Regressions," Econometrica, Econometric Society, vol. 74(4), pages 1133-1150, July.
    23. Connor, Gregory & Korajczyk, Robert A., 1986. "Performance measurement with the arbitrage pricing theory : A new framework for analysis," Journal of Financial Economics, Elsevier, vol. 15(3), pages 373-394, March.
    24. Kelly, Bryan & Pruitt, Seth, 2015. "The three-pass regression filter: A new approach to forecasting using many predictors," Journal of Econometrics, Elsevier, vol. 186(2), pages 294-316.
    25. Joachim Freyberger & Andreas Neuhierl & Michael Weber & Andrew KarolyiEditor, 2020. "Dissecting Characteristics Nonparametrically," Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2326-2377.
    26. Seung C. Ahn & Alex R. Horenstein, 2013. "Eigenvalue Ratio Test for the Number of Factors," Econometrica, Econometric Society, vol. 81(3), pages 1203-1227, May.
    27. Bai, Jushan, 2004. "Estimating cross-section common stochastic trends in nonstationary panel data," Journal of Econometrics, Elsevier, vol. 122(1), pages 137-183, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kuppenheimer, Gregory & Shelly, Stuart & Strauss, Jack, 2023. "Can machine learning identify sector-level financial ratios that predict sector returns?," Finance Research Letters, Elsevier, vol. 57(C).
    2. Liu, Shan & Li, Ziwei, 2023. "Macroeconomic attention and oil futures volatility prediction," Finance Research Letters, Elsevier, vol. 57(C).
    3. Lu, Xinjie & Ma, Feng & Wang, Tianyang & Wen, Fenghua, 2023. "International stock market volatility: A data-rich environment based on oil shocks," Journal of Economic Behavior & Organization, Elsevier, vol. 214(C), pages 184-215.
    4. Fang, Puyi & Gao, Zhaoxing & Tsay, Ruey S., 2023. "Supervised kernel principal component analysis for forecasting," Finance Research Letters, Elsevier, vol. 58(PA).
    5. Jixiang, Zhang & Feng, Ma, 2024. "Video apps user engagement and stock market volatility: Evidence from China," Finance Research Letters, Elsevier, vol. 64(C).
    6. Rajveer Jat & Daanish Padha, 2024. "Kernel Three Pass Regression Filter," Papers 2405.07292, arXiv.org, revised Jun 2024.
    7. Zhikai Zhang & Yaojie Zhang & Yudong Wang & Qunwei Wang, 2024. "The predictability of carbon futures volatility: New evidence from the spillovers of fossil energy futures returns," Journal of Futures Markets, John Wiley & Sons, Ltd., vol. 44(4), pages 557-584, April.
    8. Lu, Fei & Ma, Feng & Hu, Shiyang, 2024. "Does energy consumption play a key role? Re-evaluating the energy consumption-economic growth nexus from GDP growth rates forecasting," Energy Economics, Elsevier, vol. 129(C).
    9. Shuo-Chieh Huang & Ruey S. Tsay, 2024. "Time Series Forecasting with Many Predictors," Mathematics, MDPI, vol. 12(15), pages 1-20, July.
    10. Weijia Peng & Chun Yao, 2023. "Sector-level equity returns predictability with machine learning and market contagion measure," Empirical Economics, Springer, vol. 65(4), pages 1761-1798, October.
    11. Shulin Shen & Yiyi Zhao & Jindong Pang, 2024. "Local Housing Market Sentiments and Returns: Evidence from China," The Journal of Real Estate Finance and Economics, Springer, vol. 68(3), pages 488-522, April.
    12. Liang, Chao & Wang, Lu & Duong, Duy, 2024. "More attention and better volatility forecast accuracy: How does war attention affect stock volatility predictability?," Journal of Economic Behavior & Organization, Elsevier, vol. 218(C), pages 1-19.
    13. Lu, Fei & Ma, Feng & Guo, Qiang, 2023. "Less is more? New evidence from stock market volatility predictability," International Review of Financial Analysis, Elsevier, vol. 89(C).
    14. Wang, Jiashun & Wang, Jiqian & Ma, Feng, 2024. "International commodity market and stock volatility predictability: Evidence from G7 countries," International Review of Economics & Finance, Elsevier, vol. 90(C), pages 62-71.
    15. Chen, Andrew Y. & McCoy, Jack, 2024. "Missing values handling for machine learning portfolios," Journal of Financial Economics, Elsevier, vol. 155(C).
    16. Huang, Dashan & Jiang, Fuwei & Li, Kunpeng & Tong, Guoshi & Zhou, Guofu, 2023. "Are bond returns predictable with real-time macro data?," Journal of Econometrics, Elsevier, vol. 237(2).
    17. Tan, Xilong & Tao, Yubo, 2023. "Trend-based forecast of cryptocurrency returns," Economic Modelling, Elsevier, vol. 124(C).
    18. Lu, Xinjie & Lang, Qiaoqi, 2023. "Categorial economic policy uncertainty indices or Twitter-based uncertainty indices? Evidence from Chinese stock market," Finance Research Letters, Elsevier, vol. 55(PB).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhaoxing Gao & Ruey S. Tsay, 2023. "Supervised Dynamic PCA: Linear Dynamic Forecasting with Many Predictors," Papers 2307.07689, arXiv.org.
    2. Alain-Philippe Fortin & Patrick Gagliardini & O. Scaillet, 2022. "Eigenvalue tests for the number of latent factors in short panels," Swiss Finance Institute Research Paper Series 22-81, Swiss Finance Institute.
    3. Gagliardini, Patrick & Ossola, Elisa & Scaillet, Olivier, 2019. "A diagnostic criterion for approximate factor structure," Journal of Econometrics, Elsevier, vol. 212(2), pages 503-521.
    4. Catherine Doz & Peter Fuleky, 2019. "Dynamic Factor Models," Working Papers 2019-4, University of Hawaii Economic Research Organization, University of Hawaii at Manoa.
    5. Stefano Giglio & Dacheng Xiu, 2017. "Inference on Risk Premia in the Presence of Omitted Factors," NBER Working Papers 23527, National Bureau of Economic Research, Inc.
    6. Huang, Dashan & Li, Jiangyuan & Wang, Liyao, 2021. "Are disagreements agreeable? Evidence from information aggregation," Journal of Financial Economics, Elsevier, vol. 141(1), pages 83-101.
    7. Mykola Babiak & Jozef Barunik, 2020. "Deep Learning, Predictability, and Optimal Portfolio Returns," CERGE-EI Working Papers wp677, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
    8. Oleg Rytchkov & Xun Zhong, 2020. "Information Aggregation and P-Hacking," Management Science, INFORMS, vol. 66(4), pages 1605-1626, April.
    9. Clarke, Charles, 2022. "The level, slope, and curve factor model for stocks," Journal of Financial Economics, Elsevier, vol. 143(1), pages 159-187.
    10. Yuan Liao & Xinjie Ma & Andreas Neuhierl & Zhentao Shi, 2023. "Economic Forecasts Using Many Noises," Papers 2312.05593, arXiv.org, revised Dec 2023.
    11. Vigo Pereira, Caio, 2021. "Portfolio efficiency with high-dimensional data as conditioning information," International Review of Financial Analysis, Elsevier, vol. 77(C).
    12. Ma, Tian & Leong, Wen Jun & Jiang, Fuwei, 2023. "A latent factor model for the Chinese stock market," International Review of Financial Analysis, Elsevier, vol. 87(C).
    13. Fan, Jianqing & Xue, Lingzhou & Yao, Jiawei, 2017. "Sufficient forecasting using factor models," Journal of Econometrics, Elsevier, vol. 201(2), pages 292-306.
    14. Cakici, Nusret & Shahzad, Syed Jawad Hussain & Będowska-Sójka, Barbara & Zaremba, Adam, 2024. "Machine learning and the cross-section of cryptocurrency returns," International Review of Financial Analysis, Elsevier, vol. 94(C).
    15. Xi Dong & Yan Li & David E. Rapach & Guofu Zhou, 2022. "Anomalies and the Expected Market Return," Journal of Finance, American Finance Association, vol. 77(1), pages 639-681, February.
    16. Matteo Barigozzi & Marc Hallin, 2023. "Dynamic Factor Models: a Genealogy," Papers 2310.17278, arXiv.org, revised Jan 2024.
    17. Gu, Shihao & Kelly, Bryan & Xiu, Dacheng, 2021. "Autoencoder asset pricing models," Journal of Econometrics, Elsevier, vol. 222(1), pages 429-450.
    18. Francisco Peñaranda & Enrique Sentana, 2024. "Portfolio management with big data," Working Papers wp2024_2411, CEMFI.
    19. Shi, Qi, 2023. "The RP-PCA factors and stock return predictability: An aligned approach," The North American Journal of Economics and Finance, Elsevier, vol. 64(C).
    20. Doron Avramov & Si Cheng & Lior Metzker, 2023. "Machine Learning vs. Economic Restrictions: Evidence from Stock Return Predictability," Management Science, INFORMS, vol. 69(5), pages 2587-2619, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cuf:wpaper:678. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Qiang Gao (email available below). General contact details of provider: https://edirc.repec.org/data/emcufcn.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.