IDEAS home Printed from https://ideas.repec.org/a/gam/jeners/v17y2024i24p6452-d1549422.html
   My bibliography  Save this article

Partial Transfer Learning from Patch Transformer to Variate-Based Linear Forecasting Model

Author

Listed:
  • Le Hoang Anh

    (Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea
    These authors contributed equally to this work.)

  • Dang Thanh Vu

    (Research Center, AISeed Inc., Gwangju 61186, Republic of Korea
    These authors contributed equally to this work.)

  • Seungmin Oh

    (Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)

  • Gwang-Hyun Yu

    (Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)

  • Nguyen Bui Ngoc Han

    (Department of Electronic Convergence Engineering, Kwangwoon University, Seoul 01897, Republic of Korea)

  • Hyoung-Gook Kim

    (Department of Electronic Convergence Engineering, Kwangwoon University, Seoul 01897, Republic of Korea)

  • Jin-Sul Kim

    (Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)

  • Jin-Young Kim

    (Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)

Abstract

Transformer-based time series forecasting models use patch tokens for temporal patterns and variate tokens to learn covariates’ dependencies. While patch tokens inherently facilitate self-supervised learning, variate tokens are more suitable for linear forecasters as they help to mitigate distribution drift. However, the use of variate tokens prohibits masked model pretraining, as masking an entire series is absurd. To close this gap, we propose LSPatch-T (Long–Short Patch Transfer), a framework that transfers knowledge from short-length patch tokens into full-length variate tokens. A key implementation is that we selectively transfer a portion of the Transformer encoder to ensure the linear design of the downstream model. Additionally, we introduce a robust frequency loss to maintain consistency across different temporal ranges. The experimental results show that our approach outperforms Transformer-based baselines (Transformer, Informer, Crossformer, Autoformer, PatchTST, iTransformer) on three public datasets (ETT, Exchange, Weather), which is a promising step forward in generalizing time series forecasting models.

Suggested Citation

  • Le Hoang Anh & Dang Thanh Vu & Seungmin Oh & Gwang-Hyun Yu & Nguyen Bui Ngoc Han & Hyoung-Gook Kim & Jin-Sul Kim & Jin-Young Kim, 2024. "Partial Transfer Learning from Patch Transformer to Variate-Based Linear Forecasting Model," Energies, MDPI, vol. 17(24), pages 1-18, December.
  • Handle: RePEc:gam:jeners:v:17:y:2024:i:24:p:6452-:d:1549422
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1996-1073/17/24/6452/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1996-1073/17/24/6452/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Salinas, David & Flunkert, Valentin & Gasthaus, Jan & Januschowski, Tim, 2020. "DeepAR: Probabilistic forecasting with autoregressive recurrent networks," International Journal of Forecasting, Elsevier, vol. 36(3), pages 1181-1191.
    2. Le Hoang Anh & Gwang-Hyun Yu & Dang Thanh Vu & Hyoung-Gook Kim & Jin-Young Kim, 2023. "DelayNet: Enhancing Temporal Feature Extraction for Electronic Consumption Forecasting with Delayed Dilated Convolution," Energies, MDPI, vol. 16(22), pages 1-18, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Spiliotis, Evangelos & Makridakis, Spyros & Kaltsounis, Anastasios & Assimakopoulos, Vassilios, 2021. "Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data," International Journal of Production Economics, Elsevier, vol. 240(C).
    2. Andreas Lenk & Marcus Vogt & Christoph Herrmann, 2024. "An Approach to Predicting Energy Demand Within Automobile Production Using the Temporal Fusion Transformer Model," Energies, MDPI, vol. 18(1), pages 1-34, December.
    3. Ying Shu & Chengfu Ding & Lingbing Tao & Chentao Hu & Zhixin Tie, 2023. "Air Pollution Prediction Based on Discrete Wavelets and Deep Learning," Sustainability, MDPI, vol. 15(9), pages 1-19, April.
    4. Wang, Shengjie & Kang, Yanfei & Petropoulos, Fotios, 2024. "Combining probabilistic forecasts of intermittent demand," European Journal of Operational Research, Elsevier, vol. 315(3), pages 1038-1048.
    5. Pesantez, Jorge E. & Li, Binbin & Lee, Christopher & Zhao, Zhizhen & Butala, Mark & Stillwell, Ashlynn S., 2023. "A Comparison Study of Predictive Models for Electricity Demand in a Diverse Urban Environment," Energy, Elsevier, vol. 283(C).
    6. Wen, Honglin & Pinson, Pierre & Gu, Jie & Jin, Zhijian, 2024. "Wind energy forecasting with missing values within a fully conditional specification framework," International Journal of Forecasting, Elsevier, vol. 40(1), pages 77-95.
    7. Philippe Goulet Coulombe & Mikael Frenette & Karin Klieber, 2023. "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks," Working Papers 23-04, Chair in macroeconomics and forecasting, University of Quebec in Montreal's School of Management, revised Nov 2023.
    8. Jayesh Thaker & Robert Höller, 2022. "A Comparative Study of Time Series Forecasting of Solar Energy Based on Irradiance Classification," Energies, MDPI, vol. 15(8), pages 1-26, April.
    9. Liu, Chen & Wang, Chao & Tran, Minh-Ngoc & Kohn, Robert, 2025. "A long short-term memory enhanced realized conditional heteroskedasticity model," Economic Modelling, Elsevier, vol. 142(C).
    10. Kandaswamy Paramasivan & Brinda Subramani & Nandan Sudarsanam, 2022. "Counterfactual analysis of the impact of the first two waves of the COVID-19 pandemic on the reporting and registration of missing people in India," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-14, December.
    11. Sergio Consoli & Luca Tiozzo Pezzoli & Elisa Tosetti, 2022. "Neural forecasting of the Italian sovereign bond market with economic news," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(S2), pages 197-224, December.
    12. Wellens, Arnoud P. & Boute, Robert N. & Udenio, Maximiliano, 2024. "Simplifying tree-based methods for retail sales forecasting with explanatory variables," European Journal of Operational Research, Elsevier, vol. 314(2), pages 523-539.
    13. de Rezende, Rafael & Egert, Katharina & Marin, Ignacio & Thompson, Guilherme, 2022. "A white-boxed ISSM approach to estimate uncertainty distributions of Walmart sales," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1460-1467.
    14. Heming Chen, 2025. "Can optimal diversification beat the naive 1/N strategy in a highly correlated market? Empirical evidence from cryptocurrencies," Papers 2501.12841, arXiv.org.
    15. Philippe Goulet Coulombe & Mikael Frenette & Karin Klieber, 2023. "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks," Papers 2311.16333, arXiv.org, revised Apr 2024.
    16. Semenoglou, Artemios-Anargyros & Spiliotis, Evangelos & Makridakis, Spyros & Assimakopoulos, Vassilios, 2021. "Investigating the accuracy of cross-learning time series forecasting methods," International Journal of Forecasting, Elsevier, vol. 37(3), pages 1072-1084.
    17. Kandaswamy Paramasivan & Rahul Subburaj & Saish Jaiswal & Nandan Sudarsanam, 2022. "Empirical evidence of the impact of mobility on property crimes during the first two waves of the COVID-19 pandemic," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-14, December.
    18. Elham M. Al-Ali & Yassine Hajji & Yahia Said & Manel Hleili & Amal M. Alanzi & Ali H. Laatar & Mohamed Atri, 2023. "Solar Energy Production Forecasting Based on a Hybrid CNN-LSTM-Transformer Model," Mathematics, MDPI, vol. 11(3), pages 1-19, January.
    19. Lin, Jiahe & Michailidis, George, 2024. "A multi-task encoder-dual-decoder framework for mixed frequency data prediction," International Journal of Forecasting, Elsevier, vol. 40(3), pages 942-957.
    20. Ana Lazcano & Miguel A. Jaramillo-Morán & Julio E. Sandubete, 2024. "Back to Basics: The Power of the Multilayer Perceptron in Financial Time Series Forecasting," Mathematics, MDPI, vol. 12(12), pages 1-18, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jeners:v:17:y:2024:i:24:p:6452-:d:1549422. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.