IDEAS home Printed from https://ideas.repec.org/a/spr/digfin/v5y2023i2d10.1007_s42521-023-00079-9.html
   My bibliography  Save this article

Hybrid ARDL-MIDAS-Transformer time-series regressions for multi-topic crypto market sentiment driven by price and technology factors

Author

Listed:
  • Ioannis Chalkiadakis

    (Heriot-Watt University
    CNRS/UAR 3611)

  • Gareth W. Peters

    (University of California Santa Barbara)

  • Matthew Ames

    (ResilientML)

Abstract

This paper develops a novel hybrid Autoregressive Distributed Lag Mixed Data Sampling (ARDL-MIDAS) model that integrates both deep neural network multi-head attention Transformer mechanisms, and a number of covariates, including sophisticated stochastic text time-series features, into a mixed-frequency time-series regression model with long memory structure. In doing so, we demonstrate how the resulting class of ARDL-MIDAS-Transformer models allows one to maintain the interpretability of the time-series models whilst exploiting the deep neural network attention architectures. The latter may be used for higher-order interaction analysis, or, as in our use case, for design of Instrumental Variables to reduce bias in the estimation of the infinite lag ARDL-MIDAS model. Our approach produces an accurate, interpretable forecasting framework that allows one to forecast end-of-day sentiment intra-daily, with readily attainable time-series regressors. In this regard, we conduct a statistical time-series analysis on mixed data frequencies to discover and study the relationships between sentiment from our custom stochastic text time-series sentiment framework, alternative popular sentiment extraction frameworks (BERT and VADER), and technology factors, as well as to investigate the role that price discovery has on retail cryptocurrency investors’ sentiment (crypto sentiment). This is an interesting time-series modelling challenge as it involves working with time-series regression models in which the time-series response process, and the regression time-series covariates, are observed at different time scales. Specifically, a detailed real-data study is conducted where we explore the relationship between daily crypto market sentiment (of positive, negative and neutral polarity) and the intra-daily (hourly) price log-return dynamics of crypto markets. The sentiment indices constructed for a variety of “topics” and news sources are produced as a collection of time-series capturing the daily sentiment polarity signals for each “topic”, namely each particular market or crypto asset. Different sentiment methods are developed in a time-series context, and utilised in the proposed hybrid regression framework. Furthermore, technology factors are introduced to capture network effects, such as the hash rate which is an important aspect of the money supply relating to the mining of new crypto assets, and block hashing for transaction verification. Throughout our real data study, we provide guidance and insights on how to use our hybrid model to combine—in a transparent, non-black-box way—covariates obtained with different time resolutions, how to understand the arising dynamics between these covariates, potentially under the presence of long memory structure, and, finally, successfully leverage these in forecasting applications. The hybrid model developed demonstrated superior performance to alternatives in both in-sample and forecasting application on real data.

Suggested Citation

  • Ioannis Chalkiadakis & Gareth W. Peters & Matthew Ames, 2023. "Hybrid ARDL-MIDAS-Transformer time-series regressions for multi-topic crypto market sentiment driven by price and technology factors," Digital Finance, Springer, vol. 5(2), pages 295-365, June.
  • Handle: RePEc:spr:digfin:v:5:y:2023:i:2:d:10.1007_s42521-023-00079-9
    DOI: 10.1007/s42521-023-00079-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s42521-023-00079-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s42521-023-00079-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Eric Ghysels & Arthur Sinko & Rossen Valkanov, 2007. "MIDAS Regressions: Further Results and New Directions," Econometric Reviews, Taylor & Francis Journals, vol. 26(1), pages 53-90.
    2. Weron, Rafał, 2002. "Estimating long-range dependence: finite sample properties and confidence intervals," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 312(1), pages 285-299.
    3. Ghysels, Eric & Kvedaras, Virmantas & Zemlys, Vaidotas, 2016. "Mixed Frequency Data Sampling Regression Models: The R Package midasr," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 72(i04).
    4. Ghysels, Eric & Santa-Clara, Pedro & Valkanov, Rossen, 2006. "Predicting volatility: getting the most out of return data sampled at different frequencies," Journal of Econometrics, Elsevier, vol. 131(1-2), pages 59-95.
    5. Young Bin Kim & Jun Gi Kim & Wook Kim & Jae Ho Im & Tae Hyeong Kim & Shin Jin Kang & Chang Hun Kim, 2016. "Predicting Fluctuations in Cryptocurrency Transactions Based on User Comments and Replies," PLOS ONE, Public Library of Science, vol. 11(8), pages 1-17, August.
    6. Kwiatkowski, Denis & Phillips, Peter C. B. & Schmidt, Peter & Shin, Yongcheol, 1992. "Testing the null hypothesis of stationarity against the alternative of a unit root : How sure are we that economic time series have a unit root?," Journal of Econometrics, Elsevier, vol. 54(1-3), pages 159-178.
    7. Breusch, T S, 1978. "Testing for Autocorrelation in Dynamic Linear Models," Australian Economic Papers, Wiley Blackwell, vol. 17(31), pages 334-355, December.
    8. Ghysels, Eric & Santa-Clara, Pedro & Valkanov, Rossen, 2005. "There is a risk-return trade-off after all," Journal of Financial Economics, Elsevier, vol. 76(3), pages 509-548, June.
    9. Elena Andreou & Eric Ghysels & Andros Kourtellos, 2013. "Should Macroeconomic Forecasters Use Daily Financial Data and How?," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(2), pages 240-251, April.
    10. Ghysels, Eric & Santa-Clara, Pedro & Valkanov, Rossen, 2004. "The MIDAS Touch: Mixed Data Sampling Regression Models," University of California at Los Angeles, Anderson Graduate School of Management qt9mf223rs, Anderson Graduate School of Management, UCLA.
    11. James H. Stock & Francesco Trebbi, 2003. "Retrospectives: Who Invented Instrumental Variable Regression?," Journal of Economic Perspectives, American Economic Association, vol. 17(3), pages 177-194, Summer.
    12. Ioannis Chalkiadakis & Hongxuan Yan & Gareth W Peters & Pavel V Shevchenko, 2021. "Infection rate models for COVID-19: Model risk and public health news sentiment exposure adjustments," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-39, June.
    13. Claudia Foroni & Massimiliano Marcellino & Christian Schumacher, 2015. "Unrestricted mixed data sampling (MIDAS): MIDAS regressions with unrestricted lag polynomials," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 178(1), pages 57-82, January.
    14. Dhrymes, Phoebus J & Klein, Lawrence R & Steiglitz, Kenneth, 1970. "Estimation of Distributed Lags," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 11(2), pages 235-250, June.
    15. Kraaijeveld, Olivier & De Smedt, Johannes, 2020. "The predictive power of public Twitter sentiment for forecasting cryptocurrency prices," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 65(C).
    16. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sarun Kamolthip, 2021. "Macroeconomic Forecasting with LSTM and Mixed Frequency Time Series Data," PIER Discussion Papers 165, Puey Ungphakorn Institute for Economic Research.
    2. Laine, Olli-Matti & Lindblad, Annika, 2020. "Nowcasting Finnish GDP growth using financial variables: a MIDAS approach," BoF Economics Review 4/2020, Bank of Finland.
    3. Qian Chen & Xiang Gao & Shan Xie & Li Sun & Shuairu Tian & Shigeyuki Hamori, 2021. "On the Predictability of China Macro Indicator with Carbon Emissions Trading," Energies, MDPI, vol. 14(5), pages 1-24, February.
    4. Hanan Naser, 2015. "Estimating and forecasting Bahrain quarterly GDP growth using simple regression and factor-based methods," Empirical Economics, Springer, vol. 49(2), pages 449-479, September.
    5. Nuttanan Wichitaksorn, 2020. "Analyzing and Forecasting Thai Macroeconomic Data using Mixed-Frequency Approach," PIER Discussion Papers 146, Puey Ungphakorn Institute for Economic Research.
    6. Wichitaksorn, Nuttanan, 2022. "Analyzing and forecasting Thai macroeconomic data using mixed-frequency approach," Journal of Asian Economics, Elsevier, vol. 78(C).
    7. Valadkhani, Abbas & Smyth, Russell, 2017. "How do daily changes in oil prices affect US monthly industrial output?," Energy Economics, Elsevier, vol. 67(C), pages 83-90.
    8. Santiago Etchegaray Alvarez, 2022. "Proyecciones macroeconómicas con datos en frecuencias mixtas. Modelos ADL-MIDAS, U-MIDAS y TF-MIDAS con aplicaciones para Uruguay," Documentos de trabajo 2022004, Banco Central del Uruguay.
    9. Goldmann, Leonie & Crook, Jonathan & Calabrese, Raffaella, 2024. "A new ordinal mixed-data sampling model with an application to corporate credit rating levels," European Journal of Operational Research, Elsevier, vol. 314(3), pages 1111-1126.
    10. Nava, Consuelo R. & Osti, Linda & Zoia, Maria Grazia, 2022. "Forecasting Domestic Tourism across Regional Destinations through MIDAS Regressions," Department of Economics and Statistics Cognetti de Martiis. Working Papers 202207, University of Turin.
    11. Zhang, Yue-Jun & Wang, Jin-Li, 2019. "Do high-frequency stock market data help forecast crude oil prices? Evidence from the MIDAS models," Energy Economics, Elsevier, vol. 78(C), pages 192-201.
    12. Valadkhani, Abbas & Smyth, Russell, 2018. "Asymmetric responses in the timing, and magnitude, of changes in Australian monthly petrol prices to daily oil price changes," Energy Economics, Elsevier, vol. 69(C), pages 89-100.
    13. Rong Fu & Luze Xie & Tao Liu & Juan Huang & Binbin Zheng, 2022. "Chinese Economic Growth Projections Based on Mixed Data of Carbon Emissions under the COVID-19 Pandemic," Sustainability, MDPI, vol. 14(24), pages 1-16, December.
    14. Schumacher, Christian, 2016. "A comparison of MIDAS and bridge equations," International Journal of Forecasting, Elsevier, vol. 32(2), pages 257-270.
    15. Knotek, Edward S. & Zaman, Saeed, 2019. "Financial nowcasts and their usefulness in macroeconomic forecasting," International Journal of Forecasting, Elsevier, vol. 35(4), pages 1708-1724.
    16. Kuzin, Vladimir & Marcellino, Massimiliano & Schumacher, Christian, 2011. "MIDAS vs. mixed-frequency VAR: Nowcasting GDP in the euro area," International Journal of Forecasting, Elsevier, vol. 27(2), pages 529-542.
    17. Axel Groß-Klußmann, 2024. "Learning deep news sentiment representations for macro-finance," Digital Finance, Springer, vol. 6(3), pages 341-377, September.
    18. Dhaene, Geert & Wu, Jianbin, 2020. "Incorporating overnight and intraday returns into multivariate GARCH volatility models," Journal of Econometrics, Elsevier, vol. 217(2), pages 471-495.

    More about this item

    Keywords

    Mixed-data sampling time-series regression (MIDAS); Transformer deep neural network; Multi-scale resolution data; Natural language processing (NLP); Text sentiment NLP time-series modelling; Gegenbauer long memory; Econometrics; Time-series;
    All these keywords.

    JEL classification:

    • C32 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Time-Series Models; Dynamic Quantile Regressions; Dynamic Treatment Effect Models; Diffusion Processes; State Space Models
    • C36 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Instrumental Variables (IV) Estimation
    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • C49 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Other
    • C51 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Construction and Estimation

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:digfin:v:5:y:2023:i:2:d:10.1007_s42521-023-00079-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.