IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2004.00999.html
   My bibliography  Save this paper

Pruned Wasserstein Index Generation Model and wigpy Package

Author

Listed:
  • Fangzhou Xie

Abstract

Recent proposal of Wasserstein Index Generation model (WIG) has shown a new direction for automatically generating indices. However, it is challenging in practice to fit large datasets for two reasons. First, the Sinkhorn distance is notoriously expensive to compute and suffers from dimensionality severely. Second, it requires to compute a full $N\times N$ matrix to be fit into memory, where $N$ is the dimension of vocabulary. When the dimensionality is too large, it is even impossible to compute at all. I hereby propose a Lasso-based shrinkage method to reduce dimensionality for the vocabulary as a pre-processing step prior to fitting the WIG model. After we get the word embedding from Word2Vec model, we could cluster these high-dimensional vectors by $k$-means clustering, and pick most frequent tokens within each cluster to form the "base vocabulary". Non-base tokens are then regressed on the vectors of base token to get a transformation weight and we could thus represent the whole vocabulary by only the "base tokens". This variant, called pruned WIG (pWIG), will enable us to shrink vocabulary dimension at will but could still achieve high accuracy. I also provide a \textit{wigpy} module in Python to carry out computation in both flavor. Application to Economic Policy Uncertainty (EPU) index is showcased as comparison with existing methods of generating time-series sentiment indices.

Suggested Citation

  • Fangzhou Xie, 2020. "Pruned Wasserstein Index Generation Model and wigpy Package," Papers 2004.00999, arXiv.org, revised Jul 2020.
  • Handle: RePEc:arx:papers:2004.00999
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2004.00999
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Castelnuovo, Efrem & Tran, Trung Duc, 2017. "Google It Up! A Google Trends-based Uncertainty index for the United States and Australia," Economics Letters, Elsevier, vol. 161(C), pages 149-153.
    2. Xie, Fangzhou, 2020. "Wasserstein Index Generation Model: Automatic generation of time-series index with application to Economic Policy Uncertainty," Economics Letters, Elsevier, vol. 186(C).
    3. Scott R. Baker & Nicholas Bloom & Steven J. Davis, 2016. "Measuring Economic Policy Uncertainty," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 131(4), pages 1593-1636.
    4. Ghirelli, Corinna & Pérez, Javier J. & Urtasun, Alberto, 2019. "A new economic policy uncertainty index for Spain," Economics Letters, Elsevier, vol. 182(C), pages 64-67.
    5. Robert J. Shiller, 2017. "Narrative Economics," American Economic Review, American Economic Association, vol. 107(4), pages 967-1004, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xie, Fangzhou, 2020. "Wasserstein Index Generation Model: Automatic generation of time-series index with application to Economic Policy Uncertainty," Economics Letters, Elsevier, vol. 186(C).
    2. Bhanu Pratap & Nalin Priyaranjan, 2023. "Macroeconomic effects of uncertainty: a Google trends-based analysis for India," Empirical Economics, Springer, vol. 65(4), pages 1599-1625, October.
    3. Catalina Bolancé & Carlos Alberto Acuña & Salvador Torra, 2022. "Non-Normal Market Losses and Spatial Dependence Using Uncertainty Indices," Mathematics, MDPI, vol. 10(8), pages 1-23, April.
    4. Donadelli, Michael & Gufler, Ivan & Pellizzari, Paolo, 2020. "The macro and asset pricing implications of rising Italian uncertainty: Evidence from a novel news-based macroeconomic policy uncertainty index," Economics Letters, Elsevier, vol. 197(C).
    5. Naboka-Krell, Viktoriia, 2024. "Construction and analysis of uncertainty indices based on multilingual text representations," Economics Letters, Elsevier, vol. 237(C).
    6. Corinna Ghirelli & María Gil & Javier J. Pérez & Alberto Urtasun, 2021. "Measuring economic and economic policy uncertainty and their macroeconomic effects: the case of Spain," Empirical Economics, Springer, vol. 60(2), pages 869-892, February.
    7. Viktoriia Naboka-Krell, 2023. "Construction and Analysis of Uncertainty Indices based on Multilingual Text Representations," MAGKS Papers on Economics 202310, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    8. Nikolay Hristov & Markus Roth, 2019. "Uncertainty Shocks and Financial Crisis Indicators," CESifo Working Paper Series 7839, CESifo.
    9. Marina Diakonova & Luis Molina & Hannes Mueller & Javier J. Pérez & Cristopher Rauh, 2022. "The information content of conflict, social unrest and policy uncertainty measures for macroeconomic forecasting," Working Papers 2232, Banco de España.
    10. Himounet, Nicolas, 2022. "Searching the nature of uncertainty: Macroeconomic and financial risks VS geopolitical and pandemic risks," International Economics, Elsevier, vol. 170(C), pages 1-31.
    11. Azqueta-Gavaldon, Andres, 2023. "Political referenda and investment: Evidence from Scotland," European Journal of Political Economy, Elsevier, vol. 80(C).
    12. Efrem Castelnuovo, 2019. "Yield Curve and Financial Uncertainty: Evidence Based on US Data," Australian Economic Review, The University of Melbourne, Melbourne Institute of Applied Economic and Social Research, vol. 52(3), pages 323-335, September.
    13. Efrem Castelnuovo & Guay Lim, 2019. "What Do We Know About the Macroeconomic Effects of Fiscal Policy? A Brief Survey of the Literature on Fiscal Multipliers," Australian Economic Review, The University of Melbourne, Melbourne Institute of Applied Economic and Social Research, vol. 52(1), pages 78-93, March.
    14. Mueller, Hannes & Garcia-Uribe, Sandra & Sanz, Carlos, 2020. "Economic Uncertainty and Divisive Politics: Evidence from the "dos Españas"," CEPR Discussion Papers 15479, C.E.P.R. Discussion Papers.
    15. Gabriel Caldas Montes & Victor Maia, 2023. "The reaction of disagreements in inflation expectations to fiscal sentiment obtained from information in official communiqués," Bulletin of Economic Research, Wiley Blackwell, vol. 75(4), pages 828-859, October.
    16. Boungou, Whelsy & Mawusi, Charles, 2022. "The impact of economic policy uncertainty on banks' non-interest income activities," International Economics, Elsevier, vol. 169(C), pages 89-97.
    17. Yuting Chen & Don Bredin & Valerio Potì & Roman Matkovskyy, 2022. "COVID risk narratives: a computational linguistic approach to the econometric identification of narrative risk during a pandemic," Digital Finance, Springer, vol. 4(1), pages 17-61, March.
    18. Anna Matzner & Birgit Meyer & Harald Oberhofer, 2023. "Trade in times of uncertainty," The World Economy, Wiley Blackwell, vol. 46(9), pages 2564-2597, September.
    19. Beckmann, Joscha & Davidson, Sharada Nia & Koop, Gary & Schüssler, Rainer, 2023. "Cross-country uncertainty spillovers: Evidence from international survey data," Journal of International Money and Finance, Elsevier, vol. 130(C).
    20. John Garcia, 2021. "Analyst herding and firm-level investor sentiment," Financial Markets and Portfolio Management, Springer;Swiss Society for Financial Market Research, vol. 35(4), pages 461-494, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2004.00999. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.