IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i14p2529-d867513.html
   My bibliography  Save this article

HJ-Biplot as a Tool to Give an Extra Analytical Boost for the Latent Dirichlet Assignment (LDA) Model: With an Application to Digital News Analysis about COVID-19

Author

Listed:
  • Luis Pilacuan-Bonete

    (Department of Statistics, University of Salamanca, 37008 Salamanca, Spain
    Faculty of Industrial Engineering, Universidad de Guayaquil, Guayaquil 090514, Ecuador)

  • Purificación Galindo-Villardón

    (Department of Statistics, University of Salamanca, 37008 Salamanca, Spain
    Escuela Superior Politécnica del Litoral, Escuela Superior Politécnica del Litoral (ESPOL), Centro de Estudios e Investigaciones Estadísticas, Campus Gustavo Galindo, Km. 30.5 Via Perimetral, Guayaquil P.O. Box 09-01-5863, Ecuador
    Centro de Gestión de Estudios Estadísticos, Universidad Estatal de Milagro (UNEMI), Ciudadela Universitaria Km. 1.5 vía al Km 26, Guayas 091050, Ecuador)

  • Francisco Delgado-Álvarez

    (Department of Statistics, University of Salamanca, 37008 Salamanca, Spain)

Abstract

This work objective is to generate an HJ-biplot representation for the content analysis obtained by latent Dirichlet assignment (LDA) of the headlines of three Spanish newspapers in their web versions referring to the topic of the pandemic caused by the SARS-CoV-2 virus (COVID-19) with more than 500 million affected and almost six million deaths to date. The HJ-biplot is used to give an extra analytical boost to the model, it is an easy-to-interpret multivariate technique which does not require in-depth knowledge of statistics, allows capturing the relationship between the topics about the COVID-19 news and the three digital newspapers, and it compares them with LDAvis and heatmap representations, the HJ-biplot provides a better representation and visualization, allowing us to analyze the relationship between each newspaper analyzed (column markers represented by vectors) and the 14 topics obtained from the LDA model (row markers represented by points) represented in the plane with the greatest informative capacity. It is concluded that the newspapers El Mundo and 20 M present greater homogeneity between the topics published during the pandemic, while El País presents topics that are less related to the other two newspapers, highlighting topics such as t_12 (Government_Madrid) and t_13 (Government_millions).

Suggested Citation

  • Luis Pilacuan-Bonete & Purificación Galindo-Villardón & Francisco Delgado-Álvarez, 2022. "HJ-Biplot as a Tool to Give an Extra Analytical Boost for the Latent Dirichlet Assignment (LDA) Model: With an Application to Digital News Analysis about COVID-19," Mathematics, MDPI, vol. 10(14), pages 1-17, July.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:14:p:2529-:d:867513
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/14/2529/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/14/2529/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Ortal Slobodin & Ilia Plochotnikov & Idan-Chaim Cohen & Aviad Elyashar & Odeya Cohen & Rami Puzis, 2022. "Global and Local Trends Affecting the Experience of US and UK Healthcare Professionals during COVID-19: Twitter Text Analysis," IJERPH, MDPI, vol. 19(11), pages 1-17, June.
    2. He, Wu & Zha, Shenghua & Li, Ling, 2013. "Social media competitive analysis and text mining: A case study in the pizza industry," International Journal of Information Management, Elsevier, vol. 33(3), pages 464-472.
    3. Javier De la Hoz-M & Mª José Fernández-Gómez & Susana Mendes, 2021. "LDAShiny: An R Package for Exploratory Review of Scientific Literature Based on a Bayesian Probabilistic Model and Machine Learning Tools," Mathematics, MDPI, vol. 9(14), pages 1-21, July.
    4. Warwick McKibbin & Roshen Fernando, 2021. "The Global Macroeconomic Impacts of COVID-19: Seven Scenarios," Asian Economic Papers, MIT Press, vol. 20(2), pages 1-30, Summer.
    5. Carl Eckart & Gale Young, 1936. "The approximation of one matrix by another of lower rank," Psychometrika, Springer;The Psychometric Society, vol. 1(3), pages 211-218, September.
    6. Scott Deerwester & Susan T. Dumais & George W. Furnas & Thomas K. Landauer & Richard Harshman, 1990. "Indexing by latent semantic analysis," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 41(6), pages 391-407, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Andreas Falke & Harald Hruschka, 2022. "Analyzing browsing across websites by machine learning methods," Journal of Business Economics, Springer, vol. 92(5), pages 829-852, July.
    2. Paramveer S. Dhillon & Sinan Aral, 2021. "Modeling Dynamic User Interests: A Neural Matrix Factorization Approach," Marketing Science, INFORMS, vol. 40(6), pages 1059-1080, November.
    3. Miguel Acosta, 2015. "FOMC Responses to Calls for Transparency," Finance and Economics Discussion Series 2015-60, Board of Governors of the Federal Reserve System (U.S.).
    4. Arno de Caigny & Kristof Coussement & Koen W. de Bock & Stefan Lessmann, 2019. "Incorporating textual information in customer churn prediction models based on a convolutional neural network," Post-Print hal-02275958, HAL.
    5. Tom Magerman & Bart Looy & Xiaoyan Song, 2010. "Exploring the feasibility and accuracy of Latent Semantic Analysis based text mining techniques to detect similarity between patent documents and scientific publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 82(2), pages 289-306, February.
    6. Sewell, Daniel K., 2018. "Visualizing data through curvilinear representations of matrices," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 255-270.
    7. A. G. Aganbegyan & A. N. Klepach & B. N. Porfiryev & M. N. Uzyakov & A. A. Shirov, 2020. "Post-Pandemic Recovery: The Russian Economy and the Transition to Sustainable Social and Economic Development," Studies on Russian Economic Development, Springer, vol. 31(6), pages 599-605, November.
    8. Piotr Sorokowski & Agata Groyecka & Marta Kowal & Agnieszka Sorokowska & Michał Białek & Izabela Lebuda & Małgorzata Dobrowolska & Przemysław Zdybek & Maciej Karwowski, 2020. "Can Information about Pandemics Increase Negative Attitudes toward Foreign Groups? A Case of COVID-19 Outbreak," Sustainability, MDPI, vol. 12(12), pages 1-10, June.
    9. Rubén Muñoz Pavón & Antonio A. Arcos Alvarez & Marcos G. Alberti, 2020. "Possibilities of BIM-FM for the Management of COVID in Public Buildings," Sustainability, MDPI, vol. 12(23), pages 1-21, November.
    10. Akhtaruzzaman, Md & Boubaker, Sabri & Sensoy, Ahmet, 2021. "Financial contagion during COVID–19 crisis," Finance Research Letters, Elsevier, vol. 38(C).
    11. Curci, Ylenia & Mongeau Ospina, Christian A., 2016. "Investigating biofuels through network analysis," Energy Policy, Elsevier, vol. 97(C), pages 60-72.
    12. Claudia Salceanu & Mariana Floricica Calin, 2022. "The Pandemic Context and Quality of Life for Youth in Constanta County," Technium Social Sciences Journal, Technium Science, vol. 27(1), pages 687-696, January.
    13. Chao Wei & Senlin Luo & Xincheng Ma & Hao Ren & Ji Zhang & Limin Pan, 2016. "Locally Embedding Autoencoders: A Semi-Supervised Manifold Learning Approach of Document Representation," PLOS ONE, Public Library of Science, vol. 11(1), pages 1-20, January.
    14. Yanguas Parra, Paola & Hauenstein, Christian & Oei, Pao-Yu, 2021. "The death valley of coal – Modelling COVID-19 recovery scenarios for steam coal markets," Applied Energy, Elsevier, vol. 288(C).
    15. Di Bartolomeo, Giovanni & D'Imperio, Paolo & Felici, Francesco, 2022. "The fiscal response to the Italian COVID-19 crisis: A counterfactual analysis," Journal of Macroeconomics, Elsevier, vol. 73(C).
    16. Brum, Matias & De Rosa, Mauricio, 2021. "Too little but not too late: nowcasting poverty and cash transfers’ incidence during COVID-19’s crisis," World Development, Elsevier, vol. 140(C).
    17. Jushan Bai & Serena Ng, 2020. "Simpler Proofs for Approximate Factor Models of Large Dimensions," Papers 2008.00254, arXiv.org.
    18. Adele Ravagnani & Fabrizio Lillo & Paola Deriu & Piero Mazzarisi & Francesca Medda & Antonio Russo, 2024. "Dimensionality reduction techniques to support insider trading detection," Papers 2403.00707, arXiv.org, revised May 2024.
    19. Alfredo García-Hiernaux & José Casals & Miguel Jerez, 2012. "Estimating the system order by subspace methods," Computational Statistics, Springer, vol. 27(3), pages 411-425, September.
    20. Oleksiuk Adam & Pleśniak Agnieszka, 2022. "Environment Characteristics and Internationalization of SMEs: Insights from a Polish and Finnish Sample," Journal of Management and Business Administration. Central Europe, Sciendo, vol. 30(3), pages 175-194, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:14:p:2529-:d:867513. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.