IDEAS home Printed from https://ideas.repec.org/a/bpj/statpp/v8y2017i1p85-104n7.html
   My bibliography  Save this article

Predicting the Brexit Vote by Tracking and Classifying Public Opinion Using Twitter Data

Author

Listed:
  • Amador Diaz Lopez Julio Cesar

    (Imperial College London, London SW7 2AZ, United Kingdom of Great Britain and Northern Ireland)

  • Collignon-Delmar Sofia

    (University College London, London, United Kingdom of Great Britain and Northern Ireland)

  • Benoit Kenneth

    (London School of Economics and Political Science – Methodology, London, United Kingdom of Great Britain and Northern Ireland)

  • Matsuo Akitaka

    (London School of Economics and Political Science – Methodology, London, United Kingdom of Great Britain and Northern Ireland)

Abstract

We use 23M Tweets related to the EU referendum in the UK to predict the Brexit vote. In particular, we use user-generated labels known as hashtags to build training sets related to the Leave/Remain campaign. Next, we train SVMs in order to classify Tweets. Finally, we compare our results to Internet and telephone polls. This approach not only allows to reduce the time of hand-coding data to create a training set, but also achieves high level of correlations with Internet polls. Our results suggest that Twitter data may be a suitable substitute for Internet polls and may be a useful complement for telephone polls. We also discuss the reach and limitations of this method.

Suggested Citation

  • Amador Diaz Lopez Julio Cesar & Collignon-Delmar Sofia & Benoit Kenneth & Matsuo Akitaka, 2017. "Predicting the Brexit Vote by Tracking and Classifying Public Opinion Using Twitter Data," Statistics, Politics and Policy, De Gruyter, vol. 8(1), pages 85-104, October.
  • Handle: RePEc:bpj:statpp:v:8:y:2017:i:1:p:85-104:n:7
    DOI: 10.1515/spp-2017-0006
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/spp-2017-0006
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/spp-2017-0006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Nicholas Beauchamp, 2017. "Predicting and Interpolating State‐Level Polls Using Twitter Textual Data," American Journal of Political Science, John Wiley & Sons, vol. 61(2), pages 490-503, April.
    2. Settle, Jaime E. & Bond, Robert M. & Coviello, Lorenzo & Fariss, Christopher J. & Fowler, James H. & Jones, Jason J., 2016. "From Posting to Voting: The Effects of Political Competition on Online Political Engagement," Political Science Research and Methods, Cambridge University Press, vol. 4(2), pages 361-378, May.
    3. Huberty, Mark, 2015. "Can we vote with our tweet? On the perennial difficulty of election forecasting with social media," International Journal of Forecasting, Elsevier, vol. 31(3), pages 992-1007.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sequeira, Sandra & Nardotto, Mattia, 2021. "Identity, Media and Consumer Behavior," CEPR Discussion Papers 15765, C.E.P.R. Discussion Papers.
    2. Simon Rudkin & Lucy Barros & Paweł Dłotko & Wanling Qiu, 2024. "An economic topology of the Brexit vote," Regional Studies, Taylor & Francis Journals, vol. 58(3), pages 601-618, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sandra Wankmüller, 2023. "A comparison of approaches for imbalanced classification problems in the context of retrieving relevant documents for an analysis," Journal of Computational Social Science, Springer, vol. 6(1), pages 91-163, April.
    2. Ronald McDonald & Xuxin Mao, 2015. "Forecasting the 2015 General Election with Internet Big Data: An Application of the TRUST Framework," Working Papers 2016_03, Business School - Economics, University of Glasgow.
    3. Anselm Hager & Johannes Hermle & Lukas Hensel & Christopher Roth, 2020. "Does Party Competition Affect Political Activism?," CESifo Working Paper Series 8431, CESifo.
    4. Ali, Maged & Gomes, Lucas Moreira & Azab, Nahed & de Moraes Souza, João Gabriel & Sorour, M. Karim & Kimura, Herbert, 2023. "Panic buying and fake news in urban vs. rural England: A case study of twitter during COVID-19," Technological Forecasting and Social Change, Elsevier, vol. 193(C).
    5. Shelleka Gupta & Vinay Chauhan, 2023. "Understanding the Role of Social Networking Sites in Political Marketing," Jindal Journal of Business Research, , vol. 12(1), pages 58-72, June.
    6. Coble, David & Pincheira, Pablo, 2017. "Nowcasting Building Permits with Google Trends," MPRA Paper 76514, University Library of Munich, Germany.
    7. Li, Xixi & Bai, Yun & Kang, Yanfei, 2022. "Exploring the social influence of the Kaggle virtual community on the M5 competition," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1507-1518.
    8. Stefan Stieglitz & Christian Meske & Björn Ross & Milad Mirbabaie, 2020. "Going Back in Time to Predict the Future - The Complex Role of the Data Collection Period in Social Media Analytics," Information Systems Frontiers, Springer, vol. 22(2), pages 395-409, April.
    9. Valerio Astuti & Marta Crispino & Marco Langiulli & Juri Marcucci, 2022. "Textual analysis of a Twitter corpus during the COVID-19 pandemics," Questioni di Economia e Finanza (Occasional Papers) 692, Bank of Italy, Economic Research and International Relations Area.
    10. Keng-Chi Chang & Chun-Fang Chiang & Ming-Jen Lin, 2021. "Using Facebook data to predict the 2016 U.S. presidential election," PLOS ONE, Public Library of Science, vol. 16(12), pages 1-24, December.
    11. Carlos Arcila-Calderón & David Blanco-Herrero & Maximiliano Frías-Vázquez & Francisco Seoane-Pérez, 2021. "Refugees Welcome? Online Hate Speech and Sentiments in Twitter in Spain during the Reception of the Boat Aquarius," Sustainability, MDPI, vol. 13(5), pages 1-16, March.
    12. Gutierrez-Barroso Josue & Báez-García Alberto Javier & Flores-Muñoz Francisco & Ruiz Medina Luis Javier & Trujillo González Juan Vianney & Padrón-Armas Ana Goretty, 2024. "Google Trends of political parties in Europe: a fractal exploration," Central European Journal of Public Policy, Sciendo, vol. 18(1), pages 24-36.
    13. Schaer, Oliver & Kourentzes, Nikolaos & Fildes, Robert, 2019. "Demand forecasting with user-generated online information," International Journal of Forecasting, Elsevier, vol. 35(1), pages 197-212.
    14. Green, Lawrence & Sung, Ming-Chien & Ma, Tiejun & Johnson, Johnnie E. V., 2019. "To what extent can new web-based technology improve forecasts? Assessing the economic value of information derived from Virtual Globes and its rate of diffusion in a financial market," European Journal of Operational Research, Elsevier, vol. 278(1), pages 226-239.
    15. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    16. Saeed-Ul Hassan & Timothy D. Bowman & Mudassir Shabbir & Aqsa Akhtar & Mubashir Imran & Naif Radi Aljohani, 2019. "Influential tweeters in relation to highly cited articles in altmetric big data," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(1), pages 481-493, April.
    17. Franch, Fabio, 2021. "Political preferences nowcasting with factor analysis and internet data: The 2012 and 2016 US presidential elections," Technological Forecasting and Social Change, Elsevier, vol. 166(C).
    18. Brown, Alasdair & Reade, J. James & Vaughan Williams, Leighton, 2019. "When are prediction market prices most informative?," International Journal of Forecasting, Elsevier, vol. 35(1), pages 420-428.
    19. Fronzetti Colladon, Andrea, 2020. "Forecasting election results by studying brand importance in online news," International Journal of Forecasting, Elsevier, vol. 36(2), pages 414-427.
    20. Sheng, Jie & Amankwah-Amoah, Joseph & Wang, Xiaojun, 2017. "A multidisciplinary perspective of big data in management research," International Journal of Production Economics, Elsevier, vol. 191(C), pages 97-112.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:statpp:v:8:y:2017:i:1:p:85-104:n:7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.