A comparison of approaches for imbalanced classification problems in the context of retrieving relevant documents for an analysis
Author
Abstract
Suggested Citation
DOI: 10.1007/s42001-022-00191-7
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
References listed on IDEAS
- Baerg, Nicole & Lowe, Will, 2020. "A textual Taylor rule: estimating central bank preferences combining topic and scaling methods," Political Science Research and Methods, Cambridge University Press, vol. 8(1), pages 106-122, January.
- Mikhaylov, Slava & Laver, Michael & Benoit, Kenneth R., 2012. "Coder Reliability and Misclassification in the Human Coding of Party Manifestos," Political Analysis, Cambridge University Press, vol. 20(1), pages 78-91, January.
- van Atteveldt, Wouter & Sheafer, Tamir & Shenhav, Shaul R. & Fogel-Dror, Yair, 2017. "Clause Analysis: Using Syntactic Information to Automatically Extract Source, Subject, and Predicate from Texts with an Application to the 2008–2009 Gaza War," Political Analysis, Cambridge University Press, vol. 25(2), pages 207-222, April.
- Ennser-Jedenastik, Laurenz & Meyer, Thomas M., 2018. "The Impact of Party Cues on Manual Coding of Political Texts," Political Science Research and Methods, Cambridge University Press, vol. 6(3), pages 625-633, July.
- Grün, Bettina & Hornik, Kurt, 2011. "topicmodels: An R Package for Fitting Topic Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i13).
- D'Orazio, Vito & Landis, Steven T. & Palmer, Glenn & Schrodt, Philip, 2014. "Separating the Wheat from the Chaff: Applications of Automated Document Classification Using Support Vector Machines," Political Analysis, Cambridge University Press, vol. 22(2), pages 224-242, April.
- Gary King & Patrick Lam & Margaret E. Roberts, 2017. "Computer‐Assisted Keyword and Document Set Discovery from Unstructured Text," American Journal of Political Science, John Wiley & Sons, vol. 61(4), pages 971-988, October.
- Margaret E. Roberts & Brandon M. Stewart & Edoardo M. Airoldi, 2016. "A Model of Text for Experimentation in the Social Sciences," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(515), pages 988-1003, July.
- Nicholas Beauchamp, 2017. "Predicting and Interpolating State‐Level Polls Using Twitter Textual Data," American Journal of Political Science, John Wiley & Sons, vol. 61(2), pages 490-503, April.
- Bes, Bart Joachim & Schoonvelde, Martijn & Rauh, Christian, 2020. "Undermining, defusing or defending European integration? Assessing public communication of European executives in times of EU politicisation," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 59(2), pages 397-423.
- King, Gary & Pan, Jennifer & Roberts, Margaret E., 2013. "How Censorship in China Allows Government Criticism but Silences Collective Expression," American Political Science Review, Cambridge University Press, vol. 107(2), pages 326-343, May.
- Margaret E. Roberts & Brandon M. Stewart & Dustin Tingley & Christopher Lucas & Jetson Leder‐Luis & Shana Kushner Gadarian & Bethany Albertson & David G. Rand, 2014. "Structural Topic Models for Open‐Ended Survey Responses," American Journal of Political Science, John Wiley & Sons, vol. 58(4), pages 1064-1082, October.
- Joshua Uyheng & Kathleen M. Carley, 2020. "Bots and online hate during the COVID-19 pandemic: case studies in the United States and the Philippines," Journal of Computational Social Science, Springer, vol. 3(2), pages 445-468, November.
- Miller, Blake & Linder, Fridolin & Mebane, Walter R., 2020. "Active Learning Approaches for Labeling Text: Review and Assessment of the Performance of Active Learning Approaches," Political Analysis, Cambridge University Press, vol. 28(4), pages 532-551, October.
- Kevin M. Quinn & Burt L. Monroe & Michael Colaresi & Michael H. Crespin & Dragomir R. Radev, 2010. "How to Analyze Political Attention with Minimal Assumptions and Costs," American Journal of Political Science, John Wiley & Sons, vol. 54(1), pages 209-228, January.
- Muchlinski, David & Yang, Xiao & Birch, Sarah & Macdonald, Craig & Ounis, Iadh, 2021. "We need to go deeper: measuring electoral violence using convolutional neural networks and social media," Political Science Research and Methods, Cambridge University Press, vol. 9(1), pages 122-139, January.
- Katagiri, Azusa & Min, Eric, 2019. "The Credibility of Public and Private Signals: A Document-Based Approach," American Political Science Review, Cambridge University Press, vol. 113(1), pages 156-172, February.
- Justin Grimmer, 2013. "Appropriators not Position Takers: The Distorting Effects of Electoral Incentives on Congressional Representation," American Journal of Political Science, John Wiley & Sons, vol. 57(3), pages 624-642, July.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Dehler-Holland, Joris & Schumacher, Kira & Fichtner, Wolf, 2021. "Topic Modeling Uncovers Shifts in Media Framing of the German Renewable Energy Act," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 2(1).
- Zhang, Han, 2021. "How Using Machine Learning Classification as a Variable in Regression Leads to Attenuation Bias and What to Do About It," SocArXiv 453jk, Center for Open Science.
- Dehler-Holland, Joris & Okoh, Marvin & Keles, Dogan, 2022.
"Assessing technology legitimacy with topic models and sentiment analysis – The case of wind power in Germany,"
Technological Forecasting and Social Change, Elsevier, vol. 175(C).
- Dehler-Holland, Joris & Okoh, Marvin & Keles, Dogan, 2021. "The legitimacy of wind power in Germany," Working Paper Series in Production and Energy 54, Karlsruhe Institute of Technology (KIT), Institute for Industrial Production (IIP).
- Mourtgos, Scott M. & Adams, Ian T., 2019. "The rhetoric of de-policing: Evaluating open-ended survey responses from police officers with machine learning-based structural topic modeling," Journal of Criminal Justice, Elsevier, vol. 64(C), pages 1-1.
- Sumeet Sahay & Hemant Kumar Kaushik & Shikha Singh, 2023. "Discovering themes and trends in electricity supply chain area research," OPSEARCH, Springer;Operational Research Society of India, vol. 60(3), pages 1525-1560, September.
- Sanders, James & Lisi, Giulio & Schonhardt-Bailey, Cheryl, 2018. "Themes and topics in parliamentary oversight hearings: a new direction in textual data analysis," LSE Research Online Documents on Economics 87624, London School of Economics and Political Science, LSE Library.
- McCannon, Bryan & Zhou, Yang & Hall, Joshua, 2021. "Measuring a Contract’s Breadth: A Text Analysis," Working Papers 11013, George Mason University, Mercatus Center.
- Marcel Fratzscher & Tobias Heidland & Lukas Menkhoff & Lucio Sarno & Maik Schmeling, 2023.
"Foreign Exchange Intervention: A New Database,"
IMF Economic Review, Palgrave Macmillan;International Monetary Fund, vol. 71(4), pages 852-884, December.
- Fratzscher, Marcel & Heidland, Tobias & Menkhoff, Lukas & Sarno, Lucio & Schmeling, Maik, 2020. "Foreign exchange intervention: A new database," Kiel Working Papers 2171, Kiel Institute for the World Economy (IfW Kiel).
- Fratzscher, Marcel & Heidland, Tobias & Menkhoff, Lukas & Sarno, Lucio & Schmeling, Maik, 2022. "Foreign exchange intervention: A new database," CEPR Discussion Papers 17558, C.E.P.R. Discussion Papers.
- Marcel Fratzscher & Tobias Heidland & Lukas Menkhoff & Lucio Sarno & Maik Schmeling, 2020. "Foreign Exchange Intervention: A New Database," Discussion Papers of DIW Berlin 1915, DIW Berlin, German Institute for Economic Research.
- Li Tang & Jennifer Kuzma & Xi Zhang & Xinyu Song & Yin Li & Hongxu Liu & Guangyuan Hu, 2023. "Synthetic biology and governance research in China: a 40-year evolution," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 5293-5310, September.
- Han, Chunjia & Yang, Mu & Piterou, Athena, 2021. "Do news media and citizens have the same agenda on COVID-19? an empirical comparison of twitter posts," Technological Forecasting and Social Change, Elsevier, vol. 169(C).
- Mohamed M. Mostafa, 2023. "A one-hundred-year structural topic modeling analysis of the knowledge structure of international management research," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(4), pages 3905-3935, August.
- Ferrara, Federico M. & Masciandaro, Donato & Moschella, Manuela & Romelli, Davide, 2022.
"Political voice on monetary policy: Evidence from the parliamentary hearings of the European Central Bank,"
European Journal of Political Economy, Elsevier, vol. 74(C).
- Federico M. Ferrara & Donato Masciandaro & Manuela Moschella & Davide Romelli, 2021. "Political Voice on Monetary Policy: Evidence from the Parliamentary Hearings of the European Central Bank," BAFFI CAREFIN Working Papers 21159, BAFFI CAREFIN, Centre for Applied Research on International Markets Banking Finance and Regulation, Universita' Bocconi, Milano, Italy.
- Ferrara, Federico M. & Masciandaro, Donato & Moschella, Manuela & Romelli, Davide, 2022. "Political voice on monetary policy: evidence from the parliamentary hearings of the European Central Bank," LSE Research Online Documents on Economics 114278, London School of Economics and Political Science, LSE Library.
- Camilla Salvatore & Silvia Biffignandi & Annamaria Bianchi, 2022. "Corporate Social Responsibility Activities Through Twitter: From Topic Model Analysis to Indexes Measuring Communication Characteristics," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 164(3), pages 1217-1248, December.
- Lüdering Jochen & Winker Peter, 2016.
"Forward or Backward Looking? The Economic Discourse and the Observed Reality,"
Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(4), pages 483-515, August.
- Lüdering Jochen & Winker Peter, 2016. "Forward or Backward Looking? The Economic Discourse and the Observed Reality," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(4), pages 483-515, August.
- Lüdering Jochen & Winker Peter, 2016. "Forward or Backward Looking? The Economic Discourse and the Observed Reality," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(4), pages 483-515, August.
- Jochen Lüdering & Peter Winker, 2016. "Forward or Backward Looking? The Economic Discourse and the Observed Reality," MAGKS Papers on Economics 201607, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
- Grajzl, Peter & Murrell, Peter, 2024. "Caselaw and England's economic performance during the Industrial Revolution: Data and evidence," Journal of Comparative Economics, Elsevier, vol. 52(1), pages 145-165.
- Andreas Rehs, 2020. "A structural topic model approach to scientific reorientation of economics and chemistry after German reunification," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 1229-1251, November.
- Eunhye Park & Junehee Kwon & Bongsug (Kevin) Chae & Sung-Bum Kim, 2021. "What Are the Salient and Memorable Green-Restaurant Attributes? Capturing Customer Perceptions From User-Generated Content," SAGE Open, , vol. 11(3), pages 21582440211, July.
- Oliver Wieczorek & Saïd Unger & Jan Riebling & Lukas Erhard & Christian Koß & Raphael Heiberger, 2021. "Mapping the field of psychology: Trends in research topics 1995–2015," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9699-9731, December.
- Ulrich Fritsche & Johannes Puckelwald, 2018. "Deciphering Professional Forecasters’ Stories - Analyzing a Corpus of Textual Predictions for the German Economy," Macroeconomics and Finance Series 201804, University of Hamburg, Department of Socioeconomics.
- Arina Wischnewsky & David‐Jan Jansen & Matthias Neuenkirch, 2021.
"Financial stability and the Fed: Evidence from congressional hearings,"
Economic Inquiry, Western Economic Association International, vol. 59(3), pages 1192-1214, July.
- Arina Wischnewsky & David-Jan Jansen & Matthias Neuenkirch, 2019. "Financial stability and the Fed: evidence from congressional hearings," CESifo Working Paper Series 7657, CESifo.
- Wischnewsky, Arina & Jansen, David-Jan & Neuenkirch, Matthias, 2020. "Financial Stability and the Fed: Evidence from Congressional Hearings," VfS Annual Conference 2020 (Virtual Conference): Gender Economics 224527, Verein für Socialpolitik / German Economic Association.
- Arina Wischnewsky & David-Jan Jansen & Matthias Neuenkirch, 2019. "Financial Stability and the Fed: Evidence fromCongressional Hearings," Working Paper Series 2019-05, University of Trier, Research Group Quantitative Finance and Risk Analysis.
- Arina Wischnewsky & David-Jan Jansen & Matthias Neuenkirch, 2019. "Financial Stability and the Fed: Evidence from Congressional Hearings," Research Papers in Economics 2019-08, University of Trier, Department of Economics.
More about this item
Keywords
Imbalanced classification; Boolean query; Keyword lists; Query expansion; Topic models; Active learning;All these keywords.
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jcsosc:v:6:y:2023:i:1:d:10.1007_s42001-022-00191-7. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.