IDEAS home Printed from https://ideas.repec.org/p/osf/metaar/a9vhr_v1.html
   My bibliography  Save this paper

We Need to Talk about Mechanical Turk: What 22,989 Hypothesis Tests Tell Us about Publication Bias and p-Hacking in Online Experiments

Author

Listed:
  • Brodeur, Abel
  • Cook, Nikolai
  • Heyes, Anthony

Abstract

Amazon Mechanical Turk is a very widely-used tool in business and economics research, but how trustworthy are results from well-published studies that use it? Analyzing the universe of hypotheses tested on the platform and published in leading journals between 2010 and 2020 we find evidence of widespread p-hacking, publication bias and over-reliance on results from plausibly under-powered studies. Even ignoring questions arising from the characteristics and behaviors of study recruits, the conduct of the research community itself erode substantially the credibility of these studies' conclusions. The extent of the problems vary across the business, economics, management and marketing research fields (with marketing especially afflicted). The problems are not getting better over time and are much more prevalent than in a comparison set of non-online experiments. We explore correlates of increased credibility.

Suggested Citation

  • Brodeur, Abel & Cook, Nikolai & Heyes, Anthony, 2022. "We Need to Talk about Mechanical Turk: What 22,989 Hypothesis Tests Tell Us about Publication Bias and p-Hacking in Online Experiments," MetaArXiv a9vhr_v1, Center for Open Science.
  • Handle: RePEc:osf:metaar:a9vhr_v1
    DOI: 10.31219/osf.io/a9vhr_v1
    as

    Download full text from publisher

    File URL: https://osf.io/download/62f54cce0beb5f0542b0ae1e/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/a9vhr_v1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. John Horton & David Rand & Richard Zeckhauser, 2011. "The online laboratory: conducting experiments in a real labor market," Experimental Economics, Springer;Economic Science Association, vol. 14(3), pages 399-425, September.
    2. Camerer, Colin F & Hogarth, Robin M, 1999. "The Effects of Financial Incentives in Experiments: A Review and Capital-Labor-Production Framework," Journal of Risk and Uncertainty, Springer, vol. 19(1-3), pages 7-42, December.
    3. Francesco Guala & Luigi Mittone, 2005. "Experiments in economics: External validity and the robustness of phenomena," Journal of Economic Methodology, Taylor & Francis Journals, vol. 12(4), pages 495-515.
    4. Andreas Ortman & Le Zhang, 2013. "Exploring the Meaning of Significance in Experimental Economics," Discussion Papers 2013-32, School of Economics, The University of New South Wales.
    5. Steven D. Levitt & John A. List, 2007. "Viewpoint: On the generalizability of lab behaviour to the field," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 40(2), pages 347-370, May.
    6. Tomáš Havránek, 2015. "Measuring Intertemporal Substitution: The Importance Of Method Choices And Selective Reporting," Journal of the European Economic Association, European Economic Association, vol. 13(6), pages 1180-1204, December.
    7. Abel Brodeur & Mathias Lé & Marc Sangnier & Yanos Zylberberg, 2016. "Star Wars: The Empirics Strike Back," American Economic Journal: Applied Economics, American Economic Association, vol. 8(1), pages 1-32, January.
    8. Matias D. Cattaneo & Michael Jansson & Xinwei Ma, 2020. "Simple Local Polynomial Density Estimators," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(531), pages 1449-1455, July.
    9. Harrison, Glenn W. & Lau, Morten I. & Elisabet Rutström, E., 2009. "Risk attitudes, randomization to treatment, and self-selection into experiments," Journal of Economic Behavior & Organization, Elsevier, vol. 70(3), pages 498-507, June.
    10. Erik Snowberg & Leeat Yariv, 2021. "Testing the Waters: Behavior across Participant Pools," American Economic Review, American Economic Association, vol. 111(2), pages 687-719, February.
    11. Stefano DellaVigna & Elizabeth Linos, 2022. "RCTs to Scale: Comprehensive Evidence From Two Nudge Units," Econometrica, Econometric Society, vol. 90(1), pages 81-116, January.
    12. David Johnson & John Barry Ryan, 2020. "Amazon Mechanical Turk workers can provide consistent and economically meaningful data," Southern Economic Journal, John Wiley & Sons, vol. 87(1), pages 369-385, July.
    13. Isaiah Andrews & Maximilian Kasy, 2019. "Identification of and Correction for Publication Bias," American Economic Review, American Economic Association, vol. 109(8), pages 2766-2794, August.
    14. Eva Vivalt, 2019. "Specification Searching and Significance Inflation Across Time, Methods and Disciplines," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 81(4), pages 797-816, August.
    15. Antonio A. Arechar & Gordon T. Kraft-Todd & David G. Rand, 2017. "Turking overtime: how participant characteristics and behavior vary over time and day on Amazon Mechanical Turk," Journal of the Economic Science Association, Springer;Economic Science Association, vol. 3(1), pages 1-11, July.
    16. Nicholas Swanson & Garret Christensen & Rebecca Littman & David Birke & Edward Miguel & Elizabeth Levy Paluck & Zenan Wang, 2020. "Research Transparency Is on the Rise in Economics," AEA Papers and Proceedings, American Economic Association, vol. 110, pages 61-65, May.
    17. Armin Falk & Stephan Meier & Christian Zehnder, 2013. "Do Lab Experiments Misrepresent Social Preferences? The Case Of Self-Selected Student Samples," Journal of the European Economic Association, European Economic Association, vol. 11(4), pages 839-852, August.
    18. Ben Gillen & Erik Snowberg & Leeat Yariv, 2019. "Experimenting with Measurement Error: Techniques with Applications to the Caltech Cohort Study," Journal of Political Economy, University of Chicago Press, vol. 127(4), pages 1826-1863.
    19. Coppock, Alexander, 2019. "Generalizing from Survey Experiments Conducted on Mechanical Turk: A Replication Approach," Political Science Research and Methods, Cambridge University Press, vol. 7(3), pages 613-628, July.
    20. Yun Shin Lee & Yong Won Seo & Enno Siemsen, 2018. "Running Behavioral Operations Experiments Using Amazon's Mechanical Turk," Production and Operations Management, Production and Operations Management Society, vol. 27(5), pages 973-989, May.
    21. Berinsky, Adam J. & Huber, Gregory A. & Lenz, Gabriel S., 2012. "Evaluating Online Labor Markets for Experimental Research: Amazon.com's Mechanical Turk," Political Analysis, Cambridge University Press, vol. 20(3), pages 351-368, July.
    22. Chris Doucouliagos & T.D. Stanley, 2013. "Are All Economic Facts Greatly Exaggerated? Theory Competition And Selectivity," Journal of Economic Surveys, Wiley Blackwell, vol. 27(2), pages 316-339, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Brodeur, Abel & Cook, Nikolai & Heyes, Anthony, 2022. "We Need to Talk about Mechanical Turk: What 22,989 Hypothesis Tests Tell us about p-Hacking and Publication Bias in Online Experiments," GLO Discussion Paper Series 1157, Global Labor Organization (GLO).
    2. Abel Brodeur, Nikolai M. Cook, Anthony Heyes, 2022. "We Need to Talk about Mechanical Turk: What 22,989 Hypothesis Tests Tell Us about Publication Bias and p-Hacking in Online Experiments," LCERPA Working Papers am0133, Laurier Centre for Economic Research and Policy Analysis.
    3. Abel Brodeur & Nikolai Cook & Carina Neisser, 2024. "p-Hacking, Data type and Data-Sharing Policy," The Economic Journal, Royal Economic Society, vol. 134(659), pages 985-1018.
    4. Abel Brodeur & Scott Carrell & David Figlio & Lester Lusher, 2023. "Unpacking P-hacking and Publication Bias," American Economic Review, American Economic Association, vol. 113(11), pages 2974-3002, November.
    5. Johannes G. Jaspersen & Marc A. Ragin & Justin R. Sydnor, 2022. "Insurance demand experiments: Comparing crowdworking to the lab," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 89(4), pages 1077-1107, December.
    6. Brodeur, Abel & Cook, Nikolai & Hartley, Jonathan & Heyes, Anthony, 2022. "Do Pre-Registration and Pre-analysis Plans Reduce p-Hacking and Publication Bias?," MetaArXiv uxf39, Center for Open Science.
    7. Abel Brodeur & Nikolai M. Cook & Jonathan S. Hartley & Anthony Heyes, 2024. "Do Preregistration and Preanalysis Plans Reduce p-Hacking and Publication Bias? Evidence from 15,992 Test Statistics and Suggestions for Improvement," Journal of Political Economy Microeconomics, University of Chicago Press, vol. 2(3), pages 527-561.
    8. Eszter Czibor & David Jimenez‐Gomez & John A. List, 2019. "The Dozen Things Experimental Economists Should Do (More of)," Southern Economic Journal, John Wiley & Sons, vol. 86(2), pages 371-432, October.
    9. Cristina Blanco-Perez & Abel Brodeur, 2020. "Publication Bias and Editorial Statement on Negative Findings," The Economic Journal, Royal Economic Society, vol. 130(629), pages 1226-1247.
    10. Abel Brodeur & Nikolai Cook & Anthony Heyes, 2020. "Methods Matter: p-Hacking and Publication Bias in Causal Analysis in Economics," American Economic Review, American Economic Association, vol. 110(11), pages 3634-3660, November.
    11. Doucouliagos, Hristos & Hinz, Thomas & Zigova, Katarina, 2022. "Bias and careers: Evidence from the aid effectiveness literature," European Journal of Political Economy, Elsevier, vol. 71(C).
    12. Graham Elliott & Nikolay Kudrin & Kaspar Wüthrich, 2022. "Detecting p‐Hacking," Econometrica, Econometric Society, vol. 90(2), pages 887-906, March.
    13. Tomas Havranek & Zuzana Irsova & Lubica Laslopova & Olesia Zeynalova, 2020. "Skilled and Unskilled Labor Are Less Substitutable than Commonly Thought," Working Papers IES 2020/29, Charles University Prague, Faculty of Social Sciences, Institute of Economic Studies, revised Sep 2020.
    14. Brodeur, Abel & Cook, Nikolai & Heyes, Anthony, 2018. "Methods Matter: P-Hacking and Causal Inference in Economics," IZA Discussion Papers 11796, Institute of Labor Economics (IZA).
    15. Graham Elliott & Nikolay Kudrin & Kaspar Wuthrich, 2022. "The Power of Tests for Detecting $p$-Hacking," Papers 2205.07950, arXiv.org, revised Apr 2024.
    16. Cazachevici, Alina & Havranek, Tomas & Horvath, Roman, 2020. "Remittances and economic growth: A meta-analysis," World Development, Elsevier, vol. 134(C).
    17. Dominika Ehrenbergerova & Josef Bajzik & Tomas Havranek, 2023. "When Does Monetary Policy Sway House Prices? A Meta-Analysis," IMF Economic Review, Palgrave Macmillan;International Monetary Fund, vol. 71(2), pages 538-573, June.
    18. Jindrich Matousek & Tomas Havranek & Zuzana Irsova, 2022. "Individual discount rates: a meta-analysis of experimental evidence," Experimental Economics, Springer;Economic Science Association, vol. 25(1), pages 318-358, February.
    19. Tomas Havranek & Anna Sokolova, 2020. "Do Consumers Really Follow a Rule of Thumb? Three Thousand Estimates from 144 Studies Say 'Probably Not'," Review of Economic Dynamics, Elsevier for the Society for Economic Dynamics, vol. 35, pages 97-122, January.
    20. Yariv, Leeat & Snowberg, Erik, 2018. "Testing the Waters: Behavior across Participant Pools," CEPR Discussion Papers 13015, C.E.P.R. Discussion Papers.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:metaar:a9vhr_v1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://osf.io/preprints/metaarxiv .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.