IDEAS home Printed from https://ideas.repec.org/a/spr/comaot/v29y2023i1d10.1007_s10588-022-09362-3.html
   My bibliography  Save this article

Does big data serve policy? Not without context. An experiment with in silico social science

Author

Listed:
  • Chris Graziul

    (University of Chicago)

  • Alexander Belikov

    (University of Chicago)

  • Ishanu Chattopadyay

    (University of Chicago)

  • Ziwen Chen

    (University of Chicago)

  • Hongbo Fang

    (Carnegie Mellon University)

  • Anuraag Girdhar

    (University of Chicago)

  • Xiaoshuang Jia

    (Sun Yat-sen University)

  • P. M. Krafft

    (University of Oxford)

  • Max Kleiman-Weiner

    (MIT
    Harvard University)

  • Candice Lewis

    (University of Chicago)

  • Chen Liang

    (University of Chicago)

  • John Muchovej

    (MIT
    Harvard University)

  • Alejandro Vientós

    (MIT
    Rutgers University)

  • Meg Young

    (Cornell University)

  • James Evans

    (University of Chicago
    Santa Fe Institute)

Abstract

The DARPA Ground Truth project sought to evaluate social science by constructing four varied simulated social worlds with hidden causality and unleashed teams of scientists to collect data, discover their causal structure, predict their future, and prescribe policies to create desired outcomes. This large-scale, long-term experiment of in silico social science, about which the ground truth of simulated worlds was known, but not by us, reveals the limits of contemporary quantitative social science methodology. First, problem solving without a shared ontology—in which many world characteristics remain existentially uncertain—poses strong limits to quantitative analysis even when scientists share a common task, and suggests how they could become insurmountable without it. Second, data labels biased the associations our analysts made and assumptions they employed, often away from the simulated causal processes those labels signified, suggesting limits on the degree to which analytic concepts developed in one domain may port to others. Third, the current standard for computational social science publication is a demonstration of novel causes, but this limits the relevance of models to solve problems and propose policies that benefit from the simpler and less surprising answers associated with most important causes, or the combination of all causes. Fourth, most singular quantitative methods applied on their own did not help to solve most analytical challenges, and we explored a range of established and emerging methods, including probabilistic programming, deep neural networks, systems of predictive probabilistic finite state machines, and more to achieve plausible solutions. However, despite these limitations common to the current practice of computational social science, we find on the positive side that even imperfect knowledge can be sufficient to identify robust prediction if a more pluralistic approach is applied. Applying competing approaches by distinct subteams, including at one point the vast TopCoder.com global community of problem solvers, enabled discovery of many aspects of the relevant structure underlying worlds that singular methods could not. Together, these lessons suggest how different a policy-oriented computational social science would be than the computational social science we have inherited. Computational social science that serves policy would need to endure more failure, sustain more diversity, maintain more uncertainty, and allow for more complexity than current institutions support.

Suggested Citation

  • Chris Graziul & Alexander Belikov & Ishanu Chattopadyay & Ziwen Chen & Hongbo Fang & Anuraag Girdhar & Xiaoshuang Jia & P. M. Krafft & Max Kleiman-Weiner & Candice Lewis & Chen Liang & John Muchovej &, 2023. "Does big data serve policy? Not without context. An experiment with in silico social science," Computational and Mathematical Organization Theory, Springer, vol. 29(1), pages 188-219, March.
  • Handle: RePEc:spr:comaot:v:29:y:2023:i:1:d:10.1007_s10588-022-09362-3
    DOI: 10.1007/s10588-022-09362-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10588-022-09362-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10588-022-09362-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sascha Holzhauer & Friedrich Krebs & Andreas Ernst, 2013. "Considering baseline homophily when generating spatial social networks for agent-based modelling," Computational and Mathematical Organization Theory, Springer, vol. 19(2), pages 128-150, June.
    2. Sean F. Reardon & Lindsay Fox & Joseph Townsend, 2015. "Neighborhood Income Composition by Household Race and Income, 1990–2009," The ANNALS of the American Academy of Political and Social Science, , vol. 660(1), pages 78-97, July.
    3. Markusen, James R. & Venables, Anthony J., 1988. "Trade policy with increasing returns and imperfect competition : Contradictory results from competing assumptions," Journal of International Economics, Elsevier, vol. 24(3-4), pages 299-316, May.
    4. Handcock, Mark S. & Hunter, David R. & Butts, Carter T. & Goodreau, Steven M. & Morris, Martina, 2008. "statnet: Software Tools for the Representation, Visualization, Analysis and Simulation of Network Data," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 24(i01).
    5. Roger Pielke & Tom Wigley & Christopher Green, 2008. "Dangerous assumptions," Nature, Nature, vol. 452(7187), pages 531-532, April.
    6. Liu, Xi & Gong, Li & Gong, Yongxi & Liu, Yu, 2015. "Revealing travel patterns and city structure with taxi trip data," Journal of Transport Geography, Elsevier, vol. 43(C), pages 78-90.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Epskamp, Sacha & Cramer, Angélique O.J. & Waldorp, Lourens J. & Schmittmann, Verena D. & Borsboom, Denny, 2012. "qgraph: Network Visualizations of Relationships in Psychometric Data," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 48(i04).
    2. Fletcher, Stanley M. & Nadolnyak, Denis A., 2005. "Accommodating Imperfect Competition in A Model of World Peanut Trade," 2005 Annual meeting, July 24-27, Providence, RI 19460, American Agricultural Economics Association (New Name 2008: Agricultural and Applied Economics Association).
    3. Frank Asche, 2001. "Testing the effect of an anti-dumping duty: The US salmon market," Empirical Economics, Springer, vol. 26(2), pages 343-355.
    4. Benjamin Davies & David C. Maré, 2020. "Delineating functional labour market areas with estimable classification stabilities," Working Papers 20_08, Motu Economic and Public Policy Research.
    5. Samrachana Adhikari & Beau Dabbs, 2018. "Social Network Analysis in R: A Software Review," Journal of Educational and Behavioral Statistics, , vol. 43(2), pages 225-253, April.
    6. BEHRENS, Kristian, 2004. "Market size and urban hierarchy," LIDAM Discussion Papers CORE 2004029, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    7. He, Yifan & Zhao, Chen & Zeng, An, 2022. "Ranking locations in a city via the collective home-work relations in human mobility data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 608(P1).
    8. Knopf, Amelia & Agot, Kawango & Sidle, John & Naanyu, Violet & Morris, Martina, 2015. "Reprint of: “This is the medicine:” A Kenyan community responds to a sexual concurrency reduction intervention," Social Science & Medicine, Elsevier, vol. 125(C), pages 182-191.
    9. Baccini, Leonardo & Impullitti, Giammario & Malesky, Edmund J., 2019. "Globalization and state capitalism: Assessing Vietnam's accession to the WTO," Journal of International Economics, Elsevier, vol. 119(C), pages 75-92.
    10. bunten, devin michelle & Fu, Ellen & Rolheiser, Lyndsey & Severen, Christopher, 2024. "The Problem Has Existed over Endless Years: Racialized Difference in Commuting, 1980–2019," Journal of Urban Economics, Elsevier, vol. 141(C).
    11. Mario Larch, 2007. "The Home Market Effect in Models with Multinational Enterprises," Review of International Economics, Wiley Blackwell, vol. 15(1), pages 62-74, February.
    12. Bolling, H. Christine & Neff, Steven & Handy, Charles R., 1998. "U.S. Foreign Direct Investment in the Western Hemisphere Processed Food Industry," Agricultural Economic Reports 34017, United States Department of Agriculture, Economic Research Service.
    13. Kirtonia, Sajeeb & Sun, Yanshuo, 2022. "Evaluating rail transit's comparative advantages in travel cost and time over taxi with open data in two U.S. cities," Transport Policy, Elsevier, vol. 115(C), pages 75-87.
    14. John McLevey & Alexander V. Graham & Reid McIlroy-Young & Pierson Browne & Kathryn S. Plaisance, 2018. "Interdisciplinarity and insularity in the diffusion of knowledge: an analysis of disciplinary boundaries between philosophy of science and the sciences," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 331-349, October.
    15. Bhattacharjea, Aditya & Sinha, Uday Bhanu, 2015. "Multi-market collusion with territorial allocation," International Journal of Industrial Organization, Elsevier, vol. 41(C), pages 42-50.
    16. Zhou, Xiaolu & Wang, Mingshu & Li, Dongying, 2019. "Bike-sharing or taxi? Modeling the choices of travel mode in Chicago using machine learning," Journal of Transport Geography, Elsevier, vol. 79(C), pages 1-1.
    17. Keith Head & Thierry Mayer & John Ries, 2000. "On the Pervasiveness of Home Market Effects," Econometric Society World Congress 2000 Contributed Papers 0862, Econometric Society.
    18. Stefano Guarino & Enrico Mastrostefano & Massimo Bernaschi & Alessandro Celestini & Marco Cianfriglia & Davide Torre & Lena Rebecca Zastrow, 2021. "Inferring Urban Social Networks from Publicly Available Data," Future Internet, MDPI, vol. 13(5), pages 1-45, April.
    19. Li, Ze-Tao & Nie, Wei-Peng & Cai, Shi-Min & Zhao, Zhi-Dan & Zhou, Tao, 2023. "Exploring the topological characteristics of urban trip networks based on taxi trajectory data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 609(C).
    20. Bagwell, Kyle & Staiger, Robert W., 2012. "The economics of trade agreements in the linear Cournot delocation model," Journal of International Economics, Elsevier, vol. 88(1), pages 32-46.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:comaot:v:29:y:2023:i:1:d:10.1007_s10588-022-09362-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.