IDEAS home Printed from https://ideas.repec.org/a/spr/advdac/v16y2022i4d10.1007_s11634-021-00455-6.html
   My bibliography  Save this article

Is there a role for statistics in artificial intelligence?

Author

Listed:
  • Sarah Friedrich

    (University Medical Center Göttingen)

  • Gerd Antes

    (University of Freiburg)

  • Sigrid Behr

    (Novartis Pharma AG)

  • Harald Binder

    (University of Freiburg)

  • Werner Brannath

    (University Bremen)

  • Florian Dumpert

    (Federal Statistical Office of Germany)

  • Katja Ickstadt

    (TU Dortmund University)

  • Hans A. Kestler

    (Ulm University)

  • Johannes Lederer

    (Ruhr-Universität Bochum)

  • Heinz Leitgöb

    (Department of Sociology, University of Eichstätt-Ingolstadt)

  • Markus Pauly

    (TU Dortmund University)

  • Ansgar Steland

    (RWTH Aachen University)

  • Adalbert Wilhelm

    (Jacobs University Bremen)

  • Tim Friede

    (University Medical Center Göttingen)

Abstract

The research on and application of artificial intelligence (AI) has triggered a comprehensive scientific, economic, social and political discussion. Here we argue that statistics, as an interdisciplinary scientific field, plays a substantial role both for the theoretical and practical understanding of AI and for its future development. Statistics might even be considered a core element of AI. With its specialist knowledge of data evaluation, starting with the precise formulation of the research question and passing through a study design stage on to analysis and interpretation of the results, statistics is a natural partner for other disciplines in teaching, research and practice. This paper aims at highlighting the relevance of statistical methodology in the context of AI development. In particular, we discuss contributions of statistics to the field of artificial intelligence concerning methodological development, planning and design of studies, assessment of data quality and data collection, differentiation of causality and associations and assessment of uncertainty in results. Moreover, the paper also discusses the equally necessary and meaningful extensions of curricula in schools and universities to integrate statistical aspects into AI teaching.

Suggested Citation

  • Sarah Friedrich & Gerd Antes & Sigrid Behr & Harald Binder & Werner Brannath & Florian Dumpert & Katja Ickstadt & Hans A. Kestler & Johannes Lederer & Heinz Leitgöb & Markus Pauly & Ansgar Steland & A, 2022. "Is there a role for statistics in artificial intelligence?," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(4), pages 823-846, December.
  • Handle: RePEc:spr:advdac:v:16:y:2022:i:4:d:10.1007_s11634-021-00455-6
    DOI: 10.1007/s11634-021-00455-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11634-021-00455-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11634-021-00455-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Michael W. McCracken & Serena Ng, 2016. "FRED-MD: A Monthly Database for Macroeconomic Research," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 574-589, October.
    2. Bühlmann, Peter & van de Geer, Sara, 2018. "Statistics for big data: A perspective," Statistics & Probability Letters, Elsevier, vol. 136(C), pages 37-41.
    3. Xiao-Li Meng & Xianchao Xie, 2014. "I Got More Data, My Model is More Refined, but My Estimator is Getting Worse! Am I Just Dumb?," Econometric Reviews, Taylor & Francis Journals, vol. 33(1-4), pages 218-250, June.
    4. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    5. C. Glenn Begley & Lee M. Ellis, 2012. "Raise standards for preclinical cancer research," Nature, Nature, vol. 483(7391), pages 531-533, March.
    6. Duflo, Esther & Glennerster, Rachel & Kremer, Michael, 2008. "Using Randomization in Development Economics Research: A Toolkit," Handbook of Development Economics, in: T. Paul Schultz & John A. Strauss (ed.), Handbook of Development Economics, edition 1, volume 4, chapter 61, pages 3895-3962, Elsevier.
    7. Rosenbaum, Paul R., 2010. "Design Sensitivity and Efficiency in Observational Studies," Journal of the American Statistical Association, American Statistical Association, vol. 105(490), pages 692-702.
    8. Shanika L. Wickramasuriya & George Athanasopoulos & Rob J. Hyndman, 2019. "Optimal Forecast Reconciliation for Hierarchical and Grouped Time Series Through Trace Minimization," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 804-819, April.
    9. Brian E. Roe & David R. Just, 2009. "Internal and External Validity in Economics Research: Tradeoffs between Experiments, Field Experiments, Natural Experiments, and Field Data," American Journal of Agricultural Economics, Agricultural and Applied Economics Association, vol. 91(5), pages 1266-1271.
    10. James J. Heckman, 2001. "Micro Data, Heterogeneity, and the Evaluation of Public Policy: Nobel Lecture," Journal of Political Economy, University of Chicago Press, vol. 109(4), pages 673-748, August.
    11. D. Dobler & J. Beyersmann & M. Pauly, 2017. "Non-strange weird resampling for complex survival data," Biometrika, Biometrika Trust, vol. 104(3), pages 699-711.
    12. Athey, Susan & Imbens, Guido W., 2015. "Machine Learning for Estimating Heterogeneous Causal Effects," Research Papers 3350, Stanford University, Graduate School of Business.
    13. Braver, Sanford L. & Smith, Melanie C., 1996. "Maximizing both external and internal validity in longitudinal true experiments with voluntary treatments: The "combined modified" design," Evaluation and Program Planning, Elsevier, vol. 19(4), pages 287-300, November.
    14. Dunson, David B., 2018. "Statistics in the big data era: Failures of the machine," Statistics & Probability Letters, Elsevier, vol. 136(C), pages 4-9.
    15. Vanessa Didelez, 2007. "Graphical Models for Composable Finite Markov Processes," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 34(1), pages 169-185, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org, revised Aug 2022.
    2. Rina Friedberg & Julie Tibshirani & Susan Athey & Stefan Wager, 2018. "Local Linear Forests," Papers 1807.11408, arXiv.org, revised Sep 2020.
    3. Borup, Daniel & Christensen, Bent Jesper & Mühlbach, Nicolaj Søndergaard & Nielsen, Mikkel Slot, 2023. "Targeting predictors in random forest regression," International Journal of Forecasting, Elsevier, vol. 39(2), pages 841-868.
    4. Domenico Giannone & Michele Lenza & Giorgio E. Primiceri, 2021. "Economic Predictions With Big Data: The Illusion of Sparsity," Econometrica, Econometric Society, vol. 89(5), pages 2409-2437, September.
    5. Greenstone, Michael & Gayer, Ted, 2009. "Quasi-experimental and experimental approaches to environmental economics," Journal of Environmental Economics and Management, Elsevier, vol. 57(1), pages 21-44, January.
    6. Jörg Peters & Jörg Langbein & Gareth Roberts, 2018. "Generalization in the Tropics – Development Policy, Randomized Controlled Trials, and External Validity," The World Bank Research Observer, World Bank, vol. 33(1), pages 34-64.
    7. Burgess, Simon & Metcalfe, Robert & Sadoff, Sally, 2021. "Understanding the response to financial and non-financial incentives in education: Field experimental evidence using high-stakes assessments," Economics of Education Review, Elsevier, vol. 85(C).
    8. Guido W. Imbens, 2022. "Causality in Econometrics: Choice vs Chance," Econometrica, Econometric Society, vol. 90(6), pages 2541-2566, November.
    9. Guido W. Imbens, 2020. "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 1129-1179, December.
    10. Victor Chernozhukov & Mert Demirer & Esther Duflo & Iv'an Fern'andez-Val, 2017. "Fisher-Schultz Lecture: Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments, with an Application to Immunization in India," Papers 1712.04802, arXiv.org, revised Oct 2023.
    11. Salomo Hirvonen & Maarit Lassander & Lauri Sääksvuori & Janne Tukiainen, 2023. "Who is mobilized to vote by short text messages? Evidence from a nationwide field experiment with young voters," Discussion Papers 157, Aboa Centre for Economics.
    12. Maria Nareklishvili & Nicholas Polson & Vadim Sokolov, 2022. "Feature Selection for Personalized Policy Analysis," Papers 2301.00251, arXiv.org, revised Jul 2023.
    13. Victor Chernozhukov & Mert Demirer & Esther Duflo & Ivan Fernandez-Val, 2017. "Generic machine learning inference on heterogenous treatment effects in randomized experiments," CeMMAP working papers 61/17, Institute for Fiscal Studies.
    14. Liesbeth Colen & Sergio Gomez y Paloma & Uwe Latacz-Lohmann & Marianne Lefebvre & Raphaële Préget & Sophie Thoyer, 2016. "Economic Experiments as a Tool for Agricultural Policy Evaluation: Insights from the European CAP," Canadian Journal of Agricultural Economics/Revue canadienne d'agroeconomie, Canadian Agricultural Economics Society/Societe canadienne d'agroeconomie, vol. 64(4), pages 667-694, December.
    15. Escobari, Diego & Hoover, Gary A., 2024. "Late-Arriving Votes and Electoral Fraud: A Natural Experiment and Regression Discontinuity Evidence from Bolivia," World Development, Elsevier, vol. 173(C).
    16. Asresu Yitayew & Awudu Abdulai & Yigezu A Yigezu, 2022. "Improved agricultural input delivery systems for enhancing technology adoption: evidence from a field experiment in Ethiopia," European Review of Agricultural Economics, Oxford University Press and the European Agricultural and Applied Economics Publications Foundation, vol. 49(3), pages 527-556.
    17. Tatsushi Oka & Shota Yasui & Yuta Hayakawa & Undral Byambadalai, 2024. "Regression Adjustment for Estimating Distributional Treatment Effects in Randomized Controlled Trials," Papers 2407.14074, arXiv.org.
    18. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    19. Shi, Zhentao & Huang, Jingyi, 2023. "Forward-selected panel data approach for program evaluation," Journal of Econometrics, Elsevier, vol. 234(2), pages 512-535.
    20. 'Agoston Reguly, 2021. "Heterogeneous Treatment Effects in Regression Discontinuity Designs," Papers 2106.11640, arXiv.org, revised Oct 2021.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:advdac:v:16:y:2022:i:4:d:10.1007_s11634-021-00455-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.