IDEAS home Printed from https://ideas.repec.org/p/ifs/cemmap/61-17.html
   My bibliography  Save this paper

Generic machine learning inference on heterogenous treatment effects in randomized experiments

Author

Listed:
  • Victor Chernozhukov

    (Institute for Fiscal Studies and MIT)

  • Mert Demirer

    (Institute for Fiscal Studies)

  • Esther Duflo

    (Institute for Fiscal Studies)

  • Ivan Fernandez-Val

    (Institute for Fiscal Studies and Boston University)

Abstract

We propose strategies to estimate and make inference on key features of heterogeneous effects in randomized experiments. These key features include best linear predictors of the effects using machine learning proxies, average effects sorted by impact groups, and average characteristics of most and least impacted units. The approach is valid in high dimensional settings, where the effects are proxied by machine learning methods. We post-process these proxies into the estimates of the key features. Our approach is generic, it can be used in conjunction with penalized methods, deep and shallow neural networks, canonical and new random forests, boosted trees, and ensemble methods. Our approach is agnostic and does not make unrealistic or hard-to-check assumptions; we don’t require conditions for consistency of the ML methods. Estimation and inference relies on repeated data splitting to avoid overfitting and achieve validity. For inference, we take medians of p-values and medians of confidence intervals, resulting from many different data splits, and then adjust their nominal level to guarantee uniform validity. This variational inference method is shown to be uniformly valid and quantifies the uncertainty coming from both parameter estimation and data splitting. The inference method could be of substantial independent interest in many machine learning applications. An empirical application to the impact of micro-credit on economic development illustrates the use of the approach in randomized experiments. An additional application to the impact of the gender discrimination on wages illustrates the potential use of the approach in observational studies, where machine learning methods can be used to condition flexibly on very high-dimensional controls.

Suggested Citation

  • Victor Chernozhukov & Mert Demirer & Esther Duflo & Ivan Fernandez-Val, 2017. "Generic machine learning inference on heterogenous treatment effects in randomized experiments," CeMMAP working papers CWP61/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
  • Handle: RePEc:ifs:cemmap:61/17
    as

    Download full text from publisher

    File URL: https://www.ifs.org.uk/uploads/CWP611717.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Meinshausen, Nicolai & Meier, Lukas & Bühlmann, Peter, 2009. "p-Values for High-Dimensional Regression," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1671-1681.
    2. Alessandro Tarozzi & Jaikishan Desai & Kristin Johnson, 2015. "The Impacts of Microcredit: Evidence from Ethiopia," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 54-89, January.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    4. V. Chernozhukov & I. Fernández-Val & A. Galichon, 2009. "Improving point and interval estimators of monotone functions by rearrangement," Biometrika, Biometrika Trust, vol. 96(3), pages 559-575.
    5. Karlan, Dean S. & Zinman, Jonathan, 2009. "Expanding Microenterprise Credit Access: Using Randomized Supply Decisions to Estimate the Impacts in Manila," Center Discussion Papers 52600, Yale University, Economic Growth Center.
    6. Francine D. Blau & Lawrence M. Kahn, 2017. "The Gender Wage Gap: Extent, Trends, and Explanations," Journal of Economic Literature, American Economic Association, vol. 55(3), pages 789-865, September.
    7. Abhijit Vinayak Banerjee, 2013. "Microcredit Under the Microscope: What Have We Learned in the Past Two Decades, and What Do We Need to Know?," Annual Review of Economics, Annual Reviews, vol. 5(1), pages 487-519, May.
    8. Keisuke Hirano & Guido W. Imbens & Geert Ridder, 2003. "Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score," Econometrica, Econometric Society, vol. 71(4), pages 1161-1189, July.
    9. Victor Chernozhukov & Iván Fernández‐Val & Ye Luo, 2018. "The Sorted Effects Method: Discovering Heterogeneous Effects Beyond Their Averages," Econometrica, Econometric Society, vol. 86(6), pages 1911-1938, November.
    10. Manuela Angelucci & Dean Karlan & Jonathan Zinman, 2015. "Microcredit Impacts: Evidence from a Randomized Microcredit Program Placement Experiment by Compartamos Banco," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 151-182, January.
    11. Karlan, Dean & Zinman, Jonathan, 2009. "Expanding Microenterprise Credit Access: Randomized Supply Decisions to Estimate the Impacts in Manila," Working Papers 68, Yale University, Department of Economics.
    12. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    13. Bruno Crépon & Florencia Devoto & Esther Duflo & William Parienté, 2015. "Estimating the Impact of Microcredit on Those Who Take It Up: Evidence from a Randomized Experiment in Morocco," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 123-150, January.
    14. Duflo, Esther & Glennerster, Rachel & Kremer, Michael, 2008. "Using Randomization in Development Economics Research: A Toolkit," Handbook of Development Economics, in: T. Paul Schultz & John A. Strauss (ed.), Handbook of Development Economics, edition 1, volume 4, chapter 61, pages 3895-3962, Elsevier.
    15. Jonathan M.V. Davis & Sara B. Heller, 2017. "Rethinking the Benefits of Youth Employment Programs: The Heterogeneous Effects of Summer Jobs," NBER Working Papers 23443, National Bureau of Economic Research, Inc.
    16. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    17. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Uniform post selection inference for LAD regression models," CeMMAP working papers CWP24/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    18. Christian Hansen & Damian Kozbur & Sanjog Misra, 2016. "Targeted undersmoothing," ECON - Working Papers 282, Department of Economics - University of Zurich, revised Apr 2018.
    19. Britta Augsburg & Ralph De Haas & Heike Harmgart & Costas Meghir, 2012. "Microfinance, Poverty and Education," IFS Working Papers W12/15, Institute for Fiscal Studies.
    20. Orazio Attanasio & Britta Augsburg & Ralph De Haas & Emla Fitzsimons & Heike Harmgart, 2015. "The Impacts of Microfinance: Evidence from Joint-Liability Lending in Mongolia," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 90-122, January.
    21. Alberto Abadie, 2005. "Semiparametric Difference-in-Differences Estimators," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 72(1), pages 1-19.
    22. V. Chernozhukov & I. Fernández-Val & A. Galichon, 2009. "Improving point and interval estimators of monotone functions by rearrangement," Biometrika, Biometrika Trust, vol. 96(3), pages 559-575.
    23. James Albrecht & Anders Bjorklund & Susan Vroman, 2003. "Is There a Glass Ceiling in Sweden?," Journal of Labor Economics, University of Chicago Press, vol. 21(1), pages 145-177, January.
    24. Abhijit Banerjee & Esther Duflo & Rachel Glennerster & Cynthia Kinnan, 2015. "The Miracle of Microfinance? Evidence from a Randomized Evaluation," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 22-53, January.
    25. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Nov 2024.
    26. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Daniel J. Lewis & Davide Melcangi & Laura Pilossoph, 2019. "Latent Heterogeneity in the Marginal Propensity to Consume," Staff Reports 902, Federal Reserve Bank of New York.
    2. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    3. Alex Armand & Britta Augsburg & Antonella Bancalari, 2021. "Coordination and the poor maintenance trap: an experiment on public infrastructure in India," NOVAFRICA Working Paper Series wp2110, Universidade Nova de Lisboa, Nova School of Business and Economics, NOVAFRICA.
    4. Vira Semenova, 2020. "Generalized Lee Bounds," Papers 2008.12720, arXiv.org, revised Feb 2023.
    5. Bluwstein, Kristina & Buckmann, Marcus & Joseph, Andreas & Kapadia, Sujit & Şimşek, Özgür, 2023. "Credit growth, the yield curve and financial crisis prediction: Evidence from a machine learning approach," Journal of International Economics, Elsevier, vol. 145(C).
    6. Paul B. Ellickson & Wreetabrata Kar & James C. Reeder, 2023. "Estimating Marketing Component Effects: Double Machine Learning from Targeted Digital Promotions," Marketing Science, INFORMS, vol. 42(4), pages 704-728, July.
    7. Riccardo Di Francesco, 2024. "Aggregation Trees," Papers 2410.11408, arXiv.org.
    8. Stephen Coussens & Jann Spiess, 2021. "Improving Inference from Simple Instruments through Compliance Estimation," Papers 2108.03726, arXiv.org.
    9. O'Neill, E. & Weeks, M., 2018. "Causal Tree Estimation of Heterogeneous Household Response to Time-Of-Use Electricity Pricing Schemes," Cambridge Working Papers in Economics 1865, Faculty of Economics, University of Cambridge.
    10. Matias D. Cattaneo & Max H. Farrell & Yingjie Feng, 2018. "Large Sample Properties of Partitioning-Based Series Estimators," Papers 1804.04916, arXiv.org, revised Jun 2019.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Victor Chernozhukov & Mert Demirer & Esther Duflo & Iv'an Fern'andez-Val, 2017. "Fisher-Schultz Lecture: Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments, with an Application to Immunization in India," Papers 1712.04802, arXiv.org, revised Oct 2023.
    2. Pedro Carneiro & Sokbae Lee & Daniel Wilhelm, 2020. "Optimal data collection for randomized control trials," The Econometrics Journal, Royal Economic Society, vol. 23(1), pages 1-31.
    3. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Jonathan Fu & Annette Krauss, 2024. "Preparing fertile ground: how does the quality of business environments affect MSE growth?," Small Business Economics, Springer, vol. 63(1), pages 51-103, June.
    5. Fumagalli, Laura & Martin, Thomas, 2023. "Child labor among farm households in Mozambique and the role of reciprocal adult labor," World Development, Elsevier, vol. 161(C).
    6. Nusrat Abedin Jimi & Plamen V. Nikolov & Mohammad Abdul Malek & Subal Kumbhakar, 2019. "The effects of access to credit on productivity: separating technological changes from changes in technical efficiency," Journal of Productivity Analysis, Springer, vol. 52(1), pages 37-55, December.
    7. Lota Tamini & Ibrahima Bocoum & Ghislain Auger & Kotchikpa Gabriel Lawin & Arahama Traoré, 2019. "Enhanced Microfinance Services and Agricultural Best Management Practices: What Benefits for Smallholders Farmers? An Evidence from Burkina Faso," CIRANO Working Papers 2019s-11, CIRANO.
    8. Nusrat Abedin Jimi & Plamen Nikolov & Mohammad Abdul Malek & Subal Kumbhakar, 2020. "The Effects of Access to Credit on Productivity Among Microenterprises: Separating Technological Changes from Changes in Technical Efficiency," Papers 2006.03650, arXiv.org.
    9. Abhijit Banerjee & Emily Breza & Esther Duflo & Cynthia Kinnan, 2019. "Can Microfinance Unlock a Poverty Trap for Some Entrepreneurs?," NBER Working Papers 26346, National Bureau of Economic Research, Inc.
    10. Ahlin, Christian & Gulesci, Selim & Madestam, Andreas & Stryjan, Miri, 2020. "Loan contract structure and adverse selection: Survey evidence from Uganda," Journal of Economic Behavior & Organization, Elsevier, vol. 172(C), pages 180-195.
    11. Nakano, Yuko & Magezi, Eustadius F., 2020. "The impact of microcredit on agricultural technology adoption and productivity: Evidence from randomized control trial in Tanzania," World Development, Elsevier, vol. 133(C).
    12. Oriana Bandiera & Robin Burgess & Erika Deserranno & Ricardo Morel & Imran Rasul & Munshi Sulaiman & Jack Thiemel, 2022. "Microfinance and Diversification," Economica, London School of Economics and Political Science, vol. 89(S1), pages 239-275, June.
    13. Daniel Kandie & Khan Jahirul Islam, 2022. "A new era of microfinance: The digital microcredit and its impact on poverty," Journal of International Development, John Wiley & Sons, Ltd., vol. 34(3), pages 469-492, April.
    14. Dahal, Mahesh & Fiala, Nathan, 2020. "What do we know about the impact of microfinance? The problems of statistical power and precision," World Development, Elsevier, vol. 128(C).
    15. Dammert, Ana C. & de Hoop, Jacobus & Mvukiyehe, Eric & Rosati, Furio C., 2018. "Effects of public policy on child labor: Current knowledge, gaps, and implications for program design," World Development, Elsevier, vol. 110(C), pages 104-123.
    16. Kersten, Renate & Harms, Job & Liket, Kellie & Maas, Karen, 2017. "Small Firms, large Impact? A systematic review of the SME Finance Literature," World Development, Elsevier, vol. 97(C), pages 330-348.
    17. Meager, Rachael & Sturdy, Jennifer, 2017. "Aggregating Distributional Treatment Effects: A Bayesian Hierarchical Analysis of the Microcredit Literature," MetaArXiv 7tkvm, Center for Open Science.
    18. Rachael Meager, 2015. "Understanding the Impact of Microcredit Expansions: A Bayesian Hierarchical Analysis of 7 Randomised Experiments," Papers 1506.06669, arXiv.org, revised Jul 2016.
    19. Michael Zimmert & Michael Lechner, 2019. "Nonparametric estimation of causal heterogeneity under high-dimensional confounding," Papers 1908.08779, arXiv.org.
    20. Athey, Susan & Imbens, Guido W. & Metzger, Jonas & Munro, Evan, 2024. "Using Wasserstein Generative Adversarial Networks for the design of Monte Carlo simulations," Journal of Econometrics, Elsevier, vol. 240(2).

    More about this item

    Keywords

    Agnostic Inference; Machine Learning; Confidence Intervals; Causal Effects; Variational P-values and Confidence Intervals; Uniformly Valid Inference; Quantification of Uncertainty; Sample Splitting; Multiple Splitting; Assumption-Freeness;
    All these keywords.

    JEL classification:

    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models
    • D14 - Microeconomics - - Household Behavior - - - Household Saving; Personal Finance
    • G21 - Financial Economics - - Financial Institutions and Services - - - Banks; Other Depository Institutions; Micro Finance Institutions; Mortgages
    • O16 - Economic Development, Innovation, Technological Change, and Growth - - Economic Development - - - Financial Markets; Saving and Capital Investment; Corporate Finance and Governance

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ifs:cemmap:61/17. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Emma Hyman (email available below). General contact details of provider: https://edirc.repec.org/data/cmifsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.