IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2011.06158.html
   My bibliography  Save this paper

Mostly Harmless Machine Learning: Learning Optimal Instruments in Linear IV Models

Author

Listed:
  • Jiafeng Chen
  • Daniel L. Chen
  • Greg Lewis

Abstract

We offer straightforward theoretical results that justify incorporating machine learning in the standard linear instrumental variable setting. The key idea is to use machine learning, combined with sample-splitting, to predict the treatment variable from the instrument and any exogenous covariates, and then use this predicted treatment and the covariates as technical instruments to recover the coefficients in the second-stage. This allows the researcher to extract non-linear co-variation between the treatment and instrument that may dramatically improve estimation precision and robustness by boosting instrument strength. Importantly, we constrain the machine-learned predictions to be linear in the exogenous covariates, thus avoiding spurious identification arising from non-linear relationships between the treatment and the covariates. We show that this approach delivers consistent and asymptotically normal estimates under weak conditions and that it may be adapted to be semiparametrically efficient (Chamberlain, 1992). Our method preserves standard intuitions and interpretations of linear instrumental variable methods, including under weak identification, and provides a simple, user-friendly upgrade to the applied economics toolbox. We illustrate our method with an example in law and criminal justice, examining the causal effect of appellate court reversals on district court sentencing decisions.

Suggested Citation

  • Jiafeng Chen & Daniel L. Chen & Greg Lewis, 2020. "Mostly Harmless Machine Learning: Learning Optimal Instruments in Linear IV Models," Papers 2011.06158, arXiv.org, revised Jun 2021.
  • Handle: RePEc:arx:papers:2011.06158
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2011.06158
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Bai, Jushan & Ng, Serena, 2010. "Instrumental Variable Estimation In A Data Rich Environment," Econometric Theory, Cambridge University Press, vol. 26(6), pages 1577-1606, December.
    2. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    3. Escanciano, Juan Carlos & Li, Wei, 2021. "Optimal Linear Instrumental Variables Approximations," Journal of Econometrics, Elsevier, vol. 221(1), pages 223-246.
    4. Lester Mackey & Vasilis Syrgkanis & Ilias Zadik, 2017. "Orthogonal Machine Learning: Power and Limitations," Papers 1711.00342, arXiv.org, revised Aug 2018.
    5. Severini, Thomas A. & Tripathi, Gautam, 2012. "Efficiency bounds for estimating linear functionals of nonparametric regression models with endogenous regressors," Journal of Econometrics, Elsevier, vol. 170(2), pages 491-498.
    6. Chamberlain, Gary, 1987. "Asymptotic efficiency in estimation with conditional moment restrictions," Journal of Econometrics, Elsevier, vol. 34(3), pages 305-334, March.
    7. Nishanth Dikkala & Greg Lewis & Lester Mackey & Vasilis Syrgkanis, 2020. "Minimax Estimation of Conditional Moment Models," Papers 2006.07201, arXiv.org.
    8. Antoine, Bertille & Lavergne, Pascal, 2023. "Identification-robust nonparametric inference in a linear IV model," Journal of Econometrics, Elsevier, vol. 235(1), pages 1-24.
    9. Hansen, Christian & Kozbur, Damian, 2014. "Instrumental variables estimation with many weak instruments using regularized JIVE," Journal of Econometrics, Elsevier, vol. 182(2), pages 290-308.
    10. Janet Currie & Henrik Kleven & Esmée Zwiers, 2020. "Technology and Big Data Are Changing Economics: Mining Text to Track Methods," AEA Papers and Proceedings, American Economic Association, vol. 110, pages 42-48, May.
    11. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    12. Frank Kleibergen, 2002. "Pivotal Statistics for Testing Structural Parameters in Instrumental Variables Regression," Econometrica, Econometric Society, vol. 70(5), pages 1781-1803, September.
    13. Angrist, J D & Imbens, G W & Krueger, A B, 1999. "Jackknife Instrumental Variables Estimation," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 14(1), pages 57-67, Jan.-Feb..
    14. Dieterle, Steven G. & Snell, Andy, 2016. "A simple diagnostic to investigate instrument validity and heterogeneous effects when using a single instrument," Labour Economics, Elsevier, vol. 42(C), pages 76-86.
    15. Whitney K. Newey & James L. Powell, 2003. "Instrumental Variable Estimation of Nonparametric Models," Econometrica, Econometric Society, vol. 71(5), pages 1565-1578, September.
    16. Chunrong Ai & Xiaohong Chen, 2003. "Efficient Estimation of Models with Conditional Moment Restrictions Containing Unknown Functions," Econometrica, Econometric Society, vol. 71(6), pages 1795-1843, November.
    17. Joel L. Horowitz & Sokbae Lee, 2007. "Nonparametric Instrumental Variables Estimation of a Quantile Regression Model," Econometrica, Econometric Society, vol. 75(4), pages 1191-1208, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ziyu Wang & Yuhao Zhou & Jun Zhu, 2022. "Fast Instrument Learning with Faster Rates," Papers 2205.10772, arXiv.org, revised Oct 2022.
    2. Christopher D. Walker, 2024. "Semiparametric Bayesian Inference for a Conditional Moment Equality Model," Papers 2410.16017, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
    2. Chen, Xiaohong & Pouzo, Demian & Powell, James L., 2019. "Penalized sieve GEL for weighted average derivatives of nonparametric quantile IV regressions," Journal of Econometrics, Elsevier, vol. 213(1), pages 30-53.
    3. Alena Skolkova, 2023. "Instrumental Variable Estimation with Many Instruments Using Elastic-Net IV," CERGE-EI Working Papers wp759, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
    4. V Chernozhukov & W K Newey & R Singh, 2023. "A simple and general debiased machine learning theorem with finite-sample guarantees," Biometrika, Biometrika Trust, vol. 110(1), pages 257-264.
    5. Breunig, Christoph & Mammen, Enno & Simoni, Anna, 2020. "Ill-posed estimation in high-dimensional models with instrumental variables," Journal of Econometrics, Elsevier, vol. 219(1), pages 171-200.
    6. Xiaohong Chen & Victor Chernozhukov & Sokbae Lee & Whitney K. Newey, 2014. "Local Identification of Nonparametric and Semiparametric Models," Econometrica, Econometric Society, vol. 82(2), pages 785-809, March.
    7. Hidehiko Ichimura & Whitney K. Newey, 2022. "The influence function of semiparametric estimators," Quantitative Economics, Econometric Society, vol. 13(1), pages 29-61, January.
    8. Jonas Metzger, 2022. "Adversarial Estimators," Papers 2204.10495, arXiv.org, revised Jun 2022.
    9. Xiaohong Chen & Sokbae Lee & Myung Hwan Seo & Myunghyun Song, 2020. "Inference for parameters identified by conditional moment restrictions using a generalized Bierens maximum statistic," Papers 2008.11140, arXiv.org, revised Oct 2024.
    10. Carrasco, Marine & Tchuente, Guy, 2015. "Regularized LIML for many instruments," Journal of Econometrics, Elsevier, vol. 186(2), pages 427-442.
    11. Eric Gautier & Christiern Rose, 2022. "Fast, Robust Inference for Linear Instrumental Variables Models using Self-Normalized Moments," Papers 2211.02249, arXiv.org, revised Nov 2022.
    12. Dong, Chaohua & Gao, Jiti & Linton, Oliver, 2023. "High dimensional semiparametric moment restriction models," Journal of Econometrics, Elsevier, vol. 232(2), pages 320-345.
    13. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    14. Dennis Lim & Wenjie Wang & Yichong Zhang, 2022. "A Conditional Linear Combination Test with Many Weak Instruments," Papers 2207.11137, arXiv.org, revised Apr 2023.
    15. Thomas Wiemann, 2023. "Optimal Categorical Instrumental Variables," Papers 2311.17021, arXiv.org, revised May 2024.
    16. Matthew Backus & Christopher Conlon & Michael Sinkinson, 2021. "Common Ownership and Competition in the Ready-to-Eat Cereal Industry," NBER Working Papers 28350, National Bureau of Economic Research, Inc.
    17. Jean‐Pierre Florens & Jan Johannes & Sébastien Van Bellegem, 2012. "Instrumental regression in partially linear models," Econometrics Journal, Royal Economic Society, vol. 15(2), pages 304-324, June.
    18. Victor Chernozhukov & Whitney Newey & Rahul Singh & Vasilis Syrgkanis, 2020. "Adversarial Estimation of Riesz Representers," Papers 2101.00009, arXiv.org, revised Apr 2024.
    19. Qingliang Fan & Zijian Guo & Ziwei Mei, 2022. "A Heteroskedasticity-Robust Overidentifying Restriction Test with High-Dimensional Covariates," Papers 2205.00171, arXiv.org, revised May 2024.
    20. Jiafeng Chen & Xiaohong Chen & Elie Tamer, 2021. "Efficient Estimation in NPIV Models: A Comparison of Various Neural Networks-Based Estimators," Papers 2110.06763, arXiv.org, revised Oct 2022.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2011.06158. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.