IDEAS home Printed from https://ideas.repec.org/a/bla/buecrs/v72y2020i3p272-287.html
   My bibliography  Save this article

Random forests and selected samples

Author

Listed:
  • Jonathan A. Cook
  • Saad Siddiqui

Abstract

This paper presents a procedure for recovering causal coefficients from selected samples that uses random forests, a popular machine‐learning algorithm. This proposed method makes few assumptions regarding the selection equation and the distribution of the error terms. Our Monte Carlo results indicate that our method performs well, even when the selection and outcome equations contain the same variables, as long as the selection equation is nonlinear. The method can also be used when there are many variables in the selection equation. We also compare the results of our procedure with other parametric and semiparametric methods using real data.

Suggested Citation

  • Jonathan A. Cook & Saad Siddiqui, 2020. "Random forests and selected samples," Bulletin of Economic Research, Wiley Blackwell, vol. 72(3), pages 272-287, July.
  • Handle: RePEc:bla:buecrs:v:72:y:2020:i:3:p:272-287
    DOI: 10.1111/boer.12222
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/boer.12222
    Download Restriction: no

    File URL: https://libkey.io/10.1111/boer.12222?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Robinson, Peter M, 1982. "On the Asymptotic Properties of Estimators of Models Containing Limited Dependent Variables," Econometrica, Econometric Society, vol. 50(1), pages 27-41, January.
    2. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    3. James J. Heckman, 1976. "The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models," NBER Chapters, in: Annals of Economic and Social Measurement, Volume 5, number 4, pages 475-492, National Bureau of Economic Research, Inc.
    4. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    5. Thomas W. Zuehlke, 2017. "Use of quadratic terms in Type 2 Tobit models," Applied Economics, Taylor & Francis Journals, vol. 49(17), pages 1706-1714, April.
    6. Susan Athey & Julie Tibshirani & Stefan Wager, 2016. "Generalized Random Forests," Papers 1610.01271, arXiv.org, revised Apr 2018.
    7. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    8. Ahn, Hyungtaik & Powell, James L., 1993. "Semiparametric estimation of censored selection models with a nonparametric selection mechanism," Journal of Econometrics, Elsevier, vol. 58(1-2), pages 3-29, July.
    9. Newey, Whitney K & Powell, James L & Walker, James R, 1990. "Semiparametric Estimation of Selection Models: Some Empirical Results," American Economic Review, American Economic Association, vol. 80(2), pages 324-328, May.
    10. Arabmazar, Abbas & Schmidt, Peter, 1982. "An Investigation of the Robustness of the Tobit Estimator to Non-Normality," Econometrica, Econometric Society, vol. 50(4), pages 1055-1063, July.
    11. Klein, Roger W & Spady, Richard H, 1993. "An Efficient Semiparametric Estimator for Binary Response Models," Econometrica, Econometric Society, vol. 61(2), pages 387-421, March.
    12. Richard Blundell & Alan Duncan, 1998. "Kernel Regression in Empirical Microeconomics," Journal of Human Resources, University of Wisconsin Press, vol. 33(1), pages 62-87.
    13. Maria Fraga O. Martins, 2001. "Parametric and semiparametric estimation of sample selection models: an empirical application to the female labour force in Portugal," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 16(1), pages 23-39.
    14. Jonathan A. Cook & Fred Gale, 2019. "Using food prices and consumption to examine Chinese cost of living," Pacific Economic Review, Wiley Blackwell, vol. 24(1), pages 3-26, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wayne Taylor & Brett Hollenbeck, 2021. "Leveraging loyalty programs using competitor based targeting," Quantitative Marketing and Economics (QME), Springer, vol. 19(3), pages 417-455, December.
    2. Quiroga Gutierrez, Ana Cecilia, 2024. "Picture this: Making health insurance choices easier for those who need it," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 111(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ichimura, Hidehiko & Todd, Petra E., 2007. "Implementing Nonparametric and Semiparametric Estimators," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 74, Elsevier.
    2. Pigini Claudia, 2015. "Bivariate Non-Normality in the Sample Selection Model," Journal of Econometric Methods, De Gruyter, vol. 4(1), pages 123-144, January.
    3. Schafgans, Marcia M. A., 2000. "Gender wage differences in Malaysia: parametric and semiparametric estimation," Journal of Development Economics, Elsevier, vol. 63(2), pages 351-378, December.
    4. Huber, Martin & Melly, Blaise, 2011. "Quantile Regression in the Presence of Sample Selection," Economics Working Paper Series 1109, University of St. Gallen, School of Economics and Political Science.
    5. Lechner, Michael & Okasa, Gabriel, 2019. "Random Forest Estimation of the Ordered Choice Model," Economics Working Paper Series 1908, University of St. Gallen, School of Economics and Political Science.
    6. Ruoyao Shi, 2021. "An Averaging Estimator for Two Step M Estimation in Semiparametric Models," Working Papers 202105, University of California at Riverside, Department of Economics.
    7. Claudia PIGINI, 2012. "Of Butterflies and Caterpillars: Bivariate Normality in the Sample Selection Model," Working Papers 377, Universita' Politecnica delle Marche (I), Dipartimento di Scienze Economiche e Sociali.
    8. Lanot, Gauthier & Walker, Ian, 1998. "The union/non-union wage differential: An application of semi-parametric methods," Journal of Econometrics, Elsevier, vol. 84(2), pages 327-349, June.
    9. Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
    10. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    11. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    12. Augusto Cerqua & Marco Letta & Gabriele Pinto, 2024. "On the (Mis)Use of Machine Learning with Panel Data," Papers 2411.09218, arXiv.org.
    13. Miruna Oprescu & Vasilis Syrgkanis & Zhiwei Steven Wu, 2018. "Orthogonal Random Forest for Causal Inference," Papers 1806.03467, arXiv.org, revised Sep 2019.
    14. Justin L. Tobias, 2003. "Are Returns to Schooling Concentrated Among the Most Able? A Semiparametric Analysis of the Ability–earnings Relationships," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 65(1), pages 1-29, February.
    15. Huffman, Sonya Kostova, 1999. "Changes of household consumption behavior during the transition from centrally-planned to market-oriented economy," ISU General Staff Papers 1999010108000013568, Iowa State University, Department of Economics.
    16. Martin Huber & Giovanni Mellace, 2014. "Testing exclusion restrictions and additive separability in sample selection models," Empirical Economics, Springer, vol. 47(1), pages 75-92, August.
    17. Bas Bosma & Arjen Witteloostuijn, 2024. "Machine learning in international business," Journal of International Business Studies, Palgrave Macmillan;Academy of International Business, vol. 55(6), pages 676-702, August.
    18. Dylan Brewer & Alyssa Carlson, 2024. "Addressing sample selection bias for machine learning methods," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(3), pages 383-400, April.
    19. Angrist, Joshua D., 1997. "Conditional independence in sample selection models," Economics Letters, Elsevier, vol. 54(2), pages 103-112, February.
    20. Huber, Martin & Meier, Jonas & Wallimann, Hannes, 2022. "Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets," Transportation Research Part B: Methodological, Elsevier, vol. 163(C), pages 22-39.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:buecrs:v:72:y:2020:i:3:p:272-287. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0307-3378 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.