IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2410.18381.html
   My bibliography  Save this paper

Inference on High Dimensional Selective Labeling Models

Author

Listed:
  • Shakeeb Khan
  • Elie Tamer
  • Qingsong Yao

Abstract

A class of simultaneous equation models arise in the many domains where observed binary outcomes are themselves a consequence of the existing choices of of one of the agents in the model. These models are gaining increasing interest in the computer science and machine learning literatures where they refer the potentially endogenous sample selection as the {\em selective labels} problem. Empirical settings for such models arise in fields as diverse as criminal justice, health care, and insurance. For important recent work in this area, see for example Lakkaruju et al. (2017), Kleinberg et al. (2018), and Coston et al.(2021) where the authors focus on judicial bail decisions, and where one observes the outcome of whether a defendant filed to return for their court appearance only if the judge in the case decides to release the defendant on bail. Identifying and estimating such models can be computationally challenging for two reasons. One is the nonconcavity of the bivariate likelihood function, and the other is the large number of covariates in each equation. Despite these challenges, in this paper we propose a novel distribution free estimation procedure that is computationally friendly in many covariates settings. The new method combines the semiparametric batched gradient descent algorithm introduced in Khan et al.(2023) with a novel sorting algorithms incorporated to control for selection bias. Asymptotic properties of the new procedure are established under increasing dimension conditions in both equations, and its finite sample properties are explored through a simulation study and an application using judicial bail data.

Suggested Citation

  • Shakeeb Khan & Elie Tamer & Qingsong Yao, 2024. "Inference on High Dimensional Selective Labeling Models," Papers 2410.18381, arXiv.org, revised Oct 2024.
  • Handle: RePEc:arx:papers:2410.18381
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2410.18381
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Azeem M. Shaikh & Edward J. Vytlacil, 2011. "Partial Identification in Triangular Systems of Equations With Binary Dependent Variables," Econometrica, Econometric Society, vol. 79(3), pages 949-955, May.
    2. Klein, Roger W & Spady, Richard H, 1993. "An Efficient Semiparametric Estimator for Binary Response Models," Econometrica, Econometric Society, vol. 61(2), pages 387-421, March.
    3. Jason Abrevaya & Jerry A. Hausman & Shakeeb Khan, 2010. "Testing for Causal Effects in a Generalized Regression Model With Endogenous Regressors," Econometrica, Econometric Society, vol. 78(6), pages 2043-2061, November.
    4. Yatchew, A., 1997. "An elementary estimator of the partial linear model," Economics Letters, Elsevier, vol. 57(2), pages 135-143, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Han, Sukjin, 2021. "Identification in nonparametric models for dynamic treatment effects," Journal of Econometrics, Elsevier, vol. 225(2), pages 132-147.
    2. Shakeeb Khan & Arnaud Maurel & Yichong Zhang, 2023. "Informational Content of Factor Structures in Simultaneous Binary Response Models," Advances in Econometrics, in: Essays in Honor of Joon Y. Park: Econometric Methodology in Empirical Applications, volume 45, pages 385-410, Emerald Group Publishing Limited.
    3. Anirban Basu & Norma Coe & Cole G. Chapman, 2017. "Comparing 2SLS vs 2SRI for Binary Outcomes and Binary Exposures," NBER Working Papers 23840, National Bureau of Economic Research, Inc.
    4. Aradillas-Lopez, Andres, 2010. "Semiparametric estimation of a simultaneous game with incomplete information," Journal of Econometrics, Elsevier, vol. 157(2), pages 409-431, August.
    5. Shakeeb Khan & Fu Ouyang & Elie Tamer, 2020. "Inference on Semiparametric Multinomial Response Models," Discussion Papers Series 627, School of Economics, University of Queensland, Australia.
    6. Shakeeb Khan & Fu Ouyang & Elie Tamer, 2019. "Inference on Semiparametric Multinomial Response Models," Boston College Working Papers in Economics 980, Boston College Department of Economics.
    7. Anirban Basu & Norma B. Coe & Cole G. Chapman, 2018. "2SLS versus 2SRI: Appropriate methods for rare outcomes and/or rare exposures," Health Economics, John Wiley & Sons, Ltd., vol. 27(6), pages 937-955, June.
    8. Lewbel, Arthur & Schennach, Susanne M., 2007. "A simple ordered data estimator for inverse density weighted expectations," Journal of Econometrics, Elsevier, vol. 136(1), pages 189-211, January.
    9. Shakeeb Khan & Fu Ouyang & Elie Tamer, 2021. "Inference on semiparametric multinomial response models," Quantitative Economics, Econometric Society, vol. 12(3), pages 743-777, July.
    10. Shakeeb Khan & Xiaoying Lan & Elie Tamer & Qingsong Yao, 2021. "Estimating High Dimensional Monotone Index Models by Iterative Convex Optimization1," Papers 2110.04388, arXiv.org, revised Feb 2023.
    11. Alexander Torgovitsky, 2019. "Partial identification by extending subdistributions," Quantitative Economics, Econometric Society, vol. 10(1), pages 105-144, January.
    12. Klein, Roger & Shen, Chan & Vella, Francis, 2015. "Estimation of marginal effects in semiparametric selection models with binary outcomes," Journal of Econometrics, Elsevier, vol. 185(1), pages 82-94.
    13. Arthur Lewbel & Susanne M. Schennach, 2003. "A Simple Ordered Data Estimator For Inverse Density Weighted Functions," Boston College Working Papers in Economics 557, Boston College Department of Economics, revised 01 May 2005.
    14. Ismaël Mourifié & Marc Henry & Romuald Méango, 2020. "Sharp Bounds and Testability of a Roy Model of STEM Major Choices," Journal of Political Economy, University of Chicago Press, vol. 128(8), pages 3220-3283.
    15. Shun-Yang Lee & Julian Runge & Daniel Yoo & Yakov Bart & Anett Gyurak & J. W. Schneider, 2023. "COVID-19 Demand Shocks Revisited: Did Advertising Technology Help Mitigate Adverse Consequences for Small and Midsize Businesses?," Papers 2307.09035, arXiv.org, revised Jan 2024.
    16. Ai, Chunrong & Chen, Xiaohong, 2007. "Estimation of possibly misspecified semiparametric conditional moment restriction models with different conditioning variables," Journal of Econometrics, Elsevier, vol. 141(1), pages 5-43, November.
    17. Ichimura, Hidehiko & Todd, Petra E., 2007. "Implementing Nonparametric and Semiparametric Estimators," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 74, Elsevier.
    18. Fosgerau, Mogens & Bierlaire, Michel, 2007. "A practical test for the choice of mixing distribution in discrete choice models," Transportation Research Part B: Methodological, Elsevier, vol. 41(7), pages 784-794, August.
    19. Lanot, Gauthier & Walker, Ian, 1998. "The union/non-union wage differential: An application of semi-parametric methods," Journal of Econometrics, Elsevier, vol. 84(2), pages 327-349, June.
    20. Philippe Bracke & Edward W. Pinchbeck & James Wyatt, 2018. "The Time Value of Housing: Historical Evidence on Discount Rates," Economic Journal, Royal Economic Society, vol. 128(613), pages 1820-1843, August.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2410.18381. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.