IDEAS home Printed from https://ideas.repec.org/a/oup/biomet/v111y2024i1p215-233..html
   My bibliography  Save this article

Tailored inference for finite populations: conditional validity and transfer across distributions

Author

Listed:
  • Ying Jin
  • Dominik Rothenhäusler

Abstract

SummaryParameters of subpopulations can be more relevant than those of superpopulations. For example, a healthcare provider may be interested in the effect of a treatment plan for a specific subset of their patients; policymakers may be concerned with the impact of a policy in a particular state within a given population. In these cases, the focus is on a specific finite population, as opposed to an infinite superpopulation. Such a population can be characterized by fixing some attributes that are intrinsic to them, leaving unexplained variations like measurement error as random. Inference for a population with fixed attributes can then be modelled as inferring parameters of a conditional distribution. Accordingly, it is desirable that confidence intervals are conditionally valid for the realized population, instead of marginalizing over many possible draws of populations. We provide a statistical inference framework for parameters of finite populations with known attributes. Leveraging the attribute information, our estimators and confidence intervals closely target a specific finite population. When the data are from the population of interest, our confidence intervals attain asymptotic conditional validity, given the attributes, and are shorter than those for superpopulation inference. In addition, we develop procedures to infer parameters of new populations with differing covariate distributions; the confidence intervals are also conditionally valid for the new populations under mild conditions. Our methods extend to situations where the fixed information has a weaker structure or is only partially observed. We demonstrate the validity and applicability of our methods using simulated data and a real-word dataset for predicting car prices.

Suggested Citation

  • Ying Jin & Dominik Rothenhäusler, 2024. "Tailored inference for finite populations: conditional validity and transfer across distributions," Biometrika, Biometrika Trust, vol. 111(1), pages 215-233.
  • Handle: RePEc:oup:biomet:v:111:y:2024:i:1:p:215-233.
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1093/biomet/asad022
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Isaiah Andrews & Jonathan Roth & Ariel Pakes, 2023. "Inference for Linear Conditional Moment Inequalities," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 90(6), pages 2763-2791.
    2. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    3. Naoki Egami & Erin Hartman, 2021. "Covariate selection for generalizing experimental results: Application to a large‐scale development program in Uganda," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(4), pages 1524-1548, October.
    4. Joshua D. Angrist, 1998. "Estimating the Labor Market Impact of Voluntary Military Service Using Social Security Data on Military Applicants," Econometrica, Econometric Society, vol. 66(2), pages 249-288, March.
    5. Andrea Rotnitzky & Quanhong Lei & Mariela Sued & James M. Robins, 2012. "Improved double-robust estimation in missing data and causal inference models," Biometrika, Biometrika Trust, vol. 99(2), pages 439-456.
    6. Alberto Abadie & Susan Athey & Guido W. Imbens & Jeffrey M. Wooldridge, 2020. "Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis," Econometrica, Econometric Society, vol. 88(1), pages 265-296, January.
    7. Arceneaux, Kevin & Gerber, Alan S. & Green, Donald P., 2006. "Comparing Experimental and Matching Methods Using a Large-Scale Voter Mobilization Experiment," Political Analysis, Cambridge University Press, vol. 14(1), pages 37-62, January.
    8. Alberto Abadie & Guido W. Imbens & Fanyin Zheng, 2014. "Inference for Misspecified Models With Fixed Regressors," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(508), pages 1601-1614, December.
    9. Dedecker, Jérôme & Merlevède, Florence, 2003. "The conditional central limit theorem in Hilbert spaces," Stochastic Processes and their Applications, Elsevier, vol. 108(2), pages 229-262, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alberto Abadie & Susan Athey & Guido W. Imbens & Jeffrey M. Wooldridge, 2020. "Sampling‐Based versus Design‐Based Uncertainty in Regression Analysis," Econometrica, Econometric Society, vol. 88(1), pages 265-296, January.
    2. Sloczynski, Tymon, 2020. "Interpreting OLS Estimands When Treatment Effects Are Heterogeneous: Smaller Groups Get Larger Weights," IZA Discussion Papers 13283, Institute of Labor Economics (IZA).
    3. Alberto Abadie & Anish Agarwal & Raaz Dwivedi & Abhin Shah, 2024. "Doubly Robust Inference in Causal Latent Factor Models," Papers 2402.11652, arXiv.org, revised Apr 2024.
    4. Su, Miaomiao & Wang, Ruoyu & Wang, Qihua, 2022. "A two-stage optimal subsampling estimation for missing data problems with large-scale data," Computational Statistics & Data Analysis, Elsevier, vol. 173(C).
    5. Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Mar 2024.
    6. Roth, Jonathan & Sant’Anna, Pedro H.C. & Bilinski, Alyssa & Poe, John, 2023. "What’s trending in difference-in-differences? A synthesis of the recent econometrics literature," Journal of Econometrics, Elsevier, vol. 235(2), pages 2218-2244.
    7. Ding, Peng, 2021. "The Frisch–Waugh–Lovell theorem for standard errors," Statistics & Probability Letters, Elsevier, vol. 168(C).
    8. Damian Clarke & Nicol'as Paris & Benjam'in Villena-Rold'an, 2023. "(Frisch-Waugh-Lovell)': On the Estimation of Regression Models by Row," Papers 2311.15829, arXiv.org.
    9. Alberto Abadie & Susan Athey & Guido W. Imbens & Jeffrey M. Wooldridge, 2017. "Sampling-based vs. Design-based Uncertainty in Regression Analysis," Papers 1706.01778, arXiv.org, revised Jun 2019.
    10. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    11. AmirEmad Ghassami & Andrew Ying & Ilya Shpitser & Eric Tchetgen Tchetgen, 2021. "Minimax Kernel Machine Learning for a Class of Doubly Robust Functionals with Application to Proximal Causal Inference," Papers 2104.02929, arXiv.org, revised Mar 2022.
    12. Fengshi Niu & Harsha Nori & Brian Quistorff & Rich Caruana & Donald Ngwe & Aadharsh Kannan, 2022. "Differentially Private Estimation of Heterogeneous Causal Effects," Papers 2202.11043, arXiv.org.
    13. Peng Ding, 2020. "The Frisch--Waugh--Lovell Theorem for Standard Errors," Papers 2009.06621, arXiv.org.
    14. Nicholas Williams & Michael Rosenblum & Iván Díaz, 2022. "Optimising precision and power by machine learning in randomised trials with ordinal and time‐to‐event outcomes with an application to COVID‐19," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 2156-2178, October.
    15. Oliver Hines & Stijn Vansteelandt & Karla Diaz-Ordaz, 2021. "Robust Inference for Mediated Effects in Partially Linear Models," Psychometrika, Springer;The Psychometric Society, vol. 86(2), pages 595-618, June.
    16. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    17. Becker, Sascha & Hvide, Hans V, 2013. "Do entrepreneurs matter?," CAGE Online Working Paper Series 109, Competitive Advantage in the Global Economy (CAGE).
    18. Nicolaj N. Mühlbach, 2020. "Tree-based Synthetic Control Methods: Consequences of moving the US Embassy," CREATES Research Papers 2020-04, Department of Economics and Business Economics, Aarhus University.
    19. Blackburn, McKinley L. & Vermilyea, Todd, 2012. "The prevalence and impact of misstated incomes on mortgage loan applications," Journal of Housing Economics, Elsevier, vol. 21(2), pages 151-168.
    20. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:biomet:v:111:y:2024:i:1:p:215-233.. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: https://academic.oup.com/biomet .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.