IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2207.09016.html
   My bibliography  Save this paper

The role of the geometric mean in case-control studies

Author

Listed:
  • Amanda Coston
  • Edward H. Kennedy

Abstract

Historically used in settings where the outcome is rare or data collection is expensive, outcome-dependent sampling is relevant to many modern settings where data is readily available for a biased sample of the target population, such as public administrative data. Under outcome-dependent sampling, common effect measures such as the average risk difference and the average risk ratio are not identified, but the conditional odds ratio is. Aggregation of the conditional odds ratio is challenging since summary measures are generally not identified. Furthermore, the marginal odds ratio can be larger (or smaller) than all conditional odds ratios. This so-called non-collapsibility of the odds ratio is avoidable if we use an alternative aggregation to the standard arithmetic mean. We provide a new definition of collapsibility that makes this choice of aggregation method explicit, and we demonstrate that the odds ratio is collapsible under geometric aggregation. We describe how to partially identify, estimate, and do inference on the geometric odds ratio under outcome-dependent sampling. Our proposed estimator is based on the efficient influence function and therefore has doubly robust-style properties.

Suggested Citation

  • Amanda Coston & Edward H. Kennedy, 2022. "The role of the geometric mean in case-control studies," Papers 2207.09016, arXiv.org.
  • Handle: RePEc:arx:papers:2207.09016
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2207.09016
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. James J. Heckman & Petra E. Todd, 2009. "A note on adapting propensity score matching and selection models to choice based samples," Econometrics Journal, Royal Economic Society, vol. 12(s1), pages 230-234, January.
    2. Oliver Hines & Oliver Dukes & Karla Diaz-Ordaz & Stijn Vansteelandt, 2022. "Demystifying Statistical Learning Based on Efficient Influence Functions," The American Statistician, Taylor & Francis Journals, vol. 76(3), pages 292-304, July.
    3. Hua Yun Chen, 2007. "A Semiparametric Odds Ratio Model for Measuring Association," Biometrics, The International Biometric Society, vol. 63(2), pages 413-421, June.
    4. Imbens, Guido W. & Lancaster, Tony, 1996. "Efficient estimation and stratified sampling," Journal of Econometrics, Elsevier, vol. 74(2), pages 289-318, October.
    5. Lancaster, Tony & Imbens, Guido, 1996. "Case-control studies with contaminated controls," Journal of Econometrics, Elsevier, vol. 71(1-2), pages 145-160.
    6. Eric J. Tchetgen Tchetgen & James M. Robins & Andrea Rotnitzky, 2010. "On doubly robust estimation in a semiparametric odds ratio model," Biometrika, Biometrika Trust, vol. 97(1), pages 171-180.
    7. E. H. Kennedy & A. Sjölander & D. S. Small, 2015. "Semiparametric causal inference in matched cohort studies," Biometrika, Biometrika Trust, vol. 102(3), pages 739-746.
    8. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    9. Manski, Charles F & Lerman, Steven R, 1977. "The Estimation of Choice Probabilities from Choice Based Samples," Econometrica, Econometric Society, vol. 45(8), pages 1977-1988, November.
    10. van der Laan Mark J., 2008. "Estimation Based on Case-Control Designs with Known Prevalence Probability," The International Journal of Biostatistics, De Gruyter, vol. 4(1), pages 1-59, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sung Jae Jun & Sokbae Lee, 2024. "Causal Inference Under Outcome-Based Sampling with Monotonicity Assumptions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 42(3), pages 998-1009, July.
    2. Sung Jae Jun & Sokbae (Simon) Lee, 2020. "Causal inference in case-control studies," CeMMAP working papers CWP19/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    3. Kyungchul Song, 2009. "Efficient Estimation of Average Treatment Effects under Treatment-Based Sampling," PIER Working Paper Archive 09-011, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
    4. Esmeralda A. Ramalho & Richard J. Smith, 2013. "Discrete Choice Non-Response," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 80(1), pages 343-364.
    5. Bryan S. Graham & Cristine Campos De Xavier Pinto & Daniel Egel, 2012. "Inverse Probability Tilting for Moment Condition Models with Missing Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 79(3), pages 1053-1079.
    6. Richard Disney & Eleonora Fischera & Trudy Owens, 2010. "Has the Introduction of Microfinance Crowded-out Informal Loans in Malawi?," Discussion Papers 10/08, University of Nottingham, CREDIT.
    7. Lahiri, Kajal & Yang, Liu, 2013. "Forecasting Binary Outcomes," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 1025-1106, Elsevier.
    8. Esmerelda A. Ramalho & Richard Smith, 2003. "Discrete choice non-response," CeMMAP working papers 07/03, Institute for Fiscal Studies.
    9. Lancaster, Tony & Imbens, Guido, 1996. "Case-control studies with contaminated controls," Journal of Econometrics, Elsevier, vol. 71(1-2), pages 145-160.
    10. A. Smith, Jeffrey & E. Todd, Petra, 2005. "Does matching overcome LaLonde's critique of nonexperimental estimators?," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 305-353.
    11. Daniel McFadden, 2001. "Economic Choices," American Economic Review, American Economic Association, vol. 91(3), pages 351-378, June.
    12. Tomz, Michael & King, Gary & Zeng, Langche, 2003. "ReLogit: Rare Events Logistic Regression," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 8(i02).
    13. Prokhorov, Artem & Schmidt, Peter, 2009. "GMM redundancy results for general missing data problems," Journal of Econometrics, Elsevier, vol. 151(1), pages 47-55, July.
    14. Ramalho Esmeralda A., 2010. "Covariate Measurement Error: Bias Reduction under Response-Based Sampling," Studies in Nonlinear Dynamics & Econometrics, De Gruyter, vol. 14(4), pages 1-34, September.
    15. Lenis, David & Ackerman, Benjamin & Stuart, Elizabeth A., 2018. "Measuring model misspecification: Application to propensity score methods with complex survey data," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 48-57.
    16. Gerard J. Berg & Johan Vikström, 2014. "Monitoring Job Offer Decisions, Punishments, Exit to Work, and Job Quality," Scandinavian Journal of Economics, Wiley Blackwell, vol. 116(2), pages 284-334, April.
    17. Springborn, Michael & Romagosa, Christina M. & Keller, Reuben P., 2011. "The value of nonindigenous species risk assessment in international trade," Ecological Economics, Elsevier, vol. 70(11), pages 2145-2153, September.
    18. Tan, Zhiqiang, 2019. "On doubly robust estimation for logistic partially linear models," Statistics & Probability Letters, Elsevier, vol. 155(C), pages 1-1.
    19. repec:jss:jstsof:08:i02 is not listed on IDEAS
    20. James J. Heckman & Petra E. Todd, 2009. "A note on adapting propensity score matching and selection models to choice based samples," Econometrics Journal, Royal Economic Society, vol. 12(s1), pages 230-234, January.
    21. Ramalho, Esmeralda A., 2007. "Binary models with misclassification in the variable of interest and nonignorable nonresponse," Economics Letters, Elsevier, vol. 96(1), pages 70-76, July.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2207.09016. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.