IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/99887.html
   My bibliography  Save this paper

Modeling Qualitative Outcomes by Supplementing Participant Data with General Population Data: A New and More Versatile Approach

Author

Listed:
  • Erard, Brian

Abstract

Although one often has detailed information about participants in a program, the lack of comparable information on non-participants precludes standard qualitative choice estimation. This challenge can be overcome by incorporating a supplementary sample of covariate values from the general population. New estimators are introduced that exploit the parameter restrictions implied by the relationship between the marginal and conditional response probabilities in the supplementary sample. An important advantage of these estimators over the existing alternatives is that they can be applied to exogenously stratified samples even when the underlying stratification criteria are unknown. The ability of these new estimators to readily incorporate sample weights make them applicable to a much wider range of data sources. The new estimators are also easily generalized to address polychotomous outcomes.

Suggested Citation

  • Erard, Brian, 2017. "Modeling Qualitative Outcomes by Supplementing Participant Data with General Population Data: A New and More Versatile Approach," MPRA Paper 99887, University Library of Munich, Germany, revised 26 Apr 2020.
  • Handle: RePEc:pra:mprapa:99887
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/99887/1/MPRA_paper_99887.pdf
    File Function: original version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Imbens, Guido W, 1992. "An Efficient Method of Moments Estimator for Discrete Choice Models with Choice-Based Sampling," Econometrica, Econometric Society, vol. 60(5), pages 1187-1214, September.
    2. Gill Ward & Trevor Hastie & Simon Barry & Jane Elith & John R. Leathwick, 2009. "Presence-Only Data and the EM Algorithm," Biometrics, The International Biometric Society, vol. 65(2), pages 554-563, June.
    3. Robert Rosenman & Scott Goates & Laura Hill, 2012. "Participation in universal prevention programmes," Applied Economics, Taylor & Francis Journals, vol. 44(2), pages 219-228, January.
    4. Lancaster, Tony & Imbens, Guido, 1996. "Case-control studies with contaminated controls," Journal of Econometrics, Elsevier, vol. 71(1-2), pages 145-160.
    5. Barry C. Burden & David T. Canon & Kenneth R. Mayer & Donald P. Moynihan, 2014. "Election Laws, Mobilization, and Turnout: The Unanticipated Consequences of Election Reform," American Journal of Political Science, John Wiley & Sons, vol. 58(1), pages 95-109, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Erard, Brian & Langetieg, Patrick & Payne, Mark & Plumley, Alan, 2020. "Ghosts in the Income Tax Machinery," MPRA Paper 100036, University Library of Munich, Germany.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sung Jae Jun & Sokbae Lee, 2024. "Causal Inference Under Outcome-Based Sampling with Monotonicity Assumptions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 42(3), pages 998-1009, July.
    2. Lee, Kangbok & Joo, Sunghoon & Baik, Hyeoncheol & Han, Sumin & In, Joonhwan, 2020. "Unbalanced data, type II error, and nonlinearity in predicting M&A failure," Journal of Business Research, Elsevier, vol. 109(C), pages 271-287.
    3. Adam M. Kleinbaum & Toby E. Stuart & Michael L. Tushman, 2013. "Discretion Within Constraint: Homophily and Structure in a Formal Organization," Organization Science, INFORMS, vol. 24(5), pages 1316-1336, October.
    4. Shi Chang & Rohan Singh Wilkho & Nasir Gharaibeh & Garett Sansom & Michelle Meyer & Francisco Olivera & Lei Zou, 2023. "Environmental, climatic, and situational factors influencing the probability of fatality or injury occurrence in flash flooding: a rare event logistic regression predictive model," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 116(3), pages 3957-3978, April.
    5. Małgorzata Łazęcka & Jan Mielniczuk & Paweł Teisseyre, 2021. "Estimating the class prior for positive and unlabelled data via logistic regression," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 15(4), pages 1039-1068, December.
    6. Butler, J. S., 2000. "Efficiency results of MLE and GMM estimation with sampling weights," Journal of Econometrics, Elsevier, vol. 96(1), pages 25-37, May.
    7. Erard, Brian, 2017. "Modeling Qualitative Outcomes by Supplementing Participant Data with General Population Data: A Calibrated Qualitative Response Estimation Approach," MPRA Paper 79927, University Library of Munich, Germany.
    8. Esmeralda A. Ramalho & Richard J. Smith, 2013. "Discrete Choice Non-Response," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 80(1), pages 343-364.
    9. Esmerelda A. Ramalho & Richard Smith, 2003. "Discrete choice non-response," CeMMAP working papers 07/03, Institute for Fiscal Studies.
    10. Sung Jae Jun & Sokbae (Simon) Lee, 2020. "Causal inference in case-control studies," CeMMAP working papers CWP19/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    11. Ramalho, Esmeralda A., 2007. "Binary models with misclassification in the variable of interest and nonignorable nonresponse," Economics Letters, Elsevier, vol. 96(1), pages 70-76, July.
    12. Nail Kashaev, 2022. "Estimation of Parametric Binary Outcome Models with Degenerate Pure Choice-Based Data with Application to COVID-19-Positive Tests from British Columbia," University of Western Ontario, Departmental Research Report Series 20225, University of Western Ontario, Department of Economics.
    13. Wenkai Li & Yuanchi Liu & Ziyue Liu & Zhen Gao & Huabing Huang & Weijun Huang, 2022. "A Positive-Unlabeled Learning Algorithm for Urban Flood Susceptibility Modeling," Land, MDPI, vol. 11(11), pages 1-17, November.
    14. Lancaster, Tony, 1997. "Bayes WESML Posterior inference from choice-based samples," Journal of Econometrics, Elsevier, vol. 79(2), pages 291-303, August.
    15. Robert M. Dorazio, 2012. "Predicting the Geographic Distribution of a Species from Presence-Only Data Subject to Detection Errors," Biometrics, The International Biometric Society, vol. 68(4), pages 1303-1312, December.
    16. Esmeralda Ramalho, 2004. "Covariate Measurement Error in Endogenous Stratified Samples," Economics Working Papers 2_2004, University of Évora, Department of Economics (Portugal).
    17. Lahiri, Kajal & Yang, Liu, 2013. "Forecasting Binary Outcomes," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 1025-1106, Elsevier.
    18. Lancaster, Tony & Imbens, Guido, 1996. "Case-control studies with contaminated controls," Journal of Econometrics, Elsevier, vol. 71(1-2), pages 145-160.
    19. Schwemmer, Philipp & Güpner, Franziska & Adler, Sven & Klingbeil, Knut & Garthe, Stefan, 2016. "Modelling small-scale foraging habitat use in breeding Eurasian oystercatchers (Haematopus ostralegus) in relation to prey distribution and environmental predictors," Ecological Modelling, Elsevier, vol. 320(C), pages 322-333.
    20. Daniel McFadden, 2001. "Economic Choices," American Economic Review, American Economic Association, vol. 91(3), pages 351-378, June.

    More about this item

    Keywords

    Qualitative response; Discrete choice; Choice-based sampling; Supplementary sampling; Contaminated controls;
    All these keywords.

    JEL classification:

    • C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
    • C25 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Discrete Regression and Qualitative Choice Models; Discrete Regressors; Proportions; Probabilities
    • C35 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Discrete Regression and Qualitative Choice Models; Discrete Regressors; Proportions

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:99887. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.