IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v34y2019i2d10.1007_s00180-018-0838-3.html
   My bibliography  Save this article

Binary surrogates with stratified samples when weights are unknown

Author

Listed:
  • Yu-Min Huang

    (Tunghai University)

Abstract

In clinical practice, surrogate variables are commonly used as an indirect measure when it is difficult or expensive to measure the primary outcome variable X, based on which the disease status is assessed. In this article, we consider the problem of constructing an optimal binary surrogate Y to substitute such the feature variable X. To retain samples that have rare values in X, the paired sample (X, Y) is usually selected based on stratified sampling, where the strata are constructed using the disjoint intervals with the support of X. For such a sampling design, the stratum proportions are usually unknown such that proportional allocation is infeasible and (X, Y)’s cannot be regarded as an i.i.d. sample between strata. We estimate the unknown cutoff determining higher/lower levels of X that optimally match the variable Y and provide the true positive rates (TPR) adjusted for the disproportionate stratum weights. Our approach is to estimate the underlying distribution of X, then conduct an ad-hoc estimation for the TPR and for the expected prediction errors under zero-one loss function. We develop parametric estimate of the distribution of X under exponential family assumption and a weighted-kernel density estimator when the distribution of X is unspecified. We illustrate our methods on various simulation studies and on a real example where binary surrogates were evaluated for a medical device. The simulation results indicate that our approach performs well.

Suggested Citation

  • Yu-Min Huang, 2019. "Binary surrogates with stratified samples when weights are unknown," Computational Statistics, Springer, vol. 34(2), pages 653-682, June.
  • Handle: RePEc:spr:compst:v:34:y:2019:i:2:d:10.1007_s00180-018-0838-3
    DOI: 10.1007/s00180-018-0838-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-018-0838-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-018-0838-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wu, Colin O., 1997. "A Cross-Validation Bandwidth Choice for Kernel Density Estimates with Selection Biased Data," Journal of Multivariate Analysis, Elsevier, vol. 61(1), pages 38-60, April.
    2. Lausen, Berthold & Schumacher, Martin, 1996. "Evaluating the effect of optimized cutoff values in the assessment of prognostic factors," Computational Statistics & Data Analysis, Elsevier, vol. 21(3), pages 307-326, March.
    3. Tunes-da-Silva, Gisela & Klein, John P., 2011. "Cutpoint selection for discretizing a continuous covariate for generalized estimating equations," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 226-235, January.
    4. Chen, Song Xi, 1999. "Beta kernel estimators for density functions," Computational Statistics & Data Analysis, Elsevier, vol. 31(2), pages 131-145, August.
    5. Martsynyuk, Yuliya V., 2012. "Invariance principles for a multivariate Student process in the generalized domain of attraction of the multivariate normal law," Statistics & Probability Letters, Elsevier, vol. 82(12), pages 2270-2277.
    6. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    7. Contal, Cecile & O'Quigley, John, 1999. "An application of changepoint methods in studying the effect of age on survival in breast cancer," Computational Statistics & Data Analysis, Elsevier, vol. 30(3), pages 253-270, May.
    8. Konstantinos Fokianos, 2004. "Merging information for semiparametric density estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(4), pages 941-958, November.
    9. D. R. Cox, 2004. "A note on pseudolikelihood constructed from marginal densities," Biometrika, Biometrika Trust, vol. 91(3), pages 729-737, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Heinzl, Harald & Tempfer, Clemens, 2001. "A cautionary note on segmenting a cyclical covariate by minimum P-value search," Computational Statistics & Data Analysis, Elsevier, vol. 35(4), pages 451-461, February.
    2. Tunes-da-Silva, Gisela & Klein, John P., 2011. "Cutpoint selection for discretizing a continuous covariate for generalized estimating equations," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 226-235, January.
    3. Hollander, Norbert & Schumacher, Martin, 2006. "Estimating the functional form of a continuous covariate's effect on survival time," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 1131-1151, February.
    4. Hothorn, Torsten & Lausen, Berthold, 2003. "On the exact distribution of maximally selected rank statistics," Computational Statistics & Data Analysis, Elsevier, vol. 43(2), pages 121-137, June.
    5. Darima Fotheringham & Michael A. Wiles, 2023. "The effect of implementing chatbot customer service on stock returns: an event study analysis," Journal of the Academy of Marketing Science, Springer, vol. 51(4), pages 802-822, July.
    6. Song, Wei-Ling & Uzmanoglu, Cihan, 2016. "TARP announcement, bank health, and borrowers’ credit risk," Journal of Financial Stability, Elsevier, vol. 22(C), pages 22-32.
    7. Raymundo M. Campos-Vázquez, 2013. "Efectos de los ingresos no reportados en el nivel y tendencia de la pobreza laboral en México," Ensayos Revista de Economia, Universidad Autonoma de Nuevo Leon, Facultad de Economia, vol. 0(2), pages 23-54, November.
    8. Stephen Brown & William Goetzmann & Bing Liang & Christopher Schwarz, 2008. "Mandatory Disclosure and Operational Risk: Evidence from Hedge Fund Registration," Journal of Finance, American Finance Association, vol. 63(6), pages 2785-2815, December.
    9. Paul W. Miller & Barry R. Chiswick, 2002. "Immigrant earnings: Language skills, linguistic concentrations and the business cycle," Journal of Population Economics, Springer;European Society for Population Economics, vol. 15(1), pages 31-57.
    10. Chul‐Woo Kwon & Peter F. Orazem & Daniel M. Otto, 2006. "Off‐farm labor supply responses to permanent and transitory farm income," Agricultural Economics, International Association of Agricultural Economists, vol. 34(1), pages 59-67, January.
    11. Jonathan Gruber & Aaron Yelowitz, 1999. "Public Health Insurance and Private Savings," Journal of Political Economy, University of Chicago Press, vol. 107(6), pages 1249-1274, December.
    12. Jean-Louis Arcand & Linguère M'Baye, 2013. "Braving the waves: the role of time and risk preferences in illegal migration from Senegal," CERDI Working papers halshs-00855937, HAL.
    13. Sandra Müllbacher & Wolfgang Nagl, 2017. "Labour supply in Austria: an assessment of recent developments and the effects of a tax reform," Empirica, Springer;Austrian Institute for Economic Research;Austrian Economic Association, vol. 44(3), pages 465-486, August.
    14. Campbell, Randall C. & Nagel, Gregory L., 2016. "Private information and limitations of Heckman's estimator in banking and corporate finance research," Journal of Empirical Finance, Elsevier, vol. 37(C), pages 186-195.
    15. Leye Li & Louise Yi Lu & Dongyue Wang, 2022. "External labour market competitions and stock price crash risk: evidence from exposures to competitor CEOs’ award‐winning events," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 62(S1), pages 1421-1460, April.
    16. Jože P. Damijan & Mark Knell, 2005. "How Important Is Trade and Foreign Ownership in Closing the Technology Gap? Evidence from Estonia and Slovenia," Review of World Economics (Weltwirtschaftliches Archiv), Springer;Institut für Weltwirtschaft (Kiel Institute for the World Economy), vol. 141(2), pages 271-295, July.
    17. Calcagno, R. & Renneboog, L.D.R., 2004. "Capital Structure and Managerial Compensation : The Effects of Renumeration Seniority," Discussion Paper 2004-120, Tilburg University, Center for Economic Research.
    18. Nakashima, Kiyotaka & Ogawa, Toshiaki, 2020. "The Impacts of Strengthening Regulatory Surveillance on Bank Behavior: A Dynamic Analysis from Incomplete to Complete Enforcement of Capital Regulation in Microprudential Policy," MPRA Paper 99938, University Library of Munich, Germany.
    19. Sarah Bridges & David Lawson, 2008. "Health and Labour Market Participation in Uganda," WIDER Working Paper Series DP2008-07, World Institute for Development Economic Research (UNU-WIDER).
    20. Ahn T. Le, 2003. "Female Labour Market Participation: Differences Between Primary and Tied Movers," Economics Discussion / Working Papers 03-17, The University of Western Australia, Department of Economics.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:34:y:2019:i:2:d:10.1007_s00180-018-0838-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.