IDEAS home Printed from https://ideas.repec.org/p/ehl/lserod/100873.html
   My bibliography  Save this paper

Optimal stopping and worker selection in crowdsourcing: an adaptive sequential probability ratio test framework

Author

Listed:
  • Li, Xiaoou
  • Chen, Yunxiao
  • Chen, Xi
  • Liu, Jingchen
  • Ying, Zhiliang

Abstract

In this study, we solve a class of multiple testing problems under a Bayesian sequential decision framework. Our work is motivated by binary labeling tasks in crowdsourcing, where a requestor needs to simultaneously choose a worker to provide a label and decide when to stop collecting labels, under a certain budget constraint. We begin by using a binary hypothesis testing problem to determine the true label of a single object, and provide an optimal solution by casting it under an adaptive sequential probability ratio test framework. Then, we characterize the structure of the optimal solution, that is, the optimal adaptive sequential design, which minimizes the Bayes risk using a log-likelihood ratio statistic. We also develop a dynamic programming algorithm to efficiently compute the optimal solution. For the multiple testing problem, we propose an empirical Bayes approach for estimating the class priors, and show that the average loss of our method converges to the minimal Bayes risk under the true model. Experiments on both simulated and real data show the robustness of our method, as well as its superiority over existing methods in terms of its labeling accuracy.

Suggested Citation

  • Li, Xiaoou & Chen, Yunxiao & Chen, Xi & Liu, Jingchen & Ying, Zhiliang, 2021. "Optimal stopping and worker selection in crowdsourcing: an adaptive sequential probability ratio test framework," LSE Research Online Documents on Economics 100873, London School of Economics and Political Science, LSE Library.
  • Handle: RePEc:ehl:lserod:100873
    as

    Download full text from publisher

    File URL: http://eprints.lse.ac.uk/100873/
    File Function: Open access version.
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sam Mavandadi & Stoyan Dimitrov & Steve Feng & Frank Yu & Uzair Sikora & Oguzhan Yaglidere & Swati Padmanabhan & Karin Nielsen & Aydogan Ozcan, 2012. "Distributed Medical Image Analysis and Diagnosis through Crowd-Sourced Games: A Malaria Case Study," PLOS ONE, Public Library of Science, vol. 7(5), pages 1-8, May.
    2. Roger Koenker & Ivan Mizera, 2014. "Convex Optimization, Shape Constraints, Compound Decisions, and Empirical Bayes Rules," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 674-685, June.
    3. Jay Bartroff & Matthew Finkelman & Tze Lai, 2008. "Modern Sequential Analysis and Its Applications to Computerized Adaptive Testing," Psychometrika, Springer;The Psychometric Society, vol. 73(3), pages 473-486, September.
    4. Koenker, Roger & Mizera, Ivan, 2014. "Convex Optimization in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 60(i05).
    5. Yuan-chin Chang, 2005. "Application of Sequential Interval Estimation to Adaptive Mastery Testing," Psychometrika, Springer;The Psychometric Society, vol. 70(4), pages 685-713, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Xi Chen & Quanquan Liu & Yining Wang, 2023. "Active Learning for Contextual Search with Binary Feedback," Management Science, INFORMS, vol. 69(4), pages 2165-2181, April.
    2. Estey, Clayton, 2024. "Robust Bellman State Prediction with Learning and Model Preferences," OSF Preprints 75fc9, Center for Open Science.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Roger Koenker, 2017. "Bayesian deconvolution: an R vinaigrette," CeMMAP working papers 38/17, Institute for Fiscal Studies.
    2. Jiaying Gu & Roger Koenker, 2018. "Nonparametric maximum likelihood methods for binary response models with random coefficients," Papers 1811.03329, arXiv.org, revised Jan 2020.
    3. Stéphane Bonhomme & Martin Weidner, 2019. "Posterior average effects," CeMMAP working papers CWP43/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Timothy B. Armstrong & Michal Kolesár & Mikkel Plagborg‐Møller, 2022. "Robust Empirical Bayes Confidence Intervals," Econometrica, Econometric Society, vol. 90(6), pages 2567-2602, November.
    5. Jiaying Gu & Roger Koenker, 2014. "Unobserved heterogeneity in income dynamics: an empirical Bayes perspective," CeMMAP working papers 43/14, Institute for Fiscal Studies.
    6. Sihai Dave Zhao, 2017. "Integrative genetic risk prediction using non-parametric empirical Bayes classification," Biometrics, The International Biometric Society, vol. 73(2), pages 582-592, June.
    7. Fox, Jeremy T. & Kim, Kyoo il & Yang, Chenyu, 2016. "A simple nonparametric approach to estimating the distribution of random coefficients in structural models," Journal of Econometrics, Elsevier, vol. 195(2), pages 236-254.
    8. Jiaying Gu & Roger Koenker, 2017. "Rebayes: an R package for empirical bayes mixture methods," CeMMAP working papers CWP37/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    9. Michael Gilraine & Jiaying Gu & Robert McMillan, 2020. "A New Method for Estimating Teacher Value-Added," NBER Working Papers 27094, National Bureau of Economic Research, Inc.
    10. Jiaying Gu & Roger Koenker & Stanislav Volgushev, 2017. "Testing for homogeneity in mixture models," CeMMAP working papers CWP39/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    11. Alberto Abadie & Maximilian Kasy, 2019. "Choosing Among Regularized Estimators in Empirical Economics: The Risk of Machine Learning," The Review of Economics and Statistics, MIT Press, vol. 101(5), pages 743-762, December.
    12. Mike Gilraine & Jiaying Gu & Robert McMillan, 2022. "A Nonparametric Approach for Studying Teacher Impacts," Working Papers tecipa-716, University of Toronto, Department of Economics.
    13. Wang, Yihe & Zhao, Sihai Dave, 2021. "A nonparametric empirical Bayes approach to large-scale multivariate regression," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    14. Feng, Long & Dicker, Lee H., 2018. "Approximate nonparametric maximum likelihood for mixture models: A convex optimization approach to fitting arbitrary multivariate mixing distributions," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 80-91.
    15. Jiafeng Chen, 2022. "Empirical Bayes When Estimation Precision Predicts Parameters," Papers 2212.14444, arXiv.org, revised Apr 2024.
    16. Li Tan & Cory Koedel, 2019. "The Effects of Differential Income Replacement and Mortality on U.S. Social Security Redistribution," Southern Economic Journal, John Wiley & Sons, vol. 86(2), pages 613-637, October.
    17. Jiaying Gu & Roger Koenker, 2020. "Invidious Comparisons: Ranking and Selection as Compound Decisions," Papers 2012.12550, arXiv.org, revised Sep 2021.
    18. Michael Gilraine & Jiaying Gu & Robert McMillan, 2021. "A Nonparametric Method for Estimating Teacher Value-Added," Working Papers tecipa-689, University of Toronto, Department of Economics.
    19. Roger Koenker, 2017. "Bayesian deconvolution: an R vinaigrette," CeMMAP working papers CWP38/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    20. Jiaying Gu & Roger Koenker, 2017. "Rebayes: an R package for empirical bayes mixture methods," CeMMAP working papers 37/17, Institute for Fiscal Studies.

    More about this item

    Keywords

    Bayesian decision theory; crowdsourcing; empirical Bayes; sequential analysis; sequential probability ratio test;
    All these keywords.

    JEL classification:

    • R14 - Urban, Rural, Regional, Real Estate, and Transportation Economics - - General Regional Economics - - - Land Use Patterns
    • J01 - Labor and Demographic Economics - - General - - - Labor Economics: General
    • J50 - Labor and Demographic Economics - - Labor-Management Relations, Trade Unions, and Collective Bargaining - - - General
    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ehl:lserod:100873. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: LSERO Manager (email available below). General contact details of provider: https://edirc.repec.org/data/lsepsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.