IDEAS home Printed from https://ideas.repec.org/p/ehl/lserod/118348.html
   My bibliography  Save this paper

Optimal parallel sequential change detection under generalized performance measures

Author

Listed:
  • Lu, Zexian
  • Chen, Yunxiao
  • Li, Xiaoou

Abstract

This paper considers the detection of change points in parallel data streams, a problem widely encountered when analyzing large-scale real-time streaming data. Each stream may have its own change point, at which its data has a distributional change. With sequentially observed data, a decision maker needs to declare whether changes have already occurred to the streams at each time point. Once a stream is declared to have changed, it is deactivated permanently so that its future data will no longer be collected. This is a compound decision problem in the sense that the decision maker may want to optimize certain compound performance metrics that concern all the streams as a whole. Thus, the decisions are not independent for different streams. Our contribution is three-fold. First, we propose a general framework for compound performance metrics that includes the ones considered in the existing works as special cases and introduces new ones that connect closely with the performance metrics for single-stream sequential change detection and large-scale hypothesis testing. Second, data-driven decision procedures are developed under this framework. Finally, optimality results are established for the proposed decision procedures. The proposed methods and theory are evaluated by simulation studies and a case study.

Suggested Citation

  • Lu, Zexian & Chen, Yunxiao & Li, Xiaoou, 2022. "Optimal parallel sequential change detection under generalized performance measures," LSE Research Online Documents on Economics 118348, London School of Economics and Political Science, LSE Library.
  • Handle: RePEc:ehl:lserod:118348
    as

    Download full text from publisher

    File URL: http://eprints.lse.ac.uk/118348/
    File Function: Open access version.
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Savas Dayanik & Christian Goulding & H. Vincent Poor, 2008. "Bayesian Sequential Change Diagnosis," Mathematics of Operations Research, INFORMS, vol. 33(2), pages 475-496, May.
    2. Jay Bartroff & Matthew Finkelman & Tze Lai, 2008. "Modern Sequential Analysis and Its Applications to Computerized Adaptive Testing," Psychometrika, Springer;The Psychometric Society, vol. 73(3), pages 473-486, September.
    3. Efron, Bradley, 2004. "Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 96-104, January.
    4. Y. Mei, 2010. "Efficient scalable schemes for monitoring a large number of data streams," Biometrika, Biometrika Trust, vol. 97(2), pages 419-433.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chen, Yunxiao & Lee, Yi-Hsuan & Li, Xiaoou, 2022. "Item pool quality control in educational testing: change point model, compound risk, and sequential detection," LSE Research Online Documents on Economics 112498, London School of Economics and Political Science, LSE Library.
    2. Yunxiao Chen & Yi-Hsuan Lee & Xiaoou Li, 2022. "Item Pool Quality Control in Educational Testing: Change Point Model, Compound Risk, and Sequential Detection," Journal of Educational and Behavioral Statistics, , vol. 47(3), pages 322-352, June.
    3. Pounds Stanley B. & Gao Cuilan L. & Zhang Hui, 2012. "Empirical Bayesian Selection of Hypothesis Testing Procedures for Analysis of Sequence Count Expression Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(5), pages 1-32, October.
    4. Shigeyuki Matsui & Hisashi Noma, 2011. "Estimating Effect Sizes of Differentially Expressed Genes for Power and Sample-Size Assessments in Microarray Experiments," Biometrics, The International Biometric Society, vol. 67(4), pages 1225-1235, December.
    5. Won, Joong-Ho & Lim, Johan & Yu, Donghyeon & Kim, Byung Soo & Kim, Kyunga, 2014. "Monotone false discovery rate," Statistics & Probability Letters, Elsevier, vol. 87(C), pages 86-93.
    6. van Wieringen, Wessel N. & Stam, Koen A. & Peeters, Carel F.W. & van de Wiel, Mark A., 2020. "Updating of the Gaussian graphical model through targeted penalized estimation," Journal of Multivariate Analysis, Elsevier, vol. 178(C).
    7. Ian W. McKeague & Min Qian, 2015. "An Adaptive Resampling Test for Detecting the Presence of Significant Predictors," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1422-1433, December.
    8. Angela Schörgendorfer & Adam J. Branscum & Timothy E. Hanson, 2013. "A Bayesian Goodness of Fit Test and Semiparametric Generalization of Logistic Regression with Measurement Data," Biometrics, The International Biometric Society, vol. 69(2), pages 508-519, June.
    9. Victor F. Araman & René A. Caldentey, 2022. "Diffusion Approximations for a Class of Sequential Experimentation Problems," Management Science, INFORMS, vol. 68(8), pages 5958-5979, August.
    10. Han, Bing & Dalal, Siddhartha R., 2012. "A Bernstein-type estimator for decreasing density with application to p-value adjustments," Computational Statistics & Data Analysis, Elsevier, vol. 56(2), pages 427-437.
    11. Dalia Valencia & Rosa E. Lillo & Juan Romo, 2019. "A Kendall correlation coefficient between functional data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(4), pages 1083-1103, December.
    12. Kline, Patrick & Walters, Christopher, 2019. "Audits as Evidence: Experiments, Ensembles, and Enforcement," Institute for Research on Labor and Employment, Working Paper Series qt3z72m9kn, Institute of Industrial Relations, UC Berkeley.
    13. He, Yi & Pan, Wei & Lin, Jizhen, 2006. "Cluster analysis using multivariate normal mixture models to detect differential gene expression with microarray data," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 641-658, November.
    14. Jay Bartroff & Jinlin Song, 2016. "A Rejection Principle for Sequential Tests of Multiple Hypotheses Controlling Familywise Error Rates," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(1), pages 3-19, March.
    15. Yudong Chen & Tengyao Wang & Richard J. Samworth, 2022. "High‐dimensional, multiscale online changepoint detection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(1), pages 234-266, February.
    16. Savas Dayanik & Warren Powell & Kazutoshi Yamazaki, 2013. "Asymptotically optimal Bayesian sequential change detection and identification rules," Annals of Operations Research, Springer, vol. 208(1), pages 337-370, September.
    17. Cheng, Cheng, 2009. "Internal validation inferences of significant genomic features in genome-wide screening," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 788-800, January.
    18. Sinjini Sikdar & Somnath Datta & Susmita Datta, 2017. "EAMA: Empirically adjusted meta-analysis for large-scale simultaneous hypothesis testing in genomic experiments," PLOS ONE, Public Library of Science, vol. 12(10), pages 1-19, October.
    19. Chen, Yudong & Wang, Tengyao & Samworth, Richard J., 2022. "High-dimensional, multiscale online changepoint detection," LSE Research Online Documents on Economics 113665, London School of Economics and Political Science, LSE Library.
    20. Tianwei Yu, 2018. "A new dynamic correlation algorithm reveals novel functional aspects in single cell and bulk RNA-seq data," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-22, August.

    More about this item

    Keywords

    large-scale inference; multiple change detection; sequential analysis; multiple hypothesis testing;
    All these keywords.

    JEL classification:

    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ehl:lserod:118348. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: LSERO Manager (email available below). General contact details of provider: https://edirc.repec.org/data/lsepsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.