IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v317y2024i2p352-365.html
   My bibliography  Save this article

An algorithmic approach to identification of gray areas: Analysis of sleep scoring expert ensemble non agreement areas using a multinomial mixture model

Author

Listed:
  • Jouan, Gabriel
  • Arnardottir, Erna Sif
  • Islind, Anna Sigridur
  • Óskarsdóttir, María

Abstract

Machine learning (ML) models have become a key component in modern world services. In decision-making domains where human expertise is crucial, for example, for manually scoring biological signal data, human uncertainties undermine experts’ trust in the outcomes of these models. The field of sleep staging in particular, which requires experts to score complex biological signal is notably impacted by scoring uncertainties. Data consisting of an ensemble of independent scorers are collected to estimate inter scorer agreement and the uncertainty associated with manual scoring. However, scorers’ uncertainty lacks statistical modeling, which poses difficulties in validating ML algorithms and leads to issues of reliability and explainability. From the ensemble of scorers, uncertainty zones, called gray areas, are highlighted by samples where scorers disagree. The objective of our work is to provide a framework introducing and inferring gray areas. We present a flexible and easy-to-use multi-objective method based on multinomial mixture models clustering the different levels of scorer agreement and summarize the results into two sets of high-agreement and gray area clusters, which are called supra-clusters. The threshold is selected according to the maximization of the distance between two distributions of scorers agreement measure. Effective results were obtained by the method after it was fitted on simulated data. Additionally, the method was applied to a real case of uncertainty analysis in the sleep staging domain. A series of actual sleep stages scored by an ensemble of 10 independent scorers for a dataset of 50 participants was used.

Suggested Citation

  • Jouan, Gabriel & Arnardottir, Erna Sif & Islind, Anna Sigridur & Óskarsdóttir, María, 2024. "An algorithmic approach to identification of gray areas: Analysis of sleep scoring expert ensemble non agreement areas using a multinomial mixture model," European Journal of Operational Research, Elsevier, vol. 317(2), pages 352-365.
  • Handle: RePEc:eee:ejores:v:317:y:2024:i:2:p:352-365
    DOI: 10.1016/j.ejor.2023.09.039
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221723007567
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2023.09.039?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sakyajit Bhattacharya & Paul McNicholas, 2014. "A LASSO-penalized BIC for mixture model selection," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(1), pages 45-61, March.
    2. DonHee Lee & Seong No Yoon, 2021. "Application of Artificial Intelligence-Based Technologies in the Healthcare Industry: Opportunities and Challenges," IJERPH, MDPI, vol. 18(1), pages 1-18, January.
    3. Kadziński, Miłosz & Ciomek, Krzysztof, 2021. "Active learning strategies for interactive elicitation of assignment examples for threshold-based multiple criteria sorting," European Journal of Operational Research, Elsevier, vol. 293(2), pages 658-680.
    4. Kim, Nam Hyok & He, Feng & Zhang, Hongjie & Hong, Kwon Ryong & Ri, Kwang-Chol, 2023. "A data envelopment analysis-based clustering approach under dynamic situations," European Journal of Operational Research, Elsevier, vol. 311(1), pages 251-262.
    5. Oztekin, Asil & Al-Ebbini, Lina & Sevkli, Zulal & Delen, Dursun, 2018. "A decision analytic approach to predicting quality of life for lung transplant recipients: A hybrid genetic algorithms-based methodology," European Journal of Operational Research, Elsevier, vol. 266(2), pages 639-651.
    6. Abramowicz, Konrad & Sjöstedt de Luna, Sara & Strandberg, Johan, 2023. "Nonparametric bagging clustering methods to identify latent structures from a sequence of dependent categorical data," Computational Statistics & Data Analysis, Elsevier, vol. 177(C).
    7. Diwas Singh KC & Stefan Scholtes & Christian Terwiesch, 2020. "Empirical Research in Healthcare Operations: Past Research, Present Understanding, and Future Opportunities," Manufacturing & Service Operations Management, INFORMS, vol. 22(1), pages 73-83, January.
    8. Oosterlinck, Dieter & Benoit, Dries F. & Baecke, Philippe, 2020. "From one-class to two-class classification by incorporating expert knowledge: Novelty detection in human behaviour," European Journal of Operational Research, Elsevier, vol. 282(3), pages 1011-1024.
    9. Azzag, Hanene & Venturini, Gilles & Oliver, Antoine & Guinot, Christiane, 2007. "A hierarchical ant based clustering algorithm and its use in three real-world applications," European Journal of Operational Research, Elsevier, vol. 179(3), pages 906-922, June.
    10. Zhang, Xiaobo & Lu, Zhenzhou & Cheng, Kai, 2022. "Cross-entropy-based directional importance sampling with von Mises-Fisher mixture model for reliability analysis," Reliability Engineering and System Safety, Elsevier, vol. 220(C).
    11. Harrison, David Jr. & Rubinfeld, Daniel L., 1978. "Hedonic housing prices and the demand for clean air," Journal of Environmental Economics and Management, Elsevier, vol. 5(1), pages 81-102, March.
    12. Marina Johnson & Abdullah Albizri & Serhat Simsek, 2022. "Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis," Annals of Operations Research, Springer, vol. 308(1), pages 275-305, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jianhong Shi & Qian Yang & Xiongya Li & Weixing Song, 2017. "Effects of measurement error on a class of single-index varying coefficient regression models," Computational Statistics, Springer, vol. 32(3), pages 977-1001, September.
    2. Villalonga, Belen, 2004. "Intangible resources, Tobin's q, and sustainability of performance differences," Journal of Economic Behavior & Organization, Elsevier, vol. 54(2), pages 205-230, June.
    3. Brockmeier, M., 1991. "Entwicklung und Aufhebung von Reinheitsgeboten im Nahrungsmittelbereich – Analyse und Bewertung," Proceedings “Schriften der Gesellschaft für Wirtschafts- und Sozialwissenschaften des Landbaues e.V.”, German Association of Agricultural Economists (GEWISOLA), vol. 27.
    4. Miller, Steve & Startz, Richard, 2019. "Feasible generalized least squares using support vector regression," Economics Letters, Elsevier, vol. 175(C), pages 28-31.
    5. Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
    6. Prendergast, Luke A. & Li Wai Suen, Connie, 2011. "A new and practical influence measure for subsets of covariance matrix sample principal components with applications to high dimensional datasets," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 752-764, January.
    7. Tizheng Li & Xiaojuan Kang, 2022. "Variable selection of higher-order partially linear spatial autoregressive model with a diverging number of parameters," Statistical Papers, Springer, vol. 63(1), pages 243-285, February.
    8. Deac Dan Stelian & Schebesch Klaus Bruno, 2018. "Market Forecasts and Client Behavioral Data: Towards Finding Adequate Model Complexity," Studia Universitatis „Vasile Goldis” Arad – Economics Series, Sciendo, vol. 28(3), pages 50-75, September.
    9. Francisco Javier Santos Arteaga & Debora Di Caprio & David Cucchiari & Josep M Campistol & Federico Oppenheimer & Fritz Diekmann & Ignacio Revuelta, 2021. "Modeling patients as decision making units: evaluating the efficiency of kidney transplantation through data envelopment analysis," Health Care Management Science, Springer, vol. 24(1), pages 55-71, March.
    10. Juan Ignacio Zoloa, 2020. "Noise pollution and housing markets: A spatial hedonic analysis for La Plata City," Ensayos de Política Económica, Departamento de Investigación Francisco Valsecchi, Facultad de Ciencias Económicas, Pontificia Universidad Católica Argentina., vol. 3(2), pages 129-152, Octubre.
    11. Cheng, Tsung-Chi, 2012. "On simultaneously identifying outliers and heteroscedasticity without specific form," Computational Statistics & Data Analysis, Elsevier, vol. 56(7), pages 2258-2272.
    12. Bodhisattva Sen & Mary Meyer, 2017. "Testing against a linear regression model using ideas from shape-restricted estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(2), pages 423-448, March.
    13. Benítez-Peña, Sandra & Blanquero, Rafael & Carrizosa, Emilio & Ramírez-Cobo, Pepa, 2024. "Cost-sensitive probabilistic predictions for support vector machines," European Journal of Operational Research, Elsevier, vol. 314(1), pages 268-279.
    14. repec:wyi:journl:002176 is not listed on IDEAS
    15. Steve Gibbons & Stephan Heblich & Esther Lho & Christopher Timmins, 2016. "Fear of Fracking? The Impact of the Shale Gas Exploration on House Prices in Britain," SERC Discussion Papers 0207, Centre for Economic Performance, LSE.
    16. Sanying Feng & Liugen Xue, 2014. "Bias-corrected statistical inference for partially linear varying coefficient errors-in-variables models with restricted condition," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 66(1), pages 121-140, February.
    17. Luo, Changqi & Zhu, Shun-Peng & Keshtegar, Behrooz & Niu, Xiaopeng & Taylan, Osman, 2023. "An enhanced uniform simulation approach coupled with SVR for efficient structural reliability analysis," Reliability Engineering and System Safety, Elsevier, vol. 237(C).
    18. He, Wanxin & Wang, Yiyuan & Li, Gang & Zhou, Jinhang, 2024. "A novel maximum entropy method based on the B-spline theory and the low-discrepancy sequence for complex probability distribution reconstruction," Reliability Engineering and System Safety, Elsevier, vol. 243(C).
    19. Takafumi Kato, 2020. "Likelihood-based strategies for estimating unknown parameters and predicting missing data in the simultaneous autoregressive model," Journal of Geographical Systems, Springer, vol. 22(1), pages 143-176, January.
    20. Brown, James N & Rosen, Harvey S, 1982. "On the Estimation of Structural Hedonic Price Models," Econometrica, Econometric Society, vol. 50(3), pages 765-768, May.
    21. Xue, Jiacheng & Yao, Weixin, 2022. "Machine Learning Embedded Semiparametric Mixtures of Regressions with Covariate-Varying Mixing Proportions," Econometrics and Statistics, Elsevier, vol. 22(C), pages 159-171.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:317:y:2024:i:2:p:352-365. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.