IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0208433.html
   My bibliography  Save this article

A Bayesian approach for analysis of ordered categorical responses subject to misclassification

Author

Listed:
  • Ashley Ling
  • El Hamidi Hay
  • Samuel E Aggrey
  • Romdhane Rekaya

Abstract

Ordinal categorical responses are frequently collected in survey studies, human medicine, and animal and plant improvement programs, just to mention a few. Errors in this type of data are neither rare nor easy to detect. These errors tend to bias the inference, reduce the statistical power and ultimately the efficiency of the decision-making process. Contrarily to the binary situation where misclassification occurs between two response classes, noise in ordinal categorical data is more complex due to the increased number of categories, diversity and asymmetry of errors. Although several approaches have been presented for dealing with misclassification in binary data, only limited practical methods have been proposed to analyze noisy categorical responses. A latent variable model implemented within a Bayesian framework was proposed to analyze ordinal categorical data subject to misclassification using simulated and real datasets. The simulated scenario consisted of a discrete response with three categories and a symmetric error rate of 5% between any two classes. The real data consisted of calving ease records of beef cows. Using real and simulated data, ignoring misclassification resulted in substantial bias in the estimation of genetic parameters and reduction of the accuracy of predicted breeding values. Using our proposed approach, a significant reduction in bias and increase in accuracy ranging from 11% to 17% was observed. Furthermore, most of the misclassified observations (in the simulated data) were identified with a substantially higher probability. Similar results were observed for a scenario with asymmetric misclassification. While the extension to traits with more categories between adjacent classes is straightforward, it could be computationally costly. For traits with high heritability, the performance of the methodology would be expected to improve.

Suggested Citation

  • Ashley Ling & El Hamidi Hay & Samuel E Aggrey & Romdhane Rekaya, 2018. "A Bayesian approach for analysis of ordered categorical responses subject to misclassification," PLOS ONE, Public Library of Science, vol. 13(12), pages 1-17, December.
  • Handle: RePEc:plo:pone00:0208433
    DOI: 10.1371/journal.pone.0208433
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0208433
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0208433&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0208433?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. R. Rekaya & K. A. Weigel & D. Gianola, 2001. "Threshold Model for Misclassified Binary Responses with Applications to Animal Breeding," Biometrics, The International Biometric Society, vol. 57(4), pages 1123-1129, December.
    2. Anil Gaba & Robert L. Winkler, 1992. "Implications of Errors in Survey Data: A Bayesian Model," Management Science, INFORMS, vol. 38(7), pages 913-925, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Al-Kandari Noriah M. & Lahiri Partha, 2016. "Prediction of a Function of Misclassified Binary Data," Statistics in Transition New Series, Polish Statistical Association, vol. 17(3), pages 429-447, September.
    2. Martijn van Hasselt & Christopher R. Bollinger & Jeremy W. Bray, 2022. "A Bayesian approach to account for misclassification in prevalence and trend estimation," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(2), pages 351-367, March.
    3. Anil Gaba & W. Kip Viscusi, 1998. "Differences in Subjective Risk Thresholds: Worker Groups as an Example," Management Science, INFORMS, vol. 44(6), pages 801-811, June.
    4. M. Ruiz & F. J. Giron & C. J. Perez & J. Martin & C. Rojano, 2008. "A Bayesian model for multinomial sampling with misclassified data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 35(4), pages 369-382.
    5. Quinino, R. C. & Lee Ho, L., 2004. "Repetitive tests as an economic alternative procedure to control attributes with diagnosis errors," European Journal of Operational Research, Elsevier, vol. 155(1), pages 209-225, May.
    6. Noriah M. Al-Kandari & Partha Lahiri, 2016. "Prediction Of A Function Of Misclassified Binary Data," Statistics in Transition New Series, Polish Statistical Association, vol. 17(3), pages 429-447, September.
    7. Rahardja, Dewi & Young, Dean M., 2010. "Credible sets for risk ratios in over-reported two-sample binomial data using the double-sampling scheme," Computational Statistics & Data Analysis, Elsevier, vol. 54(5), pages 1281-1287, May.
    8. Klein, Barbara D., 2001. "Detecting errors in data: clarification of the impact of base rate expectations and incentives," Omega, Elsevier, vol. 29(5), pages 391-404, October.
    9. T. Pham-Gia & N. Turkhan, 2005. "Bayesian decision criteria in the presence of noises under quadratic and absolute value loss functions," Statistical Papers, Springer, vol. 46(2), pages 247-266, April.
    10. Bollinger, Christopher R. & van Hasselt, Martijn, 2017. "A Bayesian analysis of binary misclassification," Economics Letters, Elsevier, vol. 156(C), pages 68-73.
    11. Rahardja, Dewi & Young, Dean M., 2011. "Likelihood-based confidence intervals for the risk ratio using double sampling with over-reported binary data," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 813-823, January.
    12. Carlos Daniel Paulino & Paulo Soares & John Neuhaus, 2003. "Binomial Regression with Misclassification," Biometrics, The International Biometric Society, vol. 59(3), pages 670-675, September.
    13. Boese, Doyle H. & Young, Dean M. & Stamey, James D., 2006. "Confidence intervals for a binomial parameter based on binary data subject to false-positive misclassification," Computational Statistics & Data Analysis, Elsevier, vol. 50(12), pages 3369-3385, August.
    14. Partha Lahiri & Noriah M. Al-Kandari, 2016. "Prediction of a Function of Misclassified Binary Data," Statistics in Transition new series, Główny Urząd Statystyczny (Polska), vol. 17(3), pages 429-447, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0208433. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.