IDEAS home Printed from https://ideas.repec.org/a/tsj/stataj/y13y2013i2p356-365.html
   My bibliography  Save this article

Goodness-of-fit tests for categorical data

Author

Listed:
  • Rino Bellocco

    (University of Milano–Bicocca
    Karolinska Institutet)

  • Sara Algeri

    (Texas A&M University)

Abstract

A significant aspect of data modeling with categorical predictors is the definition of a saturated model. In fact, there are different ways of specifying it—the casewise, the contingency table, and the collapsing approaches—and they strictly depend on the unit of analysis considered. The analytical units of reference could be the subjects or, alternatively, groups of subjects that have the same covariate pattern. In the first case, the goal is to predict the probability of success (failure) for each individual; in the second case, the goal is to predict the proportion of successes (failures) in each group. The analytical unit adopted does not affect the estimation process; however, it does affect the definition of a saturated model. Consequently, measures and tests of goodness of fit can lead to different results and interpretations. Thus one must carefully consider which approach to choose. In this article, we focus on the deviance test for logistic regression models. However, the results and the conclusions are easily applicable to other linear models involving categorical regressors. We show how Stata 12.1 performs when implementing goodness of fit. In this situation, it is important to clarify which one of the three approaches is implemented as default. Furthermore, a prominent role is played by the shape of the dataset considered (individual format or events–trials format) in accordance with the analytical unit choice. In fact, the same procedure applied to different data structures leads to different approaches to a saturated model. Thus one must attend to practical and theoretical statistical issues to avoid inappropriate analyses. Copyright 2013 by StataCorp LP.

Suggested Citation

  • Rino Bellocco & Sara Algeri, 2013. "Goodness-of-fit tests for categorical data," Stata Journal, StataCorp LP, vol. 13(2), pages 356-365, June.
  • Handle: RePEc:tsj:stataj:y:13:y:2013:i:2:p:356-365
    Note: to access software from within Stata, net describe http://www.stata-journal.com/software/sj13-2/st0299/
    as

    Download full text from publisher

    File URL: http://www.stata-journal.com/article.html?article=st0299
    File Function: link to article purchase
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jörg Stolz & Anaïd Lindemann & Jean-Philippe Antonietti, 2019. "Sociological explanation and mixed methods: the example of the Titanic," Quality & Quantity: International Journal of Methodology, Springer, vol. 53(3), pages 1623-1643, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tsj:stataj:y:13:y:2013:i:2:p:356-365. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F. Baum or Lisa Gilmore (email available below). General contact details of provider: http://www.stata-journal.com/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.