IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v56y2012i8p2404-2409.html
   My bibliography  Save this article

On quantile quantile plots for generalized linear models

Author

Listed:
  • Augustin, Nicole H.
  • Sauleau, Erik-André
  • Wood, Simon N.

Abstract

The distributional assumption for a generalized linear model is often checked by plotting the ordered deviance residuals against the quantiles of a standard normal distribution. Such plots can be difficult to interpret, because even when the model is correct, the plot often deviates substantially from a straight line. To rectify this problem Ben and Yohai (2004) proposed plotting the deviance residuals against their theoretical quantiles, under the assumption that the model is correct. Such plots are closer to a straight line, when the model is correct, making them much more useful for model checking. However the quantile computation proposed in Ben and Yohai is, in general, relatively complicated to implement and computationally expensive, so that general purpose software for these plots is only available for the Poisson and binary cases in the R package robust. As an alternative the theoretical quantiles can efficiently and simply be estimated by repeatedly simulating new response data from the fitted model and computing the corresponding residuals. This method also provides reference bands for judging the significance of departures of QQ-plots from ideal straight line form. A second alternative is to estimate the quantiles using quantiles of the response variable distribution according to the estimated model. This latter alternative generally has lower computational cost than the first, but does not yield QQ-plot reference bands. In simulations the quantiles produced by the new methods give results indistinguishable from the original Ben and Yohai quantile computations, but the scaling of computational cost with sample size is much improved so that a 500 fold reduction in computation time was observed at sample size 50,000. Application of the methods to generalized linear models fitted to prostate cancer incidence data suggest that they are particularly useful in large dataset cases that might otherwise be incorrectly viewed as zero-inflated. The new approaches are simple enough to implement for any exponential family distribution and for several alternative types of residual, and this has been done for all the families available for use with generalized linear models in the basic distribution of R.

Suggested Citation

  • Augustin, Nicole H. & Sauleau, Erik-André & Wood, Simon N., 2012. "On quantile quantile plots for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 56(8), pages 2404-2409.
  • Handle: RePEc:eee:csdana:v:56:y:2012:i:8:p:2404-2409
    DOI: 10.1016/j.csda.2012.01.026
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947312000692
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2012.01.026?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Chen, Xue-Dong & Fu, Ying-Zi, 2011. "Model selection for zero-inflated regression with missing covariates," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 765-773, January.
    2. Garay, Aldo M. & Hashimoto, Elizabeth M. & Ortega, Edwin M.M. & Lachos, Víctor H., 2011. "On estimation and influence diagnostics for zero-inflated negative binomial regression models," Computational Statistics & Data Analysis, Elsevier, vol. 55(3), pages 1304-1318, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Simon N. Wood & Natalya Pya & Benjamin Säfken, 2016. "Smoothing Parameter and Model Selection for General Smooth Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1548-1563, October.
    2. Simon N. Wood, 2020. "Inference and computation with generalized additive models and their extensions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(2), pages 307-339, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Aldo M. Garay & Victor H. Lachos & Heleno Bolfarine, 2015. "Bayesian estimation and case influence diagnostics for the zero-inflated negative binomial regression model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(6), pages 1148-1165, June.
    2. Feng-Chang Xie & Jin-Guan Lin & Bo-Cheng Wei, 2014. "Bayesian zero-inflated generalized Poisson regression model: estimation and case influence diagnostics," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(6), pages 1383-1392, June.
    3. Damien Besancenot & Kim Huynh & Francisco Serranito, 2015. "Co-Authorship And Individual Research Productivity In Economics: Assessing The Assortative Matching Hypothesis," Working Papers halshs-01252373, HAL.
    4. Damien Besancenot & Kim Van Huynh & Francisco Serranito, 2015. " Thou shalt not work alone ," Working Papers hal-01175758, HAL.
    5. Lukusa, Martin T. & Phoa, Frederick Kin Hing, 2020. "A note on the weighting-type estimations of the zero-inflated Poisson regression model with missing data in covariates," Statistics & Probability Letters, Elsevier, vol. 158(C).
    6. T. Martin Lukusa & Shen-Ming Lee & Chin-Shang Li, 2016. "Semiparametric estimation of a zero-inflated Poisson regression model with missing covariates," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 79(4), pages 457-483, May.
    7. Bermúdez, Lluís & Karlis, Dimitris, 2012. "A finite mixture of bivariate Poisson regression models with an application to insurance ratemaking," Computational Statistics & Data Analysis, Elsevier, vol. 56(12), pages 3988-3999.
    8. Kong, Maiying & Xu, Sheng & Levy, Steven M. & Datta, Somnath, 2015. "GEE type inference for clustered zero-inflated negative binomial regression with application to dental caries," Computational Statistics & Data Analysis, Elsevier, vol. 85(C), pages 54-66.
    9. Marco Schito, 2021. "A Sectoral Approach to the Politics of State Aid in the European Union: an Analysis of the European Automotive Industry," Journal of Industry, Competition and Trade, Springer, vol. 21(1), pages 1-31, March.
    10. Jussiane Nader Gonçalves & Wagner Barreto-Souza, 2020. "Flexible regression models for counts with high-inflation of zeros," METRON, Springer;Sapienza Università di Roma, vol. 78(1), pages 71-95, April.
    11. Shen-Ming Lee & T. Martin Lukusa & Chin-Shang Li, 2020. "Estimation of a zero-inflated Poisson regression model with missing covariates via nonparametric multiple imputation methods," Computational Statistics, Springer, vol. 35(2), pages 725-754, June.
    12. Derek S. Young & Andrew M. Raim & Nancy R. Johnson, 2017. "Zero-inflated modelling for characterizing coverage errors of extracts from the US Census Bureau's Master Address File," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(1), pages 73-97, January.
    13. Fatemeh Hassanzadeh & Iraj Kazemi, 2017. "Regression modeling of one-inflated positive count data," Statistical Papers, Springer, vol. 58(3), pages 791-809, September.
    14. Antonio J. Sáez-Castillo & Antonio Conde-Sánchez, 2017. "Detecting over- and under-dispersion in zero inflated data with the hyper-Poisson regression model," Statistical Papers, Springer, vol. 58(1), pages 19-33, March.
    15. Sáez-Castillo, A.J. & Conde-Sánchez, A., 2013. "A hyper-Poisson regression model for overdispersed and underdispersed count data," Computational Statistics & Data Analysis, Elsevier, vol. 61(C), pages 148-157.
    16. Elton G. Aráujo & Julio C. S. Vasconcelos & Denize P. Santos & Edwin M. M. Ortega & Dalton Souza & João P. F. Zanetoni, 2023. "The Zero-Inflated Negative Binomial Semiparametric Regression Model: Application to Number of Failing Grades Data," Annals of Data Science, Springer, vol. 10(4), pages 991-1006, August.
    17. Maria Goranova & Rahi Abouk & Paul C. Nystrom & Ehsan S. Soofi, 2017. "Corporate governance antecedents to shareholder activism: A zero-inflated process," Strategic Management Journal, Wiley Blackwell, vol. 38(2), pages 415-435, February.
    18. Young, Jesse D. & Anderson, Nathaniel M. & Naughton, Helen T. & Mullan, Katrina, 2018. "Economic and policy factors driving adoption of institutional woody biomass heating systems in the U.S," Energy Economics, Elsevier, vol. 69(C), pages 456-470.
    19. Jesse D. Young & Nathaniel M. Anderson & Helen T. Naughton, 2018. "Influence of Policy, Air Quality, and Local Attitudes toward Renewable Energy on the Adoption of Woody Biomass Heating Systems," Energies, MDPI, vol. 11(11), pages 1-24, October.
    20. Yang, Miao & Das, Kalyan & Majumdar, Anandamayee, 2016. "Analysis of bivariate zero inflated count data with missing responses," Journal of Multivariate Analysis, Elsevier, vol. 148(C), pages 73-82.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:56:y:2012:i:8:p:2404-2409. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.