IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v85y2015icp54-66.html
   My bibliography  Save this article

GEE type inference for clustered zero-inflated negative binomial regression with application to dental caries

Author

Listed:
  • Kong, Maiying
  • Xu, Sheng
  • Levy, Steven M.
  • Datta, Somnath

Abstract

Use of zero-inflated count data models is common in applications where the number of zero counts exceeds that predicted from a traditional count data model such as Poisson or negative binomial. When count data exhibiting inflated zero counts are correlated among subjects, a natural approach will be to fit a marginal model with the help of generalized estimating equations (GEE) that can incorporate subject-to-subject correlations. A GEE based zero-inflated negative binomial (ZINB) model is proposed to fit clustered counts with excessive zeros. However, the corresponding sandwich variance estimator appears to underestimate the true variance. The theoretical reasons for its failure are explained and a correction under additional modeling assumptions is offered. In addition, a clustered resampling (bootstrap) procedure is proposed to estimate the variance and it is shown that the bootstrap procedure captures the correct variance under no additional model assumptions. Utility of this marginal GEE based ZINB model over two other competing models has been assessed using a thorough simulation study. The resulting inference procedure is applied to study the association between the dental caries and fluoride exposures using a dataset extracted from the Iowa Fluoride Study. A number of risk factors of clinical significance are reliably identified using the proposed model.

Suggested Citation

  • Kong, Maiying & Xu, Sheng & Levy, Steven M. & Datta, Somnath, 2015. "GEE type inference for clustered zero-inflated negative binomial regression with application to dental caries," Computational Statistics & Data Analysis, Elsevier, vol. 85(C), pages 54-66.
  • Handle: RePEc:eee:csdana:v:85:y:2015:i:c:p:54-66
    DOI: 10.1016/j.csda.2014.11.014
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947314003375
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2014.11.014?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lim, Hwa Kyung & Song, Juwon & Jung, Byoung Cheol, 2013. "Score tests for zero-inflation and overdispersion in two-level count data," Computational Statistics & Data Analysis, Elsevier, vol. 61(C), pages 67-82.
    2. Garay, Aldo M. & Hashimoto, Elizabeth M. & Ortega, Edwin M.M. & Lachos, Víctor H., 2011. "On estimation and influence diagnostics for zero-inflated negative binomial regression models," Computational Statistics & Data Analysis, Elsevier, vol. 55(3), pages 1304-1318, March.
    3. Zeileis, Achim & Kleiber, Christian & Jackman, Simon, 2008. "Regression Models for Count Data in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 27(i08).
    4. Feng-Chang Xie & Jin-Guan Lin & Bo-Cheng Wei, 2014. "Bayesian zero-inflated generalized Poisson regression model: estimation and case influence diagnostics," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(6), pages 1383-1392, June.
    5. Abbas Moghimbeigi & Mohammed Reza Eshraghian & Kazem Mohammad & Brian Mcardle, 2008. "Multilevel zero-inflated negative binomial regression modeling for over-dispersed count data with extra zeros," Journal of Applied Statistics, Taylor & Francis Journals, vol. 35(10), pages 1193-1202.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Nasim Vahabi & Anoshirvan Kazemnejad & Somnath Datta, 2018. "A Marginalized Overdispersed Location Scale Model for Clustered Ordinal Data," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 80(1), pages 103-134, December.
    2. Adrian Fianu & Hind Aissaoui & Nadège Naty & Victorine Lenclume & Anne-Françoise Casimir & Emmanuel Chirpaz & Olivier Maillard & Michel Spodenkiewicz & Nicolas Bouscaren & Michelle Kelly-Irving & Emma, 2022. "Health Impacts of the COVID-19 Lockdown Measure in a Low Socio-Economic Setting: A Cross-Sectional Study on Reunion Island," IJERPH, MDPI, vol. 19(21), pages 1-22, October.
    3. Soutik Ghosal & Timothy S. Lau & Jeremy Gaskins & Maiying Kong, 2020. "A hierarchical mixed effect hurdle model for spatiotemporal count data and its application to identifying factors impacting health professional shortages," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 69(5), pages 1121-1144, November.
    4. Chung‐Wei Shen & Chun‐Shu Chen, 2024. "Estimation and selection for spatial zero‐inflated count models," Environmetrics, John Wiley & Sons, Ltd., vol. 35(4), June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Antonio J. Sáez-Castillo & Antonio Conde-Sánchez, 2017. "Detecting over- and under-dispersion in zero inflated data with the hyper-Poisson regression model," Statistical Papers, Springer, vol. 58(1), pages 19-33, March.
    2. Johanna Eklund & Julia P. G. Jones & Matti Räsänen & Jonas Geldmann & Ari-Pekka Jokinen & Adam Pellegrini & Domoina Rakotobe & O. Sarobidy Rakotonarivo & Tuuli Toivonen & Andrew Balmford, 2022. "Elevated fires during COVID-19 lockdown and the vulnerability of protected areas," Nature Sustainability, Nature, vol. 5(7), pages 603-609, July.
    3. Totterman, Stephen, 2021. "Vehicle-based recreation and compliance for three beaches in northern New South Wales," OSF Preprints ja8h6, Center for Open Science.
    4. Wang, Liang & Xie, Zaiyang & Abdi, Majid & Lee, June Y. & Li, Stan Xiao, 2024. "The rise of female board representation in China as a glocalization process (2010–2018)," Journal of Business Research, Elsevier, vol. 172(C).
    5. Jong-Hyun Kim & Yong-Gil Lee, 2021. "Factors of Collaboration Affecting the Performance of Alternative Energy Patents in South Korea from 2010 to 2017," Sustainability, MDPI, vol. 13(18), pages 1-25, September.
    6. Olga Alipova & Lada Litvinova & Andrey Lovakov & Maria Yudkevich, 2018. "Inbreds And Non-Inbreds Among Russian Academics: Short-Term Similarity And Long-Term Differences In Productivity," HSE Working papers WP BRP 48/EDU/2018, National Research University Higher School of Economics.
    7. Christian Kleiber & Achim Zeileis, 2016. "Visualizing Count Data Regressions Using Rootograms," The American Statistician, Taylor & Francis Journals, vol. 70(3), pages 296-303, July.
    8. Sewando, Ponsian T. & Mdoe, N. Y. S. & Mutabazi, K. D. S, 2011. "Farmers’ preferential choice decisions to alternative cassava value chain strands in Morogoro rural district, Tanzania," MPRA Paper 29797, University Library of Munich, Germany.
    9. Merl, Robert & Palan, Stefan & Schmidt, Dominik & Stöckl, Thomas, 2023. "Insider trading regulation and trader migration," Journal of Financial Markets, Elsevier, vol. 66(C).
    10. Sean J. Blamires & Cheng-Hui Lai & Ren-Chung Cheng & Chen-Pan Liao & Pao-Sheng Shen & I-Min Tso, 2012. "Body spot coloration of a nocturnal sit-and-wait predator visually lures prey," Behavioral Ecology, International Society for Behavioral Ecology, vol. 23(1), pages 69-74.
    11. Lawrence N Kazembe, 2013. "A Bayesian Two Part Model Applied to Analyze Risk Factors of Adult Mortality with Application to Data from Namibia," PLOS ONE, Public Library of Science, vol. 8(9), pages 1-10, September.
    12. Soutik Ghosal & Timothy S. Lau & Jeremy Gaskins & Maiying Kong, 2020. "A hierarchical mixed effect hurdle model for spatiotemporal count data and its application to identifying factors impacting health professional shortages," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 69(5), pages 1121-1144, November.
    13. Erich Striessnig & Elke Loichinger, 2015. "Future differential vulnerability to natural disasters by level of education," Vienna Yearbook of Population Research, Vienna Institute of Demography (VID) of the Austrian Academy of Sciences in Vienna, vol. 13(1), pages 221-240.
    14. Isihara, Paul & Shi, Chaojun & Ward, Jonathan & O'Malley, Leo & Laney, Skyler & Diedrichs, Danilo & Flores, Gabriel, 2020. "Identifying most typical and most ideal attribute levels in small populations of expert decision makers: Studying the Go/No Go decision of disaster relief organizations," Journal of choice modelling, Elsevier, vol. 35(C).
    15. Ina Falfán & Luis Zambrano, 2023. "Lacustrine Urban Blue Spaces: Low Availability and Inequitable Distribution in the Most Populated Cities in Mexico," Land, MDPI, vol. 12(1), pages 1-18, January.
    16. Augustin, Nicole H. & Sauleau, Erik-André & Wood, Simon N., 2012. "On quantile quantile plots for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 56(8), pages 2404-2409.
    17. Gerike, Regine & Gehlert, Tina & Leisch, Friedrich, 2015. "Time use in travel surveys and time use surveys – Two sides of the same coin?," Transportation Research Part A: Policy and Practice, Elsevier, vol. 76(C), pages 4-24.
    18. Guarino, Ernestino de Souza Gomes & Barbosa, Ana Márcia & Waechter, Jorge Luiz, 2012. "Occurrence and abundance models of threatened plant species: Applications to mitigate the impact of hydroelectric power dams," Ecological Modelling, Elsevier, vol. 230(C), pages 22-33.
    19. Evgenii V. Gilenko & Elena A. Mironova, 2017. "Modern claim frequency and claim severity models: An application to the Russian motor own damage insurance market," Cogent Economics & Finance, Taylor & Francis Journals, vol. 5(1), pages 1311097-131, January.
    20. Livio Finos & Fortunato Pesarin, 2020. "On zero-inflated permutation testing and some related problems," Statistical Papers, Springer, vol. 61(5), pages 2157-2174, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:85:y:2015:i:c:p:54-66. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.