IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v79y2023i2p1520-1533.html
   My bibliography  Save this article

Semiparametric count data regression for self‐reported mental health

Author

Listed:
  • Daniel R. Kowal
  • Bohan Wu

Abstract

‘‘For how many days during the past 30 days was your mental health not good?” The responses to this question measure self‐reported mental health and can be linked to important covariates in the National Health and Nutrition Examination Survey (NHANES). However, these count variables present major distributional challenges: The data are overdispersed, zero‐inflated, bounded by 30, and heaped in 5‐ and 7‐day increments. To address these challenges—which are especially common for health questionnaire data—we design a semiparametric estimation and inference framework for count data regression. The data‐generating process is defined by simultaneously transforming and rounding (star) a latent Gaussian regression model. The transformation is estimated nonparametrically and the rounding operator ensures the correct support for the discrete and bounded data. Maximum likelihood estimators are computed using an expectation‐maximization (EM) algorithm that is compatible with any continuous data model estimable by least squares. star regression includes asymptotic hypothesis testing and confidence intervals, variable selection via information criteria, and customized diagnostics. Simulation studies validate the utility of this framework. Using star regression, we identify key factors associated with self‐reported mental health and demonstrate substantial improvements in goodness‐of‐fit compared to existing count data regression models.

Suggested Citation

  • Daniel R. Kowal & Bohan Wu, 2023. "Semiparametric count data regression for self‐reported mental health," Biometrics, The International Biometric Society, vol. 79(2), pages 1520-1533, June.
  • Handle: RePEc:bla:biomet:v:79:y:2023:i:2:p:1520-1533
    DOI: 10.1111/biom.13617
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13617
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13617?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Dunson, David B., 2005. "Bayesian Semiparametric Isotonic Regression for Count Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 618-627, June.
    2. Amy H. Herring & David B. Dunson & Nancy Dole, 2004. "Modeling the Effects of a Bidirectional Latent Predictor from Multivariate Questionnaire Data," Biometrics, The International Biometric Society, vol. 60(4), pages 926-935, December.
    3. Antonio Canale & David B. Dunson, 2013. "Nonparametric Bayes modelling of count processes," Biometrika, Biometrika Trust, vol. 100(4), pages 801-816.
    4. Xinyuan Song & Yemao Xia & Hongtu Zhu, 2017. "Hidden Markov latent variable models with multivariate longitudinal data," Biometrics, The International Biometric Society, vol. 73(1), pages 313-323, March.
    5. Victor Kipnis & Douglas Midthune & Dennis W. Buckman & Kevin W. Dodd & Patricia M. Guenther & Susan M. Krebs-Smith & Amy F. Subar & Janet A. Tooze & Raymond J. Carroll & Laurence S. Freedman, 2009. "Modeling Data with Excess Zeros and Measurement Error: Application to Evaluating Relationships between Episodically Consumed Foods and Health Outcomes," Biometrics, The International Biometric Society, vol. 65(4), pages 1003-1010, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nicholas Beyler & Susanne James-Burdumy & Martha Bleeker & Jane Fortson & Max Benjamin, "undated". "Measurement Error Properties in an Accelerometer Sample of U.S. Elementary School Children," Mathematica Policy Research Reports 6c99580fa94443459f3cbd005, Mathematica Policy Research.
    2. Abel Rodriguez & Enrique ter Horst, 2008. "Measuring expectations in options markets: An application to the SP500 index," Papers 0901.0033, arXiv.org.
    3. Lin, Yiqi & Song, Xinyuan, 2022. "Order selection for regression-based hidden Markov model," Journal of Multivariate Analysis, Elsevier, vol. 192(C).
    4. Shively, Thomas S. & Kockelman, Kara & Damien, Paul, 2010. "A Bayesian semi-parametric model to estimate relationships between crash counts and roadway characteristics," Transportation Research Part B: Methodological, Elsevier, vol. 44(5), pages 699-715, June.
    5. Qi Zhang & Yihui Zhang & Yemao Xia, 2024. "Bayesian Feature Extraction for Two-Part Latent Variable Model with Polytomous Manifestations," Mathematics, MDPI, vol. 12(5), pages 1-23, March.
    6. Kelly R. Moran & Matthew W. Wheeler, 2022. "Fast increased fidelity samplers for approximate Bayesian Gaussian process regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(4), pages 1198-1228, September.
    7. Liu, Hefei & Song, Xinyuan & Zhang, Baoxue, 2022. "Varying-coefficient hidden Markov models with zero-effect regions," Computational Statistics & Data Analysis, Elsevier, vol. 173(C).
    8. Zhou, Jie & Song, Xinyuan & Sun, Liuquan, 2020. "Continuous time hidden Markov model for longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 179(C).
    9. Zarepour, Mahmoud & Labadi, Luai Al, 2012. "On a rapid simulation of the Dirichlet process," Statistics & Probability Letters, Elsevier, vol. 82(5), pages 916-924.
    10. Kelly R. Moran & Elizabeth L. Turner & David Dunson & Amy H. Herring, 2021. "Bayesian hierarchical factor regression models to infer cause of death from verbal autopsy data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(3), pages 532-557, June.
    11. Harris-Fry, Helen & Saville, Naomi M. & Paudel, Puskar & Manandhar, Dharma S. & Cortina-Borja, Mario & Skordis, Jolene, 2022. "Relative power: Explaining the effects of food and cash transfers on allocative behaviour in rural Nepalese households," Journal of Development Economics, Elsevier, vol. 154(C).
    12. Shively, Thomas S. & Walker, Stephen G. & Damien, Paul, 2011. "Nonparametric function estimation subject to monotonicity, convexity and other shape constraints," Journal of Econometrics, Elsevier, vol. 161(2), pages 166-181, April.
    13. Liu, Hefei & Song, Xinyuan, 2021. "Bayesian analysis of hidden Markov structural equation models with an unknown number of hidden states," Econometrics and Statistics, Elsevier, vol. 18(C), pages 29-43.
    14. repec:jss:jstsof:46:c03 is not listed on IDEAS
    15. Chenguang Wang & Ao Yuan & Leslie Cope & Jing Qin, 2022. "A semiparametric isotonic regression model for skewed distributions with application to DNA–RNA–protein analysis," Biometrics, The International Biometric Society, vol. 78(4), pages 1464-1474, December.
    16. Xia, Ye-Mao & Tang, Nian-Sheng, 2019. "Bayesian analysis for mixture of latent variable hidden Markov models with multivariate longitudinal data," Computational Statistics & Data Analysis, Elsevier, vol. 132(C), pages 190-211.
    17. Ching-Yun Wang & Jean de Dieu Tapsoba & Catherine Duggan & Anne McTiernan, 2024. "Generalized Linear Models with Covariate Measurement Error and Zero-Inflated Surrogates," Mathematics, MDPI, vol. 12(2), pages 1-14, January.
    18. Abel Rodr�guez & Enrique ter Horst, 2011. "Measuring expectations in options markets: an application to the S&P500 index," Quantitative Finance, Taylor & Francis Journals, vol. 11(9), pages 1393-1405, July.
    19. repec:mpr:mprres:7903 is not listed on IDEAS
    20. Burda, Martin & Harding, Matthew & Hausman, Jerry, 2008. "A Bayesian mixed logit-probit model for multinomial choice," Journal of Econometrics, Elsevier, vol. 147(2), pages 232-246, December.
    21. Zhang Saijuan & Krebs-Smith Susan M. & Midthune Douglas & Perez Adriana & Buckman Dennis W. & Kipnis Victor & Freedman Laurence S. & Dodd Kevin W. & Carroll Raymond J, 2011. "Fitting a Bivariate Measurement Error Model for Episodically Consumed Dietary Components," The International Journal of Biostatistics, De Gruyter, vol. 7(1), pages 1-32, January.
    22. John Haslett & Andrew Parnell, 2008. "A simple monotone process with application to radiocarbon‐dated depth chronologies," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 57(4), pages 399-418, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:79:y:2023:i:2:p:1520-1533. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.