IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i5p555-d511924.html
   My bibliography  Save this article

On the Discretization of Continuous Probability Distributions Using a Probabilistic Rounding Mechanism

Author

Listed:
  • Chénangnon Frédéric Tovissodé

    (Laboratoire de Biomathématiques et d’Estimations Forestières, Université d’Abomey-Calavi, Abomey-Calavi, Benin)

  • Sèwanou Hermann Honfo

    (Laboratoire de Biomathématiques et d’Estimations Forestières, Université d’Abomey-Calavi, Abomey-Calavi, Benin
    These authors contributed equally to this work.)

  • Jonas Têlé Doumatè

    (Laboratoire de Biomathématiques et d’Estimations Forestières, Université d’Abomey-Calavi, Abomey-Calavi, Benin
    Faculté des Sciences et Techniques, Université d’Abomey-Calavi, Abomey-Calavi, Benin
    These authors contributed equally to this work.)

  • Romain Glèlè Kakaï

    (Laboratoire de Biomathématiques et d’Estimations Forestières, Université d’Abomey-Calavi, Abomey-Calavi, Benin)

Abstract

Most existing flexible count distributions allow only approximate inference when used in a regression context. This work proposes a new framework to provide an exact and flexible alternative for modeling and simulating count data with various types of dispersion (equi-, under-, and over-dispersion). The new method, referred to as “balanced discretization”, consists of discretizing continuous probability distributions while preserving expectations. It is easy to generate pseudo random variates from the resulting balanced discrete distribution since it has a simple stochastic representation (probabilistic rounding) in terms of the continuous distribution. For illustrative purposes, we develop the family of balanced discrete gamma distributions that can model equi-, under-, and over-dispersed count data. This family of count distributions is appropriate for building flexible count regression models because the expectation of the distribution has a simple expression in terms of the parameters of the distribution. Using the Jensen–Shannon divergence measure, we show that under the equidispersion restriction, the family of balanced discrete gamma distributions is similar to the Poisson distribution. Based on this, we conjecture that while covering all types of dispersions, a count regression model based on the balanced discrete gamma distribution will allow recovering a near Poisson distribution model fit when the data are Poisson distributed.

Suggested Citation

  • Chénangnon Frédéric Tovissodé & Sèwanou Hermann Honfo & Jonas Têlé Doumatè & Romain Glèlè Kakaï, 2021. "On the Discretization of Continuous Probability Distributions Using a Probabilistic Rounding Mechanism," Mathematics, MDPI, vol. 9(5), pages 1-17, March.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:5:p:555-:d:511924
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/5/555/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/5/555/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Walmes Marques Zeviani & Paulo Justiniano Ribeiro & Wagner Hugo Bonat & Silvia Emiko Shimakura & Joel Augusto Muniz, 2014. "The Gamma-count distribution in the analysis of experimental underdispersed data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(12), pages 2616-2626, December.
    2. Hagmark, Per-Erik, 2009. "A new concept for count distributions," Statistics & Probability Letters, Elsevier, vol. 79(8), pages 1120-1124, April.
    3. Veraart, Almut E.D., 2019. "Modeling, simulation and inference for multivariate time series of counts using trawl processes," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 110-129.
    4. Roy, D., 1993. "Reliability Measures in the Discrete Bivariate Set-Up and Related Characterization Results for a Bivariate Geometric Distribution," Journal of Multivariate Analysis, Elsevier, vol. 46(2), pages 362-373, August.
    5. Cameron, A Colin & Johansson, Per, 1997. "Count Data Regression Using Series Expansions: With Applications," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 12(3), pages 203-223, May-June.
    6. Hagmark, Per-Erik, 2008. "On construction and simulation of count data models," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 77(1), pages 72-80.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Stefano Mainardi, 2003. "Testing convergence in life expectancies: count regression models on panel data," Prague Economic Papers, Prague University of Economics and Business, vol. 2003(4), pages 350-370.
    2. A. Colin Cameron & Per Johansson, 2004. "Bivariate Count Data Regression Using Series Expansions: With Applications," Working Papers 9815, University of California, Davis, Department of Economics.
    3. Jie Q. Guo & Pravin K. Trivedi, 2002. "Flexible Parametric Models for Long‐tailed Patent Count Distributions," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 64(1), pages 63-82, February.
    4. Harris, Matthew & Kohn, Jennifer, 2015. "Reference dependent utility from health and the demand for medical care," MPRA Paper 61926, University Library of Munich, Germany.
    5. Buddana Amrutha & Kozubowski Tomasz J., 2014. "Discrete Pareto Distributions," Stochastics and Quality Control, De Gruyter, vol. 29(2), pages 143-156, December.
    6. Marco Alfò & Giovanni Trovato, 2004. "Semiparametric Mixture Models for Multivariate Count Data, with Application," CEIS Research Paper 51, Tor Vergata University, CEIS.
    7. Eduardo Fé & Richard Hofler, 2013. "Count data stochastic frontier models, with an application to the patents–R&D relationship," Journal of Productivity Analysis, Springer, vol. 39(3), pages 271-284, June.
    8. Bauer, Thomas K. & Million, Andreas & Rotte, Ralph & Zimmermann, Klaus F., 1998. "Immigration Labor and Workplace Safety," IZA Discussion Papers 16, Institute of Labor Economics (IZA).
    9. Hagmark, Per-Erik, 2009. "A new concept for count distributions," Statistics & Probability Letters, Elsevier, vol. 79(8), pages 1120-1124, April.
    10. Roy, Dilip & Gupta, R. P., 1999. "Characterizations and model selections through reliability measures in the discrete case," Statistics & Probability Letters, Elsevier, vol. 43(2), pages 197-206, June.
    11. Ronan Powell & Alfred Yawson, 2012. "Internal Restructuring and Firm Survival," International Review of Finance, International Review of Finance Ltd., vol. 12(4), pages 435-467, December.
    12. Glenn Ellison & Ashley Swanson, 2012. "Heterogeneity in High Math Achievement Across Schools: Evidence from the American Mathematics Competitions," NBER Working Papers 18277, National Bureau of Economic Research, Inc.
    13. repec:dgr:rugsom:00f37 is not listed on IDEAS
    14. Carl Lee & Felix Famoye & Alfred Akinsete, 2021. "Generalized Count Data Regression Models and Their Applications to Health Care Data," Annals of Data Science, Springer, vol. 8(2), pages 367-386, June.
    15. S. Hadi Khazraee & Antonio Jose Sáez‐Castillo & Srinivas Reddy Geedipally & Dominique Lord, 2015. "Application of the Hyper‐Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes," Risk Analysis, John Wiley & Sons, vol. 35(5), pages 919-930, May.
    16. Mailu, Stephen & Kuloba, Bernard & Ruto, Eric & Nyangena, Wilfred, 2010. "Effect of cropping policy on landowner reactions towards wildlife: a case of Naivasha area, Kenya," MPRA Paper 21308, University Library of Munich, Germany.
    17. Michael D. Creel & Montserrat Farell, 2001. "Likelihood-Based Approaches to Modeling Demand for Medical Care," UFAE and IAE Working Papers 498.01, Unitat de Fonaments de l'Anàlisi Econòmica (UAB) and Institut d'Anàlisi Econòmica (CSIC).
    18. van der Klaauw, Bas & Koning, Ruud H, 2003. "Testing the Normality Assumption in the Sample Selection Model with an Application to Travel Demand," Journal of Business & Economic Statistics, American Statistical Association, vol. 21(1), pages 31-42, January.
    19. Andr? Romeu-Santana & ?gel M. Vera-Hern?dez, "undated". "A Semi-Nonparametric Estimator For Counts With An Endogenous Dummy. Variable," UFAE and IAE Working Papers 452.00, Unitat de Fonaments de l'Anàlisi Econòmica (UAB) and Institut d'Anàlisi Econòmica (CSIC).
    20. Célestin C. Kokonendji & Sobom M. Somé & Youssef Esstafa & Marcelo Bourguignon, 2023. "On Underdispersed Count Kernels for Smoothing Probability Mass Functions," Stats, MDPI, vol. 6(4), pages 1-15, November.
    21. Sáez-Castillo, A.J. & Conde-Sánchez, A., 2013. "A hyper-Poisson regression model for overdispersed and underdispersed count data," Computational Statistics & Data Analysis, Elsevier, vol. 61(C), pages 148-157.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:5:p:555-:d:511924. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.