IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i19p2477-d649532.html
   My bibliography  Save this article

Compositional Data Modeling through Dirichlet Innovations

Author

Listed:
  • Seitebaleng Makgai

    (Department of Statistics, University of Pretoria, Pretoria 0028, South Africa
    These authors contributed equally to this work.)

  • Andriette Bekker

    (Department of Statistics, University of Pretoria, Pretoria 0028, South Africa
    These authors contributed equally to this work.)

  • Mohammad Arashi

    (Department of Statistics, University of Pretoria, Pretoria 0028, South Africa
    Department of Statistics, Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad 9177948974, Iran)

Abstract

The Dirichlet distribution is a well-known candidate in modeling compositional data sets. However, in the presence of outliers, the Dirichlet distribution fails to model such data sets, making other model extensions necessary. In this paper, the Kummer–Dirichlet distribution and the gamma distribution are coupled, using the beta-generating technique. This development results in the proposal of the Kummer–Dirichlet gamma distribution, which presents greater flexibility in modeling compositional data sets. Some general properties, such as the probability density functions and the moments are presented for this new candidate. The method of maximum likelihood is applied in the estimation of the parameters. The usefulness of this model is demonstrated through the application of synthetic and real data sets, where outliers are present.

Suggested Citation

  • Seitebaleng Makgai & Andriette Bekker & Mohammad Arashi, 2021. "Compositional Data Modeling through Dirichlet Innovations," Mathematics, MDPI, vol. 9(19), pages 1-18, October.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:19:p:2477-:d:649532
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/19/2477/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/19/2477/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Thomas, Seemon & Jacob, Joy, 2006. "A generalized Dirichlet model," Statistics & Probability Letters, Elsevier, vol. 76(16), pages 1761-1767, October.
    2. Barndorff-Nielsen, O. E. & Jørgensen, B., 1991. "Some parametric models on the simplex," Journal of Multivariate Analysis, Elsevier, vol. 39(1), pages 106-116, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lucio Masserini & Matilde Bini & Monica Pratesi, 2017. "Effectiveness of non-selective evaluation test scores for predicting first-year performance in university career: a zero-inflated beta regression approach," Quality & Quantity: International Journal of Methodology, Springer, vol. 51(2), pages 693-708, March.
    2. Silvia De Nicol`o & Maria Rosaria Ferrante & Silvia Pacei, 2021. "Mind the Income Gap: Bias Correction of Inequality Estimators in Small-Sized Samples," Papers 2107.08950, arXiv.org, revised May 2023.
    3. Andrés Ramírez Hassan & Johnatan Cardona Jiménez, 2014. "Which team will win the 2014 FIFA World Cup? A Bayesian approach for dummies," Documentos de Trabajo de Valor Público 10898, Universidad EAFIT.
    4. Jay Verkuilen & Michael Smithson, 2012. "Mixed and Mixture Regression Models for Continuous Bounded Responses Using the Beta Distribution," Journal of Educational and Behavioral Statistics, , vol. 37(1), pages 82-113, February.
    5. Rashad A. R. Bantan & Christophe Chesneau & Farrukh Jamal & Mohammed Elgarhy & Muhammad H. Tahir & Aqib Ali & Muhammad Zubair & Sania Anam, 2020. "Some New Facts about the Unit-Rayleigh Distribution with Applications," Mathematics, MDPI, vol. 8(11), pages 1-23, November.
    6. Peter Xue-Kun Song & Ming Tan, 2000. "Marginal Models for Longitudinal Continuous Proportional Data," Biometrics, The International Biometric Society, vol. 56(2), pages 496-502, June.
    7. Rosineide Fernando da Paz & Jorge Luis Bazán & Luis Aparecido Milan, 2017. "Bayesian estimation for a mixture of simplex distributions with an unknown number of components: HDI analysis in Brazil," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(9), pages 1630-1643, July.
    8. Hoyle, Edward & Hughston, Lane P. & Macrina, Andrea, 2011. "Lévy random bridges and the modelling of financial information," Stochastic Processes and their Applications, Elsevier, vol. 121(4), pages 856-884, April.
    9. Ongaro, A. & Migliorati, S., 2013. "A generalization of the Dirichlet distribution," Journal of Multivariate Analysis, Elsevier, vol. 114(C), pages 412-426.
    10. Barros, C.P. & Wanke, Peter & Dumbo, Silvestre & Manso, Jose Pires, 2017. "Efficiency in angolan hydro-electric power station: A two-stage virtual frontier dynamic DEA and simplex regression approach," Renewable and Sustainable Energy Reviews, Elsevier, vol. 78(C), pages 588-596.
    11. Malini Iyengar & Dipak Dey, 2002. "A semiparametric model for compositional data analysis in presence of covariates on the simplex," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 11(2), pages 303-315, December.
    12. Josmar Mazucheli & Bruna Alves & Mustafa Ç. Korkmaz & Víctor Leiva, 2022. "Vasicek Quantile and Mean Regression Models for Bounded Data: New Formulation, Mathematical Derivations, and Numerical Applications," Mathematics, MDPI, vol. 10(9), pages 1-23, April.
    13. Patrícia L. Espinheira & Alisson Oliveira Silva, 2020. "Residual and influence analysis to a general class of simplex regression," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(2), pages 523-552, June.
    14. Aknouche, Abdelhakim & Dimitrakopoulos, Stefanos, 2021. "Autoregressive conditional proportion: A multiplicative-error model for (0,1)-valued time series," MPRA Paper 110954, University Library of Munich, Germany, revised 06 Dec 2021.
    15. Ricardo Rasmussen Petterle & Wagner Hugo Bonat & Cassius Tadeu Scarpin, 2019. "Quasi-beta Longitudinal Regression Model Applied to Water Quality Index Data," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 24(2), pages 346-368, June.
    16. Wenting Liu & Huiqiong Li & Anmin Tang & Zixin Cui, 2023. "Bayesian Joint Modeling Analysis of Longitudinal Proportional and Survival Data," Mathematics, MDPI, vol. 11(16), pages 1-17, August.
    17. Wanke, Peter & Barros, C.P., 2016. "Efficiency in Latin American airlines: A two-stage approach combining Virtual Frontier Dynamic DEA and Simplex Regression," Journal of Air Transport Management, Elsevier, vol. 54(C), pages 93-103.
    18. Rodrigues, Antonio Carlos & Martins, Ricardo Silveira & Wanke, Peter Fernandes & Siegler, Janaina, 2018. "Efficiency of specialized 3PL providers in an emerging economy," International Journal of Production Economics, Elsevier, vol. 205(C), pages 163-178.
    19. Chunsheng Ma, 2023. "Vector Random Fields on the Probability Simplex with Metric-Dependent Covariance Matrix Functions," Journal of Theoretical Probability, Springer, vol. 36(3), pages 1922-1938, September.
    20. Hui Song & Yingwei Peng & Dongsheng Tu, 2017. "Jointly modeling longitudinal proportional data and survival times with an application to the quality of life data in a breast cancer trial," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(2), pages 183-206, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:19:p:2477-:d:649532. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.