IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v156y2021ics0167947320302140.html
   My bibliography  Save this article

Clustering for time-varying relational count data

Author

Listed:
  • Goto, Satoshi
  • Takagishi, Mariko
  • Yadohisa, Hiroshi

Abstract

Relational count data are often obtained from sources such as simultaneous purchase in online shops and social networking service information. Clustering such relational count data reveals the latent structure of the relationship between objects such as household items or people. When relational count data observed at multiple time points are available, it is worthwhile incorporating the time structure into the clustering result to understand how objects move between the clusters over time. In this paper, we propose two clustering methods for analyzing time-varying relational count data. The first model, the dynamic Poisson infinite relational model (dPIRM), handles time-varying relational count data. In the second model, which we call the dynamic zero-inflated Poisson infinite relational model, we further extend the dPIRM so that it can handle zero-inflated data. Proposing both two models is important as zero-inflated data are often encountered, especially when the time intervals are short. In addition, by explicitly deriving the relevant full conditional distributions, we describe the features of the estimated parameters and, in turn, the relationship between the two models. We show the effectiveness of both models through a simulation study and a real data example.

Suggested Citation

  • Goto, Satoshi & Takagishi, Mariko & Yadohisa, Hiroshi, 2021. "Clustering for time-varying relational count data," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
  • Handle: RePEc:eee:csdana:v:156:y:2021:i:c:s0167947320302140
    DOI: 10.1016/j.csda.2020.107123
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947320302140
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2020.107123?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Teh, Yee Whye & Jordan, Michael I. & Beal, Matthew J. & Blei, David M., 2006. "Hierarchical Dirichlet Processes," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1566-1581, December.
    2. Liu, Yin & Tian, Guo-Liang, 2015. "Type I multivariate zero-inflated Poisson distribution with applications," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 200-222.
    3. Lee, Keunbaik & Joo, Yongsung & Song, Joon Jin & Harper, Dee Wood, 2011. "Analysis of zero-inflated clustered count data: A marginalized model approach," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 824-837, January.
    4. Angers, Jean-Francois & Biswas, Atanu, 2003. "A Bayesian analysis of zero-inflated generalized Poisson model," Computational Statistics & Data Analysis, Elsevier, vol. 42(1-2), pages 37-46, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Michelle Dietzen & Haoran Zhai & Olivia Lucas & Oriol Pich & Christopher Barrington & Wei-Ting Lu & Sophia Ward & Yanping Guo & Robert E. Hynds & Simone Zaccaria & Charles Swanton & Nicholas McGranaha, 2024. "Replication timing alterations are associated with mutation acquisition during breast and lung cancer evolution," Nature Communications, Nature, vol. 15(1), pages 1-23, December.
    2. Yip, Karen C.H. & Yau, Kelvin K.W., 2005. "On modeling claim frequency data in general insurance with extra zeros," Insurance: Mathematics and Economics, Elsevier, vol. 36(2), pages 153-163, April.
    3. Redivo, Edoardo & Nguyen, Hien D. & Gupta, Mayetri, 2020. "Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).
    4. Jin, Xin & Maheu, John M., 2016. "Bayesian semiparametric modeling of realized covariance matrices," Journal of Econometrics, Elsevier, vol. 192(1), pages 19-39.
    5. Parvin Ahmadi & Iman Gholampour & Mahmoud Tabandeh, 2018. "Cluster-based sparse topical coding for topic mining and document clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(3), pages 537-558, September.
    6. Jeffrey L. Furman & Florenta Teodoridis, 2020. "Automation, Research Technology, and Researchers’ Trajectories: Evidence from Computer Science and Electrical Engineering," Organization Science, INFORMS, vol. 31(2), pages 330-354, March.
    7. Xin Jin & John M. Maheu & Qiao Yang, 2019. "Bayesian parametric and semiparametric factor models for large realized covariance matrices," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 34(5), pages 641-660, August.
    8. Csereklyei, Zsuzsanna & Anantharama, Nandini & Kallies, Anne, 2021. "Electricity market transitions in Australia: Evidence using model-based clustering," Energy Economics, Elsevier, vol. 103(C).
    9. Shu-Ping Shi & Yong Song, 2012. "Identifying Speculative Bubbles with an Infinite Hidden Markov Model," Working Paper series 26_12, Rimini Centre for Economic Analysis.
    10. Lu Huang & Xiang Chen & Yi Zhang & Changtian Wang & Xiaoli Cao & Jiarun Liu, 2022. "Identification of topic evolution: network analytics with piecewise linear representation and word embedding," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5353-5383, September.
    11. Gael M. Martin & David T. Frazier & Ruben Loaiza-Maya & Florian Huber & Gary Koop & John Maheu & Didier Nibbering & Anastasios Panagiotelis, 2023. "Bayesian Forecasting in the 21st Century: A Modern Review," Monash Econometrics and Business Statistics Working Papers 1/23, Monash University, Department of Econometrics and Business Statistics.
    12. M. Tariqul Hasan & Gary Sneddon & Renjun Ma, 2012. "Regression analysis of zero-inflated time-series counts: application to air pollution related emergency room visit data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(3), pages 467-476, June.
    13. Jin, Xin & Maheu, John M. & Yang, Qiao, 2022. "Infinite Markov pooling of predictive distributions," Journal of Econometrics, Elsevier, vol. 228(2), pages 302-321.
    14. Thomas R. W. Oliver & Lia Chappell & Rashesh Sanghvi & Lauren Deighton & Naser Ansari-Pour & Stefan C. Dentro & Matthew D. Young & Tim H. H. Coorens & Hyunchul Jung & Tim Butler & Matthew D. C. Nevill, 2022. "Clonal diversification and histogenesis of malignant germ cell tumours," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    15. Gustaf Bellstam & Sanjai Bhagat & J. Anthony Cookson, 2021. "A Text-Based Analysis of Corporate Innovation," Management Science, INFORMS, vol. 67(7), pages 4004-4031, July.
    16. Michael L. Pennell & David B. Dunson, 2008. "Nonparametric Bayes Testing of Changes in a Response Distribution with an Ordinal Predictor," Biometrics, The International Biometric Society, vol. 64(2), pages 413-423, June.
    17. Hossein Kavand & Marcel Voia, 2018. "Estimation of Health Care Demand and its Implication on Income Effects of Individuals," Springer Proceedings in Business and Economics, in: William H. Greene & Lynda Khalaf & Paul Makdissi & Robin C. Sickles & Michael Veall & Marcel-Cristia (ed.), Productivity and Inequality, pages 275-304, Springer.
    18. Bruno Scarpa & David B. Dunson, 2009. "Bayesian Hierarchical Functional Data Analysis Via Contaminated Informative Priors," Biometrics, The International Biometric Society, vol. 65(3), pages 772-780, September.
    19. Hassan Akell & Farkhondeh-Alsadat Sajadi & Iraj Kazemi, 2023. "Construction of Jointly Distributed Random Samples Drawn from the Beta Two-Parameter Process," Methodology and Computing in Applied Probability, Springer, vol. 25(3), pages 1-12, September.
    20. Feng-Chang Xie & Jin-Guan Lin & Bo-Cheng Wei, 2014. "Bayesian zero-inflated generalized Poisson regression model: estimation and case influence diagnostics," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(6), pages 1383-1392, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:156:y:2021:i:c:s0167947320302140. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.