IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v65y2024i9d10.1007_s00362-024-01608-3.html
   My bibliography  Save this article

Model selection for mixture hidden Markov models: an application to clickstream data

Author

Listed:
  • Furio Urso

    (University of Palermo)

  • Antonino Abbruzzo

    (University of Palermo)

  • Marcello Chiodi

    (University of Palermo)

  • Maria Francesca Cracolici

    (University of Palermo)

Abstract

In a clickstream analysis setting, Mixture Hidden Markov Models (MHMMs) can be used to examine categorical sequences assuming they evolve according to a mixture of latent Markov processes, each related to a different subpopulation. These models involve identifying both the number of subpopulations and hidden states. This study proposes a model selection criterion based on an integrated completed likelihood approach that accounts for the two latent classes in the model. We implemented a Monte Carlo simulation study to compare selection criteria performance. In scenarios characterised by categorical short length sequences, our proposed measure outperforms the most commonly used model selection criteria in identifying components and states. The paper presents a case study on clickstream data collected from the website of a company operating in the hospitality industry and modelled by an MHMM selected by the proposed score.

Suggested Citation

  • Furio Urso & Antonino Abbruzzo & Marcello Chiodi & Maria Francesca Cracolici, 2024. "Model selection for mixture hidden Markov models: an application to clickstream data," Statistical Papers, Springer, vol. 65(9), pages 5797-5834, December.
  • Handle: RePEc:spr:stpapr:v:65:y:2024:i:9:d:10.1007_s00362-024-01608-3
    DOI: 10.1007/s00362-024-01608-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-024-01608-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-024-01608-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Gilles Celeux & Jean-Baptiste Durand, 2008. "Selecting hidden Markov model state number with cross-validated likelihood," Computational Statistics, Springer, vol. 23(4), pages 541-564, October.
    2. Altman, Rachel MacKay, 2007. "Mixed Hidden Markov Models: An Extension of the Hidden Markov Model to the Longitudinal Data Setting," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 201-210, March.
    3. Dias, José G. & Vermunt, Jeroen K. & Ramos, Sofia, 2015. "Clustering financial time series: New insights from an extended hidden Markov model," European Journal of Operational Research, Elsevier, vol. 243(3), pages 852-864.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Roland Langrock & Thomas Kneib & Alexander Sohn & Stacy L. DeRuiter, 2015. "Nonparametric inference in hidden Markov models using P-splines," Biometrics, The International Biometric Society, vol. 71(2), pages 520-528, June.
    2. Lauren Hoskovec & Matthew D. Koslovsky & Kirsten Koehler & Nicholas Good & Jennifer L. Peel & John Volckens & Ander Wilson, 2023. "Infinite hidden Markov models for multiple multivariate time series with missing data," Biometrics, The International Biometric Society, vol. 79(3), pages 2592-2604, September.
    3. Schücking, Maximilian & Jochem, Patrick, 2021. "Two-stage stochastic program optimizing the cost of electric vehicles in commercial fleets," Applied Energy, Elsevier, vol. 293(C).
    4. Trindade, Graça & Dias, José G. & Ambrósio, Jorge, 2017. "Extracting clusters from aggregate panel data: A market segmentation study," Applied Mathematics and Computation, Elsevier, vol. 296(C), pages 277-288.
    5. Simon DeDeo, 2016. "Conflict and Computation on Wikipedia: A Finite-State Machine Analysis of Editor Interactions," Future Internet, MDPI, vol. 8(3), pages 1-23, July.
    6. Spezia, L. & Cooksley, S.L. & Brewer, M.J. & Donnelly, D. & Tree, A., 2014. "Modelling species abundance in a river by Negative Binomial hidden Markov models," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 599-614.
    7. Marino, Maria Francesca & Alfó, Marco, 2016. "Gaussian quadrature approximations in mixed hidden Markov models for longitudinal data: A simulation study," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 193-209.
    8. Colombi, R. & Giordano, S., 2015. "Multiple hidden Markov models for categorical time series," Journal of Multivariate Analysis, Elsevier, vol. 140(C), pages 19-30.
    9. F. Bartolucci & A. Farcomeni & F. Pennoni, 2014. "Rejoinder on: Latent Markov models: a review of a general framework for the analysis of longitudinal data with covariates," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 23(3), pages 484-486, September.
    10. Moliner, Jesús & Epifanio, Irene, 2019. "Robust multivariate and functional archetypal analysis with application to financial time series analysis," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 519(C), pages 195-208.
    11. Janczura, Joanna & Weron, Rafal, 2010. "Goodness-of-fit testing for regime-switching models," MPRA Paper 22871, University Library of Munich, Germany.
    12. Benny Ren & Ian Barnett, 2023. "Combining mixed effects hidden Markov models with latent alternating recurrent event processes to model diurnal active–rest cycles," Biometrics, The International Biometric Society, vol. 79(4), pages 3402-3417, December.
    13. Lin, Yong & Huang, Mian, 2025. "Penalized composite likelihood estimation for hidden Markov models with unknown number of states," Statistics & Probability Letters, Elsevier, vol. 216(C).
    14. Eric Lucas dos Santos Cabral & Mario Orestes Aguirre Gonzalez & Priscila da Cunha Jacome Vidal & Joao Florencio da Costa Junior & Rafael Monteiro de Vasconcelos & David Cassimiro de Melo & Ruan Lucas , 2024. "Optimization Models for Operations and Maintenance of Offshore Wind Turbines Based on Artificial Intelligence and Operations Research: A Systematic Literature Review," International Journal of Business and Management, Canadian Center of Science and Education, vol. 19(3), pages 1-1, June.
    15. Jennifer Pohle & Roland Langrock & Floris M. Beest & Niels Martin Schmidt, 2017. "Selecting the Number of States in Hidden Markov Models: Pragmatic Solutions Illustrated Using Animal Movement," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 22(3), pages 270-293, September.
    16. Chaudhuri, Kausik & Sen, Rituparna & Tan, Zheng, 2018. "Testing extreme dependence in financial time series," Economic Modelling, Elsevier, vol. 73(C), pages 378-394.
    17. Victor Medina-Olivares & Raffaella Calabrese, 2023. "Detecting Consumers' Financial Vulnerability using Open Banking Data: Evidence from UK Payday Loans," Papers 2306.01749, arXiv.org.
    18. Gordon Anderson & Alessio Farcomeni & Maria Grazia Pittau & Roberto Zelli, 2019. "Rectangular latent Markov models for time‐specific clustering, with an analysis of the wellbeing of nations," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 68(3), pages 603-621, April.
    19. Fulvia Pennoni & Francesco Bartolucci & Gianfranco Forte & Ferdinando Ametrano, 2022. "Exploring the dependencies among main cryptocurrency log‐returns: A hidden Markov model," Economic Notes, Banca Monte dei Paschi di Siena SpA, vol. 51(1), February.
    20. Ruijin Lu & Tonja R. Nansel & Zhen Chen, 2023. "A Perception-Augmented Hidden Markov Model for Parent–Child Relations in Families of Youth with Type 1 Diabetes," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 15(1), pages 288-308, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:65:y:2024:i:9:d:10.1007_s00362-024-01608-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.