IDEAS home Printed from https://ideas.repec.org/p/bay/rdwiwi/34994.html
   My bibliography  Save this paper

Hidden Variable Models for Market Basket Data. Statistical Performance and Managerial Implications

Author

Listed:
  • Hruschka, Harald

Abstract

We compare the performance of several hidden variable models, namely binary factor analysis, topic models (latent Dirichlet allocation, correlated topic model), the restricted Boltzmann machine and the deep belief net. We shortly present these models and outline their estimation. Performance is measured by log likelihood values of these models for a holdout data set of market baskets. For each model we estimate and evaluate variants with increasing numbers of hidden variables. Binary factor analysis vastly outperforms topic models. The restricted Boltzmann machine and the deep belief net on the other hand attain a similar performance advantage over binary factor analysis. For each model we interpret the relationships between the most important hidden variables and observed category purchases. To demonstrate managerial implications we compute relative basket size increase due to promoting each category for the better performing models. Recommendations based on the restricted Boltzmann machine and the deep belief net not only have lower uncertainty due to their statistical performance, they also have more managerial appeal than those derived for binary factor analysis. The impressive performances of the restricted Boltzmann machine and the deep belief net suggest to continue research by extending these models, e.g., by including marketing variables as predictors.

Suggested Citation

  • Hruschka, Harald, 2016. "Hidden Variable Models for Market Basket Data. Statistical Performance and Managerial Implications," University of Regensburg Working Papers in Business, Economics and Management Information Systems 489, University of Regensburg, Department of Economics.
  • Handle: RePEc:bay:rdwiwi:34994
    as

    Download full text from publisher

    File URL: https://epub.uni-regensburg.de/34994/1/dbn_baksets_diskp_all.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Bruno J.D. Jacobs & Bas Donkers & Dennis Fok, 2016. "Model-Based Purchase Predictions for Large Assortments," Marketing Science, INFORMS, vol. 35(3), pages 389-404, May.
    2. P. Seetharaman & Siddhartha Chib & Andrew Ainslie & Peter Boatwright & Tat Chan & Sachin Gupta & Nitin Mehta & Vithala Rao & Andrei Strijnev, 2005. "Models of Multi-Category Choice Behavior," Marketing Letters, Springer, vol. 16(3), pages 239-254, December.
    3. Chalmers, R. Philip, 2012. "mirt: A Multidimensional Item Response Theory Package for the R Environment," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 48(i06).
    4. Grün, Bettina & Hornik, Kurt, 2011. "topicmodels: An R Package for Fitting Topic Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i13).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mariflor Vega Carrasco & Ioanna Manolopoulou & Jason O'Sullivan & Rosie Prior & Mirco Musolesi, 2022. "Posterior summaries of grocery retail topic models: Evaluation, interpretability and credibility," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 562-588, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Harald Hruschka, 2021. "Comparing unsupervised probabilistic machine learning methods for market basket analysis," Review of Managerial Science, Springer, vol. 15(2), pages 497-527, February.
    2. Justyna Klejdysz & Robin L. Lumsdaine, 2023. "Shifts in ECB Communication: A Textual Analysis of the Press Conference," International Journal of Central Banking, International Journal of Central Banking, vol. 19(2), pages 473-542, June.
    3. Schröder, Nadine & Falke, Andreas & Hruschka, Harald & Reutterer, Thomas, 2019. "Analyzing the Browsing Basket: A Latent Interests-Based Segmentation Tool," Journal of Interactive Marketing, Elsevier, vol. 47(C), pages 181-197.
    4. Martin Reisenbichler & Thomas Reutterer, 2019. "Topic modeling in marketing: recent advances and research opportunities," Journal of Business Economics, Springer, vol. 89(3), pages 327-356, April.
    5. Andreas Falke & Harald Hruschka, 2022. "Analyzing browsing across websites by machine learning methods," Journal of Business Economics, Springer, vol. 92(5), pages 829-852, July.
    6. Oscar Calvo-Gonz'alez & Axel Eizmendi & Germ'an Reyes, 2022. "The Shifting Attention of Political Leaders: Evidence from Two Centuries of Presidential Speeches," Papers 2209.00540, arXiv.org, revised Jun 2023.
    7. Izolda Pristojkovic Suko & Magdalena Holter & Erwin Stolz & Elfriede Renate Greimel & Wolfgang Freidl, 2022. "Acculturation, Adaptation, and Health among Croatian Migrants in Austria and Ireland: A Cross-Sectional Study," IJERPH, MDPI, vol. 19(24), pages 1-15, December.
    8. Sandra Wankmüller, 2023. "A comparison of approaches for imbalanced classification problems in the context of retrieving relevant documents for an analysis," Journal of Computational Social Science, Springer, vol. 6(1), pages 91-163, April.
    9. Nana Kim & Daniel M. Bolt & James Wollack, 2022. "Noncompensatory MIRT For Passage-Based Tests," Psychometrika, Springer;The Psychometric Society, vol. 87(3), pages 992-1009, September.
    10. Hakyeon Lee & Hanbin Seo & Youngjung Geum, 2018. "Uncovering the Topic Landscape of Product-Service System Research: from Sustainability to Value Creation," Sustainability, MDPI, vol. 10(4), pages 1-15, March.
    11. Harald Hruschka, 2017. "Analyzing the dependences of multi-category purchases on interactions of marketing variables," Journal of Business Economics, Springer, vol. 87(3), pages 295-313, April.
    12. Cristina Oliveira & Paulo Rita & Sérgio Moro, 2021. "Unveiling Island Tourism in Cape Verde through Online Reviews," Sustainability, MDPI, vol. 13(15), pages 1-14, July.
    13. Christian WEISMAYER, 2022. "Applied Research in Quality of Life: A Computational Literature Review," Applied Research in Quality of Life, Springer;International Society for Quality-of-Life Studies, vol. 17(3), pages 1433-1458, June.
    14. Daphna Harel & Russell J. Steele, 2018. "An Information Matrix Test for the Collapsing of Categories Under the Partial Credit Model," Journal of Educational and Behavioral Statistics, , vol. 43(6), pages 721-750, December.
    15. Arsenyan, Jbid & Mirowska, Agata & Piepenbrink, Anke, 2023. "Close encounters with the virtual kind: Defining a human-virtual agent coexistence framework," Technological Forecasting and Social Change, Elsevier, vol. 193(C).
    16. Triss Ashton & Nicholas Evangelopoulos & Victor Prybutok, 2014. "Extending monitoring methods to textual data: a research agenda," Quality & Quantity: International Journal of Methodology, Springer, vol. 48(4), pages 2277-2294, July.
    17. Laura Maldonado-Murciano & Halley M. Pontes & Mark D. Griffiths & Maite Barrios & Juana Gómez-Benito & Georgina Guilera, 2020. "The Spanish Version of the Internet Gaming Disorder Scale-Short Form (IGDS9-SF): Further Examination Using Item Response Theory," IJERPH, MDPI, vol. 17(19), pages 1-14, September.
    18. Qi Wang & Tobias Jeppsson, 2022. "Identifying benchmark units for research management and evaluation," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(12), pages 7557-7574, December.
    19. Sutthipong Meeyai, 2015. "Modeling Store Patronage: A Systematic Review," International Conference on Marketing and Business Development Journal, The Bucharest University of Economic Studies, vol. 1(1), pages 40-48, July.
    20. Davit Khachatryan & Brigitte Muehlmann, 2020. "Measuring the drafting alignment of patent documents using text mining," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-20, July.

    More about this item

    Keywords

    Marketing; Market Basket Analysis; Factor Analysis; Topic Models; Restricted Boltzmann Machine; Deep Belief Net;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bay:rdwiwi:34994. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Gernot Deinzer (email available below). General contact details of provider: https://edirc.repec.org/data/wfregde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.