IDEAS home Printed from https://ideas.repec.org/p/bay/rdwiwi/30747.html
   My bibliography  Save this paper

Linking Multi-Category Purchases to Latent Activities of Shoppers: Analysing Market Baskets by Topic Models

Author

Listed:
  • Hruschka, Harald

Abstract

We investigate the application of two topic models, latent Dirichlet allocation (LDA) and the correlated topic model (CTM), to market basket analysis. Topic models measure the association between observed purchases and underlying latent activities of shoppers by conceiving each basket as random mixture of latent activities. We explain the structure of the two topic models used. We discuss estimation of LDA models by blocked Gibbs sampling. In addition we show how to evaluate the performance of topic models on estimation and holdout data. In the empirical study we analyse a total of 18,000 purchases made at a medium-sized supermarket which refer to 60 product categories. The LDA model performs better than the CTM in terms of log likelihood values. Latent activities inferred by this models are intuitive and interpretable, e.g., related to shopping of beverages or personal care, to baking or to an inclination towards luxury food. To illustrate the managerial relevance of estimated topic models we sketch the core of a recommender system which ranks purchase probabilities of other product categories conditional on the basket of a shopper.

Suggested Citation

  • Hruschka, Harald, 2014. "Linking Multi-Category Purchases to Latent Activities of Shoppers: Analysing Market Baskets by Topic Models," University of Regensburg Working Papers in Business, Economics and Management Information Systems 482, University of Regensburg, Department of Economics.
  • Handle: RePEc:bay:rdwiwi:30747
    as

    Download full text from publisher

    File URL: https://epub.uni-regensburg.de/30747/1/Diskussionsbeitrag%20Complete.pdf
    Download Restriction: no

    File URL: https://epub.uni-regensburg.de/30747/7/Diskussionsbeitrag%20Komplett.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Boztug, Yasemin & Reutterer, Thomas, 2008. "A combined approach for segment-specific market basket analysis," European Journal of Operational Research, Elsevier, vol. 187(1), pages 294-312, May.
    2. Yasemin Boztuğ & Lutz Hildebrandt, 2008. "Modeling Joint Purchases with a Multivariate MNL Approach," Schmalenbach Business Review (sbr), LMU Munich School of Management, vol. 60(4), pages 400-422, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Schröder, Nadine & Falke, Andreas & Hruschka, Harald & Reutterer, Thomas, 2019. "Analyzing the Browsing Basket: A Latent Interests-Based Segmentation Tool," Journal of Interactive Marketing, Elsevier, vol. 47(C), pages 181-197.
    2. Mariflor Vega Carrasco & Ioanna Manolopoulou & Jason O'Sullivan & Rosie Prior & Mirco Musolesi, 2022. "Posterior summaries of grocery retail topic models: Evaluation, interpretability and credibility," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 562-588, June.
    3. Nica-Avram, Georgiana & Harvey, John & Smith, Gavin & Smith, Andrew & Goulding, James, 2021. "Identifying food insecurity in food sharing networks via machine learning," Journal of Business Research, Elsevier, vol. 131(C), pages 469-484.
    4. Martin Reisenbichler & Thomas Reutterer, 2019. "Topic modeling in marketing: recent advances and research opportunities," Journal of Business Economics, Springer, vol. 89(3), pages 327-356, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dippold, Katrin & Hruschka, Harald, 2010. "Variable Selection for Market Basket Analysis," University of Regensburg Working Papers in Business, Economics and Management Information Systems 443, University of Regensburg, Department of Economics.
    2. Katrin Dippold & Harald Hruschka, 2013. "Variable selection for market basket analysis," Computational Statistics, Springer, vol. 28(2), pages 519-539, April.
    3. Harald Hruschka, 2017. "Analyzing the dependences of multi-category purchases on interactions of marketing variables," Journal of Business Economics, Springer, vol. 87(3), pages 295-313, April.
    4. Harald Hruschka, 2017. "Multi-category purchase incidences with marketing cross effects," Review of Managerial Science, Springer, vol. 11(2), pages 443-469, March.
    5. Creed, Bernard & Ning Shen, Kathy & Ashill, Nick & Wu, Tianshi, 2021. "Retail shopping at airports: Making travellers buy again," Journal of Business Research, Elsevier, vol. 137(C), pages 293-307.
    6. Feihong Xia & Rabikar Chatterjee & Jerrold H. May, 2019. "Using Conditional Restricted Boltzmann Machines to Model Complex Consumer Shopping Patterns," Marketing Science, INFORMS, vol. 38(4), pages 711-727, July.
    7. Anastasia Griva & Cleopatra Bardaki & Katerina Pramatari & Georgios Doukidis, 2022. "Factors Affecting Customer Analytics: Evidence from Three Retail Cases," Information Systems Frontiers, Springer, vol. 24(2), pages 493-516, April.
    8. Harald Hruschka, 2021. "Comparing unsupervised probabilistic machine learning methods for market basket analysis," Review of Managerial Science, Springer, vol. 15(2), pages 497-527, February.
    9. Harald Hruschka, 2022. "Analyzing joint brand purchases by conditional restricted Boltzmann machines," Review of Managerial Science, Springer, vol. 16(4), pages 1117-1145, May.
    10. Berndt Jesenko & Christian Schlögl, 2021. "The effect of web of science subject categories on clustering: the case of data-driven methods in business and economic sciences," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6785-6801, August.
    11. Andrews, Rick L. & Brusco, Michael J. & Currim, Imran S., 2010. "Amalgamation of partitions from multiple segmentation bases: A comparison of non-model-based and model-based methods," European Journal of Operational Research, Elsevier, vol. 201(2), pages 608-618, March.
    12. Vithala R. Rao & Gary J. Russell & Hemant Bhargava & Alan Cooke & Tim Derdenger & Hwang Kim & Nanda Kumar & Irwin Levin & Yu Ma & Nitin Mehta & John Pracejus & R. Venkatesh, 2018. "Emerging Trends in Product Bundling: Investigating Consumer Choice and Firm Behavior," Customer Needs and Solutions, Springer;Institute for Sustainable Innovation and Growth (iSIG), vol. 5(1), pages 107-120, March.
    13. Thomas Reutterer & Kurt Hornik & Nicolas March & Kathrin Gruber, 2017. "A data mining framework for targeted category promotions," Journal of Business Economics, Springer, vol. 87(3), pages 337-358, April.
    14. Kwak, Kyuseop & Duvvuri, Sri Devi & Russell, Gary J., 2015. "An Analysis of Assortment Choice in Grocery Retailing," Journal of Retailing, Elsevier, vol. 91(1), pages 19-33.
    15. Dippold Katrin & Hruschka Harald, 2013. "A Model of Heterogeneous Multicategory Choice for Market Basket Analysis," Review of Marketing Science, De Gruyter, vol. 11(1), pages 1-31, September.
    16. Michael Hahsler & Radoslaw Karpienko, 2017. "Visualizing association rules in hierarchical groups," Journal of Business Economics, Springer, vol. 87(3), pages 317-335, April.
    17. Jiang, Yuanchun & Shang, Jennifer & Liu, Yezheng & May, Jerrold, 2015. "Redesigning promotion strategy for e-commerce competitiveness through pricing and recommendation," International Journal of Production Economics, Elsevier, vol. 167(C), pages 257-270.
    18. Park, Sangwon & Nicolau, Juan L., 2015. "Differentiated effect of advertising: Joint vs. separate consumption," Tourism Management, Elsevier, vol. 47(C), pages 107-114.
    19. Gökgür, Burak & Karabatı, Selçuk, 2019. "Dynamic and targeted bundle pricing of two independently valued products," European Journal of Operational Research, Elsevier, vol. 279(1), pages 184-198.
    20. Bagirov, Adil M. & Ugon, Julien & Mirzayeva, Hijran, 2013. "Nonsmooth nonconvex optimization approach to clusterwise linear regression problems," European Journal of Operational Research, Elsevier, vol. 229(1), pages 132-142.

    More about this item

    Keywords

    multi-category buying behavior; market basket analysis; topic models;
    All these keywords.

    JEL classification:

    • L81 - Industrial Organization - - Industry Studies: Services - - - Retail and Wholesale Trade; e-Commerce
    • C35 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Discrete Regression and Qualitative Choice Models; Discrete Regressors; Proportions
    • C11 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Bayesian Analysis: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bay:rdwiwi:30747. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Gernot Deinzer (email available below). General contact details of provider: https://edirc.repec.org/data/wfregde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.