IDEAS home Printed from https://ideas.repec.org/a/spr/jbecon/v92y2022i5d10.1007_s11573-021-01067-4.html
   My bibliography  Save this article

Analyzing browsing across websites by machine learning methods

Author

Listed:
  • Andreas Falke

    (Universitat Regensburg Wirtschaftswissenschaftliche Fakultat Regensburg)

  • Harald Hruschka

    (Universitat Regensburg Wirtschaftswissenschaftliche Fakultat Regensburg)

Abstract

The increasing importance of online distribution channels is paralleled by a rising interest in gaining insights into the customer journey and browsing behavior. We evaluate several machine learning methods (latent Dirichlet allocation, correlated topic model, structural topic model, replicated softmax model) with respect to their ability to reproduce the browsing behavior of households across websites. In addition, we compare these machine learning methods to a related classical technique, singular value decomposition. In our study, the replicated softmax model outperforms latent Dirichlet allocation, but the correlated topic model attains the overall best performance. Compared to singular value decomposition both the correlated topic model and the replicated softmax model lead to a more efficient compression of web browsing data. On the other hand, singular value decomposition surpasses latent Dirichlet allocation. We interpret results of the correlated topic model and the replicated softmax model by determining combinations of topics or hidden variables that are heterogeneous with respect to visited websites. We show that decision makers should not rely on bivariate measures of site visits, as these do not agree with measures of interdependences between sites that can be inferred from the correlated topic model or the replicated softmax model. We investigate how well topics or hidden variables measured by these methods predict yearly household expenditures. The correlated topic model leads to the best predictive performance, followed by the replicated softmax model. We also discuss how the replicated softmax model can be used to support online marketing decisions of websites.

Suggested Citation

  • Andreas Falke & Harald Hruschka, 2022. "Analyzing browsing across websites by machine learning methods," Journal of Business Economics, Springer, vol. 92(5), pages 829-852, July.
  • Handle: RePEc:spr:jbecon:v:92:y:2022:i:5:d:10.1007_s11573-021-01067-4
    DOI: 10.1007/s11573-021-01067-4
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11573-021-01067-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11573-021-01067-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Linda Hagen & Kosuke Uetake & Nathan Yang & Bryan Bollinger & Allison J. B. Chaney & Daria Dzyabura & Jordan Etkin & Avi Goldfarb & Liu Liu & K. Sudhir & Yanwen Wang & James R. Wright & Ying Zhu, 2020. "How can machine learning aid behavioral marketing research?," Marketing Letters, Springer, vol. 31(4), pages 361-370, December.
    2. Carl Eckart & Gale Young, 1936. "The approximation of one matrix by another of lower rank," Psychometrika, Springer;The Psychometric Society, vol. 1(3), pages 211-218, September.
    3. Grün, Bettina & Hornik, Kurt, 2011. "topicmodels: An R Package for Fitting Topic Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i13).
    4. Bruno J.D. Jacobs & Bas Donkers & Dennis Fok, 2016. "Model-Based Purchase Predictions for Large Assortments," Marketing Science, INFORMS, vol. 35(3), pages 389-404, May.
    5. Bradlow, Eric T. & Gangwar, Manish & Kopalle, Praveen & Voleti, Sudhir, 2017. "The Role of Big Data and Predictive Analytics in Retailing," Journal of Retailing, Elsevier, vol. 93(1), pages 79-95.
    6. Ma, Liye & Sun, Baohong, 2020. "Machine learning and AI in marketing – Connecting computing power to human insights," International Journal of Research in Marketing, Elsevier, vol. 37(3), pages 481-504.
    7. Feihong Xia & Rabikar Chatterjee & Jerrold H. May, 2019. "Using Conditional Restricted Boltzmann Machines to Model Complex Consumer Shopping Patterns," Marketing Science, INFORMS, vol. 38(4), pages 711-727, July.
    8. Harald Hruschka, 2021. "Comparing unsupervised probabilistic machine learning methods for market basket analysis," Review of Managerial Science, Springer, vol. 15(2), pages 497-527, February.
    9. Michael Trusov & Liye Ma & Zainab Jamal, 2016. "Crumbs of the Cookie: User Profiling in Customer-Base Analysis and Behavioral Targeting," Marketing Science, INFORMS, vol. 35(3), pages 405-426, May.
    10. Schröder, Nadine & Falke, Andreas & Hruschka, Harald & Reutterer, Thomas, 2019. "Analyzing the Browsing Basket: A Latent Interests-Based Segmentation Tool," Journal of Interactive Marketing, Elsevier, vol. 47(C), pages 181-197.
    11. Pradeep Chintagunta & Dominique M. Hanssens & John R. Hauser, 2016. "Editorial—Marketing Science and Big Data," Marketing Science, INFORMS, vol. 35(3), pages 341-342, May.
    12. Scott Deerwester & Susan T. Dumais & George W. Furnas & Thomas K. Landauer & Richard Harshman, 1990. "Indexing by latent semantic analysis," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 41(6), pages 391-407, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wolfgang Breuer & Jannis Bischof & Christian Hofmann & Jochen Hundsdoerfer & Hans-Ulrich Küpper & Marko Sarstedt & Philipp Schreck & Tim Weitzel & Peter Witt, 2023. "Recent developments in Business Economics," Journal of Business Economics, Springer, vol. 93(6), pages 989-1013, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Paramveer S. Dhillon & Sinan Aral, 2021. "Modeling Dynamic User Interests: A Neural Matrix Factorization Approach," Marketing Science, INFORMS, vol. 40(6), pages 1059-1080, November.
    2. Schröder, Nadine & Falke, Andreas & Hruschka, Harald & Reutterer, Thomas, 2019. "Analyzing the Browsing Basket: A Latent Interests-Based Segmentation Tool," Journal of Interactive Marketing, Elsevier, vol. 47(C), pages 181-197.
    3. Harald Hruschka, 2022. "Analyzing joint brand purchases by conditional restricted Boltzmann machines," Review of Managerial Science, Springer, vol. 16(4), pages 1117-1145, May.
    4. Wang, Xin (Shane) & Ryoo, Jun Hyun (Joseph) & Bendle, Neil & Kopalle, Praveen K., 2021. "The role of machine learning analytics and metrics in retailing research," Journal of Retailing, Elsevier, vol. 97(4), pages 658-675.
    5. Bruno Jacobs & Dennis Fok & Bas Donkers, 2021. "Understanding Large-Scale Dynamic Purchase Behavior," Marketing Science, INFORMS, vol. 40(5), pages 844-870, September.
    6. Martin Reisenbichler & Thomas Reutterer, 2019. "Topic modeling in marketing: recent advances and research opportunities," Journal of Business Economics, Springer, vol. 89(3), pages 327-356, April.
    7. Herhausen, Dennis & Bernritter, Stefan F. & Ngai, Eric W.T. & Kumar, Ajay & Delen, Dursun, 2024. "Machine learning in marketing: Recent progress and future research directions," Journal of Business Research, Elsevier, vol. 170(C).
    8. Triss Ashton & Nicholas Evangelopoulos & Victor Prybutok, 2014. "Extending monitoring methods to textual data: a research agenda," Quality & Quantity: International Journal of Methodology, Springer, vol. 48(4), pages 2277-2294, July.
    9. Maksym Polyakov & Morteza Chalak & Md. Sayed Iftekhar & Ram Pandit & Sorada Tapsuwan & Fan Zhang & Chunbo Ma, 2018. "Authorship, Collaboration, Topics, and Research Gaps in Environmental and Resource Economics 1991–2015," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 71(1), pages 217-239, September.
    10. Blasco-Arcas, Lorena & Lee, Hsin-Hsuan Meg & Kastanakis, Minas N. & Alcañiz, Mariano & Reyes-Menendez, Ana, 2022. "The role of consumer data in marketing: A research agenda," Journal of Business Research, Elsevier, vol. 146(C), pages 436-452.
    11. Ratchford, Brian & Soysal, Gonca & Zentner, Alejandro & Gauri, Dinesh K., 2022. "Online and offline retailing: What we know and directions for future research," Journal of Retailing, Elsevier, vol. 98(1), pages 152-177.
    12. Shr-Wei Kao & Pin Luarn, 2020. "Topic Modeling Analysis of Social Enterprises: Twitter Evidence," Sustainability, MDPI, vol. 12(8), pages 1-20, April.
    13. Chatterjee, Sheshadri & Chaudhuri, Ranjan & Vrontis, Demetris, 2022. "AI and digitalization in relationship management: Impact of adopting AI-embedded CRM system," Journal of Business Research, Elsevier, vol. 150(C), pages 437-450.
    14. Tom Magerman & Bart Looy & Xiaoyan Song, 2010. "Exploring the feasibility and accuracy of Latent Semantic Analysis based text mining techniques to detect similarity between patent documents and scientific publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 82(2), pages 289-306, February.
    15. Justyna Klejdysz & Robin L. Lumsdaine, 2023. "Shifts in ECB Communication: A Textual Analysis of the Press Conference," International Journal of Central Banking, International Journal of Central Banking, vol. 19(2), pages 473-542, June.
    16. Miguel Acosta, 2015. "FOMC Responses to Calls for Transparency," Finance and Economics Discussion Series 2015-60, Board of Governors of the Federal Reserve System (U.S.).
    17. Lüdering, Jochen & Tillmann, Peter, 2020. "Monetary policy on twitter and asset prices: Evidence from computational text analysis," The North American Journal of Economics and Finance, Elsevier, vol. 51(C).
    18. Lüdering Jochen & Winker Peter, 2016. "Forward or Backward Looking? The Economic Discourse and the Observed Reality," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(4), pages 483-515, August.
    19. Cho, Yung-Jan & Fu, Pei-Wen & Wu, Chi-Cheng, 2017. "Popular Research Topics in Marketing Journals, 1995–2014," Journal of Interactive Marketing, Elsevier, vol. 40(C), pages 52-72.
    20. Roozbeh Irani-Kermani & Edward C. Jaenicke & Ardalan Mirshani, 2023. "Accommodating heterogeneity in brand loyalty estimation: application to the U.S. beer retail market," Journal of Marketing Analytics, Palgrave Macmillan, vol. 11(4), pages 820-835, December.

    More about this item

    Keywords

    Online marketing; Web browsing; Machine learning; Topic models; Restricted Boltzmann machine;
    All these keywords.

    JEL classification:

    • M31 - Business Administration and Business Economics; Marketing; Accounting; Personnel Economics - - Marketing and Advertising - - - Marketing
    • M37 - Business Administration and Business Economics; Marketing; Accounting; Personnel Economics - - Marketing and Advertising - - - Advertising
    • C38 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Classification Methdos; Cluster Analysis; Principal Components; Factor Analysis
    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jbecon:v:92:y:2022:i:5:d:10.1007_s11573-021-01067-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.