IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1909.00154.html
   My bibliography  Save this paper

Rethinking travel behavior modeling representations through embeddings

Author

Listed:
  • Francisco C. Pereira

Abstract

This paper introduces the concept of travel behavior embeddings, a method for re-representing discrete variables that are typically used in travel demand modeling, such as mode, trip purpose, education level, family type or occupation. This re-representation process essentially maps those variables into a latent space called the \emph{embedding space}. The benefit of this is that such spaces allow for richer nuances than the typical transformations used in categorical variables (e.g. dummy encoding, contrasted encoding, principal components analysis). While the usage of latent variable representations is not new per se in travel demand modeling, the idea presented here brings several innovations: it is an entirely data driven algorithm; it is informative and consistent, since the latent space can be visualized and interpreted based on distances between different categories; it preserves interpretability of coefficients, despite being based on Neural Network principles; and it is transferrable, in that embeddings learned from one dataset can be reused for other ones, as long as travel behavior keeps consistent between the datasets. The idea is strongly inspired on natural language processing techniques, namely the word2vec algorithm. Such algorithm is behind recent developments such as in automatic translation or next word prediction. Our method is demonstrated using a model choice model, and shows improvements of up to 60\% with respect to initial likelihood, and up to 20% with respect to likelihood of the corresponding traditional model (i.e. using dummy variables) in out-of-sample evaluation. We provide a new Python package, called PyTre (PYthon TRavel Embeddings), that others can straightforwardly use to replicate our results or improve their own models. Our experiments are themselves based on an open dataset (swissmetro).

Suggested Citation

  • Francisco C. Pereira, 2019. "Rethinking travel behavior modeling representations through embeddings," Papers 1909.00154, arXiv.org.
  • Handle: RePEc:arx:papers:1909.00154
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1909.00154
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Neeraj Arora & Greg M. Allenby & James L. Ginter, 1998. "A Hierarchical Bayes Model of Primary and Secondary Demand," Marketing Science, INFORMS, vol. 17(1), pages 29-44.
    2. Brathwaite, Timothy & Walker, Joan L., 2018. "Asymmetric, closed-form, finite-parameter models of multinomial choice," Journal of choice modelling, Elsevier, vol. 29(C), pages 78-112.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. S. Van Cranenburgh & S. Wang & A. Vij & F. Pereira & J. Walker, 2021. "Choice modelling in the age of machine learning -- discussion paper," Papers 2101.11948, arXiv.org, revised Nov 2021.
    2. Lahoz, Lorena Torres & Pereira, Francisco Camara & Sfeir, Georges & Arkoudi, Ioanna & Monteiro, Mayara Moraes & Azevedo, Carlos Lima, 2023. "Attitudes and Latent Class Choice Models using Machine Learning," Journal of choice modelling, Elsevier, vol. 49(C).
    3. Teodóra Szép & Sander Cranenburgh & Caspar Chorus, 2024. "Moral rhetoric in discrete choice models: a Natural Language Processing approach," Quality & Quantity: International Journal of Methodology, Springer, vol. 58(1), pages 179-206, February.
    4. Arkoudi, Ioanna & Krueger, Rico & Azevedo, Carlos Lima & Pereira, Francisco C., 2023. "Combining discrete choice models and neural networks through embeddings: Formulation, interpretability and performance," Transportation Research Part B: Methodological, Elsevier, vol. 175(C).
    5. Lorena Torres Lahoz & Francisco Camara Pereira & Georges Sfeir & Ioanna Arkoudi & Mayara Moraes Monteiro & Carlos Lima Azevedo, 2023. "Attitudes and Latent Class Choice Models using Machine learning," Papers 2302.09871, arXiv.org.
    6. Ioanna Arkoudi & Carlos Lima Azevedo & Francisco C. Pereira, 2021. "Combining Discrete Choice Models and Neural Networks through Embeddings: Formulation, Interpretability and Performance," Papers 2109.12042, arXiv.org, revised Sep 2021.
    7. Ortelli, Nicola & Hillel, Tim & Pereira, Francisco C. & de Lapparent, Matthieu & Bierlaire, Michel, 2021. "Assisted specification of discrete choice models," Journal of choice modelling, Elsevier, vol. 39(C).
    8. Smeele, Nicholas V.R. & Chorus, Caspar G. & Schermer, Maartje H.N. & de Bekker-Grob, Esther W., 2023. "Towards machine learning for moral choice analysis in health economics: A literature review and research agenda," Social Science & Medicine, Elsevier, vol. 326(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jaehwan Kim & Greg M. Allenby & Peter E. Rossi, 2002. "Modeling Consumer Demand for Variety," Marketing Science, INFORMS, vol. 21(3), pages 229-250, December.
    2. Charles Cunningham & Ken Deal & Yvonne Chen, 2010. "Adaptive Choice-Based Conjoint Analysis," The Patient: Patient-Centered Outcomes Research, Springer;International Academy of Health Preference Research, vol. 3(4), pages 257-273, December.
    3. Ioanna Arkoudi & Carlos Lima Azevedo & Francisco C. Pereira, 2021. "Combining Discrete Choice Models and Neural Networks through Embeddings: Formulation, Interpretability and Performance," Papers 2109.12042, arXiv.org, revised Sep 2021.
    4. Bhat, Chandra R., 2005. "A multiple discrete-continuous extreme value model: formulation and application to discretionary time-use decisions," Transportation Research Part B: Methodological, Elsevier, vol. 39(8), pages 679-707, September.
    5. Neeraj Arora & Ty Henderson, 2007. "Embedded Premium Promotion: Why It Works and How to Make It More Effective," Marketing Science, INFORMS, vol. 26(4), pages 514-531, 07-08.
    6. Yu, Jie & Goos, Peter & Vandebroek, Martina, 2011. "Individually adapted sequential Bayesian conjoint-choice designs in the presence of consumer heterogeneity," International Journal of Research in Marketing, Elsevier, vol. 28(4), pages 378-388.
    7. Theodoros Evgeniou & Constantinos Boussios & Giorgos Zacharia, 2005. "Generalized Robust Conjoint Estimation," Marketing Science, INFORMS, vol. 24(3), pages 415-429, May.
    8. Kim, Chul & Jun, Duk Bin & Park, Sungho, 2018. "Capturing flexible correlations in multiple-discrete choice outcomes using copulas," International Journal of Research in Marketing, Elsevier, vol. 35(1), pages 34-59.
    9. Albrecht, Tobias & Rausch, Theresa Maria & Derra, Nicholas Daniel, 2021. "Call me maybe: Methods and practical implementation of artificial intelligence in call center arrivals’ forecasting," Journal of Business Research, Elsevier, vol. 123(C), pages 267-278.
    10. Olivier Toubia & Duncan I. Simester & John R. Hauser & Ely Dahan, 2003. "Fast Polyhedral Adaptive Conjoint Estimation," Marketing Science, INFORMS, vol. 22(3), pages 273-303.
    11. Bhat, Chandra R., 2008. "The multiple discrete-continuous extreme value (MDCEV) model: Role of utility function parameters, identification considerations, and model extensions," Transportation Research Part B: Methodological, Elsevier, vol. 42(3), pages 274-303, March.
    12. Nitin Mehta, 2007. "Investigating Consumers' Purchase Incidence and Brand Choice Decisions Across Multiple Product Categories: A Theoretical and Empirical Analysis," Marketing Science, INFORMS, vol. 26(2), pages 196-217, 03-04.
    13. von Haefen, Roger H., 2003. "Incorporating observed choice into the construction of welfare measures from random utility models," Journal of Environmental Economics and Management, Elsevier, vol. 45(2), pages 145-165, March.
    14. Victor Martínez‐de‐Albéniz & Arnau Planas & Stefano Nasini, 2020. "Using Clickstream Data to Improve Flash Sales Effectiveness," Production and Operations Management, Production and Operations Management Society, vol. 29(11), pages 2508-2531, November.
    15. Stephen D. Wong & Jacquelyn C. Broader & Joan L. Walker & Susan A. Shaheen, 2023. "Understanding California wildfire evacuee behavior and joint choice making," Transportation, Springer, vol. 50(4), pages 1165-1211, August.
    16. Lynd Bacon & Peter Lenk, 2012. "Augmenting discrete-choice data to identify common preference scales for inter-subject analyses," Quantitative Marketing and Economics (QME), Springer, vol. 10(4), pages 453-474, December.
    17. Kwangpil Chang & S. Siddarth & Charles B. Weinberg, 1999. "The Impact of Heterogeneity in Purchase Timing and Price Responsiveness on Estimates of Sticker Shock Effects," Marketing Science, INFORMS, vol. 18(2), pages 178-192.
    18. Braun, Alexander & Schmeiser, Hato & Schreiber, Florian, 2016. "On consumer preferences and the willingness to pay for term life insurance," European Journal of Operational Research, Elsevier, vol. 253(3), pages 761-776.
    19. Benoit Playe & Chloé-Agathe Azencott & Véronique Stoven, 2018. "Efficient multi-task chemogenomics for drug specificity prediction," PLOS ONE, Public Library of Science, vol. 13(10), pages 1-34, October.
    20. Crabbe, M. & Vandebroek, M., 2012. "Improving the efficiency of individualized designs for the mixed logit choice model by including covariates," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 2059-2072.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1909.00154. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.