IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v271y2018i1p341-356.html
   My bibliography  Save this article

Robust identification of email tracking: A machine learning approach

Author

Listed:
  • Haupt, Johannes
  • Bender, Benedict
  • Fabian, Benjamin
  • Lessmann, Stefan

Abstract

Email tracking allows email senders to collect fine-grained behavior and location data on email recipients, who are uniquely identifiable via their email address. Such tracking invades user privacy in that email tracking techniques gather data without user consent or awareness. Striving to increase privacy in email communication, this paper develops a detection engine to be the core of a selective tracking blocking mechanism in the form of three contributions. First, a large collection of email newsletters is analyzed to show the wide usage of tracking over different countries, industries and time. Second, we propose a set of features geared towards the identification of tracking images under real-world conditions. Novel features are devised to be computationally feasible and efficient, generalizable and resilient towards changes in tracking infrastructure. Third, we test the predictive power of these features in a benchmarking experiment using a selection of state-of-the-art classifiers to clarify the effectiveness of model-based tracking identification. We evaluate the expected accuracy of the approach on out-of-sample data, over increasing periods of time, and when faced with unknown senders.

Suggested Citation

  • Haupt, Johannes & Bender, Benedict & Fabian, Benjamin & Lessmann, Stefan, 2018. "Robust identification of email tracking: A machine learning approach," European Journal of Operational Research, Elsevier, vol. 271(1), pages 341-356.
  • Handle: RePEc:eee:ejores:v:271:y:2018:i:1:p:341-356
    DOI: 10.1016/j.ejor.2018.05.018
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221718304120
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2018.05.018?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Viaene, Stijn & Dedene, Guido, 2005. "Cost-sensitive learning and decision making revisited," European Journal of Operational Research, Elsevier, vol. 166(1), pages 212-220, October.
    2. Viaene, Stijn & Ayuso, Mercedes & Guillen, Montserrat & Van Gheel, Dirk & Dedene, Guido, 2007. "Strategies for detecting fraudulent claims in the automobile insurance industry," European Journal of Operational Research, Elsevier, vol. 176(1), pages 565-583, January.
    3. André Bonfrer & Xavier Drèze, 2009. "Real-Time Evaluation of E-mail Campaign Performance," Marketing Science, INFORMS, vol. 28(2), pages 251-263, 03-04.
    4. Premkumar, G. & Roberts, Margaret, 1999. "Adoption of new information technologies in rural small businesses," Omega, Elsevier, vol. 27(4), pages 467-484, August.
    5. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    6. Verbeke, Wouter & Dejaeger, Karel & Martens, David & Hur, Joon & Baesens, Bart, 2012. "New insights into churn prediction in the telecommunication sector: A profit driven data mining approach," European Journal of Operational Research, Elsevier, vol. 218(1), pages 211-229.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matthias Bogaert & Lex Delaere, 2023. "Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art," Mathematics, MDPI, vol. 11(5), pages 1-28, February.
    2. Arno de Caigny & Kristof Coussement & Koen de Bock, 2020. "Leveraging fine-grained transaction data for customer life event predictions," Post-Print hal-02507998, HAL.
    3. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    4. Liu, Yi & Yang, Menglong & Wang, Yudong & Li, Yongshan & Xiong, Tiancheng & Li, Anzhe, 2022. "Applying machine learning algorithms to predict default probability in the online credit market: Evidence from China," International Review of Financial Analysis, Elsevier, vol. 79(C).
    5. De Bock, Koen W. & Coussement, Kristof & Lessmann, Stefan, 2020. "Cost-sensitive business failure prediction when misclassification costs are uncertain: A heterogeneous ensemble selection approach," European Journal of Operational Research, Elsevier, vol. 285(2), pages 612-630.
    6. Kolesnikova, A. & Yang, Y. & Lessmann, S. & Ma, T. & Sung, M.-C. & Johnson, J.E.V., 2019. "Can Deep Learning Predict Risky Retail Investors? A Case Study in Financial Risk Behavior Forecasting," IRTG 1792 Discussion Papers 2019-023, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    7. Koen W. de Bock & Kristof Coussement & Stefan Lessmann, 2020. "Cost-sensitive business failure prediction when misclassification costs are uncertain: A heterogeneous ensemble selection approach," Post-Print hal-02863245, HAL.
    8. De Bock, Koen W. & Coussement, Kristof & Caigny, Arno De & Słowiński, Roman & Baesens, Bart & Boute, Robert N. & Choi, Tsan-Ming & Delen, Dursun & Kraus, Mathias & Lessmann, Stefan & Maldonado, Sebast, 2024. "Explainable AI for Operational Research: A defining framework, methods, applications, and a research agenda," European Journal of Operational Research, Elsevier, vol. 317(2), pages 249-272.
    9. Koen W. de Bock & Kristof Coussement & Arno De Caigny & Roman Slowiński & Bart Baesens & Robert N Boute & Tsan-Ming Choi & Dursun Delen & Mathias Kraus & Stefan Lessmann & Sebastián Maldonado & David , 2023. "Explainable AI for Operational Research: A Defining Framework, Methods, Applications, and a Research Agenda," Post-Print hal-04219546, HAL.
    10. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    11. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W., 2018. "A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees," European Journal of Operational Research, Elsevier, vol. 269(2), pages 760-772.
    12. Kim, A. & Yang, Y. & Lessmann, S. & Ma, T. & Sung, M.-C. & Johnson, J.E.V., 2020. "Can deep learning predict risky retail investors? A case study in financial risk behavior forecasting," European Journal of Operational Research, Elsevier, vol. 283(1), pages 217-234.
    13. Lessmann, Stefan & Coussement, Kristof & De Bock, Koen W. & Haupt, Johannes, 2018. "Targeting customers for profit: An ensemble learning framework to support marketing decision making," IRTG 1792 Discussion Papers 2018-012, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    14. K. Coussement & K. W. Bock & S. Geuens, 2022. "A decision-analytic framework for interpretable recommendation systems with multiple input data sources: a case study for a European e-tailer," Annals of Operations Research, Springer, vol. 315(2), pages 671-694, August.
    15. Tine Van Calster & Filip Van den Bossche & Bart Baesens & Wilfried Lemahieu, 2020. "Profit-oriented sales forecasting: a comparison of forecasting techniques from a business perspective," Papers 2002.00949, arXiv.org.
    16. Debaere, Steven & Coussement, Kristof & De Ruyck, Tom, 2018. "Multi-label classification of member participation in online innovation communities," European Journal of Operational Research, Elsevier, vol. 270(2), pages 761-774.
    17. Gubela, Robin M. & Lessmann, Stefan & Jaroszewicz, Szymon, 2020. "Response transformation and profit decomposition for revenue uplift modeling," European Journal of Operational Research, Elsevier, vol. 283(2), pages 647-661.
    18. Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
    19. Chou, Ping & Chuang, Howard Hao-Chun & Chou, Yen-Chun & Liang, Ting-Peng, 2022. "Predictive analytics for customer repurchase: Interdisciplinary integration of buy till you die modeling and machine learning," European Journal of Operational Research, Elsevier, vol. 296(2), pages 635-651.
    20. Mercedes Ayuso(universitat de Barcelona) & Miguel Santolino(Universitat de Barcelona), 2009. "Individual prediction of automobile bodily injury claims liabilities," Working Papers in Economics 220, Universitat de Barcelona. Espai de Recerca en Economia.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:271:y:2018:i:1:p:341-356. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.