Application of word embedding and machine learning in detecting phishing websites

My bibliography Save this article

Application of word embedding and machine learning in detecting phishing websites

Author

Listed:

Routhu Srinivasa Rao
(GMR Institute of Technology)
Amey Umarekar
(National Institute of Technology)
Alwyn Roshan Pais
(National Institute of Technology)

Registered:

Abstract

Phishing is an attack whose aim is to gain personal information such as passwords, credit card details etc. from online users by deceiving them through fake websites, emails or any legitimate internet service. There exists many techniques to detect phishing sites such as third-party based techniques, source code based methods and URL based methods but still users are getting trapped into revealing their sensitive information. In this paper, we propose a new technique which detects phishing sites with word embeddings using plain text and domain specific text extracted from the source code. We applied various word embedding for the evaluation of our model using ensemble and multimodal approaches. From the experimental evaluation, we observed that multimodal with domain specific text achieved a significant accuracy of 99.34% with TPR of 99.59%, FPR of 0.93%, and MCC of 98.68%

Suggested Citation

Routhu Srinivasa Rao & Amey Umarekar & Alwyn Roshan Pais, 2022. "Application of word embedding and machine learning in detecting phishing websites," Telecommunication Systems: Modelling, Analysis, Design and Management, Springer, vol. 79(1), pages 33-45, January.

Handle: RePEc:spr:telsys:v:79:y:2022:i:1:d:10.1007_s11235-021-00850-6
DOI: 10.1007/s11235-021-00850-6

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Shan Wang & Sulaiman Khan & Chuyi Xu & Shah Nazir & Abdul Hafeez, 2020. "Deep Learning-Based Efficient Model Development for Phishing Detection Using Random Forest and BLSTM Classifiers," Complexity, Hindawi, vol. 2020, pages 1-7, September.
Abdul Basit & Maham Zafar & Xuan Liu & Abdul Rehman Javed & Zunera Jalil & Kashif Kifayat, 2021. "A comprehensive survey of AI-enabled phishing attacks detection techniques," Telecommunication Systems: Modelling, Analysis, Design and Management, Springer, vol. 76(1), pages 139-154, January.
Li Xu & Zhenxin Zhan & Shouhuai Xu & Keying Ye & Keesook Han & Frank Born, 2013. "Cross-Layer Detection of Malicious Websites," Working Papers 0150mss, College of Business, University of Texas at San Antonio.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Phavithra Manoharan & Jiao Yin & Hua Wang & Yanchun Zhang & Wenjie Ye, 2024. "Insider threat detection using supervised machine learning algorithms," Telecommunication Systems: Modelling, Analysis, Design and Management, Springer, vol. 87(4), pages 899-915, December.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Emtethal K. Alamri & Abdullah M. Alnajim & Suliman A. Alsuhibany, 2022. "Investigation of Using CAPTCHA Keystroke Dynamics to Enhance the Prevention of Phishing Attacks," Future Internet, MDPI, vol. 14(3), pages 1-21, March.
Scott Robbins & Aimee van Wynsberghe, 2022. "Our New Artificial Intelligence Infrastructure: Becoming Locked into an Unsustainable Future," Sustainability, MDPI, vol. 14(8), pages 1-11, April.
Kumar Prateek & Nitish Kumar Ojha & Fahiem Altaf & Soumyadev Maity, 2023. "Quantum secured 6G technology-based applications in Internet of Everything," Telecommunication Systems: Modelling, Analysis, Design and Management, Springer, vol. 82(2), pages 315-344, February.
Nikola Anđelić & Sandi Baressi Šegota & Ivan Lorencin & Matko Glučina, 2022. "Detection of Malicious Websites Using Symbolic Classifier," Future Internet, MDPI, vol. 14(12), pages 1-30, November.
Tepede Dipo & Akpa Michael Onyedikachi, 2024. "Developing a Biblical Solution Model for Mitigating Phishing Risks Among Internet Banking Users in Nigeria: The Initial Investigation," International Journal of Latest Technology in Engineering, Management & Applied Science, International Journal of Latest Technology in Engineering, Management & Applied Science (IJLTEMAS), vol. 13(4), pages 61-75, April.
Hernández-Rivera, Ariadna, 2023. "Brecha de género en la confianza de productos y servicios financieros desde la perspectiva del comportamiento," Revista Finanzas y Politica Economica, Universidad Católica de Colombia, vol. 15(1), pages 245-273, January.

More about this item

Keywords

URL; Phishing; Anti-phishing; TF-IDF; Hostname; Random forest;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:telsys:v:79:y:2022:i:1:d:10.1007_s11235-021-00850-6. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Application of word embedding and machine learning in detecting phishing websites

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data