IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i11p1624-d1399516.html
   My bibliography  Save this article

LAMBERT: Leveraging Attention Mechanisms to Improve the BERT Fine-Tuning Model for Encrypted Traffic Classification

Author

Listed:
  • Tao Liu

    (Institute of Cyberspace Security, Guangzhou University, Guangzhou 510006, China)

  • Xiting Ma

    (Institute of Cyberspace Security, Guangzhou University, Guangzhou 510006, China)

  • Ling Liu

    (Institute of Cyberspace Security, Guangzhou University, Guangzhou 510006, China)

  • Xin Liu

    (College of Computer Engineering and Applied Math, Changsha University, Changsha 410022, China)

  • Yue Zhao

    (Science and Technology on Communication Security Laboratory, Chengdu 610041, China)

  • Ning Hu

    (Institute of Cyberspace Security, Guangzhou University, Guangzhou 510006, China)

  • Kayhan Zrar Ghafoor

    (Department of Computer Science, Knowledge University, Erbil 44001, Iraq)

Abstract

Encrypted traffic classification is a crucial part of privacy-preserving research. With the great success of artificial intelligence technology in fields such as image recognition and natural language processing, how to classify encrypted traffic based on AI technology has become an attractive topic in information security. With good generalization ability and high training accuracy, pre-training-based encrypted traffic classification methods have become the first option. The accuracy of this type of method depends highly on the fine-tuning model. However, it is a challenge for existing fine-tuned models to effectively integrate the representation of packet and byte features extracted via pre-training. A novel fine-tuning model, LAMBERT, is proposed in this article. By introducing an attention mechanism to capture the relationship between BiGRU and byte sequences, LAMBERT not only effectively improves the sequence loss phenomenon of BiGRU but also improves the processing performance of encrypted stream classification. LAMBERT can quickly and accurately classify multiple types of encrypted traffic. The experimental results show that our model performs well on datasets with uneven sample distribution, no pre-training, and large sample classification. LAMBERT was tested on four datasets, namely, ISCX-VPN-Service, ISCX-VPN-APP, USTC-TFC and CSTNET-TLS 1.3, and the F1 scores reached 99.15%, 99.52%, 99.30%, and 97.41%, respectively.

Suggested Citation

  • Tao Liu & Xiting Ma & Ling Liu & Xin Liu & Yue Zhao & Ning Hu & Kayhan Zrar Ghafoor, 2024. "LAMBERT: Leveraging Attention Mechanisms to Improve the BERT Fine-Tuning Model for Encrypted Traffic Classification," Mathematics, MDPI, vol. 12(11), pages 1-22, May.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:11:p:1624-:d:1399516
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/11/1624/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/11/1624/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Niu, Dongxiao & Yu, Min & Sun, Lijie & Gao, Tian & Wang, Keke, 2022. "Short-term multi-energy load forecasting for integrated energy systems based on CNN-BiGRU optimized by attention mechanism," Applied Energy, Elsevier, vol. 313(C).
    2. Petr Velan & Milan Čermák & Pavel Čeleda & Martin Drašar, 2015. "A survey of methods for encrypted traffic classification and analysis," International Journal of Network Management, John Wiley & Sons, vol. 25(5), pages 355-374, September.
    3. Rahman, Aowabin & Srikumar, Vivek & Smith, Amanda D., 2018. "Predicting electricity consumption for commercial and residential buildings using deep recurrent neural networks," Applied Energy, Elsevier, vol. 212(C), pages 372-385.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jiang, Ben & Li, Yu & Rezgui, Yacine & Zhang, Chengyu & Wang, Peng & Zhao, Tianyi, 2024. "Multi-source domain generalization deep neural network model for predicting energy consumption in multiple office buildings," Energy, Elsevier, vol. 299(C).
    2. Zhou, Guangzhao & Guo, Zanquan & Sun, Simin & Jin, Qingsheng, 2023. "A CNN-BiGRU-AM neural network for AI applications in shale oil production prediction," Applied Energy, Elsevier, vol. 344(C).
    3. Guanqun Wang & Haibo Teng & Lei Qiao & Hongtao Yu & You Cui & Kun Xiao, 2024. "Well Logging Reconstruction Based on a Temporal Convolutional Network and Bidirectional Gated Recurrent Unit Network with Attention Mechanism Optimized by Improved Sand Cat Swarm Optimization," Energies, MDPI, vol. 17(11), pages 1-15, June.
    4. Liu, Liqi & Liu, Yanli, 2022. "Load image inpainting: An improved U-Net based load missing data recovery method," Applied Energy, Elsevier, vol. 327(C).
    5. Chen, Zhelun & O’Neill, Zheng & Wen, Jin & Pradhan, Ojas & Yang, Tao & Lu, Xing & Lin, Guanjing & Miyata, Shohei & Lee, Seungjae & Shen, Chou & Chiosa, Roberto & Piscitelli, Marco Savino & Capozzoli, , 2023. "A review of data-driven fault detection and diagnostics for building HVAC systems," Applied Energy, Elsevier, vol. 339(C).
    6. Liu, Che & Sun, Bo & Zhang, Chenghui & Li, Fan, 2020. "A hybrid prediction model for residential electricity consumption using holt-winters and extreme learning machine," Applied Energy, Elsevier, vol. 275(C).
    7. Lu, Yakai & Tian, Zhe & Zhou, Ruoyu & Liu, Wenjing, 2021. "A general transfer learning-based framework for thermal load prediction in regional energy system," Energy, Elsevier, vol. 217(C).
    8. Salah Bouktif & Ali Fiaz & Ali Ouni & Mohamed Adel Serhani, 2018. "Optimal Deep Learning LSTM Model for Electric Load Forecasting using Feature Selection and Genetic Algorithm: Comparison with Machine Learning Approaches †," Energies, MDPI, vol. 11(7), pages 1-20, June.
    9. Zeng, Huibin & Shao, Bilin & Dai, Hongbin & Yan, Yichuan & Tian, Ning, 2023. "Prediction of fluctuation loads based on GARCH family-CatBoost-CNNLSTM," Energy, Elsevier, vol. 263(PE).
    10. Ahmad, Tanveer & Chen, Huanxin, 2018. "Potential of three variant machine-learning models for forecasting district level medium-term and long-term energy demand in smart grid environment," Energy, Elsevier, vol. 160(C), pages 1008-1020.
    11. Ibrahim, Muhammad Sohail & Dong, Wei & Yang, Qiang, 2020. "Machine learning driven smart electric power systems: Current trends and new perspectives," Applied Energy, Elsevier, vol. 272(C).
    12. Pesantez, Jorge E. & Li, Binbin & Lee, Christopher & Zhao, Zhizhen & Butala, Mark & Stillwell, Ashlynn S., 2023. "A Comparison Study of Predictive Models for Electricity Demand in a Diverse Urban Environment," Energy, Elsevier, vol. 283(C).
    13. Atif Maqbool Khan & Artur Wyrwa, 2024. "A Survey of Quantitative Techniques in Electricity Consumption—A Global Perspective," Energies, MDPI, vol. 17(19), pages 1-38, September.
    14. Ivana Kiprijanovska & Simon Stankoski & Igor Ilievski & Slobodan Jovanovski & Matjaž Gams & Hristijan Gjoreski, 2020. "HousEEC: Day-Ahead Household Electrical Energy Consumption Forecasting Using Deep Learning," Energies, MDPI, vol. 13(10), pages 1-29, May.
    15. Hyunsoo Kim & Jiseok Jeong & Changwan Kim, 2022. "Daily Peak-Electricity-Demand Forecasting Based on Residual Long Short-Term Network," Mathematics, MDPI, vol. 10(23), pages 1-17, November.
    16. Dana-Mihaela Petroșanu & George Căruțașu & Nicoleta Luminița Căruțașu & Alexandru Pîrjan, 2019. "A Review of the Recent Developments in Integrating Machine Learning Models with Sensor Devices in the Smart Buildings Sector with a View to Attaining Enhanced Sensing, Energy Efficiency, and Optimal B," Energies, MDPI, vol. 12(24), pages 1-64, December.
    17. Cen, Zhongpei & Wang, Jun, 2019. "Crude oil price prediction model with long short term memory deep learning based on prior knowledge data transfer," Energy, Elsevier, vol. 169(C), pages 160-171.
    18. Ye, Zhongnan & Cheng, Kuangly & Hsu, Shu-Chien & Wei, Hsi-Hsien & Cheung, Clara Man, 2021. "Identifying critical building-oriented features in city-block-level building energy consumption: A data-driven machine learning approach," Applied Energy, Elsevier, vol. 301(C).
    19. Ajith, Meenu & Martínez-Ramón, Manel, 2023. "Deep learning algorithms for very short term solar irradiance forecasting: A survey," Renewable and Sustainable Energy Reviews, Elsevier, vol. 182(C).
    20. Yifei Chen & Zhihan Fu, 2023. "Multi-Step Ahead Forecasting of the Energy Consumed by the Residential and Commercial Sectors in the United States Based on a Hybrid CNN-BiLSTM Model," Sustainability, MDPI, vol. 15(3), pages 1-21, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:11:p:1624-:d:1399516. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.