IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v8y2020i3p460-d336585.html
   My bibliography  Save this article

FastText-Based Local Feature Visualization Algorithm for Merged Image-Based Malware Classification Framework for Cyber Security and Cyber Defense

Author

Listed:
  • Sejun Jang

    (Department of Multimedia Engineering, Dongguk University-Seoul, Seoul 04620, Korea)

  • Shuyu Li

    (Department of Multimedia Engineering, Dongguk University-Seoul, Seoul 04620, Korea)

  • Yunsick Sung

    (Department of Multimedia Engineering, Dongguk University-Seoul, Seoul 04620, Korea)

Abstract

The importance of cybersecurity has recently been increasing. A malware coder writes malware into normal executable files. A computer is more likely to be infected by malware when users have easy access to various executables. Malware is considered as the starting point for cyber-attacks; thus, the timely detection, classification and blocking of malware are important. Malware visualization is a method for detecting or classifying malware. A global image is visualized through binaries extracted from malware. The overall structure and behavior of malware are considered when global images are utilized. However, the visualization of obfuscated malware is tough, owing to the difficulties encountered when extracting local features. This paper proposes a merged image-based malware classification framework that includes local feature visualization, global image-based local feature visualization, and global and local image merging methods. This study introduces a fastText-based local feature visualization method: First, local features such as opcodes and API function names are extracted from the malware; second, important local features in each malware family are selected via the term frequency inverse document frequency algorithm; third, the fastText model embeds the selected local features; finally, the embedded local features are visualized through a normalization process. Malware classification based on the proposed method using the Microsoft Malware Classification Challenge dataset was experimentally verified. The accuracy of the proposed method was approximately 99.65%, which is 2.18% higher than that of another contemporary global image-based approach.

Suggested Citation

  • Sejun Jang & Shuyu Li & Yunsick Sung, 2020. "FastText-Based Local Feature Visualization Algorithm for Merged Image-Based Malware Classification Framework for Cyber Security and Cyber Defense," Mathematics, MDPI, vol. 8(3), pages 1-13, March.
  • Handle: RePEc:gam:jmathe:v:8:y:2020:i:3:p:460-:d:336585
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/8/3/460/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/8/3/460/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Shuyu Li & Sejun Jang & Yunsick Sung, 2019. "Automatic Melody Composition Using Enhanced GAN," Mathematics, MDPI, vol. 7(10), pages 1-13, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Muhammad Mudassar Yamin & Mohib Ullah & Habib Ullah & Basel Katt & Mohammad Hijji & Khan Muhammad, 2022. "Mapping Tools for Open Source Intelligence with Cyber Kill Chain for Adversarial Aware Security," Mathematics, MDPI, vol. 10(12), pages 1-25, June.
    2. Frank Cremer & Barry Sheehan & Michael Fortmann & Arash N. Kia & Martin Mullins & Finbarr Murphy & Stefan Materne, 2022. "Cyber risk and cybersecurity: a systematic review of data availability," The Geneva Papers on Risk and Insurance - Issues and Practice, Palgrave Macmillan;The Geneva Association, vol. 47(3), pages 698-736, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shuyu Li & Yunsick Sung, 2023. "Transformer-Based Seq2Seq Model for Chord Progression Generation," Mathematics, MDPI, vol. 11(5), pages 1-14, February.
    2. Arulsamy, Karen & Delaney, Liam, 2022. "The impact of automatic enrolment on the mental health gap in pension participation: Evidence from the UK," Journal of Health Economics, Elsevier, vol. 86(C).
    3. Wenkai Huang & Feng Zhan, 2023. "A Novel Probabilistic Diffusion Model Based on the Weak Selection Mimicry Theory for the Generation of Hypnotic Songs," Mathematics, MDPI, vol. 11(15), pages 1-26, July.
    4. Shuyu Li & Yunsick Sung, 2021. "INCO-GAN: Variable-Length Music Generation Method Based on Inception Model-Based Conditional GAN," Mathematics, MDPI, vol. 9(4), pages 1-16, February.
    5. Shuyu Li & Yunsick Sung, 2023. "MRBERT: Pre-Training of Melody and Rhythm for Automatic Music Generation," Mathematics, MDPI, vol. 11(4), pages 1-14, February.
    6. Hyewon Yoon & Shuyu Li & Yunsick Sung, 2021. "Style Transformation Method of Stage Background Images by Emotion Words of Lyrics," Mathematics, MDPI, vol. 9(15), pages 1-20, August.
    7. Lvyang Qiu & Shuyu Li & Yunsick Sung, 2021. "DBTMPE: Deep Bidirectional Transformers-Based Masked Predictive Encoder Approach for Music Genre Classification," Mathematics, MDPI, vol. 9(5), pages 1-17, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:8:y:2020:i:3:p:460-:d:336585. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.