IDEAS home Printed from https://ideas.repec.org/a/igg/jiit00/v11y2015i3p45-87.html
   My bibliography  Save this article

A Novel Bio-Inspired Approach for Multilingual Spam Filtering

Author

Listed:
  • Hadj Ahmed Bouarara

    (GeCode Laboratory, Department of Computer Science, Tahar Moulay University of Saida, Saida, Algeria)

  • Reda Mohamed Hamou

    (GeCode Laboratory, Department of Computer Science, Tahar Moulay University of Saida, Saida, Algeria)

  • Abdelmalek Amine

    (GeCode Laboratory, Department of Computer Science, Tahar Moulay University of Saida, Saida, Algeria)

Abstract

In today's digital world the email service has revolutionized the sphere of electronic communication. It has become a veritable social phenomenon in our daily life. Unfortunately, this technology has become incontestably the original source of malicious activities especially the plague called undesirable emails (SPAM) that has grown tremendously in the last few years. The battle against spam emails is extremely fierce. This paper deals with an intelligent spam filtering system called artificial heart-lungs system (AHLS) mimicked from the biological phenomenon of general circulation and oxygenation of blood. It is composed of different steps: Selection to stop automatically emails with undesirable identifier. Multilingual pre-processing to treat the problem of multilingual spam emails and vectoring them. Heart filter and lungs filter to classify unwelcome email in the spam folder and welcome email in the ham folder to present them to the recipient. The method uses an automatic updating of learning basis and black list, and a ranking step to order the spam mails according to their spam relevancy. For the authors' experimentation, they have constructed a new dataset M.SPAM composed of emails pre-classified as spam or ham with different language (English, Spanish, French, and melange) and using the validation measures (recall, precision, f-measure, entropy, accuracy and error, false positive rate and false negative rate, ROC and learning curve). The authors have optimized the sensitive parameters (text representation technique, lungs filters, and the size of initial leaning basis). The results are positive compared to the result of other bio-inspired techniques (artificial social bees, artificial social cockroaches), supervised algorithm (decision tree C4.5) and automatic algorithm (K-means). Finally, a visual result mining tool was developed in order to see the results in graphical form (3d cub and cobweb) with more realism using the functionality of zooming and rotation. The authors' aims are to eliminate a large proportion of unwelcome email, treated the multilingual emails, ensuring an automatic updating of their system and poses a minimal risk of eliminating ham email.

Suggested Citation

  • Hadj Ahmed Bouarara & Reda Mohamed Hamou & Abdelmalek Amine, 2015. "A Novel Bio-Inspired Approach for Multilingual Spam Filtering," International Journal of Intelligent Information Technologies (IJIIT), IGI Global, vol. 11(3), pages 45-87, July.
  • Handle: RePEc:igg:jiit00:v:11:y:2015:i:3:p:45-87
    as

    Download full text from publisher

    File URL: http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/IJIIT.2015070104
    Download Restriction: no
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:igg:jiit00:v:11:y:2015:i:3:p:45-87. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Journal Editor (email available below). General contact details of provider: https://www.igi-global.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.