IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0165972.html
   My bibliography  Save this article

Fast Outlier Detection Using a Grid-Based Algorithm

Author

Listed:
  • Jihwan Lee
  • Nam-Wook Cho

Abstract

As one of data mining techniques, outlier detection aims to discover outlying observations that deviate substantially from the reminder of the data. Recently, the Local Outlier Factor (LOF) algorithm has been successfully applied to outlier detection. However, due to the computational complexity of the LOF algorithm, its application to large data with high dimension has been limited. The aim of this paper is to propose grid-based algorithm that reduces the computation time required by the LOF algorithm to determine the k-nearest neighbors. The algorithm divides the data spaces in to a smaller number of regions, called as a “grid”, and calculates the LOF value of each grid. To examine the effectiveness of the proposed method, several experiments incorporating different parameters were conducted. The proposed method demonstrated a significant computation time reduction with predictable and acceptable trade-off errors. Then, the proposed methodology was successfully applied to real database transaction logs of Korea Atomic Energy Research Institute. As a result, we show that for a very large dataset, the grid-LOF can be considered as an acceptable approximation for the original LOF. Moreover, it can also be effectively used for real-time outlier detection.

Suggested Citation

  • Jihwan Lee & Nam-Wook Cho, 2016. "Fast Outlier Detection Using a Grid-Based Algorithm," PLOS ONE, Public Library of Science, vol. 11(11), pages 1-11, November.
  • Handle: RePEc:plo:pone00:0165972
    DOI: 10.1371/journal.pone.0165972
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0165972
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0165972&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0165972?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Seung Kim & Nam Wook Cho & Young Joo Lee & Suk-Ho Kang & Taewan Kim & Hyeseon Hwang & Dongseop Mun, 2013. "Application of density-based outlier detection to database activity monitoring," Information Systems Frontiers, Springer, vol. 15(1), pages 55-65, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Carly L. Huth & David W. Chadwick & William R. Claycomb & Ilsun You, 2013. "Guest editorial: A brief overview of data leakage and insider threats," Information Systems Frontiers, Springer, vol. 15(1), pages 1-4, March.
    2. Himeur, Yassine & Ghanem, Khalida & Alsalemi, Abdullah & Bensaali, Faycal & Amira, Abbes, 2021. "Artificial intelligence based anomaly detection of energy consumption in buildings: A review, current trends and new perspectives," Applied Energy, Elsevier, vol. 287(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0165972. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.