IDEAS home Printed from https://ideas.repec.org/a/eee/reensy/v164y2017icp55-65.html
   My bibliography  Save this article

Hard drive failure prediction using Decision Trees

Author

Listed:
  • Li, Jing
  • Stones, Rebecca J.
  • Wang, Gang
  • Liu, Xiaoguang
  • Li, Zhongwei
  • Xu, Ming

Abstract

This paper proposes two hard drive failure prediction models based on Decision Trees (DTs) and Gradient Boosted Regression Trees (GBRTs) which perform well in prediction performance as well as stability and interpretability. The models are evaluated on a real-world dataset containing 121,698 drives in total. Experimental results show the DT model predicts over 93% of failures at a false alarm rate under 0.01%, and the GBRT model can achieve about 90% failure detection rate without any false alarms. Moreover, the GBRT model evaluates drive health (or fault probability) which provides a quantitative indicator of failure urgency. This enables operators to allocate system resources accordingly for pre-warning migrations while maintaining the quality of user services.

Suggested Citation

  • Li, Jing & Stones, Rebecca J. & Wang, Gang & Liu, Xiaoguang & Li, Zhongwei & Xu, Ming, 2017. "Hard drive failure prediction using Decision Trees," Reliability Engineering and System Safety, Elsevier, vol. 164(C), pages 55-65.
  • Handle: RePEc:eee:reensy:v:164:y:2017:i:c:p:55-65
    DOI: 10.1016/j.ress.2017.03.004
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0951832016301569
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ress.2017.03.004?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Khorasgani, Hamed & Biswas, Gautam & Sankararaman, Shankar, 2016. "Methodologies for system-level remaining useful life prediction," Reliability Engineering and System Safety, Elsevier, vol. 154(C), pages 8-18.
    2. Ye, Zhi-Sheng & Xie, Min & Tang, Loon-Ching, 2013. "Reliability evaluation of hard disk drive failures based on counting processes," Reliability Engineering and System Safety, Elsevier, vol. 109(C), pages 110-118.
    3. Le Son, Khanh & Fouladirad, Mitra & Barros, Anne, 2016. "Remaining useful lifetime estimation and noisy gamma deterioration process," Reliability Engineering and System Safety, Elsevier, vol. 149(C), pages 76-87.
    4. Liu, Jie & Zio, Enrico, 2017. "System dynamic reliability assessment and failure prognostics," Reliability Engineering and System Safety, Elsevier, vol. 160(C), pages 21-36.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Muhammad Zafran Muhammad Zaly Shah & Anazida Zainal & Taiseer Abdalla Elfadil Eisa & Hashim Albasheer & Fuad A. Ghaleb, 2023. "A Semisupervised Concept Drift Adaptation via Prototype-Based Manifold Regularization Approach with Knowledge Transfer," Mathematics, MDPI, vol. 11(2), pages 1-30, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Moradi, Ramin & Groth, Katrina M., 2020. "Modernizing risk assessment: A systematic integration of PRA and PHM techniques," Reliability Engineering and System Safety, Elsevier, vol. 204(C).
    2. Chen, Zhen & Li, Yaping & Xia, Tangbin & Pan, Ershun, 2019. "Hidden Markov model with auto-correlated observations for remaining useful life prediction and optimal maintenance policy," Reliability Engineering and System Safety, Elsevier, vol. 184(C), pages 123-136.
    3. Li, Rui & Verhagen, Wim J.C. & Curran, Richard, 2020. "A systematic methodology for Prognostic and Health Management system architecture definition," Reliability Engineering and System Safety, Elsevier, vol. 193(C).
    4. Yang, Li & Ma, Xiaobing & Peng, Rui & Zhai, Qingqing & Zhao, Yu, 2017. "A preventive maintenance policy based on dependent two-stage deterioration and external shocks," Reliability Engineering and System Safety, Elsevier, vol. 160(C), pages 201-211.
    5. Liu, Xingheng & Matias, José & Jäschke, Johannes & Vatn, Jørn, 2022. "Gibbs sampler for noisy Transformed Gamma process: Inference and remaining useful life estimation," Reliability Engineering and System Safety, Elsevier, vol. 217(C).
    6. Chatenet, Q. & Remy, E. & Gagnon, M. & Fouladirad, M. & Tahan, A.S., 2021. "Modeling cavitation erosion using non-homogeneous gamma process," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    7. Li, Naipeng & Gebraeel, Nagi & Lei, Yaguo & Fang, Xiaolei & Cai, Xiao & Yan, Tao, 2021. "Remaining useful life prediction based on a multi-sensor data fusion model," Reliability Engineering and System Safety, Elsevier, vol. 208(C).
    8. Lewis, Austin D. & Groth, Katrina M., 2022. "Metrics for evaluating the performance of complex engineering system health monitoring models," Reliability Engineering and System Safety, Elsevier, vol. 223(C).
    9. Belkacem, Lobna & Simeu-Abazi, Zineb & Dhouibi, Hedi & Gascard, Eric & Messaoud, Hassani, 2017. "Diagnostic and prognostic of hybrid dynamic systems: Modeling and RUL evaluation for two maintenance policies," Reliability Engineering and System Safety, Elsevier, vol. 164(C), pages 98-109.
    10. Van Dyck, Jozef & Verdonck, Tim, 2014. "Precision of power-law NHPP estimates for multiple systems with known failure rate scaling," Reliability Engineering and System Safety, Elsevier, vol. 126(C), pages 143-152.
    11. Prakash, Om & Samantaray, Arun Kumar, 2021. "Prognosis of Dynamical System Components with Varying Degradation Patterns using model–data–fusion," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    12. Zhu, Xiaoyan & Wang, Jun & Yuan, Tao, 2019. "Design and maintenance for the data storage system considering system rebuilding process," Reliability Engineering and System Safety, Elsevier, vol. 191(C).
    13. Hao, Songhua & Yang, Jun & Berenguer, Christophe, 2019. "Degradation analysis based on an extended inverse Gaussian process model with skew-normal random effects and measurement errors," Reliability Engineering and System Safety, Elsevier, vol. 189(C), pages 261-270.
    14. Huang, Xucong & Peng, Zhaoqin & Tang, Diyin & Chen, Juan & Zio, Enrico & Zheng, Zaiping, 2024. "A physics-informed autoencoder for system health state assessment based on energy-oriented system performance," Reliability Engineering and System Safety, Elsevier, vol. 242(C).
    15. Blancke, Olivier & Tahan, Antoine & Komljenovic, Dragan & Amyot, Normand & Lévesque, Mélanie & Hudon, Claude, 2018. "A holistic multi-failure mode prognosis approach for complex equipment," Reliability Engineering and System Safety, Elsevier, vol. 180(C), pages 136-151.
    16. Xiangqin Hou & Yihuan Wang & Peng Zhang & Guojin Qin, 2019. "Non-Probabilistic Time-Varying Reliability-Based Analysis of Corroded Pipelines Considering the Interaction of Multiple Uncertainty Variables," Energies, MDPI, vol. 12(10), pages 1-18, May.
    17. Compare, Michele & Bellani, Luca & Zio, Enrico, 2017. "Reliability model of a component equipped with PHM capabilities," Reliability Engineering and System Safety, Elsevier, vol. 168(C), pages 4-11.
    18. Kim, Hyeonmin & Kim, Jung Taek & Heo, Gyunyoung, 2018. "Failure rate updates using condition-based prognostics in probabilistic safety assessments," Reliability Engineering and System Safety, Elsevier, vol. 175(C), pages 225-233.
    19. Tsoumpris, Charalampos & Theotokatos, Gerasimos, 2023. "A decision-making approach for the health-aware energy management of ship hybrid power plants," Reliability Engineering and System Safety, Elsevier, vol. 235(C).
    20. Hazra, Indranil & Pandey, Mahesh D. & Manzana, Noldainerick, 2020. "Approximate Bayesian computation (ABC) method for estimating parameters of the gamma process using noisy data," Reliability Engineering and System Safety, Elsevier, vol. 198(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reensy:v:164:y:2017:i:c:p:55-65. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/reliability-engineering-and-system-safety .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.