IDEAS home Printed from https://ideas.repec.org/a/spr/joinma/v36y2025i1d10.1007_s10845-023-02245-7.html
   My bibliography  Save this article

Natural language processing (NLP) and association rules (AR)-based knowledge extraction for intelligent fault analysis: a case study in semiconductor industry

Author

Listed:
  • Zhiqiang Wang

    (Research Center)

  • Kenneth Ezukwoke

    (Mines Saint-Étienne, Univ. Clermont Auvergne, CNRS UMR 6158 LIMOS
    Henri FAYOL Institute)

  • Anis Hoayek

    (Mines Saint-Étienne, Univ. Clermont Auvergne, CNRS UMR 6158 LIMOS
    Henri FAYOL Institute)

  • Mireille Batton-Hubert

    (Mines Saint-Étienne, Univ. Clermont Auvergne, CNRS UMR 6158 LIMOS
    Henri FAYOL Institute)

  • Xavier Boucher

    (Mines Saint-Étienne, Univ. Clermont Auvergne, CNRS UMR 6158 LIMOS
    Center for Biomedical and Healthcare Engineering)

Abstract

Fault analysis (FA) is the process of collecting and analyzing data to determine the cause of a failure. It plays an important role in ensuring the quality in manufacturing process. Traditional FA techniques are time-consuming and labor-intensive, relying heavily on human expertise and the availability of failure inspection equipment. In semiconductor industry, a large amount of FA reports are generated by experts to record the fault descriptions, fault analysis path and fault root causes. With the development of Artificial Intelligence, it is possible to automate the industrial FA process while extracting expert knowledge from the vast FA report data. The goal of this research is to develop a complete expert knowledge extraction pipeline for FA in semiconductor industry based on advanced Natural Language Processing and Machine Learning. Our research aims at automatically predicting the fault root cause based on the fault descriptions. First, the text data from the FA reports are transformed into numerical data using Sentence Transformer embedding. The numerical data are converted into latent spaces using Generalized-Controllable Variational AutoEncoder. Then, the latent spaces are classified by Gaussian Mixture Model. Finally, Association Rules are applied to establish the relationship between the labels in the latent space of the fault descriptions and that of the fault root cause. The proposed algorithm has been evaluated with real data of semiconductor industry collected over three years. The average correctness of the predicted label achieves 97.8%. The method can effectively reduce the time of failure identification and the cost during the inspection stage.

Suggested Citation

  • Zhiqiang Wang & Kenneth Ezukwoke & Anis Hoayek & Mireille Batton-Hubert & Xavier Boucher, 2025. "Natural language processing (NLP) and association rules (AR)-based knowledge extraction for intelligent fault analysis: a case study in semiconductor industry," Journal of Intelligent Manufacturing, Springer, vol. 36(1), pages 357-372, January.
  • Handle: RePEc:spr:joinma:v:36:y:2025:i:1:d:10.1007_s10845-023-02245-7
    DOI: 10.1007/s10845-023-02245-7
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10845-023-02245-7
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10845-023-02245-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hahsler, Michael & Grün, Bettina & Hornik, Kurt, 2005. "arules - A Computational Environment for Mining Association Rules and Frequent Item Sets," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 14(i15).
    2. Chia-Yu Hsu & Wei-Chen Liu, 2021. "Multiple time-series convolutional neural network for fault detection and diagnosis and empirical study in semiconductor manufacturing," Journal of Intelligent Manufacturing, Springer, vol. 32(3), pages 823-836, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jesus Crespo Cuaresma & Bettina Grün & Paul Hofmarcher & Stefan Humer & Mathias Moser, 2015. "A Comprehensive Approach to Posterior Jointness Analysis in Bayesian Model Averaging Applications," Department of Economics Working Papers wuwp193, Vienna University of Economics and Business, Department of Economics.
    2. Yoichi Matsumoto, 2013. "Heterogeneous Combinations of Knowledge Elements: How the Knowledge Base Structure Impacts Knowledge-related Outcomes of a Firm," Discussion Paper Series DP2013-15, Research Institute for Economics & Business Administration, Kobe University.
    3. Man-, ZuyiKeunZuyi Wang & Takagi, Chifumi & Kim, Man-Keun & Chung, Anh, 2022. "Uncover Drivers Influencing Consumers' WTP Using Machine Learning: Case of Organic Coffee in Taiwan," 2022 Annual Meeting, July 31-August 2, Anaheim, California 322150, Agricultural and Applied Economics Association.
    4. Jr-Fong Dang, 2024. "The multisensor information fusion-based deep learning model for equipment health monitor integrating subject matter expert knowledge," Journal of Intelligent Manufacturing, Springer, vol. 35(8), pages 4055-4069, December.
    5. Kurt Hornik & Christian Buchta & Achim Zeileis, 2009. "Open-source machine learning: R meets Weka," Computational Statistics, Springer, vol. 24(2), pages 225-232, May.
    6. Jeongsub Choi & Mengmeng Zhu & Jihoon Kang & Myong K. Jeong, 2024. "Convolutional neural network based multi-input multi-output model for multi-sensor multivariate virtual metrology in semiconductor manufacturing," Annals of Operations Research, Springer, vol. 339(1), pages 185-201, August.
    7. Hasan Tercan & Tobias Meisen, 2022. "Machine learning and deep learning based predictive quality in manufacturing: a systematic review," Journal of Intelligent Manufacturing, Springer, vol. 33(7), pages 1879-1905, October.
    8. Hofmarcher, Paul & Crespo Cuaresma, Jesus & Grün, Bettina & Humer, Stefan & Moser, Mathias, 2018. "Bivariate jointness measures in Bayesian Model Averaging: Solving the conundrum," Journal of Macroeconomics, Elsevier, vol. 57(C), pages 150-165.
    9. Małecka-Ziembińska Edyta & Siwiec Anna, 2020. "Searching for similarities in EU corporate income taxes for their harmonization," Economics and Business Review, Sciendo, vol. 6(4), pages 72-94, December.
    10. Nancy Awad & Jean-Francois Couchot & Bechara Al Bouna & Laurent Philippe, 2020. "Publishing Anonymized Set-Valued Data via Disassociation towards Analysis," Future Internet, MDPI, vol. 12(4), pages 1-21, April.
    11. Scholz, Michael, 2016. "R Package clickstream: Analyzing Clickstream Data with Markov Chains," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 74(i04).
    12. Khanh Giang Le & Quang Hoc Tran & Van Manh Do, 2023. "Urban Traffic Accident Features Investigation to Improve Urban Transportation Infrastructure Sustainability by Integrating GIS and Data Mining Techniques," Sustainability, MDPI, vol. 16(1), pages 1-19, December.
    13. Jasleen Kaur & Khushdeep Dharni, 2022. "Assessing efficacy of association rules for predicting global stock indices," DECISION: Official Journal of the Indian Institute of Management Calcutta, Springer;Indian Institute of Management Calcutta, vol. 49(3), pages 329-339, September.
    14. Deszczyński, Bartosz & Beręsewicz, Maciej, 2021. "The maturity of relationship management and firm performance – A step toward relationship management middle-range theory," Journal of Business Research, Elsevier, vol. 135(C), pages 358-372.
    15. Peng Zhan & Shaokun Wang & Jun Wang & Leigang Qu & Kun Wang & Yupeng Hu & Xueqing Li, 2021. "Temporal anomaly detection on IIoT-enabled manufacturing," Journal of Intelligent Manufacturing, Springer, vol. 32(6), pages 1669-1678, August.
    16. Michael Hahsler & Radoslaw Karpienko, 2017. "Visualizing association rules in hierarchical groups," Journal of Business Economics, Springer, vol. 87(3), pages 317-335, April.
    17. Ji Yeon Lee & Richa Kumari & Jae Yun Jeong & Tae-Hyun Kim & Byeong-Hee Lee, 2020. "Knowledge Discovering on Graphene Green Technology by Text Mining in National R&D Projects in South Korea," Sustainability, MDPI, vol. 12(23), pages 1-16, November.
    18. Yoonju Lee & Heejin Kim & Hyesun Jeong & Yunhwan Noh, 2020. "Patterns of Multimorbidity in Adults: An Association Rules Analysis Using the Korea Health Panel," IJERPH, MDPI, vol. 17(8), pages 1-14, April.
    19. Sun, Chenhao & Wang, Xin & Zheng, Yihui, 2020. "An ensemble system to predict the spatiotemporal distribution of energy security weaknesses in transmission networks," Applied Energy, Elsevier, vol. 258(C).
    20. Suelane Garcia Fontes & Ronaldo Gonçalves Morato & Silvio Luiz Stanzani & Pedro Luiz Pizzigatti Corrêa, 2021. "Jaguar movement behavior: using trajectories and association rule mining algorithms to unveil behavioral states and social interactions," PLOS ONE, Public Library of Science, vol. 16(2), pages 1-18, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:joinma:v:36:y:2025:i:1:d:10.1007_s10845-023-02245-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.