IDEAS home Printed from https://ideas.repec.org/a/gam/jeners/v12y2019i13p2530-d244639.html
   My bibliography  Save this article

Effect of Irrelevant Variables on Faulty Wafer Detection in Semiconductor Manufacturing

Author

Listed:
  • Dongil Kim

    (Department of Computer Science & Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Korea)

  • Seokho Kang

    (Department of Systems Management Engineering, Sungkyunkwan University, 2066 Seobu-ro, Jangan-gu, Suwon 16419, Korea)

Abstract

Machine learning has been applied successfully for faulty wafer detection tasks in semiconductor manufacturing. For the tasks, prediction models are built with prior data to predict the quality of future wafers as a function of their precedent process parameters and measurements. In real-world problems, it is common for the data to have a portion of input variables that are irrelevant to the prediction of an output variable. The inclusion of many irrelevant variables negatively affects the performance of prediction models. Typically, prediction models learned by different learning algorithms exhibit different sensitivities with regard to irrelevant variables. Algorithms with low sensitivities are preferred as a first trial for building prediction models, whereas a variable selection procedure is necessarily considered for highly sensitive algorithms. In this study, we investigate the effect of irrelevant variables on three well-known representative learning algorithms that can be applied to both classification and regression tasks: artificial neural network, decision tree (DT), and k -nearest neighbors ( k -NN). We analyze the characteristics of these learning algorithms in the presence of irrelevant variables with different model complexity settings. An empirical analysis is performed using real-world datasets collected from a semiconductor manufacturer to examine how the number of irrelevant variables affects the behavior of prediction models trained with different learning algorithms and model complexity settings. The results indicate that the prediction accuracy of k -NN is highly degraded, whereas DT demonstrates the highest robustness in the presence of many irrelevant variables. In addition, a higher model complexity of learning algorithms leads to a higher sensitivity to irrelevant variables.

Suggested Citation

  • Dongil Kim & Seokho Kang, 2019. "Effect of Irrelevant Variables on Faulty Wafer Detection in Semiconductor Manufacturing," Energies, MDPI, vol. 12(13), pages 1-11, July.
  • Handle: RePEc:gam:jeners:v:12:y:2019:i:13:p:2530-:d:244639
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1996-1073/12/13/2530/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1996-1073/12/13/2530/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Goldstein, William M. & Busemeyer, Jerome R., 1992. "The effect of "irrelevant" variables on decision making: Criterion shifts in preferential choice?," Organizational Behavior and Human Decision Processes, Elsevier, vol. 52(3), pages 425-454, August.
    2. Fomby, Thomas B., 1981. "Loss of efficiency in regression analysis due to irrelevant variables : A generalization," Economics Letters, Elsevier, vol. 7(4), pages 319-322.
    3. Wei-Yin Loh, 2014. "Fifty Years of Classification and Regression Trees," International Statistical Review, International Statistical Institute, vol. 82(3), pages 329-348, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hugo Siqueira & Mariana Macedo & Yara de Souza Tadano & Thiago Antonini Alves & Sergio L. Stevan & Domingos S. Oliveira & Manoel H.N. Marinho & Paulo S.G. de Mattos Neto & João F. L. de Oliveira & Ive, 2020. "Selection of Temporal Lags for Predicting Riverflow Series from Hydroelectric Plants Using Variable Selection Methods," Energies, MDPI, vol. 13(16), pages 1-35, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ariana Chang & Tian‐Shyug Lee & Hsiu‐Mei Lee, 2024. "Applying sustainable development goals in financial forecasting using machine learning techniques," Corporate Social Responsibility and Environmental Management, John Wiley & Sons, vol. 31(3), pages 2277-2289, May.
    2. Farkas, Sébastien & Lopez, Olivier & Thomas, Maud, 2021. "Cyber claim analysis using Generalized Pareto regression trees with applications to insurance," Insurance: Mathematics and Economics, Elsevier, vol. 98(C), pages 92-105.
    3. Miljkovic, Dragan & Gong, Jian & Lehrke, Linda, 2009. "The Effects of Trivial Attributes on Choice of Food Products," Agricultural and Resource Economics Review, Cambridge University Press, vol. 38(2), pages 142-152, October.
    4. Emilio Aguirre & Federico García-Suárez & Gabriela Sicilia, 2021. "Eficiencia técnica en la ganadería de carne bovina pastoril. Medición y exploración de sus determinantes en Uruguay," Documentos de Trabajo (working papers) 1321, Department of Economics - dECON.
    5. Gonzalez-Vallejo, Claudia & Moran, Elizabeth, 2001. "The Evaluability Hypothesis Revisited: Joint and Separate Evaluation Preference Reversal as a Function of Attribute Importance," Organizational Behavior and Human Decision Processes, Elsevier, vol. 86(2), pages 216-233, November.
    6. Lotfi Boudabsa & Damir Filipovi'c, 2022. "Ensemble learning for portfolio valuation and risk management," Papers 2204.05926, arXiv.org.
    7. Yan, Ran & Wang, Shuaian & Du, Yuquan, 2020. "Development of a two-stage ship fuel consumption prediction and reduction model for a dry bulk ship," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 138(C).
    8. A. Poterie & J.-F. Dupuy & V. Monbet & L. Rouvière, 2019. "Classification tree algorithm for grouped variables," Computational Statistics, Springer, vol. 34(4), pages 1613-1648, December.
    9. Miguel A. Vallejo & Laura Vallejo-Slocker & Martin Offenbaecher & Jameson K. Hirsch & Loren L. Toussaint & Niko Kohls & Fuschia Sirois & Javier Rivera, 2021. "Psychological Flexibility Is Key for Reducing the Severity and Impact of Fibromyalgia," IJERPH, MDPI, vol. 18(14), pages 1-11, July.
    10. Eduardo Rodríguez Sánchez & Eduardo Filemón Vázquez Santacruz & Humberto Cervantes Maceda, 2023. "Effort and Cost Estimation Using Decision Tree Techniques and Story Points in Agile Software Development," Mathematics, MDPI, vol. 11(6), pages 1-31, March.
    11. Suryo Adi Rakhmawan & M. Hafidz Omar & Muhammad Riaz & Nasir Abbas, 2023. "Hotelling T 2 Control Chart for Detecting Changes in Mortality Models Based on Machine-Learning Decision Tree," Mathematics, MDPI, vol. 11(3), pages 1-14, January.
    12. Olga Takacs & Janos Vincze, 2018. "The within-job gender pay gap in Hungary," CERS-IE WORKING PAPERS 1834, Institute of Economics, Centre for Economic and Regional Studies.
    13. Michael Puglia & Adam Tucker, 2020. "Machine Learning, the Treasury Yield Curve and Recession Forecasting," Finance and Economics Discussion Series 2020-038, Board of Governors of the Federal Reserve System (U.S.).
    14. Jiaming Mao & Jingzhi Xu, 2020. "Ensemble Learning with Statistical and Structural Models," Papers 2006.05308, arXiv.org.
    15. Kian Tehranian, 2023. "Can Machine Learning Catch Economic Recessions Using Economic and Market Sentiments?," Papers 2308.16200, arXiv.org.
    16. HOROBEȚ Alexandra & BULAI Vlad Cosmin, 2019. "Assessing the Local Developmental Impact of Hydrocarbon Exploitation in a Mature Region: A Random Forest Approach," European Journal of Interdisciplinary Studies, Bucharest Economic Academy, issue 02, June.
    17. Yu-Shan Shih & Kuang-Hsun Liu, 2019. "Regression trees for detecting preference patterns from rank data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(3), pages 683-702, September.
    18. Deepankar Basu, 2018. "Bias of OLS Estimators due to Exclusion of Relevant Variables and Inclusion of Irrelevant Variables," UMASS Amherst Economics Working Papers 2018-19, University of Massachusetts Amherst, Department of Economics.
    19. Osman, Ibrahim H. & Anouze, Abdel Latef & Irani, Zahir & Lee, Habin & Medeni, Tunç D. & Weerakkody, Vishanth, 2019. "A cognitive analytics management framework for the transformation of electronic government services from users’ perspective to create sustainable shared values," European Journal of Operational Research, Elsevier, vol. 278(2), pages 514-532.
    20. Renato Bruni & Gianpiero Bianchi, 2018. "Robustness Analysis of a Website Categorization Procedure based on Machine Learning," DIAG Technical Reports 2018-04, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jeners:v:12:y:2019:i:13:p:2530-:d:244639. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.