Elsevier

Computers in Industry

Volume 64, Issue 3, April 2013, Pages 203-213
Computers in Industry

Feature extraction, condition monitoring, and fault modeling in semiconductor manufacturing systems

https://doi.org/10.1016/j.compind.2012.10.002Get rights and content

Abstract

Reliable feature extraction, condition monitoring, and fault modeling are critical to understanding equipment degradation and implementing the proper maintenance decisions in manufacturing processes. Semiconductor manufacturing machines are highly sophisticated systems, consisting of multiple interacting components operating in highly variable operating conditions. This complicates performance monitoring since equipment condition must often be inferred through concurrent interpretation of multiple sensor readings originating from potentially very different subsystems of the tool. This paper presents an integrated approach to feature extraction, condition monitoring, and fault modeling applied to a set of standard built-in sensors on a modern 300-mm technology industrial Plasma Enhanced Chemical Vapor Deposition (PECVD) tool. Linear Discriminant Analysis was utilized to determine the set of dynamic features that are the most sensitive to different tool conditions brought about by chamber cleaning or various faults. Gaussian Mixture Models of the dynamic feature distributions were used to statistically quantify changes of these features as the condition of the tool changed. In addition, four highly detrimental faults were analyzed to demonstrate the fault modeling methodology. Data collected over 8 months from a PECVD tool being operated by a major microelectronics manufacturer was used to verify the methodology. Top sensitive features from various faults observed in this period were examined and physical connections to the chamber condition were interpreted through their behavior.

Highlights

Linear Discriminant Analysis gives features indicating prominent tool degradation. ► Overlaps of Gaussian Mixtures show agreement with process fault anticipations. ► A fault modeling strategy is applied to events of unacceptable process behavior.

Introduction

Maintenance is one of the key issues in modern semiconductor manufacturing [1], [2]. Presently, maintenance scheduling in this industry is predominantly based on historical reliability information pertaining to individual components, and short-term performance signatures for detecting sudden failures. This leads to significant waste due to interventions performed on machines that do not need it or due to missed maintenance actions that result in machine failures, bad product quality, and unscheduled downtimes [2], [3], [4], [5].

The condition-based maintenance (CBM) paradigm has been utilized in numerous areas of engineering to address maintenance wastes that currently plague many industries. CBM establishes a connection between the equipment condition and the readings of sensors existing in that equipment [5], [6]. Such information about the actual condition of the equipment can be used to make maintenance decisions that are optimally synchronized with human and material resources in the manufacturing system, and the least intrusive on the overall manufacturing operations.

Nevertheless, many challenges exist that have prevented the semiconductor manufacturing industry from pervasively implementing more proactive CBM policies. Some of those challenges are [4], [5]:

  • 1.

    Semiconductor manufacturing tools and equipment are highly sophisticated systems, consisting of multiple interacting components operating in highly variable operating conditions, which often hampers the existing performance monitoring methods in this area.

  • 2.

    Equipment condition must be inferred through concurrent interpretation of multiple in situ tool sensors, from potentially different subsystems of the tool, measuring at different sampling rates.

The foundation of condition monitoring in any area is the identification of a set of features within the equipment sensor readings that are indicative of the degradation processes, and the utilization of those features with an appropriate condition monitoring method to deduce the equipment health [7]. In general, the following feature extraction techniques have been widely utilized for CBM in manufacturing [8]:

  • Time domain methods: These methods are based on extracting signal features directly from the signal samples (i.e. from the time-domain description of the signal). Those include statistical moments (minimum, maximum, expected value, variance, skewness, kurtosis and higher order moments), as well as dynamic features, such as overshoot, settling time, rise time and others [5], [9].

  • Frequency domain methods: These methods are based on extracting signal features from the frequency domain description of the signal obtained from Fourier analysis or Fast Fourier Transforms [9], [10]. Statistical moments of the frequency domain distributions of signal energy and frequency peak locations and intensities are just some of the features that could be extracted in this domain.

  • Time–frequency methods: These methods are based on extracting signal features from the domains representing the temporal evolution of frequency domain signal representations (i.e. domains representing joint distributions of signal energy over time and frequency domains). Such signal representations can be obtained using methods that include spectrograms of the signal, wavelet transforms and Cohen's class time–frequency signal transformations [10], and are perfectly suited for non-stationary signals whose frequency content varies over time.

  • Model-based methods: These methods are based on postulating a dynamic model of the sensor readings and quantifying deviations from them [11].

Unfortunately, the standard sensors on even the most modern semiconductor manufacturing equipment are sampled at very low frequencies (1–10 Hz), compared to some other areas, such as aerospace or rotating machinery applications (sampling frequencies in these areas are often on the order of kHz or even MHz ranges) [7]. This greatly limits the feasibility of frequency and time–frequency based feature extraction in this area. Instead, great majority of semiconductor CBM applications utilize raw sensor outputs or their time-domain statistical moments (mean, variance, etc.) for monitoring purposes. This is appropriate whenever sensor readings are very closely correlated to the monitored or controlled processes. For example, the thermocouple readings from a chemical vapor deposition (CVD) chamber can be used directly as a monitored feature in statistical process control (SPC) charts [15]. In addition, the in situ particle counts can be used as an indicator of chamber contamination and are hence directly used for monitoring, if these sensors are available on the equipment. The chemical mechanical planarization (CMP) process is an example where layer thickness, copper residuals, and the wafer temperature are directly measured and used as monitoring parameters in [12]. Spanos et al. [13] depart from this trend of utilization of raw sensor readings for monitoring and present a novel feature extraction method for plasma-etch equipment sensors. Time series models [14] for each of these signals are utilized to analyze residuals of the signals after eliminating trends.

The second fundamental enabler of proactive CBM policies is the approach to condition monitoring of the equipment using the features extracted from the sensor readings. Condition monitoring techniques determine the current system or subsystem conditions based on the features extracted from the relevant sensor readings [7]. SPC and advanced process control (APC) approaches, based on statistical or pattern recognition methods, are commonly utilized in semiconductor manufacturing. Montgomery [15] gives a thorough description of SPC, along with applications from various industries. SPC methods can detect abnormalities in the process as statistically significant departures in the behavior of process signatures from their behavior during normal conditions. For example, Spanos et al. [13] raised alarms if the univariate T2 statistic of relevant sensor features [16] falls outside the statistically determined process control limits. Bunkofske [17] used SPC for condition monitoring after reducing the number of equipment features using principal component analysis. Mai [18] used SPC in monitoring of the contamination inside a lithography tool to prevent defects on the products caused by excessive tool contamination.

APC methods for process monitoring can be seen as a special class of SPC methods that use elaborate multivariate models, such as artificial neural networks (ANNs) or multivariate regression models to capture process dynamics. These methods are increasingly being employed in recent years to deal with diagnostic challenges associated with larger wafer sizes, higher costs, and smaller critical dimensions of electronic components. Card et al. [19] utilized neural network prediction errors to monitor and control a plasma etch process. Pompier et al. [20] monitored a multi-chamber oxide deposition process by tracking individual chambers using APC methods. Velichko [21] proposed a nonlinear multiple-input, multiple-output (MIMO) model-based APC framework for semiconductor manufacturing processes, and shows its benefits over tracking raw features. Baek et al. [22] analyzed the electron collisions during plasma processes using APC methods in order to identify small, previously unknown, trends between wet cleans.

The next fundamental enabler of CBM is fault diagnosis. This level generates diagnostic records by identifying fault possibilities based on the information of current condition and known faulty conditions. This layer essentially enriches the condition monitoring information showing that “something is abnormal in the process” with the diagnostic information about what is the reason for that abnormality (what fault caused it). The predominant approach for fault diagnosis currently used in the semiconductor industry is based on the use of the SPC and APC concepts [15]. SPC charts are usually constructed in a way that each of them corresponds to one specific fault. Thus, whenever the monitored features violate the corresponding limits, the presence of a particular fault is established. These warning limits can also provide a statistical significance to give the user an assessment of how accurately the tool health is being estimated. For example, Chen et al. [23] reported using optical emission spectroscopy (OES) data to provide a real-time SPC diagnosis scheme on the plasma performance, as well as to detect faults during the etch process. Matsuda et al. [24] presented the use of APC for equipment monitoring, error detection, and CBM in semiconductor thermal process.

In addition to the SPC/APC methods, artificial intelligence and classification techniques have been employed in assessing the health of semiconductor fabrication systems. For example, Salahshoor and Keshtgar [25] proposed a method which performs Independent Component Analysis followed by a Neural Network classification. This method is used to overcome false alarms and missed fault detections frequently observed when conventional monitoring techniques deal with large number of observation variables. Tu et al. [26] presented the results using principal component analysis (PCA) for fault detection and classification in 300-mm high-density plasma CVD tools.

In terms of health indicators, Blue and Chen [27] proposed the generalized moving variance as a tool health indicator, which is dependent on the changes of recipe in the semiconductor fabrication process. Djurdjanovic et al. [6] proposed a generic concept of ‘confidence values’ as an index to reflect how healthy the system is by evaluating the overlap between the most recently observed features and those observed during normal operation.

This paper presents integrated methods for feature extraction, condition monitoring, and fault modeling applied to multiple, simultaneously sampled sensor readings from all relevant subsystems of an industrial Plasma Enhanced Chemical Vapor Deposition (PECVD) tool commonly used in the semiconductor manufacturing industry. The data set used in our study encompasses 8 months of high volume production in a major domestic semiconductor fab. The remainder of the paper is organized as follows. In Section 2, PECVD tools are described briefly, followed by a description of the layout of the subsystems, sensors, and operation cycles of the tool. Section 3 describes the methodology utilized in this work. Sections 3.1 Feature extraction methodology, 3.2 Sensitivity analysis methodology: Linear Discriminant Analysis respectively list the dynamic features that were extracted from the tool sensors, and introduce the Linear Discriminant Analysis (LDA) as a viable method for determining which of these dynamic features are the most sensitive to tool condition changes. Section 3.3 introduces the overlap of probability density functions of features yielded by the tool at various stages of its degradation as a way to quantitatively monitor the tool condition. Section 3.4 presents a technique for modeling and assessing a specific condition of the tool given feature patterns corresponding to the underlying faults. Section 4 gives the results obtained through our study. The data set collected from a major domestic 300 mm fab is described in Section 4.1, while Section 4.2 presents the results of monitoring of the so-called accumulation drifts caused by short-term degradation between cleaning cycles. Section 4.3, presents the fault modeling results of four highly impacting PECVD tool faults that occurred in the tool operation period considered in this paper. Section 5 presents a discussion of the results. Finally, conclusions and future work are discussed in Section 6.

Section snippets

Description of PECVD tool

PECVD tools are used for depositing thin films onto silicon wafer substrates, which is one of the crucial steps in manufacturing of microelectronic circuits and solar cells. It is the most common method for producing conductors and dielectrics with excellent film growth properties necessary for small chip components [28]. Inside a PECVD tool chamber, reactive gases pass over silicon wafers and are absorbed onto the surfaces to form a thin layer. The gases are excited through radio frequency

Feature extraction methodology

Extraction of signal features that are the most descriptive of machine performance is one of the key elements of CBM [7]. Useful information can be gathered from extracting dynamic features by utilizing them in prediction models for capturing the features’ time-series performance. Predictions of system degradation, remaining useful life, and sudden developing failures are all examples of critical information that are found through dynamic features directly extracted from raw data. In this

Description of dataset

The data set used in this study was gathered from a standard 300 mm PECVD tool operating in the facilities of a major domestic manufacturer of integrated circuits. It was obtained from a single tool continuously depositing recipes of Si–O2 films with various thicknesses (TEOS is used as the main reactant). In this study, we focused on roughly 8 months of production data and on the recipe (film thickness) that significantly dominated the operations (about 80% of operation). All the signals in

Discussion

In summary, we created statistical models of the four faults that occurred in our dataset using PDFs of the sensor features that were found by LDA to be the most sensitive to those faults. As described in Section 3.4 and illustrated in Fig. 4, the tool can be monitored by comparing the current behavior of the sensor features with PDF models of each of the four faults we observed until now. When a similar fault occurs again (which unfortunately did not happen in the dataset used in this study),

Conclusions and future work

This paper presents an integrated feature extraction, equipment monitoring, and fault modeling approach applied to in-place sensors from a commonly used industrial PECVD tool. The Linear Discriminant Analysis (LDA) and Gaussian Mixture Models (GMMs) of dynamic features extracted from numerous in-place sensors on the tool were used to recognize and track features that show the most prominent statistical changes due to tool degradation and maintenance events. The concept of performance confidence

Acknowledgment

This work was supported in part by the International SEMATECH Manufacturing Initiative (ISMI).

Alexander Q. Bleakie was born in Houston, Texas, USA. He received a B.S. degree in Mechanical Engineering from Texas A&M University in 2008 with Summa Cum Laude Honors, and a M.S. degree in Mechanical Engineering from the University of Texas at Austin in 2010. He is currently pursuing a Ph.D. degree in Mechanical Engineering at the University of Texas at Austin with emphasis on Dynamic Systems and Control. His current research interests include dynamic modeling, control theory, fault diagnosis,

References (34)

  • D. Djurdjanovic et al.

    Watchdog agent – an infotronics based prognostics approach for product performance assessment and prediction

    International Journal of Advanced Engineering Informatics, Special Issue on Intelligent Maintenance Systems

    (2003)
  • ...
  • S. Fulton et al.

    ISMI Consensus Preventive and Predictive Maintenance Vision Ver. 1.1

    (2007)
  • A. Buchner et al.

    Data mining in manufacturing environments: goals, techniques and applications

    Studies in Informatics and Control

    (1997)
  • Y. Liu et al.

    Predictive Modeling and Intelligent Maintenance Tools for High Yield Next Generation Fab

    (2005)
  • D. Djurdjanovic et al.

    Survey of Predictive Maintenance Research and Industry Best Practice

    (2006)
  • M. Lebold et al.

    Using DCOM in an open system architecture framework for machinery monitoring and diagnostics

  • J.C. Li

    Signal processing in manufacturing monitoring

    Condition Monitoring and Control for Intelligent Manufacturing

    (2006)
  • G. Franklin et al.

    Feedback Control of Dynamic Systems

    (2006)
  • B. Boashash

    Time Frequency Signal Analysis and Processing: A Comprehensive Reference

    (2003)
  • R. Serway et al.

    Physics for Scientists and Engineers

    (2010)
  • J. Tang et al.

    In-process detection of microscratching during CMP using acoustic emission sensing technology

    Journal of Electronic Materials

    (1998)
  • C. Spanos et al.

    Real time statistical process control using tool data

    IEEE Transactions on Semiconductor Manufacturing

    (1992)
  • S.M. Pandit et al.

    Time Series and System Analysis with Applications

    (1983)
  • D.C. Montgomery

    Introduction to Statistical Quality Control

    (2001)
  • R. Harris

    A Primer of Multivariate Statistics

    (1975)
  • R.J. Bunkofske et al.

    Real-time process monitoring

  • Cited by (35)

    • A general variable neighborhood search algorithm for a parallel-machine scheduling problem considering machine health conditions and preventive maintenance

      2022, Computers and Operations Research
      Citation Excerpt :

      Moreover, optional preventive maintenance for the machines in the work center is jointly scheduled. It has been shown in many studies that considering machine health conditions facilitates enhancing operational performance in semiconductor manufacturing (Lee et al., 2007; Guan et al., 2011; Cholette et al., 2013; Bleakie and Djurdjanovic, 2013; Munirathinam and Ramadoss, 2014). In the majority of these studies, health conditions are integrated with maintenance scheduling, which is a wide-spreading strategy to utilize the condition information in a manufacturing system (Jardine et al., 2006).

    • A fault model extension for a geometric fault isolation methodology to detect leakages and sensor faults on engine test beds

      2022, Control Engineering Practice
      Citation Excerpt :

      In Zhang, Polycarpou, and Parisini (2008), the fault detection and isolation architecture is based on a bank of nonlinear adaptive estimators, one for each nonlinear process fault in the fault class and one for each sensor or output variable. The set of dynamic features most sensitive to different tool conditions induced by chamber cleaning or different faults was determined using linear discriminant analysis in Bleakie and Djurdjanovic (2013). Changes in these features were statistically quantified as the condition of the tool changed using Gaussian mixture models of the dynamic feature distributions.

    • Deep heterogeneous GRU model for predictive analytics in smart manufacturing: Application to tool wear prediction

      2019, Computers in Industry
      Citation Excerpt :

      A review on statistical data driven approaches for remaining useful life prediction is conducted in [14]. The advantages of these machine learning techniques in manufacturing are outlined in [15], but the empirical feature extraction and selection are still challenging, especially in big manufacturing data era [16]. Deep learning has attracted much attention with the superiorities of feature learning, high hierarchal models and end-to-end optimization [17].

    View all citing articles on Scopus

    Alexander Q. Bleakie was born in Houston, Texas, USA. He received a B.S. degree in Mechanical Engineering from Texas A&M University in 2008 with Summa Cum Laude Honors, and a M.S. degree in Mechanical Engineering from the University of Texas at Austin in 2010. He is currently pursuing a Ph.D. degree in Mechanical Engineering at the University of Texas at Austin with emphasis on Dynamic Systems and Control. His current research interests include dynamic modeling, control theory, fault diagnosis, and estimation.

    Dragan Djurdjanovic received the B.S. degree in mechanical engineering and applied mathematics from the University of Nis, Nis, Serbia, in 1997, the M.Eng. degree in mechanical engineering from Nanyang Technological University, Singapore, in 1999, the M.S. degree in electrical engineering (systems), and the Ph.D. degree in mechanical engineering from the University of Michigan, Ann Arbor, in 2002. He is currently an Assistant Professor of mechanical engineering, and operations research and industrial engineering, with the Department of Mechanical Engineering, University of Texas, Austin. He has co-authored 37 published or accepted journal publications, 3 book chapters, and 30 conference publications. His current research interests include advanced quality control in multistage manufacturing systems, intelligent proactive maintenance techniques, and applications of advanced signal processing in biomedical engineering. Dr. Djurdjanovic was the recipient of several prizes and awards, including the 2006 Outstanding Young Manufacturing Engineer Award from the Society of Manufacturing Engineers, the 2005 Teaching Incentive Award from the Department of Mechanical Engineering, University of Michigan, Nomination for the Distinguished Ph.D. Thesis from the Department of Mechanical Engineering, University of Michigan in 2003, and the Outstanding Paper Award at the 2001 SME North American Manufacturing Research Conference.

    View full text