1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Tài liệu Computer-Aided.Design.Engineering.and.Manufacturing P8 pptx

22 440 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 22
Dung lượng 286,82 KB

Nội dung

Chang, Shing I "A Hybrid Neural Fuzzy System for Statistical Process Control" Computational Intelligence in Manufacturing Handbook Edited by Jun Wang et al Boca Raton: CRC Press LLC,2001 ©2001 CRC Press LLC 18 A Hybrid Neural Fuzzy System for Statistical Process Control 18.1 Statistical Process Control 18.2 Neural Network Control Charts 18.3 A Hybrid Neural Fuzzy Control Chart 18.4 Design, Operations, and Guidelines for Using the Proposed Hybrid Neural Fuzzy Control Chart 18.5 Properties of the Proposed Hybrid Neural Fuzzy Control Chart 18.6 Final Remarks Abstract A hybrid neural fuzzy system is proposed to monitor both process mean and variance shifts simulta- neously. One of the major components of the proposed system is composed of several feedforward neural networks that are trained off-line via simulation data. Fuzzy sets are also used to provide decision-making capability on uncertain neural network output. The hybrid control chart provides an alternative to traditional statistical process control (SPC) methods. In addition, it is superior in that (1) it outperforms other SPC charts in most situations in terms of faster detection and more accurate diagnosis, and (2) it can be used in automatic production processes with minimal human intervention — a feature the other methods ignore. In this chapter, theoretical base, operations, user guidelines, chart properties, and examples are provided to assist those who seek an automatic SPC strategy. 18.1 Statistical Process Control Statistical process control (SPC) is one of the most often applied quality improvement tools in today’s manufacturing as well as service industries. Instead of inspecting end products or services, SPC focuses on processes that produce products and services. The philosophy of a successful SPC application is to identify sources of special causes of production variation as soon as possible during production rather than wait until the very end. Here “production” is defined as either a manufacturing or service activity. SPC provides savings over traditional inspection operations on end products or service because it eliminates accumulations of special causes of variation by monitoring key quality characteristics during production. Imagine how much waste is generated when a production mistake enters a stream of products during mid-day but inspection doesn’t take place until the end of an 8-hour shift. SPC can alleviate this situation by frequently monitoring the production process via product quality characteristics. Shing I Chang Kansas State University ©2001 CRC Press LLC A quality characteristic (QC) is a measure of quality on a product or service. Examples of QC are weight of a juice can, length of a cylinder part, the number of errors made during payroll operations, etc. A QC can be mathematically defined as a random variable, which is a function that takes values from a population or distribution. Denote a QC as random variable x . If a population Ω only contains discrete members, that is, Ω = {x 1 , x 2 , …, x n }, then QC x is a discrete random variable. For example, if x is the number of errors made during payroll operations, then member x 1 is the value in January, x 2 is the value in February, and so on. In this case, attribute control charts can be used to monitor a QC with discrete distribution. A control chart for fraction nonconforming, also known as a P chart, based on binomial distribution, is the most frequently used chart (Montgomery, 1996). However, in this chapter, we will focus only on a more interesting class of control charts when QC x is a continuous random variable where x can take a value in a continuous range, i.e., x ∈ Ω ={ x | L ≤ x ≤ U}. For example, x is the weight of a juice can with a target weight of 8 oz. The central limit theorem (CLT) implies that the sample mean of a continuous random variable x is approximately normally distributed where the sample mean is calculated by n independently sampled observations of x . The approximation improves when the size of n increases. In much of the quality control literature, n is chosen to be 5 to 10 when the approximation is considered good enough. Note that CLT does not impose any restriction on the original distribution on x , which provides the foundation for control charts. Since the sample mean of x , – x , is approximately normal distributed, i.e., N ( µ , σ 2 / n ) where µ and σ are the mean and standard deviation of x , respectively, we can collect n observations of a QC, calculate its sample mean, and plot it against a control chart with three lines. If both µ and σ are known, the centerline is µ with lower control limit and upper control limit . If CLT holds and the process defined by QC x remains in control, 99.73% of the sample population will fall within the two control limits. On the other hand, if either µ or σ shifts from its target, this will increase the probability that sample points plot outside the control limits, which indicates an out-of-control condition. A pair of control charts are often used simultaneously to monitor QC x — one for the mean µ and the other for the standard deviation σ . The goal is to make sure the process characterized by QC x is under statistical control. In other words, SPC charts are used to verify that the distribution of x remains the same over time. Since a probability distribution is usually estimated by two major parameters, µ and σ , SPC charts monitor the distribution through these two parameters. Figure 18.1 (Montgomery, 1996) demonstrates two out-of-control scenarios. At time t 1 , the mean µ 0 of x starts to shift to µ 1 . One of the most often used control charts, chart, can be used to detect this situation. On the other hand, at time t 2 , the mean is on target but the standard deviation has increased from σ 0 to σ 1 where σ 1 > σ 0 . In this case, a control chart for ranges (R chart) can be used to detect the variation change. Notice that SPC charts are designed to detect assignable causes of variation as indicated by mean or standard deviation shifts and at the same time tolerate the chance variation as shown by the bell-shaped distribution of x . Such a chance variation is inevitable in any production process. Statistical process control charts have been applied to a wide range of manufacturing and service industries since Shewhart first introduced the concept in the 1920s. There have been several improvements on the traditional control charts since then. Page (1954) first introduced cumulative sum (CUSUM) control charts to enhance the sensitivities of detecting small process shifts. Instead of depending solely on data collected in the most recent sample period for plotting in the traditional Shewhart-type control chart, the CUSUM chart’s plotting statistic involves all data points previously collected and assigns an equal weight factor for every point. If a small shift occurs, CUSUM statistic can accumulate such a deviation in a short period of time and thus increase the sensitivity of an SPC chart. However, CUSUM charts cannot be plotted as easily as the Shewhart-type control charts. Roberts (1959) proposes an exponential weighted moving average (EWMA) control chart that weighs the most recent observations more heavily than remote data points. EWMA charts were developed to have the structure of the traditional Shewhart charts, yet match the CUSUM charts’ capability of detecting small process shifts. µ σ – 3 n µ σ +3 n X ©2001 CRC Press LLC Most control chart improvements over the years have been focused on detecting process mean shifts, with a few exceptions that are discussed in the following section. Shewhart R, S, and S 2 charts are the first statistical control charts for monitoring process variance changes. Johnson and Leone (1962a, 1962b) and Page (1963) later proposed CUSUM charts based on sample variance and sample range. As an alternative, Crowder and Hamilton (1992) developed an expo- nential weighted moving average (EWMA) scheme based on the log transformation of the sample variance ln (S 2 ). Their experimental results show that the EWMA chart outperforms the Shewhart S 2 chart and is comparable to the CUSUM chart for variation proposed by Page (1963). Using the concept of log transformation of sample variance, Chang and Gan (1995) suggest a CUSUM scheme based on ln (S 2 ), which performs as well as the corresponding EWMA. Performances of Chang and Gan’s (1995) CUSUM and Crowder and Hamilton’s (1992) EWMA are not significantly better than Page’s (1963) CUSUM; however, their development of design strategies and procedures are relatively easier for practitioners to use. 18.2 Neural Network Control Charts In recent years, attempts to apply neural networks to process control have been investigated by several researchers. Guo and Dooley (1992) proposed network models that identify positive mean or variance changes using backpropagation training. Their best network performs 40% better on the average error rate than conventional control chart heuristic tests. Pugh (1989, 1991) also successfully trained backpropagation networks for detecting process mean shifts with subgrouping size of five. He found his networks equal in average run length (ARL) performance to a 2- σ control chart in both type I and II errors. Hwarng and Hubele (1991, 1993) trained a backpropagation pattern recognition classifier to detect six unnatural control chart patterns — trend, cycle, stratification, systematic, mixture, and sudden shift. Their results were promising in recognizing various special causes in out-of-control situations. Smith (1994) and Smith and Yazici (1993) described a combined X-bar and R chart backpropagation model to investigate both mean and variance shifts. They found their networks performed 50% better in average error rate when compared to Shewhart control charts. However, the majority of the wrong FIGURE 18.1 In-control and out-of-control scenarios in SPC. (From Montgomery, D.C., 1996, Introduction to Statistical Quality Control , 2nd ed. p. 131. Reproduced with the permission of John Wiley & Sons, Inc.) Assignable cause three is present; process is out-of-control Assignable cause two is present; process is out-of-control Assignable cause one is present; process is out-of-control Only chance causes of variation present; process is in  control LSL Process quality characteristic, x Time, t USL µ 0 µ 1 > µ 0 σ 0 σ 0 σ 0 µ 2 < µ 0 σ 1 > σ 0 σ 1 > σ 0 t 1 t 2 t 3 ©2001 CRC Press LLC classification is of type I error. That is, the network signals too many out-of-control false alarms when the process is actually in control. Chang and Aw (1994) proposed a four-layer backpropagation network and a fuzzy inferencing system for detecting process mean shifts. Their network outperforms conventional Shewhart control charts in terms of both type I and type II errors, while Pugh’s and Smith’s charts have larger type I errors than that of the 3 σ chart. Further, Chang and Aw’s scheme has the advantage of identifying the magnitude of shifts. None of the Shewhart-type charts, or the other neural network charts, offer this feature. Chang and Ho (1999) further introduced a two-stage neural network approach for detecting and classifying process variance shifts. The performance of the proposed method is comparable to that of the other control charts for detecting variance changes as well as being capable of estimating the magnitude of the variance change, which is not supported by the other control charts. Furthermore, Ho and Chang (1999) integrated both neural network control chart schemes and compared this with many other approaches for monitoring process mean and variance shifts. In this chapter, we will summarize the proposed hybrid neural fuzzy system for monitoring both process mean and variance shifts, provide guidelines and examples for using this system, and list the properties. 18.3 A Hybrid Neural Fuzzy Control Chart As shown in Figure 18.2 (Ho and Chang, 1999), the proposed hybrid neural fuzzy control chart, called C-NN (C stands for “combined” and NN means “neural network”), is composed of several modules — data input, data processing, decision making, and data summary. The data input module takes observa- tions from QC x and transforms them into appropriate types for both control charts for mean M-NN and for variance V-NN, which are the major components of the data processing module. The decision- making module is responsible for interpreting the neural network outputs from the previous module. There are four distinct possibilities: no process shift, process mean shift only, process variance shift only, and both process mean and variance shifts. Note that two different classifiers — fuzzy and neural network — are adopted for the process mean and variance components, respectively. Finally, the data summary module calculates estimated shift magnitudes according to appropriate diagnosis. Details of each module will be discussed in the following sections. 18.3.1 Data Input Module The data input module takes samples or observations of QC x in two ways. Sample observations, x 1 , x 2 , …, and x n in the first input method are independent of each other. In the proposed system, n is chosen as five, that is, each plotting point consists of a sample of five observations. Traditional Shewhart-type control charts normally use this input method. A moving window of five observations is used for the second method to select incoming observations. For example, the first sample point consists of observations x 1 , x 2 , . . . , x 5 and the second sample point is composed of x 2 , x 3 , . . . , x 6 , and so on. This method is explored due to the fact that both CUSUM and EWMA charts for mean shifts are capable of taking individual observations. The proposed moving range method comes close to individual observation in terms of the number of observations used for decision making. Unlike the “true” individual observation input method, the moving range method must wait until the fifth observation to complete the first sample point to start using the proposed chart. After this point, it is on pace with the “true” individual observation input method in that it uses the most recent and four immediately passed observations. The reason for maintaining a few observations in a sample point is due to the need to evaluate process variation. An individual observation does not provide such information. Transformation is also a key component in the data input module. As we will discuss later, both neural networks were trained “off-line” from simulated observations. In order to make the proposed schemes work for various applications, data transformation is necessary to standardize the raw data into the value range that both neural network components can work with. Formulas for data transformation are as follows: X ©2001 CRC Press LLC 18.3.1.1 Transformation for M-NN Input Equation (18.1) where i is the index of observations in a sample or window; t is the index for the sample period, and and s are estimates of process mean and standard deviation, respectively. In traditional control charts, it takes 100 to 125 observations, e.g., 25 samples of 4 or 5 observations each, to establish the control limits. However, in this case, 20 to 30 observations can provide reasonably good estimates. 18.3.1.2 Transformation for V-NN Input Given the data standardization in Equation 18.1, the input for V-NN of variance detection needs to further process as Equation (18.2) where t and i are the same as those defined in Equation 18.1, and is the average of five transformed observations z ti of the sample at time t . 18.3.2 Data Processing Module The heart and soul of the proposed system is a module composed of two independently developed neural networks: M-NN and V-NN. M-NN, developed by Chang and Aw (1996), is a 5–8–5–1 four-layer neural network for detecting process mean shift. On the other hand, Chang and Ho’s (1999) V-NN is a 5–12–12–1 neural network for detecting process variance shift. Data from transformation formulas (Equations 18.1 and 18.2) are fed into M-NN and V-NN, respectively. Both neural networks have single output nodes. M-NN’s output values range from –1 to +1. A value that falls into a negative range indicates a decrease in process mean value, while a positive M-NN output value indicates a potential increase in process mean shift. On the other hand, V-NN’s output ranges from 0 to 1 with larger values meaning larger shifts. Note that both neural networks were trained off-line using simulations. By incorporating the trained weight matrices, one can start using the proposed method. The only setup required is to estimate both process mean and variance for transformation. The central limit theorem guarantees that transformed data is FIGURE 18.2 A schematic diagram of C-NN (combined neural network) control chart. (Adapted from Ho and Chang, 1999, Figure 3, p. 1891.) Sample Observations Individual Observations Trans- formation M-NN V-NN Cutoff value(s) Cutoff value(s) Mean/ Variance Shift Mean Shift Fuzzy Classifier Neural Classifier Shift Magnitude Shift Magnitude Variance Shift z xx s i ti ti ==… – ,,,,,123 5 x IzzI ti ti t ==…–, ,,,12 5 z t ©2001 CRC Press LLC similar to the simulated data used for training. Thus the proposed method can be applied to many applications with various data types as long as they can be defined as QC x . Before M-NN and V-NN are introduced in detail, we first summarize calculation and training of any feedforward, multiple-layer neural networks as follows. 18.3.2.1 Computing in a Neural Network The most commonly implemented neural network is the multilayer backpropagation network, which adapts weights according to the steepest gradient descent rule along a nonlinear transformation function. The reason for this popularity is due to the versatility of its paradigm in solving diverse problems, and its strong mathematical foundation. An example of a multilayer neural network is shown in Figure 18.3. In neural networks, information propagates from input nodes (or neurons) through the system’s weight connections in the middle layers (or hidden layers) of nodes, finally passing out the last layer of nodes — the output nodes. Each node, for example node j in the hidden and output layers, contains the input links with weights w ij , an activation function (or transfer function) f , and output links to other nodes as shown in Figure 18.4. Assuming k input links are connected to node j , the output V j of node j is processed by the activation function Equation (18.3) where V pi is the output of node i from the previous layer. Many activation functions, e.g., sigmoidal and hyperbolic-tangent functions, are available. We choose to use the sigmoidal function Equation (18.4) where c is a coefficient that adjusts the abruptness of the function. FIGURE 18.3 An example of a multilayer neural network. Input Layer Hidden Layers Output Layer VfI IwV jj jijpi i k = () = = ∑ , , 1 fI e cI () = + 1 1 – ©2001 CRC Press LLC 18.3.2.2 Training of a Neural Network Backpropagation training is the most popular supervised neural network training algorithm. The training is designed to modify the thresholds and weights so that the overall error will be minimized. At each iteration, we first calculate error signals δ o , o = 1, 2, . . . , n o , for the output layer nodes as follows: Equation (18.5) where f' (I) is the first-order derivative of the activation function f(I); t o is the desired target value; and V o is the actual output for output node o. We then update the weights connected between hidden layer nodes and output layer nodes: w ho (new) = w ho (old) + ηδ o V h + α[∆w ho (old)], Equation (18.6) where η is a constant chosen by users for adjusting training rate; α is a momentum factor; δ o is obtained from Equation 18.5; V h is the output of node h in the last hidden layer; and ∆w ho is the previous weight change between node h and output node o. Subsequent steps include computing the error signals for a hidden layer(s) and propagating the errors backward toward the input layer. The error signals for node h in the current hidden layer are Equation (18.7) where V h is the output for node h in the current hidden layer under consideration; w ih is the weight coefficient between node h in the current hidden layer and node i in the next hidden layer; δ′ i is the error signal for node i in the next hidden layer; and n′ is the number of nodes in the next hidden layer. Given the error signals from Equation 18.7, the weight coefficient w jh between node j in the lower hidden layer and node h in the current hidden layer can be updated as follows: w jh (new) = w jh (old) + ηδ h V j + α[∆w jh (old)], Equation (18.8) FIGURE 18.4 Node j and its input–output values in a multilayer neural network. V p1 V V Node j at Current Layer Nodes in the Next  Layer W 1j Transfer Function f W 2j W ij W kj V p2 V pi V pk i i δ o o o ooo fI tfI V V tV= () () () =+ ()()() '– . ––,05 1 1 δδ δ hhih i n i hhih i n i fI w V V w= ′ () ′ =+ ()() ′ = ′ = ′ ∑∑ 11 051 1.± , ©2001 CRC Press LLC where the indices η and α are defined in Equation 18.6 and V j is the actual output from node j in the lower hidden layer. In summary, the procedure of backpropagation training is as follows: Step 1. Initialize the weight coefficients. Step 2. Randomly select a data entry from the training data set. Step 3. Feed the input data of the data entry into the network under training. Step 4. Calculate the network outputs. Step 5. Calculate the error signals between the network outputs and desired targets using Equation 18.5. Step 6. Adjust the weight coefficients between the output layer and closest hidden layer using Equation 18.6. Step 7. Propagate the error signals and weight coefficients backward using Equations 18.7 and 18.8. Step 8. Repeat steps 2 to 7 for each entry in the training set until the network error term drops to an acceptable level. Note that calculations in steps 2 to 4 are done from the input layer toward the output layer, while weight updates in steps 5 and 7 are calculated in a backward manner. The term “backpropagation” comes from the way the network weight coefficients are updated. 18.3.2.3 Computing and Training of M-NN The first neural network is a backpropagation type trained by Chang and Aw (1996). It is a 5–8–5–1 four-layer network, i.e., five input nodes with two hidden layers, each having eight and five neurons and one output node. This network has a unique feature in that the input layer is connected to all nodes in the other three layers, as shown in Figure 18.5. They trained M-NN by using 900 samples, each with five observations, simulated from N( µ o ± δσ o , σ o 2 ) where µ o = 0 and σ o = 1 and δ = 0, ±1, ±2, ±3, and ±4. These observations were fed directly to the network and trained by a standard backpropagation algorithm to achieve a desired output between –1 and 1. The network was originally developed to detect both positive and negative mean shifts. Since we will analyze positive shifts only, our interest here is in positive output values between 0 and 1. A value close to zero indicates the process is in control while it triggers an out-of-control signal when the output value exceeds a set of critical cutoff points. The larger the output value, the larger the process mean shift. 18.3.2.4 Computing and Training of V-NN Chang and Ho (1999) trained a neural network to detect process variation shifts henceforth called V- NN. V-NN is a standard backpropagation network with a 5-12-12-1 structure. The number of nodes for input and output were kept the same so that parallel use of both charts is possible. In training V-NN, 600 exemplar samples were taken from simulated distributions N( µ o , (ρσ o ) 2 ) where µ o = 0 and σ o = 1, and ρ = 1, 2, 3, 4, 5. They were then transformed into input values for the neural network by using Equation 18.2. The desired output, which represents different shift magnitudes, has values between 0 and 1. The network was trained by a standard backpropagation algorithm with adaptive learning rates. A V-NN output value close to 0 means the process variation is likely to be in control, while larger values indicate that the process variation increases. The larger the V-NN output, the larger the magnitude of increase. 18.3.3 Decision-Making Module The decision-making module is responsible for interpreting neural network outputs from both M-NN and V-NN. The fuzzy set theory is applied to justify human solutions to these problems. Before the decision rules for evaluating both M-NN and V-NN are given, fuzzy sets and fuzzy computing related to this module are briefly reviewed in the following sections. ©2001 CRC Press LLC 18.3.3.1 Fuzzy Sets and Fuzzy Computing Zadeh (1965) emphasized that applications based on fuzzy logic start with a human solution, which is distinctly different from a neural network solution. Motivated by solving complex systems, Zadeh observed that system models based on the first principle, such as physics, are not always able to solve the problem. Any attempt to enhance details of modeling of complex systems often leads to more uncertain- ties. On the other hand, a human being is able to offer a solution for such a system from his or her experiences. The fact is that human beings can handle uncertainties much better than a system model can. 18.3.3.1.1 Fuzzy Sets and Fuzzy Variables Zadeh (1965) first introduced the concept of the fuzzy set. A member in a fuzzy set or subset has a membership value between [0, 1] to describe how likely it is that this member belongs to the fuzzy set. Let U be a collection of objects denoted generically by {x}, which could be discrete or continuous. U is called the universe of discourse and u represents a member of U (Yager and Filev, 1994). A fuzzy set F in a universe of discourse U is characterized by a membership function µ F which takes values in the interval [0, 1], namely, µ F : U → [0, 1]. Equation (18.9) That fuzzy set F can be represented as F = {(x, µ F (x)), x ∈ U}. An ordinary set may be viewed as a special case of the fuzzy set whose membership function only takes two values, 0 or 1. An example of probability modeling is throwing a dice. Assuming a fair dice, outcomes can be modeled as a precise set A = {1, 2, 3, 4, 5, 6} with probabilities 1/6 for the occurrence of each member in the set. To model this same event in fuzzy sets, we need six fuzzy subsets ONE, TWO, THREE, FOUR, FIVE, and SIX that contain the outcomes of coin flipping. In this case, the universe of discourse U is same as set A and six membership functions µ 1 , µ 2 , …, µ 6 are for the members in fuzzy FIGURE 18.5 A proposed two-sided mean shift detection neural network model. (Adapted from Chang and Aw, 1996, Figure 1, p. 2266.) Obs 5 Obs 4 Obs 3 Obs 2 Obs 1 Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer Output . 2 ) where µ o = 0 and σ o = 1 and δ = 0, ±1, ±2, ±3, and ±4. These observations were fed directly to the network and trained by a standard backpropagation. Chang and Gan’s (1995) CUSUM and Crowder and Hamilton’s (1992) EWMA are not significantly better than Page’s (1963) CUSUM; however, their development of design

Ngày đăng: 22/12/2013, 21:18

TỪ KHÓA LIÊN QUAN