Approaches for efficient tool condition monitoring based on support vector machine 1

Appendix A: TCM Graphs of Feature Selection AE 0.5 -0.5 50 100 150 200 Tool State flank wear tool state 0.5 detected wear=0.277mm 50 100 150 200 Tool State flank wear tool state 0.5 detected wear=0.271mm 50 100 Time(s) 150 200 Figure A-1 AE signals, tool wear and identification results of feature set Y1 and Y3 (workpiece ASSAB705, insert SNMN120408 of material A30, v=215 m/min, f=0.2 mm/rev, d=1mm) AE 0.5 -0.5 50 100 150 200 Tool State flank wear tool state 0.5 detected wear=0.273mm 50 100 150 200 Tool State flank wear tool state 0.5 detected wear=0.278mm 50 100 Time(s) 150 200 Figure A-2 AE signals, tool wear and identification results of feature set Y1 and Y3 (workpiece ASSAB705, insert SNMN120408 of material A30, v=215 m/min, f=0.2 mm/rev, d=1mm) AE -1 50 100 150 200 Tool State flank wear tool state 0.5 detected wear=0.332mm 50 100 150 200 Tool State flank wear tool state 0.5 detected wear=0.312mm 50 100 150 200 Time(s) Figure A-3 AE signals, tool wear and identification results of feature set Y1 and Y3 (workpiece ASSAB705, insert SNMN120408 of material A30, v=170 m/min, f=0.3 mm/rev, d=1mm) 148 Appendix A: TCM Graphs of Feature Selection AE -1 50 100 150 200 250 300 Tool State flank wear tool state 0.5 detected wear=0.307mm 50 100 150 200 250 300 Tool State flank wear tool state 0.5 detected wear=0.306mm 50 100 150 Time(s) 200 250 300 Figure A-4 AE signals, tool wear and identification results of feature set Y1 and Y3 (workpiece ASSAB705, insert SNMN120408 of material A30, v=200 m/min, f=0.2 mm/rev, d=1mm) AE -1 50 100 150 200 250 300 350 Tool State flank wear tool state 0.5 detected wear=0.305mm 50 100 150 200 250 300 350 Tool State flank wear tool state 0.5 detected wear=0.300mm 50 100 150 200 Time(s) 250 300 350 Figure A-5 AE signals, tool wear and identification results of feature set Y1 and Y3 (workpiece ASSAB705, insert SNMN120408 of material A30, v=170 m/min, f=0.2 mm/rev, d=1mm) Figure A-6 AE signals, tool wear and identification results of feature set Y1 and Y3 (workpiece ASSAB760, insert SNMN120408 of material A30, v=200 m/min, f=0.4 mm/rev, d=1mm) 149 Appendix A: TCM Graphs of Feature Selection Figure A-7 AE signals, tool wear and identification results of feature set Y1 and Y3 (workpiece ASSAB760, insert SNMN120408 of material A30, v=230 m/min, f=0.3 mm/rev, d=1mm) AE -0.2 -0.4 -0.6 -0.8 50 100 150 200 Tool State flank wear tool state 0.5 detected wear=0.387mm 50 100 150 200 Tool State flank wear tool state 0.5 detected wear=0.306mm 50 100 Time(s) 150 200 Figure A-8 AE signals, tool wear and identification results of feature set Y1 and Y3 (workpiece ASSAB705, insert SNMN120408 of material A30, v=230 m/min, f=0.2 mm/rev, d=1mm) Result from Z1: Feature extracting time=0.34s Decision time=0.11s Result from Z2: Feature extracting time=0.22s Decision time=0.09s Figure A-9 Force signals, tool wear and identification results of feature set Z1 and Z2 (workpiece Ti-6Al-4V, insert: SNMG120408 of material AC3000 v=60 m/min, f=0.2 mm/rev, d=0.75mm) 150 Appendix A: TCM Graphs of Feature Selection Result from Z1: Feature extracting time=0.32s Decision time=0.10s Result from Z2: Feature extracting time=0.21s Decision time=0.08s Figure A-10 Force signals, tool wear and identification results of feature set Z1 and Z2 (workpiece Ti-6Al-4V, insert: SNMG120408 of material AC3000 v=80 m/min, f=0.15 mm/rev, d=0.75mm) 151 Appendix B: TCM Graphs with Manufacturing Loss Consideration AE Figure B-1 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMN120408 of material A30, v=170 m/min, f=0.2 mm/rev, d=1mm) -0.74 -0.76 50 100 150 200 250 300 350 400 450 Tool State flank wear tool state 0.5 50 100 detected wear=0.42mm 150 200 250 300 350 400 450 Tool State flank wear tool state 0.5 detected wear=0.267mm 50 100 150 200 250 Time(s) 300 350 400 450 Figure B-2 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB760, insert SNMN120408 of material A30, v=200 m/min, f=0.4 mm/rev, d=1mm) AE -0.6 -0.7 50 100 150 200 250 300 350 400 450 Tool State flank wear tool state 0.5 detected wear=0.264mm 50 100 150 200 250 300 350 400 450 Tool State flank wear tool state 0.5 detected wear=0.266mm 50 100 150 200 250 Time(s) 300 350 400 450 Figure B-3 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB760, insert SNMN120408 of material A30, v=230 m/min, f=0.3 mm/rev, d=1mm) 152 AE Appendix B: TCM Graphs with Manufacturing Loss Consideration 0.2 -0.2 -0.4 -0.6 -0.8 50 100 150 200 250 300 350 400 450 400 450 400 450 Tool State flank wear tool state 0.5 50 100 detected wear=0.208mm 150 200 250 300 350 Tool State flank wear tool state 0.5 detected wear=0.206mm 50 100 150 200 250 Time(s) 300 350 Figure B-4 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB760, insert SNMN120408 of material A30, v=200 m/min, f=0.2 mm/rev, d=2mm) AE -1 50 100 150 200 250 300 350 Tool State flank wear tool state 0.5 detected wear=0.312mm 50 100 150 200 250 300 350 Tool State flank wear tool state 0.5 detected wear=0.307mm 50 100 150 200 Time(s) 250 300 350 Figure B-5 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMN120408 of material A30, v=170 m/min, f=0.2 mm/rev, d=1mm) AE 0 50 100 150 200 Tool State flank wear tool state 0.5 detected wear=0.283mm 50 100 150 200 Tool State flank wear tool state 0.5 detected wear=0.289mm 50 100 Time(s) 150 200 Figure B-6 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMN120408 of material A30, v=170 m/min, f=0.3 mm/rev, d=1mm) 153 Appendix B: TCM Graphs with Manufacturing Loss Consideration AE -1 50 100 150 200 250 Tool State flank wear tool state 0.5 detected wear=0.302mm 50 100 150 200 250 Tool State flank wear tool state 0.5 detected wear=0.286mm 50 100 150 Time(s) 200 250 Figure B-7 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMN120408 of material A30, v=200 m/min, f=0.2 mm/rev, d=1mm) Figure B-8 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMG120408 of material AC3000, v=200 m/min, f=0.3 mm/rev, d=1mm) 154 Appendix B: TCM Graphs with Manufacturing Loss Consideration Figure B-9 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMG120408 of material AC3000, v=200 m/min, f=0.3 mm/rev, d=1mm) Figure B-10 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMG120408 of material AC3000, v=170 m/min, f=0.3 mm/rev, d=1mm) 155 Appendix B: TCM Graphs with Manufacturing Loss Consideration Figure B-11 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMG120408 of material AC3000, v=150 m/min, f=0.4 mm/rev, d=1mm) Figure B-12 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMG120408 of material AC3000, v=150 m/min, f=0.4 mm/rev, d=1mm) 156 Appendix B: TCM Graphs with Manufacturing Loss Consideration Figure B-13 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMG120408 of material AC3000, v=220 m/min, f=0.3 mm/rev, d=1mm) Figure B-14 AE, tool wear and tool state prediction from the standard and revised SVM (workpiece ASSAB705, insert SNMG120408 of material AC3000, v=150 m/min, f=0.4 mm/rev, d=1mm) 157 Appendix D: TCM Graphs in Titanium Machining Figure D-7 Cutting force, tool wear and tool state prediction from the standard and revised SVM (v=80 m/min, f=0.2 mm/rev, d=0.75mm) 161 Appendix E: SVM Theory in Classification Task SVM Theory in Classification Task E.1 Basic theory of SVM In the past few years, there has been an increasing development in SVM – an optimization method to predict the output of unseen datasets. In SVM, the support vectors come from elements of dataset which are specifically chosen for the classification task, and considered important in separating the two classes from each other. Meanwhile, other input vectors are ignored because of their insignificance in constructing the optimal hyperplane. With the use of structural risk minimization principle, SVM gives good generalization performance on practical problems. SVM was initially developed by Vapnik (1995) for the classification problem with separable data. Later it was improved to handle nonseparable data and also adapted to solve the regression problem. As a supervised method, SVM can take advantage of prior knowledge of tool wear and construct a hyperplane as the decision surface so that the margin of the separation between different tool state samples is maximized. To explain the basic idea behind the SVM, we start with the simplest case: linear machine trained on separable data. E.1.1 Linear SVM In the classification problem, a hyperplane is a linear function that is capable of separating the training data without error. The minimal distance from the hyperplane to the closest data point is called the margin and a separating hyperplane with the maximum margin is called optimal hyperplane. It is intuitively clear that a larger margin corresponds to better generalization. 162 Appendix E: SVM Theory in Classification Task Let’s consider a binary classification task with l training data x i ∈ R d (i = 1, L, l ) having corresponding class labels y i = ±1 . Suppose that the training data can be separated by a hyperplane decision function (E.1) with appropriate coefficients w and b , where x is an input vector. g ( x) = w ⋅ x − b (E.1) The problem of finding an optimal hyperplane is to find w and b that maximize the margin. Using the method of Lagrange multipliers, the quadratic optimization problem with linear constraints may be formally stated below. Given the training samples ( x i , y i ) , i = 1,L , l , find the parameter α i to maximize the objective function l L D (α ) = ∑ α i − i =1 l subject to constraints ∑yα i =1 i i l l ∑∑ α i α j yi y j ( x i ⋅ x j ) i =1 j =1 (E.2) =0 (E.3) α i ≥ , i = 1,L, l (E.4) This leads to a hyperplane in the form l g ( x ) = ∑ α i0 y i ( x ⋅ x i ) − b0 i =1 (E.5) l w = ∑ α i yi x i i =1 (E.6) where α i0 are Lagrange multipliers at the solution of the optimization problem (E.2)-(E.4) and the parameter b0 at the solution is computed by taking advantage of the conditions on the support vectors. 163 Appendix E: SVM Theory in Classification Task E.1.2 Nonlinear SVM As the computation capability of the linear functions is somewhat limited, extending the linear SVM into nonlinear is more practical for solving complex estimation problems. The idea of generating nonlinear SVM is to map the original input space X into a high dimensional feature space H by a function ϕ( x ) and then to construct a linear function in the high dimensional feature space which corresponds to a nonlinear function in the original input space. Both the decision function g ( x ) and optimization problem remain the same form except that the input vector x is replaced by ϕ( x ) , or in other word, the inner product ( x i ⋅ x j ) is replaced by ϕ( x i ) ⋅ ϕ( x j ) . While, with the increasing dimension of the used feature space, computation cost will also increase. How to efficiently compute the linear function defined in the high dimensional feature space? This problem is solved by using the kernel function K ( x i , x j ) , and K ( x i , x j ) = ϕ( x i ) ⋅ ϕ( x j ) . Hence, the inner products ϕ( x i ) ⋅ ϕ( x j ) are all replaced with the kernel function K ( x i , x j ) . The mapping of ϕ( x ) from X is implicit, which is not proportional to the number of the dimension of ϕ( x ) . Thus, the use of the kernel functions makes it possible to map the data implicitly into a high dimensional feature space. Any function satisfying Mercer’s condition (Mercer, 1909) can be used as the kernel function. The commonly used kernel functions are Gaussian kernel. K ( x i , x j ) = exp(− ( xi − x j ) ) δ2 (E.7) By using the kernel functions, SVMs can efficiently and effectively construct many types of nonlinear functions. 164 Appendix E: SVM Theory in Classification Task E.1.3 Nonlinear SVM with soft decision boundary For training data that cannot be separated by a hyperplane without error, it would be desirable to separate the data with a minimal number of errors as in figure E.1. Positive slack variables ξ i are introduced in the constraints to quantify the nonseparable data in the defining condition of the hyperplane (Cortes, 1995), and now the constraints become: y i [( w ⋅ x i ) − b] ≥ − ξ i , i = 1,L, l (E.8) For a training sample x i , the slack variable ξ i is the deviation from the margin border corresponding to the class of y i . With the method of Lagrange multipliers, finding the optimal hyperplane for the linearly nonseparable case is a quadratic optimization problem with linear constraints, as formally stated next: Figure E.1 Soft Margin Hyperplane Given the training data ( x i , y i ) , i = 1, L , l , determine the w and b that minimize the objective function l L D (α ) = ∑ α i − i =1 Subject to l l ∑∑ α i α j yi y j K ( x i , x j ) i =1 j =1 y i ( w ⋅ ϕ( x i ) − b) ≥ − ξ i , ξi ≥ , i = 1,L , l (E.9) (E.10) (E.11) 165 Appendix E: SVM Theory in Classification Task where C is a user-specified parameter, a larger C corresponds to assign a higher penalty to training errors. The weight vectors can be expressed using the following equation. l w = ∑ α i d i ϕ( x i ) (E.12) i =1 E.2 Network structure and training method Figure E.2 shows the network structure of SVM, which includes input layer, output layer and hidden layer (Haykin, 1999). Figure E.2 SVM Configuration SVM training algorithm is called sequential minimal optimization (SMO) which has been widely used due to fast learning speed even in large problem. Since it always optimizes and alters two Lagrange multipliers at every step, making the problem can be solved easily and quickly. The improved SMO learning algorithm is introduced as following (Shevade et al., 2000). Define the following index sets at a given α : 166 Appendix E: SVM Theory in Classification Task I0 = {i : < α i < C} ; I1 = {i : y i = 1, α i = 0} ; I2 = {i : y i = −1, α i = C} ; I3 = {i : y i = 1, α i = C} ; I4 = {i : y i = −1, α i = 0} . (E.13) N Define Fi = ∑ α j y j K ( x i , x j ) − y i , and j =1 Fi _ up = bup = min{Fi : i ∈ I ∪ I ∪ I } ; Fi _ low = blow = max{Fi : i ∈ I ∪ I ∪ I } . (E.14) Then optimality conditions will hold at some α iff blow ≤ bup + 2τ (E.15) where τ is a positive tolerance parameter. Correspondingly, violation is defined at α if one of the following sets of conditions holds: i ∈ I ∪ I ∪ I , j ∈ I ∪ I ∪ I and Fi > F j + 2τ (E.16) i ∈ I ∪ I ∪ I , j ∈ I ∪ I ∪ I and Fi < F j − 2τ (E.17) This algorithm always begins with the worst violating pair. Choose i2=i_low and i1=i_up. If the target y1 does not equal the target y2 , then the following bounds apply to α : L=max(0,α2-α1), H=min(C,C+α2-α1) (E.18) If the target y1 equals to the target y2 , then the following bounds apply to α : L=max(0,α2+α1-C), H=min(C,α2+α1) (E.19) The second derivative of the objective function along the diagonal line can be expressed as: η = K ( x1 , x1 ) + K ( x , x ) − K ( x1 , x ) (E.20) If η is positive, α 2new = α + y ( F1 − F2 ) / η , then 167 Appendix E: SVM Theory in Classification Task α 2new,clipped ⎧ H ⎪ = ⎨α 2new ⎪ L ⎩ α 2new ≥ H ; , , L < α 2new < H ; , α new (E.21) ≤ L. Otherwise, SMO will move to the Lagrange multipliers to the end point (L or H) that has the lowest value of the objective function. And if α 2new,clipped − α ≥ eps * ( α 2new,clipped + α2 + eps), then α 1new = α + y1 y * (α − α 2new,clipped ) (E.22) where constant eps is a tolerance parameter. Otherwise, select another different α2 and repeat the whole process again. Note that after a successful step using a pair of ~ indices (i2,i1), let I = I ∪ {i1 , i2 } . ( i _ low, blow ) and ( i _ up, bup ) are computed using ~ I only. Update F values for i1 and i2 by (E.23) and (E.24), and use this together with y i to decide the index set of i. , clipped F1new = F1 + y1 * (α1new − α1 ) * K ( x1 , x1 ) + y * (α new − α ) * K ( x1 , x ) (E.23) F2new = F2 + y1 * (α1new − α1 ) * K ( x1 , x ) + y * (α 2new,clipped − α ) * K ( x , x ) (E.24) This algorithm first loops over the entire training set for one time, then makes repeated passes over the non-bound examples ( < α i < C ) until all of those examples obey the optimality conditions (E.15). Again the algorithm alternates between single pass over the entire training set and multiple passes over the non-bound subset until the number of α i needed to be changed in the entire training set is equal to zero. The convergence of this algorithm has been proved by Keerthi and Gilbert (2002). E.3 Generalization performance and evaluation criteria Many practical time series contain a large amount of noise. If the number of free parameters within a network is chosen too large, this network can capture not only 168 Appendix E: SVM Theory in Classification Task useful information contained in the data but also unwanted noises, named over-fitting. This will certainly lead the network to perform well on the training data, but generalize poorly to similar input-output patterns not used in training. In order to avoid over-fitting, generalization must be evaluated in the learning process. Generalization performance measures how well a network performs on unseen data after the training stage has been completed. The problem of generalization is analogous to the curve fitting problem with regularization - a good curve fit must represent some underlying trend in the data rather than provide an arbitrarily close fit to the data points without regard to complexity. The generalization performance is evaluated using classification error. Three factors mainly influence the ability of a network to generalize: the size and efficiency of the training set; the architecture of the network; the physical complexity of the problem. Of these three factors, the last is the only one which we have no control. The other two factors, whether the training samples contain sufficient information to enable the network to generalize correctly, and how to optimize the structure and architecture of the network have been investigated in this practical application. Thus, the other two statistical metrics, namely classification error, training time and the number of support vectors are added to explain generalization performance. The classification error in the form of percentages measures the deviation between the actual and predicted output, thereby a smaller value suggests a better prediction. Classification error can also be used to evaluate the quality of training set and network performance. In SVM, support vectors (SVs) come from the training data which are considered important in separating the two classes from each other, while other input vectors are 169 Appendix E: SVM Theory in Classification Task ignored because of their insignificance in constructing a hyperplane for classification task. Under the similar classification error, a smaller number of SVs is preferred. Consider an example, if a classification task can be constructed by two kinds network structure according to SV set A and B with similar classification error, and the size of set A is smaller than that of set B, the trained network based on set A is considered more effective and powerful to describe this classification task than that of set B. Since the decision time closely relates with the number of SV, a shorter decision time is also found in set A. About the issue of training time, let’s consider another example. Under the same SVM network structure, two classification models with a similar classification error are developed based on training data sets C and D from one data population. Assuming the training time of data set C is smaller than that of D, the classification network based on C generalizes quickly than that of D, thereby training data set C is considered to own a better learning performance. E.4 Selecting training data and tuning parameters E.4.1 Training data selection When NNs and related methods are used in classification, independent parameters in this network will be specified in terms of training data. Hence, the network performance heavily depends upon the quality of training data, assuming an optimal topology for the network is known. In the past, training data is selected arbitrary. While, Reeves (1995) found that different training data set gave substantially different error in classification. Sollich (1995) observed that a smaller training set may produce equally or better generalization performance than a larger data set containing redundant samples. Hence, an effective training data set should be selected, which can 170 Appendix E: SVM Theory in Classification Task not only reduce the size of training data set, but also enhance the computation performance. Misclassification rate, decision time (the number of support vectors) and training time are key factors to evaluate the quality of training data set in network learning. E.4.2 Parameters tuning Choosing optimal hyperparameter values is an important step in SVM design. This is usually done by minimizing either an estimate of generalization error or some other related performance measure. In this research, the tradeoff between the lowest misclassification rate and the smallest decision time (the smallest number of support vectors) is the key of parameter selection. During the past few years, several tuning methods have been proposed. Xi-Alpha Bound developed by Joachims (2000) estimated an upper bound on the error rate of leave one out procedure. Approximate Span Bound introduced by Vapnik and Chapelle (1999) can not only provide a good function for SVM hyperparameter selection, but also reflect the actual error rate. A Vapnik-Chervonenkis (VC) bound was proposed (Burges, 1998) to approximate the VC-dimension in the sum of empirical risk and VC confidence by a loose bound. While, none of them yielded a performance as good as k-fold cross validation (CV), which can provide a good correlation with the test error (Duan, 2003). In k-fold CV, the training data is randomly split into k mutually exclusive subsets (the folds) of approximately equal size. The error is obtained by using k-1 subsets for training and then is tested on the subset left out. This procedure is repeated k times and each subset is used for testing once, then averaging the test error over the k trials gives an estimate of the expected generalization error. Note that each of the datasets 171 Appendix E: SVM Theory in Classification Task has a large number of test samples so that performance on the test set, the testing error, can be taken as an accurate reflection of generalization performance. For efficiency, k-fold CV is useful to have simpler estimates that, though crude, are very inexpensive to compute. In particular, they not require any matrix operations involving the kernel matrix. Although it is still not a structured way to select the optimal parameters, it is useful and efficient in the practical application (Haykin, 1999). 172 Appendix F: Performance Comparison between Standard and Modified SVM Performance Comparison between Standard and Modified SVM r _ SVM and s _ SVM are defined as the indices of revised SVM approach and standard SVM approach, respectively. 1. Null hypothesis: Lr _ SVM − Ls _ SVM = Alternative hypothesis: Lr _ SVM − Ls _ SVM < 2. Level of significance α = 0.1 3. Criterion: Reject the null hypothesis if Z 1.895, where 1.895 is the value of t 0.1 for degrees of freedom and where Z is given by the formula below. η (Ts ) − η (T0 ) Z= ST2s n + ST20 n where η and S denote the average value of generalization error, and its standard deviation. n is the degree of freedom. 4. Calculation: Z = − 16 . 9525 17.5313 0.99504 + 1.39578 = . 893 5. Decision: Since Z < 1.895 , the null hypothesis cannot be rejected: that is, the data is not enough to substantiate the claim η (Ts ) − η (T0 ) > (i.e. the generalization error with the use of training set Ts is not bigger than that of T0). 176 [...]... Multiclassification (a) W1=0 .15 6 W2=0.530 (b) (c) W1=0 .17 0 W2=0.392 Figure C -1 Tool state prediction (workpiece ASSAB760, insert SNMN120408 of material A30, v =15 0 m/min, f=0.4 mm/rev, d=1mm) (a) AE signals, (b) Tool wear and tool state prediction from the standard SVM, (c) Prediction result from the revised SVM (a) W1=0.204 W2=0.540 (b) (c) W1=0 .18 8 W2=0.353 Figure C-2 Tool state prediction (workpiece... ( x ⋅ x i ) − b0 i =1 (E.5) l w = ∑ α i yi x i i =1 (E.6) where α i0 are Lagrange multipliers at the solution of the optimization problem (E.2)-(E.4) and the parameter b0 at the solution is computed by taking advantage of the conditions on the support vectors 16 3 Appendix E: SVM Theory in Classification Task E .1. 2 Nonlinear SVM As the computation capability of the linear functions is somewhat limited,... pair of ~ indices (i2,i1), let I = I 0 ∪ {i1 , i2 } ( i _ low, blow ) and ( i _ up, bup ) are computed using ~ I only Update F values for i1 and i2 by (E.23) and (E.24), and use this together with y i to decide the index set of i new F1new = F1 + y1 * ( 1 − 1 ) * K ( x1 , x1 ) + y 2 * (α new,clipped − α 2 ) * K ( x1 , x 2 ) 2 (E.23) new new F2new = F2 + y1 * ( 1 − 1 ) * K ( x1 , x 2 ) + y 2 * (α 2... 2 : L=max(0,α2- 1) , H=min(C,C+α2- 1) (E .18 ) If the target y1 equals to the target y2 , then the following bounds apply to α 2 : L=max(0,α2+ 1- C), H=min(C,α2+ 1) (E .19 ) The second derivative of the objective function along the diagonal line can be expressed as: η = K ( x1 , x1 ) + K ( x 2 , x 2 ) − 2 K ( x1 , x 2 ) (E.20) new If η is positive, α 2 = α 2 + y 2 ( F1 − F2 ) / η , then 16 7 Appendix E: SVM... nonlinear is more practical for solving complex estimation problems The idea of generating nonlinear SVM is to map the original input space X into a high dimensional feature space H by a function ϕ( x ) and then to construct a linear function in the high dimensional feature space which corresponds to a nonlinear function in the original input space Both the decision function g ( x ) and optimization... corresponds to better generalization 16 2 Appendix E: SVM Theory in Classification Task Let’s consider a binary classification task with l training data x i ∈ R d (i = 1, L, l ) having corresponding class labels y i = 1 Suppose that the training data can be separated by a hyperplane decision function (E .1) with appropriate coefficients w and b , where x is an input vector g ( x) = w ⋅ x − b (E .1) The... insert SNMN120408 of material A30, v =15 0 m/min, f=0.4 mm/rev, d=1mm) (a) AE signals, (b) Tool wear and tool state prediction from the standard SVM, (c) Prediction result from the revised SVM 15 8 Appendix D: TCM Graphs in Titanium Machining Figure D -1 Cutting force, tool wear and tool state prediction from the standard and revised SVM (v=80 m/min, f=0.2 mm/rev, d=0.5mm) Figure D-2 Cutting force, tool wear... tool state prediction from the standard and revised SVM (v=80 m/min, f=0.3 mm/rev, d=0.5mm) Figure D-3 Cutting force, tool wear and tool state prediction from the standard and revised SVM (v=80 m/min, f=0 .15 mm/rev, d=0.75mm) 15 9 Appendix D: TCM Graphs in Titanium Machining Figure D-4 Cutting force, tool wear and tool state prediction from the standard and revised SVM (v=80 m/min, f=0 .1 mm/rev, d=1mm)... given α : 16 6 Appendix E: SVM Theory in Classification Task I0 = {i : 0 < α i < C} ; I1 = {i : y i = 1, α i = 0} ; I2 = {i : y i = 1, α i = C} ; I3 = {i : y i = 1, α i = C} ; I4 = {i : y i = 1, α i = 0} (E .13 ) N Define Fi = ∑ α j y j K ( x i , x j ) − y i , and j =1 Fi _ up = bup = min{Fi : i ∈ I 0 ∪ I 1 ∪ I 2 } ; Fi _ low = blow = max{Fi : i ∈ I 0 ∪ I 3 ∪ I 4 } (E .14 ) Then optimality conditions will... TCM 17 3 Appendix G: Apparatus F .1 OKUMA LH 35 – N Turning Machine F 2 Tool Holder and AE Sensor 17 4 Appendix G: Apparatus F 3 Measuring System (Tool Wear and Wear Image) Digital Camer Lighting Microscope Measuring Unit Image Monitor 17 5 Appendix H: Comparison of Classification Error under To and Ts Comparison of Classification Error under T0 and Ts Eight sets of identification result are used to perform . Selection 14 8 0 50 10 0 15 0 200 -0.5 0 0.5 1 AE 0 50 10 0 15 0 200 0 0.5 1 detected wear=0.277mm Tool State flank wear tool state 0 50 10 0 15 0 200 0 0.5 1 detected wear=0.271mm Time(s) Tool. 50 10 0 15 0 200 -0.5 0 0.5 1 AE 0 50 10 0 15 0 200 0 0.5 1 detected wear=0.273mm Tool State flank wear tool state 0 50 10 0 15 0 200 0 0.5 1 detected wear=0.278mm Time(s) Tool State flank wear tool. 50 10 0 15 0 200 0 0.5 1 detected wear=0.332mm Tool State flank wear tool state 0 50 10 0 15 0 200 0 0.5 1 detected wear=0. 312 mm Time(s) Tool State flank wear tool state Figure A-3 AE signals, tool

Định dạng
Số trang	29
Dung lượng	641,84 KB