Luận án tiến sĩ Khoa học máy tính: Linear and Nonlinear Analysis for Transduced Current Curves of Electrochemical Biosensors

5.1 Review of Hematocrit and Previous Measurement Methods...79 5.1.1 Typical Methods for Measuring Hematocrit ...79 5.1.2 Hematocrit Determination from Impedance...80 5.1.3 Hematocrit Me

INTRODUCTION

Statement of the Problem

Glucose is a major component coming from carbohydrate foods It is used as a main source of energy in the body The measurement of glucose in the blood plays an important role in diagnosis and treatment, and especially in the effective treatment of diabetes Typically, there are two types of insulin treatment in diabetic therapy: basal and mealtime The basal insulin may also be called “background” insulin that refers to continuous secretion of pancreas; it is the insulin working behind the scenes and often taken before bed Mealtime insulin treatment is the injection of additional doses of faster acting insulin to control the fluctuation of blood glucose levels, which is resulted from different reasons, including the metabolization of sugars and carbohydrates Such fluctuation control requires accurate measurement of the blood glucose levels Failure to do so can result in extreme complications such as blindness and loss of circulation in extremities In addition, Krinsley [1] reported that even a modest degree of hyperglycemia occurring after intensive care unit admission was associated with a substantial increase in hospital mortality in patients with a wide range of medial and surgical diagnoses Patients with glucose concentrations of 80 to 99mg/dL had the lowest hospital mortality (9.6%) and it increased up to 27% with patients having glucose concentrations between 100mg/dL and 119mg/dL The further increase in glucose concentrations had deleterious association with the highest hospital mortality (42.5%) among patients with glucose concentrations exceeding 300mg/dL (P0 is a constant determining the trade off between the flatness of g and the amount up to which deviations larger then ε are tolerated The formula (2.45) should be switched to a Lagrangian formulation because of two reasons The first is that constraints given by (2.45) are replaced by constraints on the Lagrangian multipliers themselves that are much easier to handle The second is that the training data will only appear in the form of dot products between vectors after switching to

Lagrangian formulation This crucial property allows us to generalize the procedure to the nonlinear case A Lagrangian function constructed from (2.45) has form:

(2.46) where are the Lagrange multipliers The Lagrangian function has to be minimized with respect to w, b and maximized with respect to

, , , i i i i α α η η Requiring that the gradient of L P with respect to w and b vanish, we have

Note that refers to and Substituting (2.47) into (2.46), the corresponding dual optimization problem is

We can see that the dual variables and are eliminated through condition (2.47c) From (2.47b), solution for w is η i η * i

Thus, w can be completely described as linear combination of training patterns x i The complexity of representation of a function by SVs is independent of the dimensionality of the input space; it depends only on the number of SVs

Finally, we can compute b based on Karush-Kuhn-Tucker (KKT) conditions which state that, at the point of the solution, the product between dual variables and constraints has to be zero [43, 44] Thus, we have

This allows us to conclude that:

- Only patterns (x i , t i ) corresponding toα i ( ) * =C lie outside the ε-insensitive tube around g (ξ i ( ) * ≠0)

- There can never be a set of dual variables and which are both simultaneously nonzero ( α i α * i

Thus, we have max{ , | 0} min{ , | 0 }

In the case , the inequalities become equalities, and b can be computed as follows:

The other methods for computing b can be seen in [45] and [46]

For the sparsity of the SV expansion, we can also see from (2.51) that for all patterns inside the ε-insensitive tube, |g(x i )-t i |< ε, the second factor in (2.51) is nonzero, hence has to be zero such that the KKT conditions are satisfied These

Lagrange multipliers may be nonzero only for patterns outside the ε-insensitive tube,

(*) α i i )-t i |≥ ε Thus, from (2.49) we do not need all patterns x i to describe w, and we have a sparse expansion of w in term of x i

In the situations where the nonlinear models are required to adequate model data, we can use a nonlinear mapping to map the data into other feature space where the linear regression is performed We can also use an approach via kernels

This allows us to rewrite the SV algorithm as follows:

The solution for w is given by

Some common kernels can be used:

- Polynomial (homogeneous): K(x i ,x)=(< x i ,x >) q - Polynomial (inhomogeneous): K(x i ,x)=(< x i ,x >+1) q - Radial basis function: K(x i ,x)=exp(-γ||x 1-x|| 2 ), γ >0

- Gaussian radial basic function: K(x i ,x) 2 exp( 2 ) 2 i σ

- Sigmoid: K(x i ,x)=tanh(k< x i ,x >+c), for some k >0 and c0 and activation function f in R which is infinitely differentiable in any interval, there exist ẹ ≤N such that for N arbitrary distinct patterns {(x j, t j)| x j∈R p , t j∈R c , j=1,…, N }, for any w m and b m randomly assigned from any intervals of R p and R, respectively, according to any continuous probability distribution, then with probability one, ||HA-T|| ≠ μ v θ ji,G+1 ji,G+1 ji,G+1 b(j) CR j = rnbr(i) b(j) CR j rnbr(i) where rand b(j) is the j-th evaluation of a uniform random number generator,

CR is the crossover constant and rnbr(i) is a randomly chosen index which ensures at least one parameter from v ji,G+1 (iii) Determine the output weights

(iv) Evaluate the fitness for each individual

(v) Selection: The new generation is determined by: i ,G i,G if ( ) ( ) ( ), if ( ) ( ) ( ) and otherwise,

A μ A θ μ θ θ θ μ θ θ θ i,G i,G i,G i,G i,G+1 i,G i,G i,G i,G i,G ϕ ϕ ϕ ϕ ϕ ϕ where φ(ã) is the fitness function and є is a predefined tolerance rate The DE process is repeated until the goal is met or a maximum learning epochs is reached This algorithm can obtain good generalization performance with compact networks

However, this algorithm is also slow due to the iteration of the DE process, especially for the data sets with very large number of input features In addition, it did not obtain small input weight values as the original ELM algorithm.

Least-Squares Extreme Learning Machine

The LS-ELM was proposed in our previous study [57] The aim in this study was to develop an efficient leaning algorithm for SLFNs with a smaller number of hidden units while producing better generalization capability and faster computation Instead of randomly choosing or iteratively adjusting the input weights and biases as in the ELM, OS-ELM or BP, the LS-ELM estimates them analytically In LS-ELM, determining weights and biases of SLFNs consists of two stages In the first stage, the input weights and hidden layer biases are estimated based on a linear model

Then, the output weights are determined by the second linear model

From (3.4), if we assume that the output weight matrix A is determined then the hidden layer output matrix H can be estimated as

H=TA † , (3.10) where A † is the Moore-Penrose generalized inverse of A For an invertible function f(ã), we have Π= f -1 [TA † ], (3.11) where f -1 [TA † ] =f -1 ([TA † ] ) and Π= (3.12)

⎥ If we define the matrix B ∈R cxẹ by

B= T † f -1 [TA † ], (3.13) where T † is pseudo inverse of T, then (3.11) becomes Π =TB (3.14)

Define the input matrix as

L (3.15) and let W be the matrix of input weights and biases defined by

The minimum norm solution for W among all possible solutions is Ŵ=X † TB, (3.18) where X † is the MP generalized inverse of X

At beginning of learning, the matrix A is unknown Therefore, instead of estimating matrix B by equation (3.13), we can randomly assign values for B, and then estimate the matrix of input weights and biases by equation (3.18) After estimating the input weights w m and the biases b m (m=1, 2,…, ẹ) of the hidden units, we can calculate the hidden layer output matrix H and the output weight matrix Â by equation (3.5) and equation (3.9) In summary, our proposed least-squares extreme learning machine (LS-ELM) algorithm for training SLFNs can be described as follows:

Given a training set S={(x j,t j) | j=1,…,N}, activation function f(x), and number of hidden node ẹ

1 Randomly assign values for matrix B

2 Estimate input weights w m and biases b m by equation (3.18)

3 Calculate the hidden layer output matrix H by the equation (3.5)

4 Determine the output weights by the equation (3.9)

Thus, parameters of the networks can be determined by the non-iterative procedure

They do not need to be adjusted interactively with an appropriate initialization of the network parameters (i.e., weights and biases) and proper choices of control parameters (i.e., epochs, learning rate, etc.) This algorithm results in very fast training process compared to conventional iterative learning algorithms for SLFNs

Although we select random values for the matrix B∈R cxẹ , in comparison with ELM or OS-ELM, the number of random values required is significantly reduced from (p+1)xẹ to cxẹ when the number of outputs denoted by c is much smaller than the number of input features denoted by p, which is usual in most of the applications

Furthermore, SLFNs trained by our LS-ELM algorithm can have a small number of hidden units, which can further reduce the number of random values In addition, the solution for the input weights and hidden layer biases by equation (3.18) is the minimum norm solution As analyzed by Peter L Bartlett [58], the networks tend to have better generalization performance with small weights Therefore, the LS-ELM approach with small norm weights can be expected to give better performance than the original ELM

3.4.2 Online Training with LS-ELM

When the mount of training data is very large or when memory costs are very expensive, an online training method should be addressed Updating output weight matrix based on the recursive least-square solution is given by

A A P H T - H A , (3.20) where the initialization of P and A is given by P 0 =(H H T 0 0 − 1 and A (0) =P H T 0 0 T 0

Now, we must estimate the input weights and biases for the SLFN From equation (3.18), the matrix B∈R CxN is randomly chosen which does not depend on arriving data Therefore, it is just randomly chosen once, and then parameters of the SLFN are recursively adjusted We consider the case where rank(X)=p+1, with p is the number of input features The MP generalized inverse of X is given by X † (X T X) -1 X T Hence, the estimation of W in equation (3.18) is given by Ŵ=(X T X) -1 X T TB (3.21)

Thus, the minimum norm solution for the initial training subset S 0 ={(x j,t j) | j=1,…,N 0 } can be

L X X 0 Suppose now that there are N 1 observations of the second training subset

S 1 ={(x j,t j)|j= N 0 +1,…, N 0 +N 1 } The input weights and biases can be estimated by

By expanding the last three terms on the right side of equation (3.22), we have

By substituting into (3.20), we obtain

In generalization, for N k+1 observations of the (k+1) th training subset S k+1 ={(x j,t j)|j=

∑ = } The input weights and biases can be determined as

Let Q k+1= L -1 k +1 , then Q k+1 is expressed by Woodbury identity as [Golub1996]

Finally, the update formula for W (k+1) is given by

In summary, the online training scheme for SLFNs with LS-ELM can be described as follows:

Given a training set S={(x j,t j)|j=1,…,N}, activation function f(x), and number of hidden node ẹ

1 Initialization: For the initial training subset S 0 ={(x j,t j) | j=1,…,N 0 } i Assign random values for the matrix B ∈R cxẹ ii Calculate the initial input weights and biases

T 0=[t 1 t 2…t N0] T iii Calculate the initial hidden layer output matrix

M iv Determine the initial output weights by

2 Training process : For the (k+1) th training subset S k+1 ={(x j,t j)|j= ,…,

∑ = } we do: i Estimate input weights and biases W (k+1)

W W + Q X T B - X W k ii Calculate the partial hidden layer output matrix H k+1 by

M iii Determine the output weights A (k+1)

This algorithm consists of two processes, which are initialization process and training process In the initialization process, the matrix B is assigned randomly, and then the initial weights and biases are determined based on the initial training subset

S 0 The number of patterns required for S 0 should be at least max{ẹ, p+1} In the training process, weights and biases of the SLFNs are updated for each training subset S k+1 (k=0,…, K-1) where x N∈S K, which implies that each input pattern is involved in training only once and the total number of training subsets is K+1 including S 0.

Regularized Least-Squares Extreme Learning Machine (RLS-ELM)

In this section, we introduce a regularized least-squares extreme learning machine (RLS-ELM) which is an efficient approach for training SLFNs with a smaller number of hidden units [59] The input weights and biases of hidden units in the RLS-ELM are estimated by a simple regularized least-squares scheme, which is an approach to overcome the ill-posed problem in LS-ELM The most commonly used method for the regularization of ill-posed problems has been Tikhonov regularization [60], in which the solution for W of equation (3.17) can be replaced by seeking W that minimizes

||XW-TB|| 2 +λ||W|| 2 , (3.27) where ||ã|| is the Euclidean norm and λ is a positive constant The solution for W is given by Ŵ=(X T X+λ I) -1 X T TB (3.28)

The size of X T X is (p+1)x(p+1) Therefore, when the number of input features is large, the size of X T X is very huge In order to reduce computation for inverse matrix, equation (3.28) should have the following format: Ŵ = (X T X+λ I) -1 X T TB

In addition, if we can express W as a linear combination of x j ’s for j=1,2, …, N, i.e., there exists a matrix Y ∈ R N N × % such that W=X T Y, then equation (3.17) becomes XX T Y=TB Thus, we can obtain an indirect solution for (3.17) in the case of the large number of input features as Ŵ =X T Y, (3.30) where

As done in LS-ELM, the matrix B should be assigned randomly and then used to estimate the input weights and biases of hidden units When the input weights and biases of hidden units are estimated, the output weights are analytically computed by the MP generalized inverse as shown in the equation (3.9) In summary, our proposed RLS-ELM algorithm for compact SLFNs can be described as follows:

Given a training set S={(x j ,t j )|j=1,…,N}, activation function f(ã), and number of hidden units ẹ:

● Randomly assign the values for the matrix B

● Estimate the input weights w m and biases b m by using (3.28), (3.29) or (3.30) depending on applications

● Calculate the hidden-layer output matrix H by using the equation (3.5)

● Determine the output weight matrix A by using the equation (3.9)

We can also see that the parameters of the networks can be determined by the non-iterative procedure It is simple and has low computational complexity, which results in extremely high speed for both training and testing The estimated solution for W by using (3.28), (3.29) or (3.30) is also found with small norm, which can lead the SLFNs to produce better generalization performance as analyzed by Bartlett [58].

Evolutionary Least-Squares Extreme Learning Machine (ELS-ELM)

It was shown that the performance of SLFNs in function approximation can be improved by applying the DE process, in which the initial generation is generated by using the linear model proposed by our previous studies A training algorithm called evolutionary least-squares extreme learning machine (ELS-ELM) was proposed in our study [61] Training SLFNs by ELS-ELM consists of two steps; the first step is the initialization of population based on the linear system In the second step, the input weights and hidden layer biases are estimated by the DE process, and the output weights are determined through MP generalized inverse

As done in E-ELM, suppose that each individual in the population is composed of a set of input weights and hidden layer biases: θ={ w 1 T , w 2 T ,…, w T N , b 1 , b 2 ,…, b N }

The initialization of population is performed based on the linear system shown by (3.17) After initializing the population, the output weights corresponding to each individual are determined by using equation (3.9) Then the fitness of each individual is computed by using the root mean square error (RMSE):

The DE process is then used to find the optimal set of the input weights and hidden layer biases We should note that the networks tend to have better generalization performance with small weights [58] Therefore, as done in E-ELM, we also add an additional criterion for the selection step that is the norm of output weights ||A||, by which an individual with smaller norm ||A|| is selected if the difference of the fitness between individuals is small In summary, our proposed ELS-ELM for training SLFNs can be described as follows:

Given a training set S={(x j ,t j )|j=1,…,N}, an activation function f(ã) and the number of hidden units ẹ,

1 Initialization: Generate the initial generation composed of parameter vectors { θ i,G | i=1, 2, …, NP} as the population, where NP is the population size:

For each individual θ in the population, we do:

● Randomly assign the values for the matrix B

● Estimate input weights w m and biases b m of θ by using linear system

● Calculate the hidden-layer output matrix H by using equation (3.5)

● Determine the output weights A by using equation (3.9)

At each generation G, we do: i) Mutation: the mutant vector is generated as v i,G+1 = θ r1,G +F(θ r2,G - θ r3,G ), where r1, r2, r3 are different random indices and F is a constant factor used to control the amplification of the differential variation ii) Crossover: the trial vector is formed so that if or if and rand rand

= ⎨⎧⎩ > ≠ μ v θ ji,G+1 ji,G+1 ji,G+1 b(j) CR j = rnbr(i) b(j) CR j rnbr(i), where rand b(j) is the j-th evaluation of a uniform random number generator, CR is the crossover constant and rnbr(i) is a randomly chosen index which ensures at least one parameter from v ji,G+1 iii) Compute the hidden layer output matrix H by equation (3.5) iv) Determine the output weights by equation (3.9) v) Evaluate the fitness for each individual vi) Selection: The new generation is determined as in E-ELM.

OUTLIER DETECTION AND ELIMINATION

Distance-based outlier detection

A data point in a data set S is an outlier if its neighborhood contains less than pct% of the data set S There were two distance-based algorithms proposed by Knorr et al

[67], index-based algorithms and nested-loop algorithm In the index-based algorithms, a range search with predefined radius for each object is performed As soon as its pct% neighbors are found, the search stops and the object is not considered as an outlier; otherwise, the object is consider as an outlier This algorithm is time-consuming An algorithm that can overcome this problem is the nested-loop algorithm, which reduces computational cost by using block-oriented or nested-loop design It is appreciated for large dataset

The distance-based methods can detect certain kinds of outliers Because it takes a global view of the dataset, these outliers can be viewed as global outliers

Therefore, it only works well for global uniform density, and can not work well when the subsets of data have different densities

Figure 4.1 A simple 2D dataset contains points belonging to two clusters C 1 and C 2 C 1 forms a denser cluster than C 2 Two additional points o 1 and o 2 can be considered as outliers

To illustrate this fact, we consider an example given in the Fig 4.1, which is a simple 2D dataset containing points belonging to two clusters C 1 and C 2 C 1 forms a denser cluster than C 2 , and there are two additional points o 1 and o 2 which can be called outliers With our notion, we wish to label o 1 and o 2 as outliers However, in the distance-based framework, the distance between every point p in C and its nearest neighbor is larger than the distance between o 1 and C 1 Hence, we cannot find an appropriate value of pct such that o 1 is an outlier but the points in the C 2 are not

This problem is overcome by a formal definition of local outliers and a density-based scheme proposed by Breuning et al [65].

Density-based local outlier detection

This method uses the “local outlier factor” LOF to measure how strong a point can be an outlier [65] Definitions for local outliers as following:

For any positive integer k, the k-distance of point p, denoted as k-dist(p), is defined as the distance d(p,o) between p and a point o∈S such that:

(1) For at least k points o’∈S\{p} it holds that d(p,o’)≤ d(p,o), and (2) For at most k-1 points o’ ∈ S\{p} it holds that d(p,o’) < d(p,o)

Definition 4.2: (k-distance neighborhood of a point p)

Given the k-distance of p, the k-distance neighborhood of p, N k (p), contains every point whose distance from p is not greater than the k-distance

Definition 4.3: (reachability distance of a point p with respect to point o)

Let k be a natural number The reachability distance of a point p with respect to point o is defined as reach_dist k (p,o)=max{k-distance(o), d(p,o)}

Definition 4.4: (local reachability density of a point p)

The local reachability density of a point p is defined as

∑ , (4.1) where MinPts is a minimum number of points in a density-based cluster

Definition 4.5: (local outlier factor of a point p)

The local outlier factor of a point p is defined as

The LOF value of a point is obtained by comparing its density with those in its neighborhood to measure how strong a point can be an outlier It has properties as following: i) For most objects p in a cluster, the LOF of p is approximately equal to 1 ii) Let p be an point from the database S , 1≤MinPts≤| S |, and C 1 , C 2 , …, C n be a partition of N MinPts (p), it holds that min

≤⎜⎝∑ ξ ⎟⎠ ⎝⎜∑ ξ ⎟⎠ (4.4) where ξ i = C i / N MinPts (p) i i be the percentage of points in the neighborhood of p, which are also in C i , direct min ( )p , direct max ( )p be minimum and maximum reachability distance between p and a MinPts-nearest neighbor of p in C i , and

, be minimum and maximum reachability distance between the MinPts-nearest neighbor q of p and a MinPts-nearest neighbor of q in C min i ( ) indirect p direct max i ( )p i

Relied on LOF values, the outliers are detected However, this scheme may not work well for gradually sparse distribution or density-based clusters that are so closed each other.

The Chebyshev outlier detection

Another method for outlier detection is based on Chebyshev theorem proposed by Brett G Amidan et al [68] The Chebyshev’s inequality is stated that

− ≤ ≥ −k ) , (4.5) where μ is data mean, σ is standard deviation of data, and k is the number of standard deviations from mean For example, with k=2, it shows that at least 75%(3/4) of data would fall within two standard deviations Thus, the inequality (4.5) gives a lower bound of the percentage of data that is within a certain number of standard deviations from mean, it does not depend on the distribution of data Furthermore, with changes to focus on the amount of data away from mean, the inequality (4.5) becomes

This gives an upper bound on the amount of data that is away from the mean with k standard deviations Therefore, Brett G Amidan et al [68] used this Chebyshev inequality to calculate upper limit (ODV u ) and lower limit (ODV l ) of an outlier detection value An object with the value that is not within the range of the upper and lower limits would be considered as an outlier This method allows detecting multiple outliers at a time.

Area-descent-based outlier detection

Our work for outlier detection is based on area-descent of convex-hull-polygon [69]

It can detect outliers relying on only two adjacent points on the polygon and shows their location related to the dataset, which can infer a proper shift to reduce their effects on linear regression The symbols and notations used for reviewing our area descent method are shown in the Table 4.1

Suppose N measurements (attribute values) t 1 , t 2 ,…, t N (N≥1) are made on the referential object x={x 1 , x 2 ,…,x N } Let P={P 1 , P 2 ,…,P N } be a 2D point set corresponding to measurements, i.e P i =(x i , t i ) Firstly, we determine the convex hull polygon, which is made up of the most outside points in the data set Let denote a polygon by K={Q 1 , Q 2 ,…,Q k } where K ⊂ P , Q i ∈ P and k ≪N Let also denote the area of convex hull polygon as S

Figure 4.2 Detecting outliers by the area descent method

For each point Q i on the polygon K , let S i denote the convex-hull area of point set without Q i If the difference between S and S i is larger than a threshold θ, which means ΔS i =S-S i >θ, (4.7) then Q i can be viewed as an outlier candidate There may be many outlier candidates for each polygon K causing many Q i , which satisfy condition (4.7) Depending on application, all of these candidates can be outliers or only candidates whose area descent is maximum can be outliers, i.e Q h is viewed as an outlier if ΔS h = is larger than the threshold θ An outline of algorithm for detecting outliers is described as following: max { i }

An algorithm for detecting outliers:

1 Detect convex-hull polygon K and its area, S, from point set P

2 For each point Q i ∈ K , compute P’ = P -{Q i }, S i =area of convex-hull polygon of P’ , and ΔS i =S-S i

4 If Δ S h > θ then - Q h is an outlier

- Remove Q h from point set P , goto step 1

This algorithm can accurately detect outliers that are solely isolated even from the data sets having gradually sparse distribution However, it can not detect outliers that are in even a small group, because their area-descent is too small to pass the predefined threshold For instance, a simple dataset shown in the Fig 4.3 consists of 100 points and two outliers o 1 and o 2 which are very closed each other making a small group

Figure 4.3 A simple dataset with closed outliers o1 and o2 These outliers cannot be detected by area-descent based method

The value of the area descent corresponding to o 1 will be smaller than that of the other points on the polygon K Hence, o 1 cannot be detected as an outlier and thus o 2 can also not be for subsequent iterance To overcome this problem, we propose a two-stage area descent algorithm which is detailed in the following section [70, 71].

Two-stage area-descent outlier detection

To overcome problem of detecting outliers clustered together in the previous area- descent based method, we apply a two-stage area-descent algorithm in which the first stage is the same as the algorithm described in the previous section [70, 71] The second stage will start when there is no data point that its area-descent is larger than the threshold θ Thus, in the first stage, all of solely isolated outliers are detected, and outliers clustered together will be detected in the second stage In the second stage, two procedures are repeated: determination of convex hull polygon K i and elimination of the data points involved in constructing K i These two procedures will stop when the total number of removed data points exceeds the predefined threshold value pts Let m denote the number of the convex-hull polygon detected in this stage

Polygons K i , i=1, 2,…, m, contain outside points of the dataset Thus, all outliers detected in the second stage should be in the polygons K 1 , K 2 ,…, K m Detecting outliers in the second stage is similar to that in the first stage

However, in this stage, the polygons are first determined as previously explained, and we start to detect outliers from the polygon K m-1 to K 1 An algorithm for detecting outliers consisting of two stages is described as following:

The two-stage algorithm for detecting outliers:

1 Detect convex-hull polygon K and its area, S, from point set P

2 For each point Q i ∈ K , compute P’ = P -{Q i }, S i =area of convex-hull polygon of P’ , and Δ S i =S-S i

- Remove Q h from point set P , go to step 1

6 Find a convex-hull polygon K 1 from point set P

9 Repeat a Find a convex hull polygon K i from P b Compute P = P - K i c i=i+1

10 Until the amount of data points on polygon K 1 , K 2 , …, K m larger than pts% of the data set 11 Set P m = K m 12 For i=m-1 to 1 do begin a Compute P i = P i+1 U K i

Repeat b Compute S=area of convex-hull polygon of P i c For each point Q j ∈ K i , compute P’ = P i -{Q j }, S j =area of convex-hull polygon of P’ , and ΔS j =S-S j d Compute ΔS h = max { j }

The most outside data points of data set are taken out gradually till the amount of taken points are large enough Outlier detection is performed only on these data points from interior to exterior Hence, it can detect outliers that are close to each other making a small group.

ELM-based outlier Detection and Elimination

Using neural networks for outlier detection and rejection is also researched by researchers [72-77] In regression problem, reducing the effect of outliers is also shown in our study In [75], the SLFN trained by the ELM algorithm is used to detect outliers It is extremely fast and outlier detection does not depend on the distribution of the dataset Identifying outliers is based on the outputs of the network; patterns that are not activating output units should be considered as outliers and must be eliminated to improve the regression performance Training the network consists of two stages In the first stage, the network trained with ELM, patterns in the training set are evaluated based on their output values If the output value of a pattern exceeded a threshold then that pattern is considered as outlier and rejected from the training set The output weights are re-estimated using ELM algorithm after rejecting outliers

Threshold depends upon application In the regression problem, for a predefined threshold θ, outliers should have difference between its output value and expected output larger than this threshold An outline of two-stage ELM algorithms for regression is illustrated as following:

Algorithm Two-stage ELM for regression:

5 Train the SLFN with S using ELM

6 Compute the output values O corresponding to S, O={o i , i=1,…, N}

7 Compute difference between the output values and expected values, d i = |t i -o i |

8 Reject patterns from S which have the corresponding difference values larger than the predefined threshold θ; S=S-{(x k ,t k ) | d k >θ}

9 Train the SLFN with S using ELM

Another approach based on ELM which is robust to outliers is the weighted least squares scheme [76] This approach is similar to the least-squares approach for determining the output weights of ELM algorithms However, instead of being weighted equally, the penalties corresponding to training patterns are weighted so that patterns with larger penalty weights will contribute more to the fit:

∑ ∑ β h α ) whereβ j is the penalty weight corresponding to the pattern x j The equation (4.8) can be rewritten as

Take derivatives over all α i ’s (A) and set the gradient to zero

In matrix notation, we can write

The estimation of A=[α 1 , α 2 , …, α c ] will be Â=H *† T * , (4.14) where H *† is the Moore-Penrose generalized inverse of matrix H * The penalty weights β j , j=1, 2 , , N, are chosen so that patterns with small weights are candidate outliers which contribute less to the fit The ELM based algorithm is extremely high leaning speed Therefore, we propose to use the ELM based algorithms to evaluate patterns Thus, in our approach, the training process of SLFNs consists of two stages

In the first stage, the SLFN is trained with the ELM based algorithm The output values and penalties of patterns are calculated Let e be the penalty corresponding to the pattern x j , in our approach, e j is determined as e j =||t j -o j || The penalty weights are determined based on these penalties In the second stage, after computing penalty weights, the SLFN is trained again based on the weighted least-squares scheme, in which the output network weights are estimated by using (3.9) An outline of algorithm for training SLFNs with reducing effects of outliers in regression is illustrated as follows:

Training algorithm for SLFN with reducing effects of outliers:

Given a training set S={(x j , t j ) | j=1, …, N}, activation f(ã), 1 Train SLFNs with S using the original ELM algorithm

2 Compute the output values O corresponding to S; O={o j , j=1 ,…, N}

4 Determine the penalty weights β j based on values of e j 5 Train the SLFN by using equation (3.9).

HEMATOCRIT ESTIMATION FROM TRANSDUCED

Review of Hematocrit and Previous Measurement Methods

Blood is made up of red blood cells (RBCs), white blood cells (WBCs), platelets, and plasma The measure of the fractional level of red blood cells in the whole blood is expressed as hematocrit (HCT) For example, a hematocrit of 30% means that there are 30 milliliters of red blood cells in 100 milliliters of blood The HCT is a very useful clinical indicator in surgical procedures and hemodialysis [78, 79] It can affect the adenosine diphosphate-included aggregation of human platelets [2] and produce a decreased bleeding time [80] An increased hematocrit can refer to an indicator of unfavorable outcome in the course of acute myocardial infarction [81]

In addition, many studies showed that hematocrit variations can significantly affect the accuracy of glucose measurements [3, 5, 8] The glucose results are overestimated at lower hematocrit levels and underestimated at higher hematocrit levels Therefore, the hematocrit estimation also plays an important role in improving performance of glucose meters

5.1.1 Typical Methods for Measuring Hematocrit

The hematocrit can manually determined by centrifugation method In which a capillary tube called micro-hematocrit tube is filled with blood When the tube is centrifuged at 10,000RPM for five minutes, the blood is separated into layers The RBCs with the greatest weight are forced to the bottom of the tube, the WBCs and platelets form a thin layer between the RBCs and the plasma that is the buffy coat, and the top layer is liquid plasma The hematocrit is measured as the percent of the RBC column to the total blood column

With modern lab equipment, the hematocrit is typically measured from a blood by automated analyzer, which can make several other measurements at the same time

In the automated machines, the hematocrit is not directly measured It is calculated by multiplying the red cell count by the mean cell volume (number of red blood cells per liter):

HCT=Mean Corpuscular Volume x Red Blood Cell Count, (5.1)

The mean corpuscular volume (MCV) is measurement of the average RBC volume

With anemia patients, it allows to identify whether they are microcytic anemia (MCV below normal range, 100 mg/dL

Table 8.6 Comparison results for different criteria of error tolerance p(%) Within error tolerance of SLFN 10 93.28%

Figure 8.8 The comparison of glucose value between the neural network and the primary reference instrument corresponding to criterion of ±15mg/dL for glucose levels ≤100 mg/dL and ±15% for glucose levels > 100 mg/dL

From our experimental studies, the neural network can be a good method to compute the glucose values with the input features of transduced current curve

Moreover, this approach produced better performance than the methods that use the intermediate steps of hematocrit estimation We believe that direct estimation or computation of glucose value can cover all the cases of error occurrences, while the method using hematocrit estimation only considers the dependency of hematocrit.

CONCLUSIONS AND FUTURE WORKS

Conclusions

Our research has devised approaches for improving accuracy of handheld devices for glucose measurements They are not complicated chemical procedures

They are based on intelligent computing methods, which are simple and with less cost for handheld devices These contribute significantly to enhancing medical diagnosis and treatment, especially the effective treatment of diabetes

We have devised the hematocrit estimation methods by using glucose biosensors It relied on the transduced anodic current curves produced from the enzyme reaction of glucose oxidase The current points used in estimation were sampled at frequency of 10Hz from the second period of current curve There are three proposed approaches, which are support vector machine, neural networks and linear models The neural networks, specially the single hidden layer feedforward neural network (SLFN), can be trained by efficient learning algorithms such as ELM and its improvements They could use the whole current composed of 59 sampled points to be the input features for estimation The linear models could be used with two choices of input features They could use 59 sampled points or use a subset of sampled points and two extra features representing the shape of curve Experiments showed that both linear models are better than SLFN and support vector machine (SVM) in terms of two criteria RMSE and MPE The results obtained from these methods are important in reducing or eliminating the effects of hematocrit level in measuring glucose value In addition, it shows that one of the important clinical indicator, hematocrit, can be estimated by cheep handheld device with fast measuring time

The error corrections for glucose measurements have been described The mapping functions from hematocrit density to residuals which are defined by differences between the measured/estimated glucose values and the primary reference measurements, have been found The return values of these functions are used to adjust the measured glucoses by reducing the effects of hematocrit, from which we can obtain more accurate values for glucose measurements

A glucose measurement method using a single point of current curve was also implemented, in which the informative point for glucose measurements was first discovered by using statistical methods After estimated using the single point, the glucose value is adjusted to accurate values by reducing the effects of hematocrit

We have devised a method to obtain the glucose value directly from the transduced current curve The single hidden layer feedforward neural networks (SLFNs) were used The input features were the whole current curve The networks were trained by ELM and its improved algorithms This approach not only reduces the effects of hematocrit but also may reduce the effects of other critical care factors

Experimental results showed that the dependence of hematocrit on residuals is reduced and the accuracy of glucose measurements is significantly improved

In our research, novel training algorithms for the single hidden layer feedforward neural networks (SLFNs) were also proposed Unlike the original ELM that assigns the random values, our approaches including LS-ELM, RLS-ELM tried to find an optimal set of the input weights and biases of hidden units Instead of using global searching methods based on evolutionary algorithms (EA) such as E-ELM, the input weights and hidden layer biases can be determined by a single step based on the fast minimum-norm least-squares or regularized least-squares scheme This analytical determination can lead the SLFNs to a compact network with a smaller number of hidden units, which results in extremely high training speed, quick reaction of the trained network to new observations and better generalization performance In addition, a combination between RLS-ELM and differential evolution (DE) was also proposed This was called evolutionary least-squares extreme learning machine (ELS-ELM), in which the input weights and the hidden layer biases were estimated by using the differential evolution (DE) process while the output weights were determined by MP generalized inverse However, unlike E-ELM, ELS-ELM initiated the individual of the initial generation by the least-squares scheme This method can obtain the trained networks with small number of hidden units as E-ELM and LS-ELM while producing the better RMSE for regression problems

In outlier detection, our research proposed a new approach for outlier detection based on area-descent of convex-hull polygons It can detect outliers, which are solely isolated or clustered together This approach is much simple and is not dependent upon knowing the distribution of the data but can provide a suitable direction to eliminate effects of outliers for the linear regression.

Future Works

In this research, we are interest in the improvement and application of linear and nonlinear system for analysis of transduced current curve Besides those results we obtained through the research, there are also some issues that can be the topics for future research In this section, we discuss topics and ideas concerning our future research

Feature selection has been an active research in areas of applications including pattern recognition, statistics and data mining The main goal of feature selection is to choose a subset of input features, which can significantly improve the prediction performance of the predictors, provide faster and cost-effective predictors, and gain a deeper insight into the underlying processes that generated the data Furthermore, the feature selection plays an important role in analyzing the transduced current curve by reducing the number of time points for data acquisition The first advantage is to be able to save the battery equipped to the handheld device Another benefit is memory saving in handheld device which has limited hardware capacity

Our experiments have performed on electrochemical biosensor However, the proposed approaches can be applied for optical biosensor, which is a device capable of measuring the luminescence Researches reported that the glucose measurements using portable devices with optical biosensors have also been affected by the critical care variables Therefore, reducing effects of these variables plays an important role in improving the performance of portable devices

9.2.3 Reducing Effects of Other Factors

Besides the critical care variables hematocrit, PCO 2 , pH and PO 2 , there may exist other factors influencing the glucose measurements, such as age, disease, medication, etc For this research, the whole blood samples should be carefully selected based on the donor’s life style and medical information All of the other procedures for this study can be the same as that we have done

9.2.4 Applying Improvements of ELM in Medical Diagnosis

The neural networks have applied successfully in data mining, biomedicine and biomedical applications due to their abilities to resolve problems that are very difficult to handle by classical methods Traditional training algorithms based on gradient-descent have some problems such as over-fitting, learning rate, epochs , etc

The ELM algorithm can overcome these problems but often requires a large number of hidden units Therefore, the improvements of ELM in our research can be effective applications of SLFNs in data mining, biomedicine and biomedical applications

[1] J S Krinsley, "Association between hyperglycemia and increased hospital mortality in a heterogeneous population of critically ill patients," Mayo Clinic

[2] H L Goldsmith, E S Kaufer, and F A McIntosh, "Effect of hematocrit on adenosine diphosphate-induced aggregation of human platelets in tube flow,"

Biorheology, vol 32, pp 537-52, Sep-Oct 1995

[3] E S Kilpatrick, A G Rumley, H Myint, M H Dominiczak, and M Small,

"The effect of variations in haematocrit, mean cell volume and red blood cell count on reagent strip tests for glucose," Ann Clin Biochem, vol 30 ( Pt 5), pp

[4] E S Kilpatrick, A G Rumley, and E A Smith, "Variations in sample pH and pO2 affect ExacTech meter glucose measurements," Diabet Med, vol 11, pp

[5] R F Louie, Z P Tang, D V Sutton, J H Lee, and G J Kost, "Point-of-care glucose testing - Effects of critical care variables, influence of reference instruments, and a modular glucose meter design," Archives of Pathology &

Laboratory Medicine, vol 124, pp 257-266, Feb 2000

[6] Z Tang, R F Louie, J H Lee, D M Lee, E E Miller, and G J Kost,

"Oxygen effects on glucose meter measurements with glucose dehydrogenase- and oxidase-based test strips for point-of-care testing," Crit Care Med, vol 29, pp 1062-70, May 2001

[7] Z Tang, R F Louie, M Payes, K C Chang, and G J Kost, "Oxygen effects on glucose measurements with a reference analyzer and three handheld meters," Diabetes Technol Ther, vol 2, pp 349-62, Autumn 2000

[8] Z P Tang, J H Lee, R F Louie, and G J Kost, "Effects of different hematocrit levels on glucose measurements with handheld meters for point-of- care testing," Archives of Pathology & Laboratory Medicine, vol 124, pp

[9] E F Treo, C J Felice, M C Tirado, M E Valentinuzzi, and D O Cervantes,

"Hematocrit measurement by dielectric spectroscopy," Ieee Transactions on

Biomedical Engineering, vol 52, pp 124-127, Jan 2005

[10] G B Huang, Q Y Zhu, and C K Siew, "Extreme learning machine: Theory and applications," Neurocomputing, vol 70, pp 489-501, Dec 2006

[11] D G Kleinbaum, L L Kupper, K E Muller, and A Nizam, Applied

Regression Analysis and Other Multivariable Methods: Duxbury Press, 1997

[12] P J Brown, Measurement, Regression and Calibration: Oxford University Press, 1994

[13] J J Peterson, S Cahya, and E del Castillo, "A general approach to confidence regions for optimal factor levels of response surfaces," Biometrics, vol 58, pp

[14] G King, J Honaker, A Joseph, and K Scheve, "Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation," The

American Political Science Review, vol 95, pp 49-69, March 2001

[15] S M Kay, Fundamentals of Statistical Signal Processing: Estimation Theory vol 1: Prentice Hall, 1993

[16] J Hertz, A Krogh, and R G Palmer, Introduction to the theory of neural computation: Addison Wesley Publishing Company, 1991

[17] D R Hush and B G Horne, "Progress in Supervised Neural Networks: What’s New Since Lippmann?," in IEEE Signal Processing magazine vol 10, 1993, pp 8-39

[18] P J G Lisboa, E C Ifeachor, and P S Szczepaniak, Artificial neural networks in Biomedicine: Springer-Verlag London Berlin Heidelberg, 2000

[19] R N G Naguib and G V Sherbet, Artificial Neural Networks in Cancer

Diagnosis, Prognosis, and Patient Management Washington DC: CRC Press,

[20] R J Schalkoff, Artificial Neural Networks: MIT Press and McGraw-Hill Companies, 1997

[21] P K Simpson, Artificial Neural Systems: Foundations, Paradigms,

Applications, and Implementations New York: Pergamon Press, 1990

[22] J M Zurada, Introduction to Artificial Neural Systems: West Publishing Company, 1992

[23] D Nguyen and B Widrow, "Improving the Learning Speed of 2-Layer Neural Networks by Choosing Initial Values of the Adaptive Weights," in Proceedings of the International Joint Conference on Neural Networks vol 3, 1990, pp 21-

[24] A A Suratgar, M B Tavakoli, and A Hoseinabadi, "Modified Levenberg- Marquardt Method for Neural Networks Training," in Proceedings of World academy of Science, engineering and technology vol 6, 2005, pp 46-48

[25] R Hecht-Nielsen, "Kolmogorov's mapping neural network existence theorem," in The First IEEE Annual International Conference on Neural Networks vol

[26] G Cybenko, "Approximation by superpositions of a sigmoidal function,"

Mathematics of Control, Signals, and Systems (MCSS), vol 2, pp 303-314,

[27] K Funahashi, "On the approximate realization of continuous mappings by neural networks," Neural Networks, vol 2, pp 183-192, 1989

[28] K Hornik, M Stinchcombe, and H White, "Multilayer feedforward networks are universal approximators," Neural Networks, vol 2, pp 359-366, 1989

[29] K Hornik, "Approximation capabilities of multilayer feedforward networks,"

[30] M Leshno, V Y Lin, A Pinkus, and S Schocken, "Multilayer feedforward networks with a nonpolynomial activation function can approximate any function," Neural Networks, vol 6, pp 861-867, 1993

[31] T P Chen, H Chen, and R W Liu, "Approximation Capability in C((R)over- Bar(N)) by Multilayer Feedforward Networks and Related Problems," Ieee

Transactions on Neural Networks, vol 6, pp 25-30, Jan 1995

[32] Y Ito, "Representation of functions by superpositions of a step or sigmoid function and their applications to neural network theory," Neural Networks, vol

[33] Y Ito, "Approximation of continuous functions on R d by linear combinations of shifted rotations of a sigmoid function with and without scaling," Neural

[34] G.-B Huang and H A Babri, "General Approximation Theorem on Feedforward Networks," in Proc of Int’l Conf on Information,

Communications and Signal Processing (ICIS’97), 1997, pp 698-702

[35] S.-C Huang and Y.-F Huang, "Bounds on the number of hidden neurons in multilayer perceptrons," IEEE transactions on neural networks, vol 2, pp 47- 55, 1991

[36] M A Sartori and P J Antsaklis, "A simple method to derive bounds on the size and to train multilayer neural networks," IEEE Transactions on Neural

[37] G B Huang and H A Babri, "Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions," Ieee Transactions on Neural Networks, vol 9, pp 224-229, Jan

[38] V Vapnik, The Nature of Statistical Learning Theory: Springer, 1999

[39] V N Vapnik, Statistical Learning Theory: Wiley-Interscience, 1998

[40] V N Vapnik and A Y Chervonenkis, "On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities," Theory of Probability and its Applications, vol 16, pp 264-280, 1971

[41] V Vapnik, S E Golowich, and A Smola, "Support vector method for function approximation, regression estimation, and signal processing," in Advances in

Neural Information Processing Systems 9 Cambridge, MA: MIT Press, 1997, pp 281-287

[42] A J Smola, "Learning with Kernels," PhD thesis, Technische Universitọt Berlin, 1998

[43] W Karush, "Minima of Functions of Several Variables with Inequalities as Side Constraints," in Department of Mathematics vol MSc Thesis: University of

[44] H W Kuhn and A W Tucker, "Nonlinear programming," in Proceedings of the

Second Berkeley Symposium on mathematical Statistics and Probability

Berkeley: University of California Press, 1951, pp 481-492

[45] S S Keerthi, S K Shevade, C Bhattacharyya, and K R K Murthy,

"Improvements to Platt's SMO algorithm for SVM classifier design," Neural

[46] A J Smola and B Scholkopf, "A tutorial on support vector regression,"

Statistics and Computing, vol 14, pp 199-222, Aug 2004

[47] B Scholkopf, A J Smola, R C Williamson, and P L Bartlett, "New support vector algorithms," Neural Computation, vol 12, pp 1207-1245, May 2000

[48] G B Huang and L Chen, "Convex incremental extreme learning machine,"

[49] G B Huang, L Chen, and C K Siew, "Universal approximation using incremental constructive feedforward networks with random hidden nodes,"

Ieee Transactions on Neural Networks, vol 17, pp 879-892, Jul 2006

[50] G B Huang, "Learning capability and storage capacity of two-hidden-layer feedforward networks," Ieee Transactions on Neural Networks, vol 14, pp

[51] C R Rao and S K Mitra, Generalized Inverse of Matrices and Its

Applications: John Wiley & Sons Inc, 1971

[52] D Serre, Matrices: Theory and Applications New York: Springer, 2002

[53] S Ferrari and R F Stengel, "Smooth function approximation using neural networks," Ieee Transactions on Neural Networks, vol 16, pp 24-38, Jan 2005

[54] N Y Liang, G B Huang, P Saratchandran, and N Sundararajan, "A fast and accurate online sequential learning algorithm for feedforward networks," Ieee

Transactions on Neural Networks, vol 17, pp 1411-1423, Nov 2006

[55] Q Y Zhu, A K Qin, P N Suganthan, and G B Huang, "Evolutionary extreme learning machine," Pattern Recognition, vol 38, pp 1759-1763, Oct 2005

[56] R Storn and K Price, "Differential Evolution – A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces," Journal of Global

[57] Hieu Trung Huynh and Yonggwan Won, "Small number of hidden units for ELM with two-stage liner model," Ieice Transactions on Information and

[58] P L Bartlett, "The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network," Ieee Transactions on Information Theory, vol 44, pp 525-536, Mar

[59] Hieu Trung Huynh, Jung-ja Kim, and Y Won, "An Improvement of Extreme Learning Machine for Compact Single-Hidden-Layer Feedforward Neural Networks," International Journal of Neural Systems, vol 18, pp 433-441,

[60] A N Tikhonov and V Y Arsenin, Solutions of ill-posed problems

[61] Hieu Trung Huynh and Yonggwan Won, "Evolutionary Algorithm for Training Compact Single Hidden Layer Feedforward Neural Networks," in The IEEE

2008 Int’l Joint Conference on Neural networks (IJCNN2008), 2008, pp 3027-

[62] V Barnett and T Lewis, Outliers in Statistical Data, 3rd ed New York: John Wiley, 1994

[63] R K Pearson, "Outliers in process modeling and identification," Ieee

Transactions on Control Systems Technology, vol 10, pp 55-63, Jan 2002

[64] D Hawkins, Identification of Outliers, 1st ed.: Springer, 1980

[65] M M Breunig, H.-p Kriegel, R T Ng, and J S, "LOF: Identifying Density- Based Local Outliers," in Proc ACM SIGMOD 2000 Int’l Conference on

Management of Data, Texas, 2000, pp 427-438

[66] X Lu, Y Li, and X Zhang, "A simple strategy for detecting outlier samples in microarray data," in the 8 th Int’l Conference on Control, Automation, Robotics and Vision Kunming, 2004, pp 1331-1335

[67] E M Knorr and R T Ng, "Algorithms for mining distance-based outliers in large datasets," in Proceedings of the 24th International Conference on Very

[68] B G Amidan, T A Ferryman, and S K Cooley, "Data Outlier Detection using the Chebyshev Theorem," in the 2005 IEEE Aerospace Conference, 2005, pp

[69] Huynh Trung Hieu and Yonggwan Won, "A Method for Outlier Detection based on Area Descent," in Proc 21st Int’l Conference on Circuit/Systems,

Computers and Communications vol 1, 2006, pp 193-196

[70] Hieu Trung Huynh, M T T Hoang, N H Vo, and Yonggwan Won, "Outlier Detection with Two-Stage Area-Descent Method for Linear Regression," in The

6 th WSEAS Int’l Conf on Applied computer science (ACS’06), 2006, pp 463-

[71] Hieu Trung Huynh, N H Vo, M T T Hoang, and Yonggwan Won, "An Improvement of Outlier Detection in Linear Regression based on Area- Descent," WSEAS Trans on Computer Research, vol 1, pp 174-180, 2006

[72] Hieu Trung Huynh, N H Vo, M.-T T Hoang, and Yonggwan Won,

"Performance enhancement of RBF networks in classification by removing outliers in the training phase," in Lecture Note in Artificial Intelligence,

LNAI4617: Springer-Verlag Berlin Heidelberg, 2007, pp 341-350

[73] Hieu Trung Huynh, N H Vo, M.-T T Hoang, and Yonggwan Won, "Outlier Treatment for SLFNs in Classification," in the 5th Int’l conf on computational science and applications (ICCSA2007): IEEE, 2007, pp 104 -109

[74] Hieu Trung Huynh and Yonggwan Won, "Performance Enhancement of SLFNs in Classification by Reducing Effect of Outliers," in Proc 22th Int’l

Conference on Circuit/Systems, Computers and Communications vol 1, 2007, pp 159-160

[75] Hieu Trung Huynh and Yonggwan Won, "Two-Stage Extreme Learning Machine for SLFNs in Regression," in Proc 22th Int’l Conference on

Circuit/Systems, Computers and Communications vol 1, 2007, pp 161-162

[76] Hieu Trung Huynh and Yonggwan Won, "Weighted Least Squares Scheme for Reducing Effects of Outliers in Regression Based on Extreme Learning Machine," International Journal of Digital Content Technology and its Applications, in press, 2008

[77] J Liu and P Gader, "Outlier rejection with MLPs and Variants of RBF Networks," in Proc 15 th Int’l Conf on Pattern Recognition vol 2, 2000, pp

[78] A Aris, J M Padro, J O Bonnin, and J M Caralps, "Prediction of hematocrit changes in open-heart surgery without blood transfusion," Journal of

[79] J Z Ma, J Ebben, H Xia, and A J Collins, "Hematocrit level and associated mortality in hemodialysis patients," J Am Soc Nephrol, vol 10, pp 610-619, Mar 1999

[80] F Fernandez, C Goudable, P Sie, H Ton-That, D Durand, J M Suc, and B

Boneu, "Low haematocrit and prolonged bleeding time in uraemic patients: effect of red cell transfusions," Br J Haematol, vol 59, pp 139-48, Jan 1985

[81] J Fuchs, I Weinberger, A Teboul, Z Rotenberg, H Joshua, and J Agmon,

"Plasma viscosity and haematocrit in the course of acute myocardial infarction," Eur Heart J, vol 8, pp 1195-200, Nov 1987

[82] K Maeda, T Shinzato, F Yoshida, Y Tsuruta, M Usuda, K Yamada, T

Ishihara, F Inagaki, I Igarashi, and T Kitano, "Newly developed circulating blood volume-monitoring system and its clinical application for measuring changes in blood volume during hemofiltration," Artif Organs, vol 10, pp 452- 9, Dec 1986

[83] K R Foster and H P Schwan, "Dielectric properties of tissues and biological materials: a critical review," Crit Rev Biomed Eng, vol 17, pp 25-104, 1989

[84] Hieu Trung Huynh and Yonggwan Won, "Hematocrit Estimation from Transduced Current Patterns Using Single Hidden Layer Feedforward Neural Networks," in the 2007 Int’l Conference on Convergence Information

[85] Hieu Trung Huynh and Yonggwan Won, "Neural Networks for the Estimation of Hematocrit from Transduced Current Curves," in Proc of the 2008 IEEE

Int’l Conf on Networking, Sensing and Control, 2008, pp 1517-1520

[86] Hieu Trung Huynh and Yonggwan Won, "Hematocrit Estimation from Compact Single Hidden Layer Feedforward Neural Networks Trained by Evolutionary Algorithm," in Proc of the 2008 IEEE World Congress on Computational

[87] World Health Organisation Department of Noncommunicable Disease Surveillance, "Definition, Diagnosis and Classification of Diabetes Mellitus and its Complications," 1999

[88] M H Kutner, C Nachtsheim, J Neter, and W Li, Applied Linear Statistical

[89] S W Looney and T R G Jr, "Use of the Correlation Coefficient with

Normal Probability Plots," The American Statistician, vol 39, pp 75-79, 1985

[90] D Oberg and C G Ostenson, "Performance of glucose dehydrogenase-and glucose oxidase-based blood glucose meters at high altitude and low temperature," Diabetes Care, vol 28, p 1261, May 2005

전기화학 바이오센서에서 발생되는 변환전류곡선의 선형 및 비선형 분석

전남대학교대학원 컴퓨터정보통신공학과

(지도교수: 원용관)

과학과 기술의 발달 덕택에, 광범위한 진단 시험을 정교한 실험실 장비들 없이도빠르고 간편하게 수행이 가능해졌으며, 이때 바이오 센서가 기술적으로 주요 역할을 한다 바이오센서들은 복잡한 혼합물 및 분화합물의 분석 및 결정에 있어 화학 및 생화확 산업 뿐 아니라 의약 및 건강 산업 분야에서도 매우 유용하다 일반적으로, 바이오센서들은 정확도, 비용, 가용성, 범위 및 간편성 등과 같은 기능적 특성과 디자인에 따라 분류 및 평가된다 이러한 기준에 비춰, 가격과 가용성 측면에서 있어서, 과 같은 디자인 그리고 기능 특성에 기초를 두어 분류되고 평가될 수 있다 이 기초에, 전기화학 바이오 센서는 정확성, 비용및 가용성 때문에 선호된다

수 많은 노력을 들인 결과로, 혈중 포도당 측정을 위한 전기화학적바이오센서는 현재 상업적으로 널리 활용되는 바이오센서가 되고 있다

그들은 휴대형 기기와 함께 당뇨병 환자의 혈중 포도당 농도를 정상 수치에 가깝도록 유지하기 위하여 혈중 포도당 농도 수준을 감시에 적용되며, 이는 당뇨병으로 인한 합병증을 감소시킬 수 있다 비록, 휴대형 기기가 혈중 포도당 농도를 감시하고 제어하기 위하여 편리하게 사용되고 있으나, 그들의 정확도는 요산 아스코르빈산 산, PO 2, PCO 2, pH, 헤마토크릿 등의 간섭에 의하여 크게 영향을 받으며, 그 중에서도 헤마토크릿이 휴대형 장치에 의한 측정에 가장 크게 영향을 미친다 한편, 이러한 산화 가능 물질에 의한 간섭은 화학 방법으로 감소되는 수 있으나, 헤마토크릿의 간섭 영향을 줄이기 위한 소수의 현실적인 해결 방법만이 제안되었다 그러나, 이러한 해결 방법은 바이오센서 제조를 위한 공정 절차를 복잡성과 가격을 상승시키며, 휴대형 기기로의 구현 또한 어렵다

본 연구는 전기화학 바이오센서를 이용한 혈중 포도당 측정에 있어 휴대형 측정기의 정확도를 개선하기 위한 지능컴퓨팅 방법의 개발에 집중한다 혈중 포도당 농도 측정에 있어서의 전기화학 바이오센서의 분석 원리는 생물학적 상호반응 과정에 근간을 두고 있는데, 혈중 포도당과 포도당 산화효소와의 상호작용과 전극에 의해 단순화된 효소 형태의 산화에 의하여 변환전류라 불리는 전기화학적 전류 신호가 발생된다 이러한 변환전류는 시간에 따라 변화하며, 이는 변환전류 곡선(TCC)라

불리는 곡선으로 표현된다 변환전류 곡선(TCC)는 어떤 방법으로든

휴대형 측정기를 이용한 포도당 농도의 결정에 사용되어 왔다 그러나, 본

연구는 변환전류 곡선(TCC)의 변화 형태가 포도당에 대한 정보뿐만

아니라 간섭을 포함한 다양한 인자들도 포함한다는 믿음에서 출발하였다

따라서, 변환전류 곡선(TCC)의 분석은 전기화학 바이오 센서를 사용한

측정의 성능을 개선하는데 결정적인 역할을 할 수 있다

본 연구에서는, 변환전류 곡선을 분석하기 위하여 Support Vector Machines

(SVM) 과 신경회로망을 포함하는 선형 및 비선형 모델들이 연구된다

Tiêu đề	Linear and Nonlinear Analysis for Transduced Current Curves of Electrochemical Biosensors
Tác giả	HUYNH TRUNG HIEU
Người hướng dẫn	Professor Yonggwan Won
Trường học	Chonnam National University
Chuyên ngành	Computer Engineering
Thể loại	Doctor of Philosophy Dissertation
Năm xuất bản	2009

Định dạng
Số trang	169
Dung lượng	1 MB

Luận án tiến sĩ Khoa học máy tính: Linear and Nonlinear Analysis for Transduced Current Curves of Electrochemical Biosensors

INTRODUCTION

Statement of the Problem

Least-Squares Extreme Learning Machine

Regularized Least-Squares Extreme Learning Machine (RLS-ELM)

Evolutionary Least-Squares Extreme Learning Machine (ELS-ELM)

OUTLIER DETECTION AND ELIMINATION

Distance-based outlier detection

Density-based local outlier detection

The Chebyshev outlier detection

Area-descent-based outlier detection

Two-stage area-descent outlier detection

ELM-based outlier Detection and Elimination

HEMATOCRIT ESTIMATION FROM TRANSDUCED

Review of Hematocrit and Previous Measurement Methods

CONCLUSIONS AND FUTURE WORKS

Conclusions

Future Works

Approximation Capabilities of Feedforward networks and SLFNs

Review of Hematocrit and Previous Measurement Methods