Nghiên cứu biểu diễn và nhận dạng đối tượng chuyển động dựa trên đại số hình học bảo giác và học máy TT TIENG ANH

MINISTRY OF EDUCATION AND TRAINING THE UNIVERSITY OF DANANG NGUYEN NANG HUNG VAN RESEARCH ON REPRESENTATION AND RECOGNITION OF MOVING OBJECTS BASED ON CONFORMAL GEOMETRIC ALGEBRA AND MACHINE LEARNING Specialization : COMPUTER SCIENCE Code : 62 48 01 01 DOCTORAL THESIS SUMMARY DaNang, 2021 The dissertation is completed at: THE UNIVERSITY OF DANANG Supervisors: Assoc Prof Kanta Tachibana Dr Phạm Minh Tuấn Reviewer 1: ……………………………………………… Reviewer 2: ……………………………………………… Reviewer 3: ……………………………………………… The dissertation will be defended before approval committee at The University of Danang: Time , date month year The dissertation can be found at: - Vietnam National Library; - The Center for Learning Information Resources and Communication, The University of Danang INTRODUCTION Today, the development of science and technology has created big data from electronic transaction systems, multimedia data storage systems, and sensor applications in the internet of things The development of technology has prompted researchers to move from low-level data collection and reception to high-level integrated research that enables them to analyze, identify, and predict possible problems in the future Therefore, there are more and more practical problems that need to be solved, especially recognition of moving objects in space to support security systems, smart homes, smart hospitals, etc., and Artificial Intelligence Data generated from systems are getting increasingly large and complicated, while machine learning algorithms normally use linear transformations to represent the data and assume the data is distributed on a plane In case data is distributed on the hyper-plane or hyper-sphere form of objects that move and rotate in space, then results from data processing are not accurate Therefore, the thesis proposes the study of representing moving objects based on Conformal Geometrical Algebra (CGA) that facilitates recognition of human actions CGA is extended from real m dimensional space by adding two base vectors and using transformations to convert vectors in real space into a set of points in CGA space Complex data distributions are optimized by hyper-plane or hyper-sphere data approximation method A vector in CGA space is represented as a point, hyper-plane, hyper-sphere The goals of the thesis The thesis objective is to study representing moving objects based on conformal geometric algebra On that basis, it is proposed to incorporate CGA into machine learning to recognize moving objects and human actions In particular, the thesis focuses on the following research objectives: - The Gauss probability density function will cluster the data by calculating the variance based on the distance from the point to the mean of the data, so it cannot accurately represent the data with a complex distribution The thesis proposes a new data clustering method by combining the Gaussian probability density function with CGA to calculate the variance based on the data approximation method of hyper-spheres in CGA space - The vector quantization method based on the data clustering method using k-means uses the center of each cluster as a representation It is hence not possible to cluster the data of hyper-spheres distributions The thesis proposes vector quantization based on a data clustering method using CGA to cluster data of moving objects and to create discrete data chains for HMM training - Take advantage of CGA to represent moving objects in space, the thesis proposes to use CGA instead of PCA to optimize data in the PCR machine learning model and to apply it to Human activity recognition (HAR) - The thesis proposes the reduction of data dimension by characteristic extraction method using CGA in combination with RNN to recognize human actions - The thesis researches and proposes a method of pre-processing of input data to create input data for the clustering and feature extraction of the object; From the advantages of using CGA to represent moving objects in 3D space, the thesis proposes the method of clustering and feature extraction using conformal geometric algebra (CGA) - The thesis research proposes a model that combines CGA with machine learning models such as GMM, HMM, PCR, and Recurrent Neural Network (RNN) to recognize human actions; selection of parameters to propose better model improvements The thesis researches and builds an experimental model from the proposed methods to select models that fruited the best results in training and object recognition The thesis has tested the proposed models on CMU motion capture data set with many 08 human actions The experiments were conducted with numerous different methods and parameters to compare and evaluate the results, thereby giving an appropriate research direction for the thesis The objectives and scope of the thesis The thesis covers the following objects: - Several probability models in machine learning - Geometric algebra and Conformal geometric algebra - Some machine learning models like HMM, PCR, and RNN - Data clustering and feature extraction method using CGA in moving objects and HAR Upon determination of the above-mentioned objectives and objects, the scope of the study are as follows: - CGA studies are based on geometrical data such as points, lines, planes, spheres, and super spheres for application into machine learning - Research on the CGA model combined with machine learning includes two main parts: (1) using CGA to cluster and extract features of moving objects; (2) combining CGA with some machine learning models in training and moving objects recognition The thesis proposes the use of CGA to represent moving objects and the combination of CGA with other machine learning models to facilitate moving object representation and HAR The research methodology of the thesis The research method in the thesis is based on the inheritance of fundamental knowledge in science and engineering, including: - Models in Machine Learning - Techniques, and models in Machine Learning - Geometric Algebra and Conformal Geometric Algebra The research method used in the thesis is a method that combines theory and experiment to evaluate the results of the proposed model: - Explore studies related to Machine learning and Geometric algebra On that basis, the advantages and disadvantages of each method are evaluated to propose new methods and data analysis models of objects moving in space The evaluation of the methods will be based on the rate of identification and processing speed - Extract features of moving objects in 3D space and use the CMU data set in the experiment Develop models based on proposed methods to experiment and evaluate results The structure of the thesis Based on the research contents to achieve objectives and ensure the logic of the research problem, in addition to the introduction, conclusion, and future work sections, the thesis is structured into three main chapters: Chapter Method of representing moving objects in machine learning Overview of data representation methods in space and data representation methods in machine learning with a focus on the method of representing moving objects using Conformal Geometric Algebra Chapter Method of representing moving objects based on Conformal Geometric Algebra Presentation of Geometric Algebra with operators, reflections, and rotations used to solve problems in multidimensional spaces, introduction to the hyper-plane or hyper-spheres approximation method in CGA Presentation of proposals for applying Conformal Geometric Algebra in moving objects representation in space Focus on proposing the combination of Conformal Geometric Algebra with Gaussian mixed model and vector quantization for clustering data, Conformal Geometric Algebra method with PCR to classify data, and feature extraction method based on Conformal Geometric Algebra Finally, we conclude and evaluate the advantages and disadvantages of the proposed models as well as the need for Conformal Geometric Algebra with Machine Learning Chapter Experiment deployment and result evaluations Presenting the method of building an experimental model based on the proposed model and data of the moving object Human activity recognition based on clustering data using CGA and HMM, Human activity recognition based on a combination of PCR and CGA, and feature extraction method using CGA in combination with RNN to Human activity recognition Finally, test results are concluded and evaluated to offer the researchers the next research direction The research results of the thesis The thesis has covered the initially set objectives The thesis "Research on representation and recognition of moving objects based on Conformal Geometric Algebra and Machine Learning" has achieved the below results: - The thesis has studied the basic problems of data representation in space and data representation methods in machine learning, resulting in the proposal of the application of Conformal Geometric Algebra in moving object representation in space - The thesis proposes a data clustering method by combining Conformal Geometric Algebra with GMM to discrete data for HMM in recognition - The thesis proposes a vector quantization method by clustering data using Conformal Geometric Algebra to optimize the distance function in CGA space and combining CGA with HMM to HAR The correct recognition result of the proposed model is 81.9% - The thesis proposes to build a data training model by combining PCR with CGA to Human activity recognition The correct recognition result of the proposed model is 88,9% - The thesis proposes a method to feature extraction of the object and reduce the data dimension by using CGA, combining CGA with RNN in Human activity recognition The result of correct recognition of the best-suggested model is 92.5% Finally, the thesis practiced the proposed methods on the MCU data set The experimental results show that the proposed method brings about significantly high accuracy It is possible to apply the proposed method in systems to Human activity recognition to detect falls, theft, etc The contributions of this thesis This thesis has focused on cluster research and feature extraction using CGA and CGA combined with a machine learning model for HAR The contributions of this thesis include: - The thesis proposes a method using CGA combined with the Gauss distribution to represent moving objects in space with complex distributions such as spheres or hyper-spheres - The thesis proposes the combination of Conformal GA and quantization of vectors to cluster data of objects moving in space - The thesis proposes models of combining CGA with HMM, PCR, and Recurrent Neural Network in training and object recognition - The thesis proposes a feature extraction method using conformal GA The thesis proposes are pre-processing method by transformation method of coordinates or marker selection method The thesis researches towards technology application in life The research results have opened up a new research direction in spatial data analysis based on CGA and CGA combined with machine learning models in training and human activity recognition Chapter METHODS OF REPRESENTATION OF MOVING OBJECTS IN MACHINE LEARNING Chapter 1, the thesis will address issues related to the method of spatial data representation and moving object representation based on machine learning techniques with laser focus on obstacles in data representation of objects moving in the space that this thesis is studied 1.1 Methods of data representation in space 1.1.1 Method of data representation by vector The method of representing data by vector is a method of mapping a data set to a multidimensional vector space and applied in almost all fields of computer science A vector space contains sequences of numbers The values in the sequence are called the elements of the vector, can be written to indicate the ith element of the vector In math, a column vector is represented as = ⋮ to and a row vectors are represented as … = [ elements of the vector they can be denoted ={ ∈ ℝ}, ], where , …, are the ∈ {1, … , } 1.1.2 Method of representing data by matrix Method of representing data by matrix is commonly used in image processing and recognition [12, 73] A matrix is generalized to represent data by two axes called rows and columns An ℝ × matrix consists of rows and element will be in the ∈ {1, … , ∈ columns whose elements are real numeric values, each } row and the ∈ {1, … , } column of the matrix In math, a matrix can be represented as follows, = In any matrix matrix ∈ℝ × ⋮ ⋮ … … ⋱ … ⋮ with the number of rows equal to the number of columns ( = ), is called a square matrix In the calculation, it is possible to swap the rows and columns of the matrix to get the transposed matrix, denoted as = , if then = with any and Two methods of representing input data by matrix can be mentioned as Convolutional Neural Network [85] and P Viola algorithm [72] in face recognition 1.1.3 Method of representing data by Tensor Tensor [1] is a popular concept used to represent multidimensional data in machine learning Tensor of real numbers with p degrees of general form ∈⊗ ℝ ∈ {1, … , } in Euclidean space ℝ , in the case of vectors ( = 1) and matrix ( = 2) Elements in tensor are identified by indexes on each dimension of the tensor In the case of three-dimensional tensor denoted as ℝ × × , each element will be in row ∈ {1, … , }, column ∈ {1, … , } and depth ∈ ∈ {1, … , } of tensor 1.2 Methods of representing moving objects in Machine Learning Machine Learning [14] has many different methods and algorithms There are also many ways to classify machine learning algorithms The most common way is to divide machine learning into two basic categories: Supervised learning and Unsupervised learning 1.2.1 Methods of representing data based on probability model In conventional machine learning models, the input data is very large If we process all of this input, it will cost a lot of computation and storage space Therefore, the use of parameters in the probability model such as mean, variance, and standard deviation instead of big data will reduce computation and storage costs 1.2.1.1 Gaussian Mixture Model The Gaussian Mixture Model (GMM) [25, 40, 43] is a very important probability distribution model and is widely used in the studies of image recognition, speech recognition, and action recognition [51, 89] GMM is represented by the weighted sum of the probability density functions of the Gaussian distribution M [1] whose components are ( )= where ∈ , , …, (1.1) ( | ,∑ ) is the typical vector of the object to be represented in D-dimensional space, weights of the mixture satisfying the condition ≤ ≤ and ∑ = 1, ∈ {1, … , is the } 1.2.1.2 Hidden Markov Model HMM [32, 53] has been published since the 1960s It is a statistical model in the chronological and sequence data modeling system The parameters in HMM are not known It is our job to determine the hidden parameters from the observed parameters As HMM can change structures easily and accurately when training, it has been widely used in handwriting recognition [70], voice and speech recognition [27, 79], Human activity recognition [74], natural language recognition [63], and analysis of biological sequences such as proteins and DNA [10, 22] 1.2.2 Dimension reduction method Dimension Reduction method is a method of converting data from a space with a large number of dimensions to a less-dimensional space to reduce computation and storage space 1.2.2.1 Principal Components Analysis Principal Components Analysis (PCA) [36, 55, 64, 68] ] is usually used to convert the dataset from a multi-dimensional space into a less dimensional space, but the method still ensures that the variance of the input data on each new dimension is the largest Figure 1.1: Data representation method using PCA algorithm Figure 1.1 is the initial space of figure a) (the blue points) observed on the coordinate axis, the variance in each direction is large In the new space of figure b) (the red points) along the ab coordinate axis, the variance in the second dimension is very small compared to This means that when projecting data on a coordinate axis, we get points that are very close to each other and close to expectations in that direction 1.2.2.2 Multi-class Linear Discriminant Analysis Multi-class Linear Discriminant Analysis (multi-class LDA) [6] is the Linear Discriminant method in multi-class classification problem is built by the improvement from Linear Discriminant Analysis (LDA) Figure1.2: Linear Discriminant Analysis method of two classes 1.2.3 Method of dimension increase In reality, the data is very intricately distributed in space If we use linear methods or assume the data is distributed on a plane then it is impossible to separate them into different layers Therefore we need to map the original data set to a new, more dimensional space to represent the data The method of representing data dimensionality, also known as the method using kernel functions is commonly applied in Support Vector Machine and Neural Network 1.2.3.1 Support Vector Machine 1.2.3.2 Artificial Neural Network 1.3 Method of representing moving objects using CGA 1.3.1 Geometric Algebra Geometric algebra is also a representation method that increases the number of data dimensions by defining two more base vectors and redefining operators such as Geometric product, Inner product, Outer product, Reflections, Rotations in Geometry Algebra to represent data in space 1.3.2 Conformal Geometric Algebra Conformal Geometric Algebra (CGA) [18, 95] is part of Geometric Algebra and is extended from real space of m dimensions by adding two dimensions, i.e in real space there are m dimensions, then in CGA space, there are m + dimensions to represent data In CGA space, the optimal distance function will be determined from a point to a vector that can be a point, a plane, or a hyper-sphere The hyper-spherical approximation method is to find a hyper-sphere [82, 97] that the minimum squared sum of the error function from the original data set is minimal 1.4 Conclusion chapter The main contribution of chapter is the analysis and evaluation of the advantages and disadvantages of data representation methods Especially, CGA is used to represent moving objects in space and this is the basis for the orientation of the next research topic in the thesis Chapter METHOD OF REPRESENTATION OF MOVING OBJECTS BASED ON CONFORMAL GEOMETRIC ALGEBRA Chapter 2, the thesis proposes methods of representation of moving objects based on Conformal Geometric Algebra to solve complex spatial data distribution problems At the same time, it proposes models that combine CGA with machine learning in Human activity recognition 2.1 Conformal Geometric Algebra 2.1.1 Geometric Algebra 2.1.2 Conformal Geometric Algebra 2.2 Proposing data clustering method using CGA Data clustering is an important unsupervised machine learning technique in data mining The 11 The HMM is defined by the following parameter settings, = ( , , ) where (2.49) = is the state transition probability distribution = the codebook index ( ) and , B is the probability distribution of is the starting probability of each state However, HMM is a model for estimating parameters and predicting time sequence, so when action recognition is performed, it is necessary to observe a series of actions before proceeding to determine the result (action recognition) Meanwhile, the actual requirement is to quickly define the action taking place at the time of observation, so the PCR or RNN models can be used instead of HMM in training and recognizing activities 2.3 Propose a feature extraction method using CGA 2.3.1 Feature extraction method using PCA 2.3.2 Feature extraction method using CGA PCA feature extraction method uses only the largest variances by determining the distance from one point to the mean (2.52) in real space However, the thesis proposes a method of feature extraction using PCA combined with CGA to find the largest variance by determining the distance from a point to a vector in CGA space Give a data set as (2.50), = × ∈ in CGA space, data set }; ∈ {1, … , }, ∈ {1, … , ( ) is represented as a point set (2.20), as follows: = + + ( ) , )= ( − This means that when minimizing the error ( ) ( − − × (2.58) , , ) The error function () ( ∈ ( The process of estimating using least squares = (2.50) − function, ) as follows: (2.59) can be limited by ‖ ‖ = 1, ) , ‖ ‖ =1 (2.60) Therefore, we might be tempted to express the previous problem using a non-negative Lagrange multiplier as the minimization of (2.60), 12 ( , )= () ∑ ( () − − ) (2.61) − (‖ ‖ − 1) The optimization process from Pham [61] The function ( ) is defined as follows: ( )= − − ∈ (2.62) The formula (2.61) can be rewritten, ( , )= As − (‖ ‖ − 1) (2.63) The optimal result can be solved using the Eigen problem, = where (2.64) training set in CGA space is the variance matrix of the = ( ) ( ) (2.65) Figure 2.2: Methods of data representation in CGA 2.3.2 Method of combining PCR with CGA 2.3.2.1 Principal Components Regression (PCR) 2.3.2.2 Method of combining PCR with CGA Given training set: = , where is the number of actions, ∈ , ∈ is the label of the action and ( ) is the number of frames of the (2.66) = {1, … , }} data set, is the dimension number of the action The problem now is to find a weight vector so that the covariance transformation is the largest To optimize the function is shown as the following formula: max ( ) ∑ − () ‖ ‖ = 1, , (2.67) 13 where, is the mean vector of and = () ∑ (2.68) () The PCR uses the coordinate axis with the minimum eigenvalue for each class and finally, the features chosen are, _ where (1 ≤ ≤ ( )=( − ) and , ,…, ( − ) (2.69) ) , { − 1, } is the degree of freedom of data set = and is the number of Eigenvectors Then, the new vector can be identiﬁed as following, ( )= _ (2.70) ( ) PCR assumes that the data is distributed on a plane or hyper-plane As a result, PCR cannot accurately represent the case where the data is distributed on the hyper-sphere of spatially rotated objects The thesis hence proposes a classification method using PCR in combination with CGA to represent more accurately cases where data is distributed on the hyper-sphere as stated in Figure 2.14 Figure 2.3: Data distributed two mixed layers of objects moving in space Then, we can replace formula (2.67) with formula (2.59) to calculate the variance in each class Finally, the feature to be extracted within each class is defined, _ where, = ; ( )=( , ,…, , ) { − 1, } is the degree of freedom of data set (2.71) and a new vector in the subclass as follows: ( )= _ ; ( ) (2.72) PCR determines the smallest variance using the PCA algorithm However, the thesis proposes to use the hyper-sphere approximation method in the CGA space to determine the variance in each 14 layer of PCR Therefore, this proposal can represent moving objects with complex distribution data very accurately 2.3.2.3 Method of combining PCR with CGA in Human activity recognition To recognize activities, the PCR method can be used to classify the object and determine the action at the time of observation Figure 2.15 is a proposed model of the PCR method combined with CGA to recognize human activity First, training data is preprocessed by selecting only important markers Next, a training model is built from combining PCR and CGA to classify objects Figure 2.4: Proposed model of HAR based on PCR combined with CGA 2.3.3 Feature extraction method using CGA in combination with RNN Figure 2.5 is a proposed model for Human activity recognition based on the feature extraction method using CGA in combination with RNN and the three-step model as follows: The first step is to build a data pre-processing method The second step is feature extraction using CGA and finally, the RNN model is used for active training and recognition Figure 2.5: Feature extraction method uses CGA in combination with RNN to HAR 2.3.3.1 Transformation method of coordinates 2.3.3.2 Model combining CGA with RNN The purpose of feature extraction using CGA is to extract key components ( ) from formula (2.62) and use data set ( ) to generate input data for training the RNN model Figure 2.18 shows 15 the RNN data representation model, each square is a state The input of each of these states is the output of the previous state is the input feature vector of CGA, ℎ = {ℎ , … , ℎ } is hidden vector string and y is output vector string The output of ℎ defined in the RNN is ℎ = where ( ℎ + (2.73) ) is active function and often use functions sigmoid or tanh, coefficient to , is the matrix of the conversion is the matrix of the conversion coefficient ℎ to ℎ Because there is only one output value, can be specified through the action function as softmax, = where is the matrix of the conversion coefficient ( ℎ ) (2.74) to 2.5 Conclusion chapter In this chapter, the thesis proposes three methods of representing moving objects based on CGA and combining CGA with machine learning models for clustering, feature extraction for training, and object recognition The main contributions of the chapter include: - To propose a data clustering method using CGA in combination with GMM and vector quantization method using CGA for HMM that can train and recognize actions - To propose a combination of PCR with CGA to build a data training model and action recognition - To propose a moving object recognition model based on feature extraction method using CGA and combining CGA with RNN in action recognition Chapter EXPERIMENT AND RESULTS EVALUATION The experiments based on the proposed models are as follows: - Experiment 1: Human activity recognition based on clustering data using CGA in conjunction with HMM The purpose of this experiment is to compare the correct rate of Human activity recognition between the clustering model using CGA in combination with HMM and the clustering model using k-means in combination with HMM - Experiment 2: Human activity recognition based on PCR combined with CGA The purpose of this experiment is to consider the applicability of CGAs in machine learning 16 and compare the correct rate of Human activity recognition between clustering models using PCR in combination with CGA and method of PCR in combination with PCA - Experiment 3: Human activity recognition based on feature extraction method using CGA combined with RNN The purpose of this experiment is to compare the correct rate of Human activity recognition between methods of feature extraction using PCA combined with RNN Experiments were conducted on CMU data set [95] to adjust the parameters of the proposed model and select the models with the best ability of Human activity recognition 3.1 Experimental data 3.1.1 Objects moving in space 3.1.2 Motion dataset of CMU 3.1.3 Database experiment The thesis used CMU data set to conduct experiments on the proposed models Table 3.1 is a specific data set with 08 actions including dance, jump, kicking, placing Tee, putt, run, swing, walk with a total of 19,862 frames, in each frame, there are 41 markers, coordinate ( , , ) 60% of the data was used for training and 40% of the data was used for testing Table 3.1: Database experiment of human activities Action Number frame Training Testing Total Dance 3,305 1,577 4,882 Jump 1,198 846 2,044 Kick 1,605 1,163 2,768 Placing Tee 1,487 1,096 2,583 Putt 1,534 974 2,508 Run 452 322 774 Swing 1,324 977 2,301 Walk 1,074 928 2,002 Total 11,979 7,883 19,862 3.2 Action recognition based on CGA clustering combined with HMM 3.2.1 Experimental results 3.2.1.1 The parameters of the model During the quantization process, the vectors create quantum errors and cause a decreased rate 17 of recognition accuracy Therefore, we need to optimize the parameters in the model so that the smaller the quantum error, the better This problem can be solved by gradually increasing the number of clusters ( ) In this experiment, the number of clusters selected increases gradually from until the best results are available and the number of hidden states is selected = 5( = = 5) Besides, to save memory, the thesis used a scale parameter which is the number of frames selected for the experiment (for example, using a camera with a frequency of 120 Hz, then the number of Frames per second = 120 / scale) In the experiment, this parameter is also increased from =1 = 24 To ensure accurate results, for each cluster (each k value) 200 times of executions to will be conducted The accurate recognition rate is the average of those 200 executions 3.2.1.2 Experiment results The main purpose of this experiment is to evaluate the correct recognition rate of the proposed model in Human activity recognition There are two main experiments: - Experiment 1: Vector quantization method based on k-means clustering data (kmeans_HMM) combined with HMM in Human activity recognition on the move - Experiment 2: Vector quantization method based on data clustering using CGA (CGA_Clustering_HMM) combined with HMM in Human activity recognition The experiment is conducted with increasing scale, when scale = to scale = 20, the number of clusters increases gradually from to (classnum = ÷ 5) then the results were the most positive Table 3.2 shows the results after each experiment Table 3.2: Table comparing the results of actions recognition using CGA clustering and the k-means algorithm in HMM Cluster number Scale value k-means-HMM CGA clustering-HMM (%) (%) 59,30 65,75 72,57 77,30 49,25 69,75 5 51,75 86,95 10 59,85 54,80 10 61,8 63,75 10 49,75 65,75 10 70,35 63,30 Table 3.2 shows that when the clustering number is and 5, the recognition rate of the k-means clustering method reaches the highest threshold of 72.57%, and proposed method CGA_Clustering 18 when the cluster number is 5, CGA clustering-HMM reaches the highest threshold of 86.95% It is 14,38% higher than the clustering k-means 3.2.2 Result evaluation The proposed method and experimental results show that advantages of vector quantization in the data clustering method with data clusters centered on proceeded by optimizing the distance from and data in the clusters is to the center then it is from formula (2.45) The k-means algorithm will optimize the function of squaring the distance between the vector , and the center of the kth cluster formed from one point to one point Meanwhile, the proposed method of CGA_Clustering will use the hyper-spherical approximation method to optimize the distance function from point , to the center of the vector , which can be a point, a plane, or a hyper-sphere in CGA space Therefore, the use of the CGA_Clustering method to represent moving objects is more accurate, but the complexity of these two algorithms is the same 3.3 Action recognition based on PCR combined with CGA 3.3.1 Experiment methods An experimental scenario is conducted by increasing the number of markers (from to 41) to compare the recognition results based on the proposed models - Experiment 1: Use the non-marker selection method in combination with PCR for training - Experiment 2: Use the proposed marker selection method in combination with PCR - Experiment 3: Use the non-marker selection method in combination with PCR proposed model and CGA - Experiment 4: Use the proposed model of proposed marker selection combined with PCR and CGA These experimental methods are performed to compare the results in feature extraction using CGA in PCR and feature extraction using PCA in PCR 3.3.2 Experimental results The experiment is conducted by gradually increasing the number of markers from to 41 in four proposed experimentation methods Results of specific methods are as follows: - Experiment 1: Use the non-marker selection method in combination with PCR for training In other words, PCR is used to train on datasets that not use the marker selection preprocessing method In this experiment, the recognition results are quite low, only 54.3% - Experiment 2: Use the proposed marker selection method in combination with PCR In this 19 experiment, the recognition results have been improved, but only 63.1% - Experiment 3: Use the non-marker selection method in combination with PCR proposed model and CGA Figure 3.4: Marker model and lhumerus marker data distribution density - Figure 3.4 shows a very complex data distribution of the lhumerus joint in the left arm while performing a dance action If PCA is used for data representation, it would be inaccurate when this joint is doing a lot of activities and making a lot of rotations When CGA is used for data representation, it is more accurate for this data distribution Compared with experiment 1, the use of CGA feature extraction makes the result of experiment increase a lot, from 54.3% to 81.2% - Experiment 4: Use the proposed model of proposed marker selection combined with PCR and CGA This experiment is a combination of experimental and experiment 3, i.e the selection of markers for the PCR and CGA combination model to train and recognize Figure 3.5 shows that the proposed method only needs 34 markers to give a result with the highest rate of 88.9% while using all the markers, the highest result reaches only 81.2% non marker selection marker selection 88,9% Recognition rate 0,8 81,2% 0,6 0,4 0,2 10 11 12 16 20 21 25 26 28 33 34 36 39 41 Number of markers Figure 3.5: Results of the proposed method using PCR combined with CGA for HAR 3.3.3 Result evaluation Fig 3.6 shows that the proposed method of PCR combined with CGA has the best recognition 20 result of 88.9%, higher than other methods The proposed method shows that using only 34 markers Recognition rate (%) still gives better results when using all markers 100 88,9 81,2 80 54,3 60 63,1 40 20 nosl sl nosl_cga_pcr sl_cga_pcr The results of the experimental proposed method Figure 3.6: Comparison of results of experimental proposed methods 3.4 Action recognition based on CGA combined with RNN Experiment was carried out to verify and compare the feature extraction method using PCA in combination with RNN and feature extraction method using CGA in combination with RNN in Human activity recognition, experimental models are shown in Figure 2.17 3.4.1 Experimental results 3.4.1.1 Predict with RNN Table 3.3: Comparison results when using the preprocessing method on the RNN model Experiment Original data results Results using RNN Preprocessing results Train Test Train Test 72.57% 70.97% 87.11% 84.45% The purpose of this experiment is to compare the coordinate displacement preprocessing method and not use the preprocessing method when using the RNN model for training Table 3.3 shows that the coordinate displacement method has recognition results of 84.45%, much higher than when the coordinates are unchanged, only 70.97% 3.4.1.2 Predict with PCA_RNN The experiment was performed with the following stages: preprocessing, PCA feature extraction, and using RNN in training As the number of dimensions increases, so does the recognition result When the number of dimensions is 25, the training result (PCA_Train) is 97.9% and the result of recognition (PCA_Test) is 90.5% 21 Accuracy rate (%) PCA_Train PCA_Test 0,97998 0,905216 0,8 0,6 0,4 1011 12 13 1415 16 1718 19 2021 22 23 2425 26 2728 Number of dimensions Figure 3.1: Results of combining PCA and RNN 3.4.1.3 Predict with CGA_RNN The experiment is conducted based on the proposed model in Figure 2.17 When the number of dimensions increases, the results in Figure 3.9 also increase However, compared to the PCA feature extraction method, when the number of dimensions is 5, the highest threshold reaches 92.52% Therefore, it can be concluded that feature extraction using CGA brings about results in very fast convergence when the attributes of the object are fully obtained Results of the highest training rate (CGA_Train) reaches 98.1% when the number of dimensions was and the highest recognition rate (CGA_Test) was 92.52% when the number of dimensions was Accuracy rate (%) CGA_Train CGA_Test 0,9807 0,925296 0,8 0,6 0,4 10 1112 1314 1516 171819 2021 2223 2425 262728 Number of dimensions Figure 3.2: Results of combining CGA and RNN 3.4.2 Result evaluation Table 3.4 shows the average results after five experiments There are some other cases where the results reach absolute values The result of the proposed method of feature extraction using CGA in combination with RNN is 2% higher than the method of feature extraction using PCA and RNN Table 3.2: Compares the results of the two proposed methods Proposed method Results 3.5 Conclusion chapter PCA_RNN CGA_RNN Train Test Train Test 98% 90,5% 98,1% 92,52% 22 CONCLUSION AND FUTURE WORKS The results of the thesis The thesis has successfully covered the initially set objectives The thesis "Research on representation and recognition of moving objects based on conformal geometric algebra and machine learning" has achieved below results: - The thesis has studied the basic problems of data representation in space and data representation methods in machine learning, resulting in the proposal of the application of conformal geometric algebra in moving object representation in space - The thesis proposes a data clustering method by combining conservative geometric algebra with Gaussian distribution probability density function to generate HMM data string - The thesis proposes a vector quantization method by using conformal geometric algebra to optimize the distance function and combining CGA with HMM for Human activity recognition - The thesis proposes to build a data training model by combining PCR with CGA in Human activity recognition - The thesis proposes a method of feature extraction and dimension reduction by using CGA in combination with CGA and RNN in Human activity recognition Finally, the thesis practices the proposed methods on the MCU data set The experimental results show that the proposed method brings about significantly high accuracy It is possible to apply the proposed method in systems using for Human activity recognition Result evaluation The thesis presents the research results on the use of conformal geometric algebra to represent moving objects and combining with machine learning models such as HMM, PCR, and RNN to recognize human activity The research is highly applicable thanks to the ability of image and video production by nowadays devices and greatly applicable for monitoring and control applications so that systems can gradually be automated to replace humans The thesis proposes new methods of data clustering and feature extraction of moving objects, opening up a new research direction to replace the previous theories such as k-means, Gauss probability density function, and PCA Compared with the initial set objectives, the results of the thesis can be assessed as follows: - The overview study has fully presented problems related to geometric algebra, conformal geometric algebra, and some machine learning models as well as their applicability Thereby we can see the reasons and advantages of the application of conformal geometrical 23 algebra in the representation of moving objects in space - For the Gauss probability density function, the point-to-point distance function is optimized and the representation distribution has a mountain shape, so it is only possible to approximate clustered data With the rotation data or the hyper-spheres distribution data, the Gauss density function cannot represent correctly Therefore, the thesis has combined the Gauss density function with CGA to cluster the data of moving objects using the hyperplane and hyper-spheres approximation method to determine the variance in clustering Table 3.5: Summary of results of proposed methods STT Method Accuracy Experiment data Experiments were conducted on CMU data k-means HMM 72,57% CGA HMM 86,95% PCR 63,1 % CGA_PCR 88,9% PCA_RNN 90,5% CGA_RNN 92,52% set with human activities (dance, jump, kicking, placingTee, putt, run, swing, walk) - Each action has multiple frames and each frame has 41 with 3D coordinate markers For the vector quantization method, in some previous studies, the k-means algorithm was used to optimize the distance and use the center as a representative for each cluster Consequently, when performing moving or rotating objects, the result is only 72.57% Meanwhile, with the proposed CGA_Clustering method, the center of each cluster is a vector which can be a point, a plane, or a hyper-sphere As a result, objects moving in space are represented very accurately and recognition results reach 86.95% - The PCR classification method uses a PCA algorithm to reduce the number of dimensions in each layer, so there are limitations when representing rotating objects Recognition results only reach 63.1% The proposed model combining PCR with CGA in classification has the best result of 88.9% - The Human activity recognition is performed on activities The thesis proposes the use of CGA to represent the moving object and use PCR to reduce the number of dimensions in the data The experiment was conducted on 41 markers and 123 dimensions We find that with a dimension of 50, the recognition result reaches the highest rate of 88.6% - For data with complex distribution and a large number of dimensions, previous studies often use the PCA feature extraction method and multi-class LDA method to reduce the number of data dimensions, reduce storage capacity calculation time and increase accuracy 24 The thesis proposes to use CGA in the feature extraction of the object and then combine it with RNN for training The highest recognition result of the proposed method is 92.52% Meanwhile, when using the PCA feature extraction method combined with RNN, the highest result is 90.5% In each study and published results, there are comparisons and evaluations of the results and advantages of research methods It can be said that this is highly feasible research with good results Research has opened up a new direction in image processing and Human activity recognition Further research directions In addition to the results achieved in the thesis, several problems can be raised for further research: - The thesis just studies some machine learning models to combine with CGA It is necessary to continue to study some other machine learning models in the future to incorporate with CGA to be able to evaluate more specific results - The thesis has conducted verification and experimentation on the MCU sample data set It is necessary to experiment on other data sets and deploy on real systems to evaluate future models - Research only applies to low-level actions such as dance, jump, kicking, placingTee, putt, run, swing, and walk, but not on high-level actions with a large time such as house cleaning, sightseeing In conclusion, the thesis “Research on representation and recognition of moving objects based on Conformal Geometric Algebra and Machine Learning” has achieved the set objectives, the proposed models have experimented on the CMU data set and with very high results Therefore, in the future, it is necessary to further research and apply it to real data to recognize more human behaviors or gestures such as sleepy, happy, sad or stealing, fighting, etc PUBLICATIONS [1] Nguyễn Năng Hùng Vân, Phạm Minh Tuấn, Tachibana Kanta “Nhận dạng chuyển động quay dựa mô hình Markov ẩn Conformal Geometric Algebra”, Tạp chí Khoa học Công nghệ Đại học Đà Nẵng, số 1(72), 2, năm 2014 [2] Nguyễn Năng Hùng Vân, Phạm Minh Tuấn, Tachibana Kanta, “CGA clustering based vector quantization approach for Human activity recognition using discrete Hidden Markov Model”, Tạp chí Khoa học Cơng nghệ Đại học Đà Nẵng, số: 12(85), 1, năm 2014 [3] Nguyễn Năng Hùng Vân, Phạm Minh Tuấn, Ung Nho Dãi, “Mơ hình trọng số kết hợp phương pháp trích chọn đặc tính nhận dạng hành động người”, Kỷ yếu Hội thảo quốc gia Điện tử, Viễn thông Công nghệ Thông tin REV2015, năm 2015 [4] Nang Hung Van NGUYEN, Minh Tuan PHAM, Phuc Hao DO “Marker Selection for Human Activity Recognition Using Combination of Conformal Geometric Algebra and Principal Component Regression”, In Proceedings of the Seventh International Symposium on Information and Communication Technology, papers 274-379, (SoICT 2016), December 08-09, ISBN 978-1-45034815-7, DOI: 10.1145/3011077.3011133 (ACM ICPS, ACM Digital Library and DBLP) [5] Nguyen Nang Hung Van, Pham Minh Tuan, Do Phuc Hao, Pham Cong Thang, and Kanta Tachibana, “Human action recognition method based on Conformal Geometric Algebra and Recurrent Neural Network”, Journal of Information and Control Systems, ISSN 1684-8853 (print); ISSN 2541-8610 (online), no (108)/2020, DOI: 10.31799/1684-8853-2020-5-2-11 (Scopus) ... Vân, Phạm Minh Tuấn, Tachibana Kanta ? ?Nhận dạng chuyển động quay dựa mơ hình Markov ẩn Conformal Geometric Algebra”, Tạp chí Khoa học Cơng nghệ Đại học Đà Nẵng, số 1(72), 2, năm 2014 [2] Nguyễn Năng... Khoa học Công nghệ Đại học Đà Nẵng, số: 12(85), 1, năm 2014 [3] Nguyễn Năng Hùng Vân, Phạm Minh Tuấn, Ung Nho Dãi, “Mơ hình trọng số kết hợp phương pháp trích chọn đặc tính nhận dạng hành động. .. Recurrent Neural Network (RNN) to recognize human actions; selection of parameters to propose better model improvements The thesis researches and builds an experimental model from the proposed

Định dạng
Số trang	27
Dung lượng	1,14 MB