Machine learning final project report face recognition

PHENIKAA UNIVERSITY FACULTY OF ELECTRICAL AND ELECTRONICS ENGINEERING MACHINE LEARNING FINAL PROJECT REPORT: FACE RECOGNITION Students: 1.. INTRODUCTIONFace recognition using machine le

Trang 1

PHENIKAA UNIVERSITY FACULTY OF ELECTRICAL AND ELECTRONICS ENGINEERING

MACHINE LEARNING FINAL PROJECT REPORT:

FACE RECOGNITION

Students:

1 Nguyen Van Khai

2 Tran Quang Thanh

3 Tran Anh Quan Lecturer: Le Minh Huy

Trang 2

WORKLOAD DISTRIBUTION

Nguyen Van Khai  Write report + Code Naive Bayes in Midterm Tran Quang Thanh  Model completion and deploy real-time Tran Anh Quan  Make PowerPoint + Code KNN In Midterm

Trang 3

1 INTRODUCTION 4

2 WORKFLOW 5

3 DATA PREPARATION 6

3.1 Data collection + data cleaning 6

3.2 Data Visualization 7

4 CHOOSE MODEL 9

5 EVALUATE MODEL 11

6 CONCLUSION 13

Table of figures Figure 1 Workflow 5

Figure 2 Data acquisition 6

Figure 3 Preprocessed data 6

Figure 4 Data Visualize 3D using PCA Algorithm 8

Figure 5 Some pictures of datasets 8

Figure 6 How does SVM work? 9

Figure 7 Different models' accuracies 11

Figure 8 Different parameters' accuracies 11

Figure 9 Result 12

Trang 4

1 INTRODUCTION

Face recognition using machine learning is a fascinating field of study that involves training a computer to recognize and identify human faces from images or videos With the advancement of machine learning and computer vision

technology, it is now possible to develop systems that can recognize faces in real-time

In a face recognition machine learning project, the goal is to develop a system that can accurately identify individuals from their facial features, such as the distance between the eyes, the shape of the nose, and the contours of the face This technology has various practical applications, including security systems, surveillance, and biometric authentication

In this project, a vast amount of data is required to train the machine learning model to recognize different faces accurately The process involves selecting an appropriate algorithm, preprocessing the images, and training the model using a dataset of labeled images The project also requires testing the model to evaluate its accuracy and making necessary adjustments to improve its performance Overall, face recognition machine learning projects have tremends potential to revolutionize various industries by providing an efficient and reliable way of recognizing individuals in real-time

We recommend these links for better understanding our project:

Presentation:

‘Presentation YouTube’

Real-time:

‘Real-time YouTube’

Code:

‘GitHub’

Let’s get started!

Trang 5

2 WORKFLOW

Figure 1 Workflow

Trang 6

3 DATA PREPARATION

3.1 Data collection + data cleaning

We use OpenCV-Python for collecting data, including 4 faces: Dong, Khai, Quan, Thanh To avoid noise as much as possible we have set a rectangle on the display for people to put their face in this After that, normalize dataset by dividing by 255 Then, we get the face in the rectangle

Figure 2 Data acquisition

Finally, we get the result like under this:

Trang 7

3.2 Data Visualization

More convenient for choosing model We use PCA for data visualization Principal Component Analysis (PCA) is a statistical technique used to reduce the dimensionality of high-dimensional data by identifying the most important variables or components that explain the majority of the variance in the data PCA

is widely used in data analysis, pattern recognition, and machine learning This algorithm, combined with Singular Value Decomposition (SVD), is described in:

Calculation Convartiance Matrix:

ε=1

m ×∑

i=1

m

x n ×1 (i) × (x1× n (i) )T

Finding eigenvector from SVD:

A n × p =U n × n × S n × p ×V p × p

T

Finding Eigenvalues by solving:

det (ε−λ × I )=0

Representation to k th dimension

Z k × 1 (i) =U k × n T

× x n × 1 (i)

One downside of the PCA approach is that PCA is a trade-off algorithm We trade the reduction in the number of dimensions with the ability to retain the data properties Fortunately, to be able to calculate the variance between two datasets before and after applying PCA, we have a formula in this equation:

ε|x (i)−x¿(i)|2

ε‖x (i)‖2

Trang 8

Figure 5 Some pictures of datasets

Trang 9

4 CHOOSE MODEL

With well-distributed data SVM algorithm can solve the problem well with the data

Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is used for classification as well as regression problems However, primarily, it is used for classification problems in Machine Learning The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future This best decision boundary is called a hyperplane

SVM chooses the extreme points/vectors that help in creating the hyperplane These extreme cases are called as support vectors, and hence algorithm is termed

as Support Vector Machine Consider the below diagram in which there are two different categories that are classified using a decision boundary or hyperplane:

Margin = ‖b w‖ f(x) = w x + bT

Trang 10

Optimization problem:

With any pair of the data points (x , y ), the distance from that point to the division n n

is:

y n (w T

x n +b)

‖w‖2 With the split face, margin is calculated as closest from a point to the face (with any point in 2 classes):

Margin = min y n (w T

x n +b)

‖w‖2 The optimization problem in SVM is the problem of finding w and b so that this margin reaches the maximum value:

(w, b) = argmax {miny n (w T

x n +b)

‖w‖2 } = argmax {

1

‖w‖2

min y n (w T

x n +b)}

We can lead to the following constrained optimization problem:

(w, b) = argmax ‖w1‖2 Subject to: y n(w T

x n + b)≤ 0 ,∀n=1, 2,…,N

There’s a thing we should notice: regularization parameter, C, is inversely proportional to the strength of the regularization It also means the thickness of the margin: the smaller C is, the larger margin is

Overall, to choose a good model, we consider regularization parameter C and type

of kernel the most

Trang 11

5 EVALUATE MODEL

To confirm the performance of SVM, we compare it to some other models Here’s the result:

Figure 7 Different models' accuracies

Therefore, SVM is a good technique for this problem

After trying some parameters of SVM model, we got this:

Trang 12

Generally, all situation works well However, we choose C = 1.0 and use kernel

“linear” for SVM model not only because of its accuracy, but also another reason Our dataset is not linearly separated, though it is well-distributed, so the small C may be not good for real-time Since a small value of C may return a massive margin for some data points in real-time, which causes wrong results

Therefore, by choosing C = 1.0, kernel = “linear”, we get the final model for face recognition Here’s results for test data:

Trang 14

As expected, system gives great results.

Trang 15

6 CONCLUSION

SVM show its advantages on classification With fast computing, high accuracy and easy to use and understand, the reflexibility is low, secondly, our project has promised result in real-time application Follow links above on part 1: introduction for more information

Face recognition machine learning projects have tremendous potential to

revolutionize various industries by providing an efficient and reliable way of recognizing individuals in real-time This project is just a starting point for solving more problems in real life, which we will improve it

Trang 16

[1] Main lesson, Minhhuy Le

[2] https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

[3]

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSear chCV.html

Tiêu đề	Face Recognition
Tác giả	Nguyen Van Khai, Tran Quang Thanh, Tran Anh Quan
Người hướng dẫn	Le Minh Huy
Trường học	Phenikaa University
Chuyên ngành	Electrical And Electronics Engineering
Thể loại	Final Project Report

Định dạng
Số trang	16
Dung lượng	2,95 MB