1 K-nearest Neighbor Classification and original labels of test data.. Confusion Matrix ‘crn accuracy macro avg weighted avg Classification error: By normalizing the Iris dataset be
Trang 1Xí
USTH
VIETNAM FRANCE UNIVERSITY
University of Science and Technology of Hanoi
Machine Learning & Data Mining II Labwork 3 Report
BI12-389 Nguyen Son
BI12-447 An Minh Tri
Academic Year 2 - Data Science February 2023
Trang 2Contents
1.1
1.2
1.3
2.1
2.2
2.3
10
11
12
12
13
13
13
14
15
15
16
16
16
17
17
18
18
Trang 31 K-nearest Neighbor Classification
and original labels of test data
Confusion Matrix
ce)
Here we set k = 3 (3 nearest neighbor) to calculate Confusion Matrix, precision, recall, fl-score, for 3 classes in the Iris dataset
Trang 4with k= 3
Confusion Matrix
Confusion Matrix
For example, with k = 4 compared to when k = 3, almost every aspect has changed slightly But
Trang 5Confusion Matrix
‘crn
accuracy
macro avg
weighted avg
Classification error:
By normalizing the Iris dataset before implementing the k-nn, the performance yields much better results and the accuracy can sometimes achieve 100%.(k = 3)
Trang 6with explanation
We apply PCA for 2 components:
0.646835
150 rows x 2 columns
Confusion Matrix
cece)
In case of PCA, the accuracy drops down to 90% as well as other aspects compared to when
Trang 7e In order to confidently say that we can achieve consistently high accuracy on future unseen data, testing the model on unseen data is essential
e In cross-validation, instead of splitting the data into two parts, we split it into 3 Training data, cross-validation data, and test data
e To use the cross-validation data under training data, we randomly split training data to k equal parts 1/k of those will be for cross-validation, the rest is for training Repeat the step
k times and the data from cross-validation will be trained
e With cross-validation, we can also find the optimal number of neighbor(k) for the best accu- racy
097
Number of Neighbors
Implementing the leave-one-out method, we get the
Classification score: 9.9833333333333333
which results in
Trang 81.2.1 Apply k-nn on Digits dataset, compute classification error by comparing pre- dicted and original labels of test data
Confusion Matrix
9]
kề 0]
le pS
-0o -95
- 80 -S7/
==) mes)
iy ery
- 80 -97
9o
- 80
608 -7Z/
oo
98
628
97
92
94
4)
xu
HN
Fy Sun, -98
Se
TY
`
9
1
2
3
4
=) io)
vi
8
kở
® -98 mes) -98
accuracy
Here we set k = 3 (3 nearest neighbor) to calculate Confusion Matrix, precision, recall, fl-score, for 3 classes in the Digits dataset
Trang 9with k= 5
Confusion Matrix
8
3
8
8
¬— ®
43
Ww 2
di
oyrodsd Ww
|
[
Í
[
[
[
[
- 88 mr!
- 88
- 88 97
- 88
ay | mer 97
- 88
0o
68
42 ko)
43
33
Ey
be
36 Ee)
35
8
nf
2
3
Fì _
6
Vị
8
9 accuracy
macro avg
weighted avg
As expected the results vary as k changes
Trang 10Confusion Matrix
9]
tà co
w oe
a mo
” "
8
1
2
3
Fì
Sy
6
7
8
9
accuracy
macro avg
By normalizing the dataset before implementing the k-nn, the performance yields much better results
Trang 11with explanation
Apply PCA for 2 components
Confusion Matrix
` i
ho 0¬
8
75) -=
_>v,
và, -88
ay 7]
vụ,
ae
ay | _-=
Sa,
51
51
Br
82
22
77
65
19
30
ở
1
2
3
4
=
6
7
8
9
The results drop drastically to only about 51%
Trang 12
0990
0985
0980
0.975
0.970
Number of Neighbors
The optimal k is equal to 1
12
Trang 131.3.1 Apply k-nn on Wine dataset, compute classification error by comparing pre- dicted and original labels of test data
Confusion Matrix
Classification error:
Here we set k = 3 (3 nearest neighbor) to calculate Confusion Matrix, precision, recall, fl-score, for 3 classes in the Wine dataset
Confusion Matrix
E8 )
The results vary as k changes
13
Trang 14Confusion Matrix
‘aD
By normalizing the dataset before implementing the k-nn, the performance yields a small improve- ment
with explanation
Apply PCA for 2 components
Confusion Matrix
eC)
[9 1 9]]
.80 accuracy
macro avg
weighted avg
For the Wine dataset, after applying PCA, the results for k-nn improve by a large margin from about 75% to 94% accuracy
14
Trang 15072
070
068
= oa a
064 3
062
060
Number of Neighbors
The optimal k is equal to 12
15
Trang 162 Perceptron classifier
Perceptron on Iris
% °
o? = ocd ©? oe
"Pe @ % 8 % ge
107? 5
PC1
The weight vector w is initialized to a numpy array of zeros in the perceptron function as follows
n samples, n features = X.shape
w = np.zeros(n_features) .‹b=
The learning rate œ 1s seb to 0.01 by default in the function signature:
16
Trang 172.1.2 Plot linear classifiers for Iris dataset (using PCA or SVD to reduce the number
of dimensions to 2D):
Perceptron Classifier for Iris
converging faster?
The convergence rate of the Perceptron on Iris dataset depends on the data distribution and the learning rate (@) used The Perceptron algorithm is guaranteed to converge to a solution if the data
is linearly separable, but the number of iterations required to converge may vary depending on the data distribution
In this case of the Iris Dataset, we set the maximum number of epochs to 10, which means that the algorithm will iterate over the entire dataset 10 times If the data is not well separated, the algorithm may not converge within 10 epochs
To make the algorithm converge faster, we can do the following:
e Scale the input features: Rescaling the input features to have zero mean and unit variance can improve the convergence rate of the algorithm
e Change the learning rate: The learning rate determines the step size of the weight update A larger learning rate can result in faster convergence, but it may also cause the algorithm to overshoot the optimal weights and diverge On the other hand, a smaller learning rate may converge slower but more stably Therefore, it is important to choose an appropriate learning rate based on the problem and the data
Trang 18e Use a more advanced algorithm: There are many advanced algorithms for linear classification, such as Support Vector Machines (SVMs) and Logistic Regression, which can converge faster and have better performance than Perceptron
Perceptron Classifier for wine
Perceptron on wine
60
40
y
L4
0
° -10 -20 -20
1000
Perceptron on digits Perceptron Classifier for digits
18
Trang 193 References
Scikit-learn (n.d.) datasets.load_iris Retrieved from
https: //scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris html Scikit-learn (n.d.) datasets.load_ wine Retrieved from
https: //scikit-learn.org/stable/modules/generated/sklearn.datasets.load_wine.html Scikit-learn (n.d.) datasets.load_ digits Retrieved from
https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.ht
ml
The link to all the source code for this labwork:
https://drive google com/drive/folders/1oWdt974E_5SvHOvK6 9nBgAd18FxMDpG2?usp=shari
ng