TR Ư Ờ NG Đ Ạ I H Ọ C NAM C Ầ N THƠ KHOA K Ỹ THU Ậ T CÔNG NGH Ệ ── ── PH Ạ M H Ữ U DƯ Ợ C Ứ NG D Ụ NG PHƯƠNG PHÁP DECISION TREE Đ Ể NH Ậ N DI Ệ N CH Ữ S Ố VI Ế T TAY Đ Ồ ÁN TH Ự C T Ậ P Ngành Công ngh ệ Thông tin Mã s ố Ngành: 7480201 05 – 2021 TRƯ Ờ NG Đ Ạ I H Ọ C NAM C Ầ N THƠ KHOA K Ỹ THU Ậ T – CÔNG NGH Ệ PH Ạ M H Ữ U DƯ Ợ C MSSV: 17 7088 Ứ NG D Ụ NG PHƯƠNG PHÁP DECISION TREE Đ Ể NH Ậ N DI Ệ N CH Ữ S Ố VI Ế T TAY Đ Ồ ÁN TH Ự C T Ậ P Ngành Công ngh ệ Thông tin Mã s ố Ngành: 7480201 GI Ả NG VIÊN HƯ Ớ NG D Ẫ N TS NGÔ H Ồ ANH KHÔI 05 - 2021 i C H Ấ P THU Ậ N C Ủ A H Ộ I Đ Ồ NG Đ ồ án th ự c t ậ p cu ố i khóa “cài đ ặ t gi ả i thu ậ t d ecision t ree đ ể nh ậ n d ạ ng ch ữ s ố vi ế t tay” do sinh viên Ph ạ m H ữ u Dư ợ c th ự c hi ệ n dư ớ i s ự hư ớ ng d ẫ n c ủ a T S Ngô H ồ Anh Khôi Đ ồ án th ự c t ậ p đã báo cáo và đư ợ c h ộ i đ ồ ng ch ấ m đ ồ án thông qua ngày …… tháng …… năm …… Ủ y viên Thư ký GHI CH Ứ C DANH, H Ọ , TÊN GHI CH Ứ C DANH, H Ọ , TÊN Ph ả n bi ệ n 1 Ph ả n bi ệ n 2 GHI CH Ứ C DANH, H Ọ , TÊN GHI CH Ứ C DANH, H Ọ , TÊN Cán b ộ hư ớ ng d ẫ n Ch ủ t ị ch H ộ i đ ồ ng GHI CH Ứ C DANH, H Ọ , TÊN GHI CH Ứ C DANH, H Ọ , TÊN ii NH Ậ N XÉT C Ủ A GIÁO VIÊN HƯ Ớ NG D Ẫ N ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· ································ ································ ································ ····· C ầ n Thơ, Ngày… tháng… năm 2021 Giáo viên hư ớ ng d ẫ n (Ký tên) T S Ngô H ồ Anh Khôi iii NH Ậ N XÉT C Ủ A GIÁO VIÊN PH Ả N BI Ệ N ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ ································ ································ ······························ C ầ n Thơ, Ngày… tháng… năm 2021 Giáo viên ph ả n bi ệ n (Ký tên) Th S Hu ỳ nh Bá l ộ c iv L Ờ I C Ả M ƠN Trong th ờ i gian th ự c t ậ p cu ố i khóa (CNTT) l ầ n này , em đã nh ậ n đư ợ c s ự giúp đ ỡ nhi ệ t tình t ừ các th ầ y cô đ ể em hoàn thành th ự c t ậ p cu ố i khóa (CNTT) k ị p th ờ i gian đã quy đ ị nh Vì th ế , cho phép em g ử i l ờ i c ả m ơn sâu s ắ c đ ế n các th ầ y cô gi ả ng viên khoa k ỹ thu ậ t – công ngh ệ trư ờ ng Đ ạ i h ọ c Nam C ầ n Thơ đã d ạ y b ả o và trang b ị cho em nh ữ ng ki ế n th ứ c vô cùng h ữ u ích đ ể em có cơ s ở v ữ ng ch ắ c hoàn thành đ ồ án th ự c t ậ p l ầ n này Đ ặ c bi ệ t em xin g ử i l ờ i chúc s ứ c kh ỏ e và l ờ i c ả m ơn chân thành nh ấ t t ớ i gi ả ng viên TS Ngô H ồ Anh Khôi th ầ y đã giúp đ ỡ v à ch ỉ b ả o t ậ n tình đ ể t ừ đó em đ ị nh hư ớ ng đư ợ c m ụ c tiêu và hoàn thành t ố t th ự c t ậ p cu ố i khóa(CNTT) l ầ n này M ặ c dù em đã c ố g ắ ng và n ổ l ự c r ấ t nhi ề u nhưng do đây là em làm đ ồ án th ự c t ậ p nên kinh nghi ệ m là m ộ t tr ở ng ạ i đ ố i v ớ i em nên đ ồ án th ự c t ậ p l ầ n này không tránh đư ợ c nh ữ ng thi ế u sót và h ạ n ch ế Em r ấ t mong nh ậ n đư ợ c thông c ả m, nh ữ ng nh ậ n xét và ch ỉ b ả o l ạ i c ủ a th ầ y cô đ ể em k ị p b ổ sung ki ế n th ứ c và c ố g ắ ng làm t ố t hơn cho công vi ệ c sau này Em xin chân thành c ả m ơn! C ầ n T hơ, ngày … tháng … năm 2021 Ngư ờ i th ự c hi ệ n v L Ờ I CAM ĐOAN Tôi xin cam k ế t khóa lu ậ n này đư ợ c hoàn thành d ự a trên các k ế t qu ả nghiên c ứ u c ủ a tôi và các k ế t qu ả nghiên c ứ u này chưa đư ợ c dùng cho b ấ t c ứ khóa lu ậ n cùng c ấ p nào khác C ầ n T hơ, ngày … tháng … năm 2021 Ngư ờ i th ự c hi ệ n vi M Ụ C L Ụ C CH Ấ P THU Ậ N C Ủ A H Ộ I Đ Ồ NG i NH Ậ N XÉT C Ủ A GIÁO VIÊN HƯ Ớ NG D Ẫ N ii NH Ậ N XÉT C Ủ A GIÁO VIÊN PH Ả N BI Ệ N iii L Ờ I C Ả M ƠN iv L Ờ I CAM ĐOAN v M Ụ C L Ụ C vi DANH SÁCH B Ả NG viii DANH SÁCH HÌNH ix CHƯƠNG 1: GI Ớ I THI Ệ U NƠI TH Ự C T Ậ P 1 1 1 Gi ớ i thi ệ u v ề công ty 1 1 2 Thông tin v ề công ty 1 1 3 Tr ụ s ở chính 1 1 4 Ngư ờ i đ ạ i di ệ n theo pháp lu ậ t: 1 1 5 Thông tin liện hệ 1 CHƯƠNG 2: GI Ớ I THI Ệ U 2 2 1 Đ ặ t v ấ n đ ề nghiên c ứ u 2 2 2 M ụ c tiêu nghiên c ứ u 3 2 3 Gi ớ i thi ệ u v ề b ộ cơ s ở d ữ li ệ u 3 2 4 Phương pháp nghiên c ứ u 5 2 4 1 Phương pháp nghiên c ứ u lý thuy ế t 5 2 4 2 Phương pháp nghiên c ứ u th ự c nghi ệ m 5 2 4 3 Phương phá p đi ề u tra 5 CHƯƠNG 3: CƠ S Ở L Ậ P LU Ậ N 6 3 1 Cơ s ở lý lu ậ n 6 3 2 Gi ớ i thi ệ u v ề gi ả i thu ậ t Decision Tree 6 3 2 1 Gi ớ i thi ệ u chung 6 3 2 2 Cây quy ế t đ ị nh C4 5 9 3 2 3 Hàm s ố entropy 11 3 2 4 Thu ậ t toán trong C4 5 12 3 2 5 Đi ề u ki ệ n d ừ ng 16 vii 3 2 6 Pruning 16 3 2 7 Tri th ứ c đ ị nh d ạ ng 17 3 2 8 L ậ p trình Python cho C4 5 17 3 3 Gi ớ i thi ệ u v ề ngôn ng ữ Python 17 CHƯƠNG 4 : GI Ả I THU Ậ T DECISION TREE TRONG NH Ậ N D Ạ NG CH Ữ S Ố VI Ế T TAY 20 4 1 Phương pháp nh ậ n d ạ ng Decision Tree 20 4 2 Quá trính nh ậ n d ạ ng ch ữ s ố vi ế t tay 21 4 2 1 Đưa ả nh vào 21 4 2 2 Ti ề n x ử lý 21 4 2 3 S ử d ụ ng Decision Tree đ ể nh ậ n d ạ ng 21 4 3 Sơ đ ồ Use case chương trình 22 CHƯƠNG 5: TH Ự C NGHI Ệ M VÀ K Ế T QU Ả 23 5 1 K ế t qu ả nghiên c ứ u 23 5 2 Giao di ệ n 24 5 3 Hư ớ ng d ẫ n s ử d ụ ng 25 5 4 Hư ớ ng d ẫ n cài đ ặ t 36 CHƯƠNG 6: K Ế T LU Ậ N 37 viii DANH SÁCH B Ả NG B ả ng 5 1: B ả ng Parameter 23 B ả ng 5 2: B ả ng so sánh các gi ả i thu ậ t 24 ix DANH SÁCH HÌNH Hình 2 1: Gi ớ i thi ệ u b ộ d ữ li ệ u mnist 4 Hình 2 2: Ả nh v ề pixel trong mnist 4 Hình 3 1: Các nút c ủ a cây nh ị phân 9 Hình 3 2: Ư ớ c lư ợ ng trên cây quy ế t đ ị nh 10 Hình 3 3: Bi ể u đ ồ 12 Hình 3 4: Mô t ả cách tính information gain 14 Hình 4 1: Sơ đ ồ use case chương trình 22 Hình 5 1: Giao di ệ n k ế t qu ả models test 23 Hình 5 2: Giao di ệ n c ủ a chương trình 24 Hình 5 3: Sơ đ ồ Use case hư ớ ng d ẫ n s ử d ụ ng 25 Hình 5 4: Giao di ệ n dùng đ ể train 26 Hình 5 5: Giao di ệ n sau khi train 27 Hình 5 6: Giao di ệ n test 28 Hình 5 7: Giao di ệ n test file model 29 Hình 5 8: Giao di ệ n t ỷ l ệ % đúng c ủ a list models 30 Hình 5 9: Giao di ệ n report c ủ a models 31 Hình 5 10: Giao di ệ n test m ộ t model 32 Hình 5 11: Giao di ệ n sau khi test 1 model 33 Hình 5 12: Giao di ệ n bi ể u đ ồ report test m ộ t model 34 Hình 5 13: Giao di ệ n nh ậ n d ạ ng ch ữ s ố vi ế t tay 35 Hình 5 14: Sơ đ ồ Use case hư ớ ng d ẫ n cài đ ặ t 36 x DANH M Ụ C T Ừ VI Ế T T Ắ T DT Decision Tree CNTT Công ngh ệ thông tin MNIST Modified National Institute of Standards and Technology
TRƯỜNG ĐẠI HỌC NAM CẦN THƠ KHOA KỸ THUẬT CÔNG NGHỆ ──── PHẠM HỮU DƯỢC ỨNG DỤNG PHƯƠNG PHÁP DECISION TREE ĐỂ NHẬN DIỆN CHỮ SỐ VIẾT TAY ĐỒ ÁN THỰC TẬP Ngành Công nghệ Thông tin Mã số Ngành: 7480201 05 – 2021 TRƯỜNG ĐẠI HỌC NAM CẦN THƠ KHOA KỸ THUẬT – CÔNG NGHỆ PHẠM HỮU DƯỢC MSSV: 177088 ỨNG DỤNG PHƯƠNG PHÁP DECISION TREE ĐỂ NHẬN DIỆN CHỮ SỐ VIẾT TAY ĐỒ ÁN THỰC TẬP Ngành Công nghệ Thông tin Mã số Ngành: 7480201 GIẢNG VIÊN HƯỚNG DẪN TS NGÔ HỒ ANH KHÔI 05-2021 CHẤP THUẬN CỦA HỘI ĐỒNG Đồ án thực tập cuối khóa “cài đặt giải thuật decision tree để nhận dạng chữ số viết tay” sinh viên Phạm Hữu Dược thực hướng dẫn TS Ngô Hồ Anh Khôi Đồ án thực tập báo cáo hội đồng chấm đồ án thông qua ngày …… tháng …… năm …… Ủy viên Thư ký GHI CHỨC DANH, HỌ, TÊN GHI CHỨC DANH, HỌ, TÊN Phản biện Phản biện GHI CHỨC DANH, HỌ, TÊN GHI CHỨC DANH, HỌ, TÊN Cán hướng dẫn Chủ tịch Hội đồng GHI CHỨC DANH, HỌ, TÊN GHI CHỨC DANH, HỌ, TÊN i NHẬN XÉT CỦA GIÁO VIÊN HƯỚNG DẪN ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· ····································································································· Cần Thơ, Ngày… tháng… năm 2021 Giáo viên hướng dẫn (Ký tên) TS Ngô Hồ Anh Khôi ii NHẬN XÉT CỦA GIÁO VIÊN PHẢN BIỆN ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ ······························································································ Cần Thơ, Ngày….tháng… năm 2021 Giáo viên phản biện (Ký tên) Th.S Huỳnh Bá lộc iii LỜI CẢM ƠN Trong thời gian thực tập cuối khóa(CNTT) lần này, em nhận giúp đỡ nhiệt tình từ thầy để em hồn thành thực tập cuối khóa(CNTT) kịp thời gian quy định Vì thế, cho phép em gửi lời cảm ơn sâu sắc đến thầy cô giảng viên khoa kỹ thuật – công nghệ trường Đại học Nam Cần Thơ dạy bảo trang bị cho em kiến thức vơ hữu ích để em có sở vững hoàn thành đồ án thực tập lần Đặc biệt em xin gửi lời chúc sức khỏe lời cảm ơn chân thành tới giảng viên TS Ngô Hồ Anh Khôi thầy giúp đỡ bảo tận tình để từ em định hướng mục tiêu hoàn thành tốt thực tập cuối khóa(CNTT) lần Mặc dù em cố gắng nổ lực nhiều em làm đồ án thực tập nên kinh nghiệm trở ngại em nên đồ án thực tập lần khơng tránh thiếu sót hạn chế Em mong nhận thông cảm, nhận xét bảo lại thầy cô để em kịp bổ sung kiến thức cố gắng làm tốt cho công việc sau Em xin chân thành cảm ơn! Cần Thơ, ngày … tháng … năm 2021 Người thực iv LỜI CAM ĐOAN Tôi xin cam kết khóa luận hồn thành dựa kết nghiên cứu kết nghiên cứu chưa dùng cho khóa luận cấp khác Cần Thơ, ngày … tháng … năm 2021 Người thực v MỤC LỤC CHẤP THUẬN CỦA HỘI ĐỒNG i NHẬN XÉT CỦA GIÁO VIÊN HƯỚNG DẪN ii NHẬN XÉT CỦA GIÁO VIÊN PHẢN BIỆN iii LỜI CẢM ƠN .iv LỜI CAM ĐOAN v MỤC LỤC vi DANH SÁCH BẢNG viii DANH SÁCH HÌNH ix CHƯƠNG 1: GIỚI THIỆU NƠI THỰC TẬP .1 1.1 Giới thiệu công ty 1.2 Thông tin công ty 1.3 Trụ sở 1.4 Người đại diện theo pháp luật: 1.5 Thông tin liện hệ CHƯƠNG 2: GIỚI THIỆU 2.1 Đặt vấn đề nghiên cứu 2.2 Mục tiêu nghiên cứu 2.3 Giới thiệu sở liệu 2.4 Phương pháp nghiên cứu 2.4.1 Phương pháp nghiên cứu lý thuyết 2.4.2 Phương pháp nghiên cứu thực nghiệm 2.4.3 Phương pháp điều tra CHƯƠNG 3: CƠ SỞ LẬP LUẬN 3.1 Cơ sở lý luận .6 3.2 Giới thiệu giải thuật Decision Tree 3.2.1 Giới thiệu chung .6 3.2.2 Cây định C4.5 3.2.3 Hàm số entropy 11 3.2.4 Thuật toán C4.5 12 3.2.5 Điều kiện dừng 16 vi 3.2.6 Pruning 16 3.2.7 Tri thức định dạng 17 3.2.8 Lập trình Python cho C4.5 .17 3.3 Giới thiệu ngôn ngữ Python .17 CHƯƠNG 4: GIẢI THUẬT DECISION TREE TRONG NHẬN DẠNG CHỮ SỐ VIẾT TAY 20 4.1 Phương pháp nhận dạng Decision Tree 20 4.2 Quá trính nhận dạng chữ số viết tay .21 4.2.1 Đưa ảnh vào .21 4.2.2 Tiền xử lý 21 4.2.3 Sử dụng Decision Tree để nhận dạng 21 4.3 Sơ đồ Use case chương trình 22 CHƯƠNG 5: THỰC NGHIỆM VÀ KẾT QUẢ 23 5.1 Kết nghiên cứu 23 5.2 Giao diện 24 5.3 Hướng dẫn sử dụng .25 5.4 Hướng dẫn cài đặt 36 CHƯƠNG 6: KẾT LUẬN 37 vii DANH SÁCH BẢNG Bảng 5.1: Bảng Parameter 23 Bảng 5.2: Bảng so sánh giải thuật 24 viii DANH SÁCH HÌNH Hình 2.1: Giới thiệu liệu mnist .4 Hình 2.2: Ảnh pixel mnist Hình 3.1: Các nút nhị phân Hình 3.2: Ước lượng định 10 Hình 3.3: Biểu đồ .12 Hình 3.4: Mơ tả cách tính information gain 14 Hình 4.1: Sơ đồ use case chương trình 22 Hình 5.1: Giao diện kết models test 23 Hình 5.2: Giao diện chương trình .24 Hình 5.3: Sơ đồ Use case hướng dẫn sử dụng 25 Hình 5.4: Giao diện dùng để train 26 Hình 5.5: Giao diện sau train 27 Hình 5.6: Giao diện test 28 Hình 5.7: Giao diện test file model 29 Hình 5.8: Giao diện tỷ lệ % list models 30 Hình 5.9: Giao diện report models 31 Hình 5.10: Giao diện test model .32 Hình 5.11: Giao diện sau test model .33 Hình 5.12: Giao diện biểu đồ report test model .34 Hình 5.13: Giao diện nhận dạng chữ số viết tay 35 Hình 5.14: Sơ đồ Use case hướng dẫn cài đặt .36 ix DANH MỤC TỪ VIẾT TẮT DT Decision Tree CNTT Công nghệ thông tin MNIST Modified National Institute of Standards and Technology x