Hướng phát triển

8 Tổng kết

8.2.2 Hướng phát triển

Dựa trên những hạn chế đã trình bày, để hoàn thiện và cải thiện mô hình, tôi dự kiến sẽ phát triển tiếp với những ý sau:

• Làm giàu thêm tập dữ liệu hiện có, xử lý những dữ liệu đang bị sai nhãn cũng như đánh nhãn chưa tốt, sau đó tiến hành huấn luyện lại mô hình cũng như đánh giá lại hệ thống.

• Xây dựng hệ thống, ứng dụng hoàn thiện từ những mô-đun đang có.

• Nghiên cứu, phát triển tiếp các mô-đun OCR, trích xuất thông tin để hoàn thiện một hệ thống trích xuất thông tin từ văn bản chữ viết tay.

Phụ lục A

Kệ hoạch thực hiện luận văn

Từ những ngày đầu giai đoạn đề cương đến nay, tôi đã xây dựng kế hoạch cụ thể cho từng giai đoạn để đảm bảo cho luận văn hoàn thành một cách tốt nhất. Mặc dù ảnh hưởng việc học, việc làm cũng như tình hình dịch bệnh căng thẳng của Covid-19 có làm chậm tiến độ đã được đề ra. Song kế hoạch thực hiện vẫn được đảm bảo hoàn thành, bản kế hoạch thực hiện luận văn được tôi trình bày trong hình A.1 dưới đây.

Tài liệu tham khảo

[1] Convolution - Tích chập giải thích bằng code thực tế website. https://techmast

er.vn/posts/35474/convolution-tich-chap-giai-thich-bang-code-thuc-te.

Accessed: 2020-12-10.

[2] Scale Space Technique for Word Segmentation in Handwritten Documents website.

http://ciir.cs.umass.edu/pubfiles/mm-27.pdf. Accessed: 2021-5-11.

[3] Support vector machines - cs229 lecture notes. http://cs229.stanford.edu/not

es2019fall/cs229-notes3.pdf. Accessed: 2021-04-12.

[4] Tranposed Convolution - Cách hoạt động của tích chập chuyển vị website. https: //towardsdatascience.com/transposed-convolution-demystified-84ca81b4 baba. Accessed: 2021-02-17.

[5] Yangquing Jia Pierre Sermanet Scott Reed Scott Reed Dragmir Anguelov Dumitru Erhan Vincent Vanhoucke Andrew Rabinovich Chirstian Szegedy, Wei Liu. Going deeper with convolutions. arXiv preprint arXiv:1409.4842v1, 2014.

[6] Ross Girshick EShaoqing Ren, Kaiming He and Jian Sun. Faster r-cnn: To- wards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497, 2016.

[7] Jonathan Long Evan Shelhamer and Trevor Darrell. Fully convolutional networks for semantic segmentation. arXiv preprint arXiv:1605.06211, 2015.

[8] Ross Girshick. Fast r-cnn. arXiv preprint arXiv:1504.08083, 2015.

[9] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015. [10] Piotr Dollar Ross Girshick Kaiming He, Georgia Gkioxari. Mask r-cnn.arXiv preprint

arXiv:1703.06870, 2018.

[11] Shaoqing Ren Jian Sun Kaiming He, Xiangyu Zhang. Deep residual learning for image recognition. arXiv preprint arXiv:1512.0239, 2015.

[12] Philipp Fischer Olaf Ronneberger and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. arXiv preprint arXiv:1505.04597v1, 2015. [13] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look

once: Unified, real-time object detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.

[14] Trevor Darrell Jitendra Malik Ross Girshick, Jeff Donahue. Rich feature hier- archies for accurate object detection and semantic segmentation. arXiv preprint arXiv:1311.2524v5, 2014.

[15] Frederic Kaplan† Sofia Ares Oliveira†, Benoit Seguin†. dhsegment: A generic deep- learning approach for document segmentation. arXiv preprint arXiv:1804.10371, 2019.

[16] Yichuan Tang. Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239, 2013.

[17] Francesco Visin Vincent Dumoulin. A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285, 2018.

[18] Lilian Weng. Object detection for dummies part 3: R-cnn family.

lilianweng.github.io/lil-log, 2017.

[19] He Wen Yuzhi Wang Shuchang Zhou Weiran He Jiajun Liang Xinyu Zhou, Cong Yao. East: An efficient and accurate scene text detector. arXiv preprint arXiv:1704.03155, 2017.

[20] Tong He Pan He Yu Qiao Zhi Tian, Weilin Huang. Detecting text in natural image with connectionist text proposal network. arXiv preprint arXiv:1609.03605, 2016.

Mô hình nhận diện vật thể Faster R-CNN [6]

Kết quả nhận diện của Mask R-CNN [10]