Về mô hình nhận dạng tư thế võ dựa trên ảnh chiều sâu

BỘ GIÁO DỤC VÀ ĐÀO TẠO TRƯỜNG ĐẠI HỌC BÁCH KHOA HÀ NỘI VỀ MƠ HÌNH NHẬN DẠNG TƯ THẾ VÕ DỰA TRÊN ẢNH CHIỀU SÂU LUẬN ÁN TIẾN SĨ KỸ THUẬT ĐIỆN TỬ Hà Nội − 2020 BỘ GIÁO DỤC VÀ ĐÀO TẠO TRƯỜNG ĐẠI HỌC BÁCH KHOA HÀ NỘI VỀ MƠ HÌNH NHẬN DẠNG TƯ THẾ VÕ DỰA TRÊN ẢNH CHIỀU SÂU Ngành: Kỹ thuật điện tử Mã số : 9520203 LUẬN ÁN TIẾN SĨ KỸ THUẬT ĐIỆN TỬ NGƯỜI HƯỚNG DẪN KHOA HỌC: TS Lê Dũng TS Phạm Thành Công Hà Nội − 2020 LỜI CAM ĐOAN Tôi xin cam đoan luận án: "Về mô hình nhận dạng tư võ dựa ảnh chiều sâu" cơng trình nghiên cứu riêng tơi Một phần số liệu, kết trình bày luận án trung thực, công bố tạp chí khoa học chuyên ngành, kỷ yếu hội nghị khoa học nước quốc tế Phần lại luận án chưa công bố cơng trình nghiên cứu ngồi nước Hà Nội, ngày 18 tháng 05 năm 2020 NGHIÊN CỨU SINH Nguyễn Tường Thành TẬP THỂ HƯỚNG DẪN TS Lê Dũng TS Phạm Thành Công i LỜI CẢM ƠN Luận án tiến sĩ thực Viện Điện tử Viễn thông, trường Đại học Bách khoa Hà Nội hướng dẫn khoa học TS Lê Dũng TS Phạm Thành Cơng Nghiên cứu sinh xin bày tỏ lịng biết ơn sâu sắc tới thầy định hướng khoa học suốt trình nghiên cứu Nghiên cứu sinh xin trân trọng cảm ơn nhà khoa học, tác giả cơng trình cơng bố trích dẫn cung cấp nguồn tư liệu quý báu q trình hồn thành luận án Nghiên cứu sinh xin trân trọng cảm ơn Viện Điện tử Viễn thơng; Phịng Đào tạo Trường Đại học Bách Khoa Hà Nội; Các thầy cô Viện Điện tử Viễn thông, anh chị bạn nhóm NCS, võ sư Hồ Minh Mộng Hùng, Phạm Đình Khiêm, Phạm Ngọc Dương, Bùi Thị Lành, Nguyễn Quốc Tiễn, Trung tâm Võ thuật cổ tryền Bình Định, TP Quy Nhơn, tỉnh Bình Định quan tâm, động viên giúp đỡ tạo điều kiện thuận lợi thời gian, địa điểm nghiên cứu, trang thiết bị, hỗ trợ mặt nhân lực để NCS thực việc thu thập liệu, thực nghiệm kết nghiên cứu Nghiên cứu sinh xin cảm ơn TS Lê Văn Hùng nghiên cứu Viện nghiên cứu quốc tế MICA, Đại học Bách khoa Hà Nội Đại học Tân Trào hỗ trợ kỹ thuật, đồng tác giả giúp NCS thực nghiên cứu luận án Cuối nghiên cứu sinh xin bày tỏ biết ơn tới Ban giám hiệu Trường Đại học Quy Nhơn; Ban chủ nhiệm Khoa Kỹ thuật Cơng nghệ, gia đình, bạn bè đồng nghiệp động viên khích lệ, tạo điều kiện thuận lợi để NCS yên tâm công tác học tập Hà Nội, tháng 05 năm 2020 NGHIÊN CỨU SINH Nguyễn Tường Thành ii NỘI DUNG LỜI CAM ĐOAN i LỜI CẢM ƠN ii NỘI DUNG v KÝ HIỆU VÀ VIẾT TẮT vi DANH SÁCH BẢNG BIỂU viii DANH SÁCH HÌNH VẼ xiv MỞ ĐẦU Chương 1.1 1.2 1.3 1.4 1.5 1.6 1: TỔNG QUAN Học máy, học sâu ứng dụng 1.1.1 Học máy 1.1.2 Học sâu Hệ thống khôi phục hoạt động người không gian 3-D chấm điểm võ thuật 1.2.1 Hệ thống khôi phục hoạt động người không gian 3-D 1.2.2 Hệ thống chấm điểm võ thuật Ước lượng khung xương thể người không gian 2-D 1.3.1 Ước lượng khung xương ảnh màu 1.3.2 Ước lượng khung xương ảnh độ sâu 1.3.3 Ước lượng tư dựa đối tượng ngữ cảnh hoạt động 1.3.4 Nhận xét Ước lượng khung xương tư người môi trường 3-D 1.4.1 Phục hồi tư 3-D người từ ảnh 1.4.2 Phục hồi tư 3-D người 1.4.2.1 Phục hồi khung xương, tư người không gian 3-D từ ảnh 1.4.2.2 Phục hồi khung xương, tư người không gian 3-D từ chuỗi ảnh 1.4.3 Nhận xét Các sở liệu cho việc đánh giá ước lượng khung xương không gian 3-D 1.5.1 Giới thiệu Kinect thiết lập thu liệu 1.5.2 Hiệu chỉnh liệu thu từ cảm biến Kinect Tổng kết chương iii 9 11 13 13 13 13 14 15 18 19 20 20 21 22 22 22 28 28 29 35 Chương 2: ƯỚC LƯỢNG KHUNG XƯƠNG CỦA NGƯỜI TỪ DỮ LIỆU VÕ CỔ TRUYỀN TRONG KHÔNG GIAN 3-D 2.1 2.2 2.3 Ước lượng khung xương không gian 2-D 2.1.1 Giới thiệu 2.1.2 Các nghiên cứu liên quan 2.1.3 Sử dụng học sâu cho việc ước lượng hành động võ cổ truyền không gian 2-D 2.1.3.1 Phương thức 2.1.3.2 Cơ sở liệu võ cổ truyền 2.1.3.3 Phương thức đánh giá 2.1.3.4 Xoay dịch liệu không gian 3-D 2.1.3.5 Kết ước lượng nhận xét 2.1.4 Kết luận Phục hồi khung xương, tư người không gian 3-D bị che khuất 2.2.1 Giới thiệu 2.2.2 Các nghiên cứu liên quan 2.2.3 Phục hồi khung xương, tư người không gian 3-D 2.2.3.1 Nghiên cứu so sánh khôi phục khung xương người không gian 3-D 2.2.3.2 Thí nghiệm kết ước lượng khung xương 3-D 2.2.4 Ước lượng khung xương, tư người bị che khuất Tổng kết chương Chương 3: 3.4 37 37 38 41 41 45 51 54 59 62 72 72 72 75 76 80 82 90 NHẬN DẠNG VÀ CHẤM ĐIỂM ĐỘNG TÁC VÕ CỔ TRUYỀN VIỆT NAM 3.1 3.2 3.3 36 91 Giới thiệu 91 Các nghiên cứu liên quan 94 Cơ sở lý thuyết để nhận diện động tác công chấm điểm động tác võ 95 3.3.1 Nhận diện động tác công 95 3.3.1.1 Xử lý liệu 95 3.3.1.2 Trích xuất đặc trưng thể người với camera Kinect 95 3.3.2 Mơ hình chấm điểm động tác võ cổ truyền 100 3.3.2.1 Mô tả động tác người 100 3.3.2.2 Công thức chấm điểm 103 Thực nghiệm 105 3.4.1 Nhận diện động tác công 105 3.4.1.1 Nhận diện động tác công phân loại 105 iv 3.5 3.6 3.4.1.2 Nhận diện động tác công mạng 3.4.2 Chấm điểm động tác võ cổ truyền Việt Nam Kết luận Tổng kết chương KẾT LUẬN VÀ HƯỚNG PHÁT TRIỂN nơ ron 106 108 113 113 113 DANH MỤC CÁC CƠNG TRÌNH ĐÃ CÔNG BỐ CỦA LUẬN ÁN 115 TÀI LIỆU THAM KHẢO PHỤ LỤC 118 132 v DANH MỤC CÁC KÝ HIỆU VÀ VIẾT TẮT Số Viết tắt Giải nghĩa Nghĩa tiếng Việt AD Average deviation Độ lệch trung bình AP Average Precision Độ xác trung bình APM Articulated Part-based Modeldeviation Mơ hình dựa phần khớp nối CPM Convolutional Pose Machines Máy học cử tích chập CPU Central Processing Unit Đơn vị xử lý trung tâm CNN Convolutional Nerural Network Mạng Nơ ron tích chập CNNs Convolutional Nerural Networks Mạng Nơ ron tích chập nhiều lớp DPM Deformable Part Model Mơ hình phần biến dạng DTW Dynamic Time Warping So khớp chuỗi thời gian động 10 DV Digital Video Video số 11 fps f rame per second Khung hình giây 12 GPU Graphics Processing Unit Đơn vị xử lý đồ họa 13 HMMs Hidden Markov Models Mơ hình Markov ẩn 14 HOG Histogram of Oriented Gradients Biểu đồ hướng dốc 15 HRNet High-Resolution Network Mạng độ phân giải cao 16 IR InfraRed camera Máy ảnh hồng ngoại 17 JI Jaccard Index Chỉ số Jaccard 18 LSTM Long Short-Term Memory Mạng nhớ ngắn định hướng dài hạn 19 MADS Martial Arts, Dancing and Sports Võ cổ truyền, khiêu vũ, thể thao 20 MOCAP MOtion CAPture 21 MPJPE MeanPerJointPositionError Độ đo sai số trung bình khớp nối 22 MS MicroSoft Microsoft 23 MSE Mean Squared Error Sai số bình phương 24 OCR Optical Character Recognition Nhận dạng ký tự quang học 25 OKS Object Key point Similarity Độ tương tự điểm đại diện 26 OpenCV Open Computer Vision 27 OpenNI Open Natural Interaction Thư viện hỗ trợ đa ngơn ngữ 28 PCA Principal Component Analysis Phân tích nguyên lý thành phần Thu nhận chuyển động Thư viện mã nguồn mở thị giác máy tính vi 29 PCL Poind Cloud Library Thư viện đám mây điểm 30 RAM Random Access Memory Bộ nhớ truy nhập ngẫu nhiên 31 RDF Random Decision Forests Rừng định ngẫu nhiên 32 RGB Red Green Blue Đỏ Xanh Xanh lơ 33 SDK Software Development Kit Kit phát triển phần mềm 34 SVM Support Vector Machine Học máy hỗ trợ vector 35 TOF Time-Of-Flight sensor Cảm biến TOF 36 V1 Version Phiên 37 V2 Version Phiên 38 VE Vector Estimation Vector dự đoán 39 VG Vector Ground truth Vector đánh dấu thực 40 VNMA VietNam Martial Arts Võ cổ truyền Việt Nam vii DANH SÁCH BẢNG BIỂU Bảng 1.1 Thống kê nghiên cứu ước lượng khung xương người khơng gian 3-D mà có đánh giá cở liệu Human3.6M [85] kết ước lượng 24 Bảng 1.2 Khảo sát ước lượng tư người không gian 3-D sử dụng ảnh 26 Bảng 1.3 Khảo sát ước lượng khung xương người không gian 3-D từ chuỗi ảnh 27 Bảng 2.1 Số khung hình tư võ sở liệu VNMA 48 Bảng 2.2 Số khung hình tư võ sở liệu SVNMA 49 Bảng 2.3 Kết trung bình ước lượng khớp nối (AP), góc lệch khớp liệu gốc khớp nối ước lượng (AD) khoảng cách trung bình điểm đại diện ước lượng điểm đại diện liệu gốc, tương ứng với 59 Bảng 2.4 Kết ước lượng khung xương ảnh chiếu sang không gian 3-D với 14 điểm xương liệu VNMA Kết đánh giá độ đo MPJPE theo đơn vị milimet (mm) 66 Bảng 2.5 Số khung hình đánh giá liệu VNMA 67 Bảng 2.6 Kết ước lượng khung xương ảnh sau chiếu sang khơng gian 3-D sở liệu MADS với 14 điểm xương 69 Bảng 2.7 Số khung hình cho việc đánh giá ước lượng khung xương ảnh sau chiếu sang khơng gian 3-D sở liệu MADS 70 Bảng 2.8 Kết ước lượng khung xương ảnh sau chiếu sang khơng gian 3-D sở liệu VNMA với 15 điểm xương 86 Bảng 2.9 Kết ước lượng khung xương ảnh sau chiếu sang khơng gian 3-D sở liệu MADS với 15 điểm xương 87 Bảng 3.1 Thể tám véc tơ chi 102 viii TÀI LIỆU THAM KHẢO [1] MJ Rantz, TS Banerjee, E Cattoor, SD Scott, M Skubic, and M Popescu Automated fall detection with quality improvement "rewind" to reduce falls in hospital rooms J Gerontol Nurs, 40(1):13–17, 2014 [2] Yury Degtyarev Philip L Davidson Sean Ryan Fanello Adarsh Kowdle Sergio Orts Christoph Rhemann David Kim Jonathan Taylor Pushmeet Kohli Vladimir Tankovich Shahram Izadi Mingsong Dou, Sameh Khamis Fusion4D: real-time performance capture of challenging scenes ACM Transactions on Graphics, 35(4), 2016 [3] Koldo de Miguel, Alberto Brunete, Miguel Hernando, and Ernesto Gambao Home Camera-Based Fall Detection System for the Elderly Journal of Sensors, 17(12), 2017 [4] Moiz Ahmed, Nadeem Mehmood, Nadeem Adnan, Amir Mehmood, and Kashif Rizwan Fall Detection System for the Elderly Based on the Classification of Shimmer Sensor Prototype Data Healthc Inform Res, 23(3):147–158, 2017 [5] Raul IgualCarlos, Medrano Carlos, and Inmaculada Plaza Challenges, Issues and Trends in Fall Detection Systems BioMedical Engineering OnLine, 12(1):147–158, 2013 [6] Tinh Binh Dinh Bao ton va phat huy vo co truyen Binh dinh: Tiep tuc ho tro cac vo duong tieu bieu http://www.baobinhdinh.com.vn/viewer.aspx?macm= 12&macmp=12&mabb=88043, 2017 [Accessed; April, 2019] [7] Meier Kung Fu Motion https://www.djaquet.info/blog/2018/5/20/ ive-been-to-kung-fu-motion-artlab-epfl-april-august-2018, 2018 [Accessed; April, 2019] [8] Tinh Binh Dinh Ai ve Binh Dinh ma coi, Con gai Binh Dinh bo roi di quyen http://www.seagullhotel.com.vn/du-lich-binh-dinh/vo-cotruyen-binh-dinh-5, 2019 [Accessed; April, 2019] [9] Japanese Leading Japanese Martial Arts https://jpninfo.com/10410, 2015 [Accessed; May, 21 2019] [10] China Tai chi v MMA: The 20-second fight that left China reeling https: //www.bbc.com/news/world-asia-china-39853374, 2017 [Accessed; May, 21 2019] 118 [11] PRESERVING PRESERVING LIFE THROUGH THE MARTIAL WAY – AT MYOFU AN BUJUTSU DOJO, NEW HAMPSHIRE https://myo-fu-an com/tai-chi-chuan/, 2019 [Accessed; May, 21 2019] [12] Chinese Chinese Kung Fu (Martial Arts) https://www.travelchinaguide com/intro/martial_arts/, 2019 [Accessed; April, 2019] [13] ECCV2018 ECCV 2018 Joint COCO and Mapillary Recognition) http:// cocodataset.org/#home, 2018 [Accessed 18 April 2019] [14] MSCOCO 2017 MSCOCO Keypoints Challenge 2017) coco2017.github.io/, 2017 [Accessed 18 April 2019] https://places- [15] Tinh Binh Dinh Preserving traditional martial arts) baobinhdinh.com.vn/culture-sport/2011/8/114489/, 2011 April 2019] http://www [Accessed 18 [16] Chinese traditional Chinese martial arts and the transmission of intangible cultural heritage) https://www.academia.edu/18641528/Fighting_ modernity_traditional_Chinese_martial_arts_and_the_transmission_ of_intangible_cultural_heritage., 2012 [Accessed 18 April 2019] [17] Microsoft Kinect for Windows SDK v1.8 https://www.microsoft.com/enus/download/details.aspx?id=40278, 2012 [Accessed 18 April 2019] [18] Opencv Opencv library https://opencv.org/, 2018 [Accessed 19 April 2019] [19] MICA International Research Institute MICA http://mica.edu.vn/, 2019 [Accessed 19 April 2019] [20] Karate Karate Rules - Kumite Scoring System https://www.youtube.com/ watch?v=c6r8JwEFowY, 2018 [Accessed 19 April 2019] [21] openpose openpose https://github.com/CMU-Perceptual-Computing-Lab/ openpose, 2019 [Accessed 23 April 2019] [22] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh Realtime Multi Person Pose Estimation https://github.com/ZheC/Realtime_Multi-Person_Pose_ Estimation [Accessed 23 April 2019] [23] COCO Observations on the calculations of COCO metrics https://github com/cocodataset/cocoapi/issues/56, 2019 [Accessed 24 April 2019] [24] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh Realtime multi-person 2d pose estimation using part affinity field 2017 119 [25] Xu Tao and Zhou Yun Fall prediction based on biomechanics equilibrium using Kinect International Journal of Distributed Sensor Networks, 13(4), 2017 [26] Daphne Koller Sebastian Thrun Varun Ganapathi, Christian Plagemann Realtime human pose tracking from range data In ECCV, 2012 [27] Umer Rafi, Juergen Gall, and Bastian Leibe A semantic occlusion model for human pose estimation from a single depth image In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2015 [28] Leonid Pishchulin, Eldar Insafutdinov, Siyu Tang, Bjoern Andres, Mykhaylo Andriluka, Peter Gehler, and Bernt Schiele Deepcut: Joint subset partition and labeling for multi person pose estimation In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016) year = 2016, [29] Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh Convolutional pose machines In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016) year = 2016, [30] Burrus Nicolas Calibrating the depth and color camera http://nicolas burrus.name/index.php/Research/KinectCalibration, 2018 [Online; accessed 10-January-2018] [31] Weichen Zhang, Zhiguang Liu, Liuyang Zhou, Howard Leung, and Antoni B Chan Martial Arts, Dancing and Sports dataset: a Challenging Stereo and MultiView Dataset for 3D Human Pose Estimation Image and Vision Computing, Volume 61, 2017 [32] Leonid Sigal, Alexandru O Balan, and Michael J Black HUMANEVA: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion International Journal of Computer Vision, Volume 87(1), 2010 [33] Istvan Sarandi, Timm Linder, Kai O Arras, and Bastian Leibe How Robust is 3D Human Pose Estimation to Occlusion In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’18) - Workshop on Robotic Co-workers 4.0: Human Safety and Comfort in Human-Robot Interactive Social Environments, 2018 [34] PCL How to use random sample consensus model http://pointclouds.org/ documentation/tutorials/random_sample_consensus.php, 2014 [35] Jared St Jean Kinect hacks Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472., 2013 120 [36] Qifei Wang, Gregorij Kurillo, Ferda Ofli, and Ruzena Bajcsy Evaluation of pose tracking accuracy in the first and second generations of microsoft Kinect In Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015, pages 380–389, 2015 [37] Cheng-Shian Lin Hao-Cheng Mo, Jin-Jang Leou Human Behavior Analysis Using Multiple 2D Features and Multicategory Support Vector Machine In Proceedings Conference on Machine Vision Applications MVA, 2009 [38] C Nakajima, M Pontil, and T Poggio People recognition and pose estimation in image sequences In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks., number June 2014, pages 189–194 vol.4, 2002 [39] Cem Keskin, Furkan Kirac, Yunus Emre Kara, and Lale Akarun 3D hand pose estimation and classification using depth sensors In 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pages 1–4, 2012 [40] Sijin Li, Weichen Zhang, and Antoni B Chan Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation International Journal of Computer Vision, 122(1):149–168, 2017 [41] Haiyong Zhao and Zhijing Liu Human Action Recognition Based on Non-linear SVM Decision Tree, volume = 7, year = 2011 Journal of Computational Information Systems, (7) [42] Le Thi-Lan, Nguyen Minh-Quoc, and Nguyen Thi-Thanh-Mai Human posture recognition using human skeleton provided by Kinect In 2013 International Conference on Computing, Management and Telecommunications (ComManTel), 2013 [43] Patsadu Orasa, Nukoolkit Chakarida, and Watanapa Bunthit Human gesture recognition using Kinect camera In 2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE), 2012 [44] Sriparna Saha, Shreya Ghosh, Amit Konar, and Atulya K Nagar Gesture Recognition from Indian Classical Dance Using Kinect Sensor In 2013 Fifth International Conference on Computational Intelligence, Communication Systems and Networks, 2013 [45] A Agarwal and B Triggs 3D human pose from silhouettes by relevance vector regression In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004 121 [46] Min Sun, Pushmeet Kohli, and Jamie Shotton Conditional regression forests for human pose estimation In 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012 [47] Angel Martiez-Gonzalez, Michael Villamizar, Olivier Canevet, and Jean Marc Odobez Real-time Convolutional Networks for Depth-based Human Pose Estimation In IEEE International Conference on Intelligent Robots and Systems, pages 41–47, 2018 [48] Wenjuan Gong, Xuena Zhang, Jordi Gonzàlez, Andrews Sobral, Thierry Bouwmans, Changhe Tu, and El Hadi Zahzah Human Pose Estimation from Monocular Images: A Comprehensive Survey Sensors (Basel, Switzerland), 16(12):1–39, 2016 [49] Enrique Martinez Berti, Antonio Jose Sánchez Salmerón, and Carlos Ricolfe Viala 4-Dimensional deformation part model for pose estimation using Kalman filter constraints International Journal of Advanced Robotic Systems, 14(3):1–13, 2017 [50] Tao Hu, Xinyan Zhu, Wei Guo, and Kehua Su Efficient Interactions Recognition through Positive Action based Representation Mathematical Problems in Engineering, 2013 [51] Pedro Felzenszwalb, David Mcallester, and Deva Ramanan A Discriminatively Trained, Multiscale, Deformable Part Model In 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008 [52] Min Sun and Silvio Savarese Articulated part-based model for joint object detection and pose estimation In Proceedings of the IEEE International Conference on Computer Vision, pages 723–730, 2011 [53] Leonid Pishchulin, Mykhaylo Andriluka, Peter Gehler, and Bernt Schiele Strong Appearance and Expressive Spatial Models for Human Pose Estimation In Proceedings of the 2013 IEEE International Conference on Computer Vision, 2013 [54] Mykhaylo Andriluka, Stefan Roth, and Bernt Schiele Pictorial Structures Revisited : People Detection and Articulated Pose Estimation In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 1014–1021, 2009 [55] Alex Smola and S.V.N Vishwanathan Introduction to machine learning Cambridge University Press, 2010 [56] Ho Sy Hung Tom luoc lich su phat trien cua nganh machine learning https://techmaster.vn/posts/33923/lich-su-phat-trien-machinelearning, 2016 [Accessed; June, 21 2019] 122 [57] Maher Which machine learning model to use https://towardsdatascience com/which-machine-learning-model-to-use-db5fdf37f3dd, 2019 [Accessed; June, 21 2019] [58] Nguyen Van Hieu Tong quan ve machine learning https://nguyenvanhieu vn/machine-learning-la-gi/, 2019 [Accessed; June, 21 2019] [59] wikipedia Hoc sau https://vi.wikipedia.org/wiki/H%E1%BB%8Dc_s%C3% A2u, 2019 [Accessed; June, 21 2019] [60] Microsoft Kinect for Windows SDK v1.8 https://www.microsoft.com/en-us/ download/details.aspx?id=40278, 2018 [Online; accessed 10-January-2018] [61] Christian Plagemann Real Time Motion Capture Using a Single Time-Of-Flight Camera In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010 [62] Jamie Shotton and Toby Sharp Real-Time Human Pose Recognition in Parts from Single Depth Images In CVPR 2011, 2011 [63] Christian Plagemann and Daphne Koller Real-time Identification and Localization of Body Parts from Depth Images In 2010 IEEE International Conference on Robotics and Automation, 2010 [64] Himanshu Prakash Jain and Anbumani Subramanian Real-Time Upper-Body Human Pose Estimation In Lecture Notes in Computer Science book series (LNCS, volume 6930), pages 227–238, 2011 [65] Chaitanya Desai and Deva Ramanan Detecting actions, poses, and objects with relational phraselets Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7575 LNCS(PART 4):158–172, 2012 [66] Vivek Kumar Singh and Ram Nevatia Multiple Pose Context Trees for estimating Human Pose in Object Context_Context_2010_Singh, Nevatia.pdf In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pages 17–24, 2010 [67] Bangpeng Yao and Li Fei-Fei Recognizing human-object interactions in still images by modeling the mutual context of objects and human poses IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(9):1691–1703, 2012 [68] Jk Aggarwal and Ms Ryoo Human activity analysis: A review (Vision-based) ACM Computing Surveys (CSUR), 43(3):16:1–16:43, 2011 123 [69] Mao Ye, Qing Zhang, Liang Wang, Jiejie Zhu, Ruigang Yang, and Juergen Gall A survey on human motion analysis from depth data Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8200 LNCS:149–187, 2013 [70] Mohamed E Hussein, Marwan Torki, Mohammad A Gowayyed, and Motaz ElSaban Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations IJCAI International Joint Conference on Artificial Intelligence, pages 2466–2472, 2013 [71] Fengjun Lv and Ramakant Nevatia Recognition and segmentation of 3-D human action using HMM and multi-class AdaBoost Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3954 LNCS:359–372, 2006 [72] Yaser Sheikh, Mumtaz Sheikh, and Mubarak Shah Exploring the space of a human action In Proceedings of the IEEE International Conference on Computer Vision, volume I, pages 144–149, 2005 [73] Jiang Wang, Zicheng Liu, Ying Wu, and Junsong Yuan Mining actionlet ensemble for action recognition with depth cameras In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1290–1297, 2012 [74] Xiaodong Yang and Yingli Tian EigenJoints-based Action Recognition Using Naăive-Bayes-Nearest-Neighbor In 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 [75] Yu Zhu, Wenbin Chen, and Guodong Guo Fusing spatiotemporal features and joints for 3D action recognition In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pages 486–491, 2013 [76] Michael J Black Parameterized Modeling and Recognition of Activities Yaser Yacoob Introduction Computer Vision and Image Understanding, Volume 73(Issue 2):Pages 232–247, 1999 [77] Rizwan Chaudhry, Ferda Ofli, Gregorij Kurillo, and Ruzena Bajcsy Bio-inspired Dynamic 3D Discriminative Skeletal Features ChaudhryHAU3D13.pdf In 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2013 [78] Ferda Ofli, Rizwan Chaudhry, Gregorij Kurillo, René Vidal, and Ruzena Bajcsy Sequence of the most informative joints (SMIJ): A new representation for human skeletal action recognition Journal of Visual Communication and Image Representation, 25(1):24–38, 2014 124 [79] Eshed Ohn-bar and Mohan M Trivedi Contribution Joint Angles Similarities and HOG for Action Recognition Joint Angles Affinity Clustering – Previous Work In IEEE Conference on Computer Vision and Pattern Recognition Workshops: Human Activity Understanding from 3D Data,, pages 1–6, 2013 [80] Raviteja Vemulapalli, Felipe Arrate, and Rama Chellappa Human action recognition by representing 3D skeletons as points in a lie group Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 588–595, 2014 [81] Ho Viet Ha, Tran Anh Vu, Ngo Van Sy, Huynh Huu Hung, and Dang Van Hung Phan tich dang di dua tren thong tin sau In Ky yeu Hoi nghi khoa hoc quoc gia lan thu IX, Nghien cuu co ban va ung dung cong nghe thong tin, pages 553–558, 2017 [82] Pham Nguyen Khang and Huynh Nhat Minh Ung dung Camera Kinect de thiet ke bai tap ho tro phuc hoi chuc nang van dong Tap chi khoa hoc truong Dai hoc Can Tho, pages 25–31, 2015 [83] Nikolaos Sarafianos, Bogdan Boteanu, Bogdan Ionescu, and Ioannis A Kakadiaris 3D Human Pose Estimation : A Review of the Literature and Analysis of Covariates Computer Vision and Image Understanding, Volume 152(vii):Pages 1–20, 2016 [84] Denis Tome, Chris Russell, and Lourdes Agapito Lifting from the deep: Convolutional 3D pose estimation from a single image In Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, volume 2017-Janua, pages 5689–5698, 2017 [85] Catalin Ionescu, Dragos Papava, Vlad Olaru, and Cristian Sminchisescu Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7):1325–1339, jul 2014 [86] Xing Zhou A Study of Microsoft Kinect Calibration Technical report Dept of Computer Science George Mason University, 2012 [87] Bouguet Jean-Yves Camera calibration toolbox for matlab http:// www.vision.caltech.edu/bouguetj/calib_doc/, 2018 [Online; accessed 10January-2018] [88] Cristian Sminchisescu Catalin Ionescu, Fuxin Li Latent structured models for human pose estimation In International Conference on Computer Vision, 2011 125 [89] Hao-shu Fang, Yuanlu Xu, Wenguan Wang, Xiaobai Liu, and Song-chun Zhu Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation In Thirty-Second AAAI Conference on Artificial Intelligence, 2018 [90] Istvan Sarandi, Timm Linder, Kai Oliver Arras, and Bastian Leibe How robust is 3d human pose estimation to occlusion? CoRR, abs/1808.09316, 2018 [91] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun Deep residual learning for image recognition CoRR, abs/1512.03385, 2015 [92] Mir Rayat Imtiaz Hossain and James J Little Exploiting temporal information for 3D human pose estimation In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 11214 LNCS, pages 69–86, 2018 [93] Linwan Liu, Xiaoyu Wu, Linglin Wu, and Tianchu Guo Static Human Gesture grading based on Kinect In 2012 5th International Congress on Image and Signal Processing, CISP 2012, pages 1390–1393, 2012 [94] Zoe Marquardt, João Beira, Natalia Em, Isabel Paiva, and Sebastian Kox Super Mirror: A Kinect Interface for Ballet Dancers CHI ’12 Extended Abstracts on Human Factors in Computing Systems (CHI EA ’12), pages 1619–1624, 2012 [95] Michalis Raptis, Darko Kirovski, and Hugues Hoppe Real-Time Classification of Dance Gestures (MSR).pdf In SCA ’11 Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pages 147–156, 2011 [96] D Kim, D Kim, and J Paik Gait recognition using active shape model and motion prediction IET Computer Vision, 4(1):25, 2009 [97] Kem-Laurin Kramer Learn more about Natural User Interface In Natural User Interface 2012 [98] Ashleigh Johnstone and Paloma Marí-Beffa The effects of martial arts training on attentional networks in typical adults Frontiers in Psychology, 9(FEB):1–9, 2018 [99] Mitola J and Maguire G.Q Cognitive Radio : Making Software Radios More Personal IEEE Personal Communications, Volume: 6(Issue: 4), 1999 [100] Tin Kam Ho Random Decision Forests In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, volume 47, pages 4–8, 95 126 [101] Ilana Rapp Motion Capture Actors: Body Movement Tells The Story https://www.nycastings.com/motion-capture-actors-body-movementtells-the-story/, 2019 [Accessed; June, 21 2019] [102] Leonid Sigal, Alexandru O Balan, and Michael J Black HUMAN EVA : Synchronized Video and Motion Capture Dataset Human Motion International Journal of Computer Vision, 87(1):4–27, 2010 [103] Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, and Bernt Schiele [104] Sudharshan Chandra Babu A 2019 guide to Human Pose Estimation with Deep Learning https://blog.nanonets.com/human-pose-estimation-2d-guide/, 2019 [Accessed; June, 21 2019] [105] Fabien Baradel, Christian Wolf, and Julien Mille Pose-conditioned spatiotemporal attention for human action recognition CoRR, abs/1703.10106, 2017 [106] Yi Yang and Deva Ramanan Articulated pose estimation with flexible mixturesof-parts resenting shape In In CVPR, pages 1385–1392, 2011 [107] Matthias Dantone, Juergen Gall, and Christian Leistner Human Pose Estimation using Body Parts Dependent Joint Regressors In In CVPR, 2013 [108] Alexander Toshev and Christian Szegedy DeepPose: Human Pose Estimation via Deep Neural Networks In CVPR, 2014 [109] Jonathan Tompson, Ross Goroshin, Arjun Jain, Yann Lecun, and Christoph Bregler Efficient Object Localization Using Convolutional Networks In CVPR, 2015 [110] Shih-en Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh Convolutional Pose Machines In CVPR, 2016 [111] U C Berkeley, U C Berkeley, U C Berkeley, C V Jun, and U C Berkeley Human Pose Estimation with Iterative Error Feedback In CVPR, 2016 [112] Alejandro Newell, Kaiyu Yang, and Jia Deng Stacked Hourglass Networks for Human Pose Estimation In ECCV, 2016 [113] Bin Xiao, Haiping Wu, and Yichen Wei Simple Baselines for Human Pose Estimation and Tracking In ECCV, pages 1–16, 2018 [114] Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang Deep High-Resolution Representation Learning for Human Pose Estimation In CVPR, 2019 [115] Kyle Brown Stereo Human Keypoint Estimation Stanford University, Stanford Intelligent Systems Laboratory kjbrown7, 2017 127 [116] Daniil Osokin Real-time 2d multi-person pose estimation on CPU: lightweight openpose CoRR, abs/1811.12004, 2018 [117] Manuel J Mar and Rafael Medina-carnicer 3D human pose estimation from depth maps using a deep combination of poses Journal of Visual Communication and Image Representation, Volume 55:627–639, 2018 [118] Albert Haque, Boya Peng, Zelun Luo, Alexandre Alahi, Serena Yeung, and Fei-Fei Li Viewpoint invariant 3d human pose estimation with recurrent error feedback CoRR, abs/1603.07076, 2016 [119] Alireza Shafaei and James J Little Real-Time Human Motion Capture with Multiple Depth Cameras In 13th Conference on Computer and Robot Vision (CRV), 2016 [120] Fabrice Atrevi, Damien Vivet, Florent Duculty, and Bruno Emile 3D Human Poses Estimation from a single 2D silhouette In 11th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications ISBN: 978-989-758-175-5, 2017 [121] N Dalal and B Triggs 3D Human Poses Estimation from a single 2D silhouette In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) ISBN: 0-7695-2372-2, 2005 [122] P.-T Yap, R Paramesran, and Seng-Huat Ong Image analysis by Krawtchouk moments IEEE Transactions on Image Processing, 12, 2003 [123] Helge Rhodin, Mathieu Salzmann, and Pascal Fua Unsupervised geometry-aware representation for 3D human pose estimation In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 11214 LNCS, pages 765–782, 2018 [124] K P Horn Berthold Closed-form solution of absolute orientation using unit quaternions Journal of the Optical Society of America, 4(4):629–642, 1987 [125] Jeff Kramer, Nicolas Burrus, Florian Echtler, Herrera C Daniel, and Matt Parker Hacking the Kinect Apress, 2012 [126] MReza Naeemabadi, Birthe Dinesen, Ole Kæseler Andersen, Samira Najafi, and John Hansen Evaluating accuracy and usability of microsoft kinect sensors and wearable sensor for tele knee rehabilitation after knee operation In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIODEVICES,, pages 128–135 INSTICC, SciTePress, 2018 128 [127] Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, and Christian Theobalt Vnect: Real-time 3d human pose estimation with a single rgb camera volume 36, 2017 [128] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun Identity mappings in deep residual networks In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 9908 LNCS, pages 630–645, 2016 [129] Junho Kim ResNet-Tensorflow https://github.com/taki0112/ResNetTensorflow, 2019 [Accessed 18 April 2019] [130] Sam Johnson and Mark Everingham Clustered pose and nonlinear appearance models for human pose estimation In Proceedings of the British Machine Vision Conference, pages 12.1–12.11 BMVA Press, 2010 doi:10.5244/C.24.12 [131] Sam Johnson and Mark Everingham Learning effective human pose estimation from inaccurate annotation In IEEE Proc CVPR, 2011 [132] Dushyant Mehta, Helge Rhodin, Dan Casas, Oleksandr Sotnychenko, Weipeng Xu, and Christian Theobalt Monocular 3d human pose estimation using transfer learning and improved CNN supervision CoRR, abs/1611.09813, 2016 [133] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation https://github.com/CMU-Perceptual-Computing-Lab/openpose [Accessed 23 April 2019] [134] Dario Pavllo, Christoph Feichtenhofer, David Grangier, and Michael Auli 3d human pose estimation in video with temporal convolutions and semi-supervised training In Conference on Computer Vision and Pattern Recognition (CVPR), 2019 [135] Aiden Nibali, Zhen He, Stuart Morgan, and Luke Prendergast 3D human pose estimation with 2D marginal heatmaps In Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019, number Figure 1, pages 1477–1485, 2019 [136] Márton Véges, Viktor Varga, and András L˝orincz 3d human pose estimation with siamese equivariant embedding arXiv preprint arXiv:1809.07217, 2018 [137] Keze Wang, Liang Lin, Chenhan Jiang, Chen Qian, and Pengxu Wei 3d human pose machines with self-supervised learning IEEE transactions on pattern analysis and machine intelligence, 2019 129 [138] Julieta Martinez, Rayat Hossain, Javier Romero, and James J Little A Simple Yet Effective Baseline for 3d Human Pose Estimation In Proceedings of the IEEE International Conference on Computer Vision, volume 2017-Octob, pages 2659–2668, 2017 [139] Mỏrton Vộges and Andrỏs Lăorincz Absolute human pose estimation with depth prediction network CoRR, abs/1904.05947, 2019 [140] Dushyant Mehta, Oleksandr Sotnychenko, Franziska Mueller, Weipeng Xu, Srinath Sridhar, Gerard Pons-Moll, and Christian Theobalt Single-shot multiperson 3d pose estimation from monocular rgb In 3D Vision (3DV), 2018 Sixth International Conference on IEEE, sep 2018 [141] Xiao Sun, Chuankang Li, and Stephen Lin An Integral Pose Regression System for the ECCV2018 PoseTrack Challenge In ECCV, pages 1–5, 2018 [142] Georgios Pavlakos, Xiaowei Zhou, Konstantinos G Derpanis, and Kostas Daniilidis Coarse-to-fine volumetric prediction for single-image 3D human pose In Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, volume 2017-Janua, pages 1263–1272, 2017 [143] Luyang Wang, Yan Chen, Zhenhua Guo, Keyuan Qian, Mude Lin, Hongsheng Li, and Jimmy S Ren Generalizing monocular 3d human pose estimation in the wild arXiv preprint arXiv:1904.05512, 2019 [144] Chen Li and Gim Hee Lee Generating multiple hypotheses for 3d human pose estimation with mixture density network In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019 [145] Dushyant Mehta, Helge Rhodin, Dan Casas, Pascal Fua, Oleksandr Sotnychenko, Weipeng Xu, and Christian Theobalt Monocular 3d human pose estimation in the wild using improved cnn supervision In 3D Vision (3DV), 2017 Fifth International Conference on, 2017 [146] Karim Iskakov, Egor Burkov, Victor S Lempitsky, and Yury Malkov Learnable triangulation of human pose CoRR, abs/1905.05754, 2019 [147] Bugra Tekin, Pablo Marquez-Neila, Mathieu Salzmann, and Pascal Fua Learning to Fuse 2D and 3D Image Cues for Monocular Body Pose Estimation In Proceedings of the IEEE International Conference on Computer Vision, volume 2017-Octob, pages 3961–3970, 2017 [148] Magnus Burenius, Josephine Sullivan, and Stefan Carlsson 3D Pictorial Structures for Multiple View Articulated Pose Estimation In 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013 130 [149] Sam Johnson and Mark Everingham Clustered pose and nonlinear appearance models for human pose estimation In Proc BMVC, pages 12.1–11, 2010 doi:10.5244/C.24.12 [150] Geometric Geometric Transformations https://pages.mtu.edu/~shene/ COURSES/cs3621/NOTES/geometry/geo-tran.html, 2019 [Accessed; April, 2019] [151] geeks forgeeks Linear Regression (Python Implementation) https: //www.geeksforgeeks.org/linear-regression-python-implementation/, 2019 [Accessed; April, 2019] [152] Linear Linear Regression https://machinelearningcoban.com/2016/12/28/ linearregression/, 2019 [Accessed; April, 2019] [153] Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, and Bernt Schiele [154] Ching Hang Chen and Deva Ramanan 3D human pose estimation = 2D pose estimation + matching In Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, volume 2017-Janua, pages 5759– 5767, 2017 [155] Mohamed Omran, Christoph Lassner, Gerard Pons-Moll, Peter Gehler, and Bernt Schiele Neural body fitting: Unifying deep learning and model based human pose and shape estimation In Proceedings - 2018 International Conference on 3D Vision, 3DV 2018, pages 484–494, 2018 [156] Dario Pavllo, David Grangier, and Michael Auli Quaternet: A quaternionbased recurrent model for human motion In British Machine Vision Conference (BMVC), 2018 [157] Bastian Wandt and Bodo Rosenhahn Repnet: Weakly supervised training of an adversarial reprojection network for 3d human pose estimation CoRR, abs/1902.09868, 2019 [158] Muhammed Kocabas, Salih Karagoz, and Emre Akbas Self-Supervised Learning of 3D Human Pose using Multi-view Geometry In IEEE Computer Vision and Pattern Recognition, 2019 [159] Albert Haque, Boya Peng, Zelun Luo, Alexandre Alahi, Serena Yeung, and Li FeiFei Towards viewpoint invariant 3D human pose estimation In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 9905 LNCS, pages 160–177, 2016 131 PHỤ LỤC Toàn liệu phần code up lên trang https://github.com/ PandaThanh/3d_code Ngồi liệu cịn up lên Googledrive: https://drive.google.com/drive/u/3/folders/1GxPHcM_7-6eoloRDUy7vF7IzNbK7o Toàn mã nguồn việc xoay dịch chuyển liệu https://drive.google com/file/d/1Opluv8IRMMFuxKweDgLwrvHR2qhY4BIO/view?usp=sharing Chi tiết các kết ước lượng thể đường dẫn này: https: //drive.google.com/drive/u/3/folders/1RyYjlDkCCWiKkzvXNRZ14YdwoH-rVbn2 Để thực luận án phát triển cơng cụ nhỏ có tên "groundtruth.m" https://drive.google.com/file/d/1Opluv8IRMMFuxKweDgLwrvHR2qhY4BIO/ view?usp=sharing Video thể thời gian thực với hình ảnh đầu vào ảnh màu đầu kết ước lượng đầy đủ khớp xương không gian 3-D, trường hợp nhìn thấy khớp xương số khớp xương bị che khuất https:// drive.google.com/drive/u/3/folders/1Nv10vamq0ENmj1RAJUECbHdqvBKoUjc5 132 ... ước lượng tư người dựa hình ảnh 2-D Để ước lương tư thế, khung xương người dựa độ sâu, báo cung cấp hình ảnh độ sâu dựa liệu hướng nhìn Tất video hiệu chỉnh đồng hóa với liệu gốc tư tương ứng... tư? ??ng cảnh Bài báo đề xuất ngữ cảnh cho việc nối mơ hình tư đối tư? ??ng người tư? ?ng tác Để ước lượng tư hình ảnh, báo trình bày mơ hình Bayes để tối ưu ghép nối cách tối đa hóa khả nhiều ngữ cảnh...BỘ GIÁO DỤC VÀ ĐÀO TẠO TRƯỜNG ĐẠI HỌC BÁCH KHOA HÀ NỘI VỀ MƠ HÌNH NHẬN DẠNG TƯ THẾ VÕ DỰA TRÊN ẢNH CHIỀU SÂU Ngành: Kỹ thuật điện tử Mã số : 9520203 LUẬN ÁN TIẾN SĨ KỸ THUẬT ĐIỆN

Định dạng
Số trang	148
Dung lượng	3,28 MB