1. Trang chủ
  2. » Giáo Dục - Đào Tạo

A multifunctional embedded system based on deep learning for assisting the cognition of visually impaired people

106 1 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 106
Dung lượng 4,14 MB

Nội dung

資 逢甲大學 訊 工 程 學 博士論文 系 基於深度學習來輔助視覺障礙者認知 之多功能嵌入式系統 A Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People 指導教授: 竇其仁 林峰正 研 究 生:吳友輝 中 華 民 國 一 百 一 十 年 一 月 A Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People Acknowledgement First and foremost, I would like to express my sincere gratitude to my advisor, Prof Chyi-Ren Dow, for his motivation, extensive experience, and immense knowledge I am very grateful for all his ideas, time, and funding contributions that laid the foundation for my research experience The passion and enthusiasm he has for the research were inspired and motivational to me, especially during tough periods in the Ph.D pursuit I am also thankful for the excellent pieces of advice he has offered as an outstanding professor These pieces of advice are valuable lessons for me in all the time of research and the future It is an honor for me to be one of his Ph.D students Again, I would like to convey my heartfelt gratitude to him I would like to express my sincere gratitude to my co-advisor, Prof Feng-Cheng Lin, for his scientific advice, knowledge, and valuable guidance I am very grateful for all his ideas, time, and the supported devices He always encourages and helps me to promote strengths in my research, especially appreciates our research results I am also thankful for his excellent pieces of advice It is an honor for me to be his first Ph.D student From the bottom of my heart, I would like to express my sincere gratitude to him again I would like to thank my dissertation committee: Prof Hsiao-Hsi Wang, Prof Tsung-Chuan Huang, Prof Lin-Huang Chang, Prof Cheng-Min Lin, and Prof Hsi-Min Chen, for their meaningful suggestions, which help me continue to improve and develop my research In addition, I am indebted to Szu-Yi Ho (Toni), who gave me numerous insightful discussions and suggestions She supported and shared with me to resolve my faced challenges in related research issues, especially publishing papers i FCU e-Theses & Dissertations (2021) A Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People I would always remember my fellow labmates at the Mobile Computing lab for the inspiring discussions, unconditional supports, friendship, and for all the fun-time we have had in the last four years In particular, my gratitude goes to Ms Yu-Yun Chang (Amber) and Mr Kuan-Chieh Wang (Rich) for providing essential local supports during the years Last but not least, I am grateful to my family members for all their encouragements and faith in me They gave me enough moral support, encouragement, and motivation to accomplish the personal goals And most of all for my parents, who raised me up with unconditional love and gave me unlimited support in every decision I have made ii FCU e-Theses & Dissertations (2021) A Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People 摘要 視力障礙的人在生活中面臨許多困難,例如,無輔助導航,獲取訊息和情境 感知。儘管許多智慧型裝置可用來幫助視障人士,但大多數只在提供導航幫助和 避障。在本研究中,我們專注於情境感知和周遭物件辨識。與大多數主從式架構 或是單台桌機運算所進行的研究不同,我們提出了一種基於深度學習的多功能嵌 入式系統,以幫助視覺障礙者對周遭環境的認知。我們提出的系統還克服了使用 上的區域限制,並增強了導航任務的能力。我們使用嵌入式設備(NVIDIA Jetson AGX Xavier)作為主要的處理器模組,並連接到其他外部周邊設備(像是網路鏡 頭、藍芽喇叭、螢幕、滑鼠和藍芽音訊配對器)。它幾乎可以執行所有主機應有 的系統功能,包括影像蒐集,影像處理和結果呈現。首先,系統的網路鏡頭用於 擷取使用者當前場景。然後,透過遙控器執行所選取的功能來處理該圖像。最後, 系統將當前場景的結果描述,從文字描述轉為語音,並由藍芽喇叭將其傳達給使 用者。該系統的三個主要功能,包括臉部辨識和情緒分類感知(第一個功能), 年齡和性別分類(第二個功能)以及物體檢測(第三個功能)。該系統是基於不 同的深度學習模型構建的,但對於視力障礙的人使用上可能會成為挑戰。因此, 我們還提出了一種可以有效選擇功能的過程,以減輕視障人士的系統控制的複雜 性。最後,完成設計,製造和測試原型,並進行實驗驗證。利用原型機上獲得的 實驗結果,證明了所提系統的性能可靠度。基於辨識和分類準確性、計算時間及 實際適用性的結果證明,該系統是可行的,並且可以有效地用於幫助視障人士。 關鍵詞:年齡分類,情緒分類,臉部辨識,性別分類,對象檢測。 iii FCU e-Theses & Dissertations (2021) A Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People Abstract Individuals with visual impairment confront many difficulties in their living, for example, unassisted navigation, access to information, and context-aware Although many smart devices were designed to assist visually impaired people, most of them aimed to provide navigation assistance and obstacle avoidance In this study, we focus on context-aware and surrounding object recognition Unlike most studies, which were implemented on servers or laptop computers, we propose a multifunctional embedded system based on deep learning for assisting the cognition of visually impaired people This proposed system also overcomes the limitation of area usage and enhances the capabilities of navigation tasks An embedded device (NVIDIA Jetson AGX Xavier) is employed as a central processor module in the system and connected to peripheral devices (webcam, speaker, monitor, mouse, and Bluetooth audio transmitter adapter) It performs almost all the system functions, including image collection, image processing, and result description First, the webcam of the system is used to capture the current scene of the user Then, this image is processed by following the selected function that is executed through a remote controller Lastly, the system converts the result description of the current scene from text to voice and delivers it to the user by the speaker Three main functions of this system include face recognition and emotion classification (the first function), age and gender classification (the second function), and object detection (the third function) This system is built based on different deep learning models, and it may become a challenge for visually impaired people Therefore, we also propose a process that can select functions efficiently to ease the complexity of the system control for visually impaired people Finally, a prototype is designed, fabricated, and tested for experimental validation The performance of the proposed system is demonstrated using results obtained from the experiments on the prototype iv FCU e-Theses & Dissertations (2021) A Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People Results based on recognition and classification accuracy, computing time, and practical applicability prove that the proposed system is feasible and can be effectively used to assist visually impaired people Keywords: Age Classification, Emotion Classification, Face Recognition, Gender Classification, Object Detection v FCU e-Theses & Dissertations (2021) A Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People Table of Contents Acknowledgement i 摘要 iii Abstract iv Table of Contents vi List of Figures ix List of Tables xi Chapter Introduction 1.1 Motivation 1.2 Overview of Research 1.3 Dissertation Organization Chapter Related Work 2.1 Face Recognition 2.2 Gender, Age and Emotion Classification 11 2.3 Object Detection 14 2.4 Smart Healthcare 16 Chapter System Overview 19 3.1 System Architecture 19 3.2 Function Selection 21 3.2.1 Remote Controller 21 3.2.2 Function Selection Process 23 3.3 NVIDIA Jetson AGX Xavier 25 3.3.1 NVIDIA Jetson Family Introduction 25 3.3.2 Technical Specification of NVIDIA Jetson AGX Xavier 26 Chapter Face Recognition Function 29 vi FCU e-Theses & Dissertations (2021) A Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People 4.1 Overview of Face Recognition Function 29 4.2 Dataset Collection 30 4.3 Model Architectures 33 4.4 Enrolling a New Person 36 Chapter Gender, Age and Emotion Classification Function 38 5.1 Overview of Gender, Age and Emotion Classification Function 38 5.2 Gender Classification Schemes 39 5.3 Age Classification Schemes 41 5.4 Emotion Classification Schemes 42 Chapter Object Detection Function 47 6.1 Overview of Object Detection Function 47 6.2 Object Detection Schemes 48 6.2.1 Two-Stage Detectors 48 6.2.2 One-Stage Detectors 49 6.3 Arrangement of Result Description 52 Chapter System Prototype and Implementation 53 7.1 Devices in System Implementation 53 7.2 Initialization Program in Embedded System 55 7.3 Dataset Collection 58 Chapter Experimental Results 60 8.1 Evaluation of Face Recognition Results 60 8.1.1 Results Evaluation in Terms of Precision and Recall 60 8.1.2 Analysis Results of Face Recognition 63 8.1.3 Results Comparison in Multiple Standard Datasets 64 8.2 Examination Results of Gender, Age and Emotion Classification 65 8.2.1 Evaluation Results of Gender Classification 65 vii FCU e-Theses & Dissertations (2021) ... e-Theses & Dissertations (2021) A Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People Abstract Individuals with visual impairment confront... Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People 1.1 Motivation Visually impaired people confront numerous visual challenges every day, such as... Dissertations (2021) A Multifunctional Embedded System Based on Deep Learning for Assisting the Cognition of Visually Impaired People (2) Gender, age and emotion classification issues Gender classification

Ngày đăng: 11/11/2021, 11:01

Nguồn tham khảo

Tài liệu tham khảo Loại Chi tiết
[1] O. Agbo-Ajala and S. Viriri, “Deep Learning Approach for Facial Age Classification: A Survey of the State-of-the-Art,” Artificial Intelligence Review, pp. 1–35, Jun. 2020 Sách, tạp chí
Tiêu đề: Deep Learning Approach for Facial Age Classification: A Survey of the State-of-the-Art,” "Artificial Intelligence Review
[2] O. Agbo-Ajala and S. Viriri, “Face-Based Age and Gender Classification Using Deep Learning Model,” in Proceedings of Image and Video Technology, Sydney, New South Wales, Australia, pp. 125–137, Nov. 2019 Sách, tạp chí
Tiêu đề: Face-Based Age and Gender Classification Using Deep Learning Model,” in Proceedings of "Image and Video Technology
[3] P. Angin and B. K. Bhargava, “Real-time Mobile-Cloud Computing for Context- Aware Blind Navigation,” International Journal of Next Generation Computing, vol. 2, no. 2, pp. 1–13, Jul. 2011 Sách, tạp chí
Tiêu đề: Real-time Mobile-Cloud Computing for Context-Aware Blind Navigation,” "International Journal of Next Generation Computing
[4] O. Arriaga, P. G. Ploger and M. Valdenegro, “Real-Time Convolutional Neural Networks for Emotion and Gender Classification,” arXiv:1710.07557, pp. 1–5, Oct. 2017 Sách, tạp chí
Tiêu đề: Real-Time Convolutional Neural Networks for Emotion and Gender Classification,” "arXiv:1710.07557
[5] A. Ashok and J. John, “Facial Expression Recognition System for Visually Impaired,” in Proceedings of International Conference on Intelligent Data Communication Technologies and Internet of Things (ICICI), Coimbatore, Tamil Nadu, India, pp. 244–250, Aug. 2018 Sách, tạp chí
Tiêu đề: Facial Expression Recognition System for Visually Impaired,” in Proceedings of "International Conference on Intelligent Data Communication Technologies and Internet of Things (ICICI)
[6] V. Aza, Indrabayu and I. S. Areni, “Face Recognition Using Local Binary Pattern Histogram for Visually Impaired People,” in Proceedings of International Seminar on Application for Technology of Information and Communication (iSemantic), Semarang, Indonesia, pp. 241–245, Sep. 2019 Sách, tạp chí
Tiêu đề: Face Recognition Using Local Binary Pattern Histogram for Visually Impaired People,” in Proceedings of "International Seminar on Application for Technology of Information and Communication (iSemantic)
[7] J. Bai, Z. Liu, Y. Lin, Y. Li, S. Lian and D. Liu, “Wearable Travel Aid for Environment Perception and Navigation of Visually Impaired People,”Electronics, vol. 8, no. 6, pp. 1–27, Jun. 2019 Sách, tạp chí
Tiêu đề: Wearable Travel Aid for Environment Perception and Navigation of Visually Impaired People,” "Electronics
[8] S. A. Bargal, E. Barsoum, C. C. Ferrer and C. Zhang, “Emotion Recognition in the Wild from Videos Using Images,” in Proceedings of 18th ACM International Conference on Multimodal Interaction, Tokyo, Japan, pp. 433–436, Nov. 2016 Sách, tạp chí
Tiêu đề: Emotion Recognition in the Wild from Videos Using Images,” in Proceedings of "18th ACM International Conference on Multimodal Interaction
[9] J. Cai, Z. Meng, A. S. Khan, Z. Li, J. O’Reilly and Y. Tong, “Island Loss for Learning Discriminative Features in Facial Expression Recognition,” in Proceedings of 13th IEEE International Conference on Automatic Face Gesture Recognition, Xi’an, China, pp. 302–309, May 2018 Sách, tạp chí
Tiêu đề: Island Loss for Learning Discriminative Features in Facial Expression Recognition,” in Proceedings of "13th IEEE International Conference on Automatic Face Gesture Recognition
[10] G. Capi, “Development of a New Robotic System for Assisting and Guiding Visually Impaired People,” in Proceedings of IEEE International Conference on Robotics and Biomimetics (ROBIO), Guangzhou, China, pp. 229–234, Dec. 2012 Sách, tạp chí
Tiêu đề: Development of a New Robotic System for Assisting and Guiding Visually Impaired People,” in Proceedings of "IEEE International Conference on Robotics and Biomimetics (ROBIO)
[11] W. J. Chang, J. P. Su, L. B. Chen, M. C. Chen, C. H. Hsu, C. H. Yang, C. Y. Sie and C. H. Chuang, “An AI Edge Computing Based Wearable Assistive Device for Visually Impaired People Zebra-Crossing Walking,” in Proceedings of IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, Nevada, USA, pp. 1–2, Jan. 2020 Sách, tạp chí
Tiêu đề: An AI Edge Computing Based Wearable Assistive Device for Visually Impaired People Zebra-Crossing Walking,” in Proceedings of "IEEE International Conference on Consumer Electronics (ICCE)
[12] S. Chaudhry and R. Chandra, “Design of a Mobile Face Recognition System for Visually Impaired Persons,” arXiv:1502.00756, pp. 1–11, Jun. 2015 Sách, tạp chí
Tiêu đề: Design of a Mobile Face Recognition System for Visually Impaired Persons,” "arXiv:1502.00756
[13] S. Chen, D. Yao, H. Cao and C. Shen, “A Novel Approach to Wearable Image Recognition Systems to Aid Visually Impaired People,” Applied Sciences, vol. 9, no. 16, pp. 1–20, Jan. 2019 Sách, tạp chí
Tiêu đề: A Novel Approach to Wearable Image Recognition Systems to Aid Visually Impaired People,” "Applied Sciences
[14] J. Cheng, Y. Li, J. Wang, L. Yu and S. Wang, “Exploiting Effective Facial Patches for Robust Gender Recognition,” Tsinghua Science and Technology, vol. 24, no.3, pp. 333–345, Jun. 2019 Sách, tạp chí
Tiêu đề: Exploiting Effective Facial Patches for Robust Gender Recognition,” "Tsinghua Science and Technology
[16] S. Das, “A Novel Emotion Recognition Model for the Visually Impaired,” in Proceedings of IEEE 5th International Conference for Convergence in Technology (I2CT), Pune, India, pp. 1–6, Mar. 2019 Sách, tạp chí
Tiêu đề: A Novel Emotion Recognition Model for the Visually Impaired,” in Proceedings of "IEEE 5th International Conference for Convergence in Technology (I2CT)
[17] J. Deng, J. Guo, N. Xue and S. Zafeiriou, “ArcFace: Additive Angular Margin Loss for Deep Face Recognition,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp.4690–4699, Jun. 2019 Sách, tạp chí
Tiêu đề: ArcFace: Additive Angular Margin Loss for Deep Face Recognition,” in Proceedings of "IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
[18] A. Dhomne, R. Kumar and V. Bhan, “Gender Recognition Through Face Using Deep Learning,” Procedia Computer Science, vol. 132, pp. 2–10, Jan. 2018 Sách, tạp chí
Tiêu đề: Gender Recognition Through Face Using Deep Learning,” "Procedia Computer Science
[19] C. R. Dow, H. H. Ngo, L. H. Lee, P. Y. Lai, K. C. Wang and V. T. Bui, “A Crosswalk Pedestrian Recognition System by Using Deep Learning and Zebra‐Crossing Recognition Techniques,” Software: Practice and Experience, vol. 50, no. 5, pp. 630–644, Aug. 2019 Sách, tạp chí
Tiêu đề: A Crosswalk Pedestrian Recognition System by Using Deep Learning and Zebra‐Crossing Recognition Techniques,” "Software: Practice and Experience
[96] NVIDIA Embedded Systems for Next-Generation Autonomous Machines, NVIDIA Website, Available online: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/ Link
[97] NVIDIA Jetson AGX Xavier, NVIDIA Developer Blog, Available online: https://developer.nvidia.com/blog/nvidia-jetson-agx-xavier-32-teraops-ai-robotics/ Link

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w