Nghiên cứu thiết kế, tích hợp rôbốt thông minh có khả năng ứng dụng trong khai thác các thông tin đa phương tiện

BỘ GIÁO DỤC VÀ ĐÀO TẠO BỘ KHOA HỌC VÀ CÔNG NGHỆ ĐỀ TÀI ĐỘC LẬP CẤP NHÀ NƯỚC BÁO CÁO TỔNG HỢP KẾT QUẢ KHOA HỌC CÔNG NGHỆ ĐỀ TÀI/DỰ ÁN NGHIÊN CỨU, THIẾT KẾ, TÍCH HỢP ROBOT THƠNG MINH CÓ KHẢ NĂNG ỨNG DỤNG TRONG KHAI THÁC CÁC THÔNG TIN ĐA PHƯƠNG TIỆN MÃ SỐ: ĐTĐL.2009G/42 Chủ nhiệm đề tài/dự án: (ký tên) Cơ quan chủ trì đề tài/dự án: (ký tên đóng dấu) TS Nguyễn Quốc Cường Bộ Khoa học Công nghệ (ký tên đóng dấu gửi lưu trữ) Hà Nội - 2012 TRƯỜNG ĐHBK HÀ NỘI CỘNG HOÀ XÃ HỘI CHỦ NGHĨA VIỆT NAM Độc lập - Tự - Hạnh phúc Hà Nội, ngày tháng năm 2012 BÁO CÁO THỐNG KÊ KẾT QUẢ THỰC HIỆN ĐỀ TÀI I THÔNG TIN CHUNG Tên đề tài: Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện Mã số đề tài: ĐTĐL.2009G/42 Thuộc: - Độc lập Chủ nhiệm đề tài: Họ tên: Nguyễn Quốc Cường Ngày, tháng, năm sinh: 22/11/1974 Nam/ Nữ: Nam Học hàm, học vị: Tiến sỹ Chức danh khoa học: Chức vụ: Cán nghiên cứu Điện thoại: Tổ chức: 043 868 3087 Nhà riêng: 043 863 7795 Mobile: 0912265621 Fax: 38 68 35 51 E-mail: Quoc-Cuong.Nguyen@mica.edu.vn Tên tổ chức công tác: nghiên cứu quốc tế Thông tin đa phương tiện, Truyền thông ứng dụng (MICA), trường Đại Học Bách Khoa Hà Nội Địa tổ chức: Tầng 8, nhà B1, Trường Đại học Bách Khoa Hà Nội, Đại Cồ Việt, Hà Nội Địa nhà riêng: Số 2, ngõ 296 phố Bạch Mai, Hà Nội Tổ chức chủ trì đề tài: Tên tổ chức chủ trì đề tài: Trường Đại học Bách Khoa Hà Nội Điện thoại: Fax: E-mail: Website: http://www.hut.edu.vn Địa chỉ: Số Đại Cồ Việt, Quận Hai Bà Trưng, Thành phố Hà Nội Họ tên thủ trưởng tổ chức: GS TS Nguyễn Trọng Giảng Số tài khoản: 93101062 Ngân hàng: Kho bạc nhà nước, quận Hai Bà Trưng Tên quan chủ quản đề tài: Bộ Giáo dục Đào tạo II TÌNH HÌNH THỰC HIỆN Thời gian thực đề tài/dự án: - Theo Hợp đồng ký kết: từ tháng 7/2009 đến tháng 6/2011 - Thực tế thực hiện: từ tháng 7/2009 đến tháng 12/2011 - Được gia hạn (nếu có): - Lần từ tháng năm 2011 đến tháng 12 năm 2011 Kinh phí sử dụng kinh phí: a) Tổng số kinh phí thực hiện: 2100 tr.đ, đó: + Kính phí hỗ trợ từ SNKH: 2100 tr.đ + Kinh phí từ nguồn khác: tr.đ + Tỷ lệ kinh phí thu hồi dự án (nếu có): b) Tình hình cấp sử dụng kinh phí từ nguồn SNKH: Đơn vị: Triệu đồng Số TT Theo kế hoạch Thời gian Kinh (Tháng, năm) phí (Tr.đ) Thực tế đạt Thời gian Kinh phí (Tháng, năm) (Tr.đ) 7/2009 -6/2010 7/2009 -6/2010 1.000 991,386 Ghi (Số đề nghị toán) 991,386 7/2010 - 12/2011 1.100 Tổng cộng: 2.100 Tổng kinh phí đề nghị tốn: Kinh phí cịn lại: 7/2010 - 12/2011 1.108,614 1.108,614 2.100,000 2.100,000 2.100,000 c) Kết sử dụng kinh phí theo khoản chi: Đối với đề tài: Đơn vị tính: Triệu đồng Số TT Nội dung khoản chi Thực tế đạt Theo kế hoạch Tổng SNKH Nguồn khác 1310 1310 20 Nguồn khác Tổng SNKH 1.310,00 1.310,00 20 15,33 15,33 426 426 425,55 425,55 0 0 Trả công lao động (khoa học, phổ thông) Nguyên, vật liệu, lượng Thiết bị, máy móc Xây dựng, sửa chữa nhỏ Chi phí điện nước cho 28,076 28,076 quan chủ trì Chi khác Tổng cộng 344 344 321,04 321,04 2100 2100 2.100,00 2.100,00 Các văn hành q trình thực đề tài/dự án: (Liệt kê định, văn quan quản lý từ công đoạn xác định nhiệm vụ, xét chọn, phê duyệt kinh phí, hợp đồng, điều chỉnh (thời gian, nội dung, kinh phí thực có); văn tổ chức chủ trì đề tài, dự án (đơn, kiến nghị điều chỉnh có) Số Số, thời gian ban hành TT văn Tên văn Ghi Số 426/ QĐ-BKHCN V/v phê duyệt danh mục Kèm danh mục 27/03/2009 đề tài khoa học công đề tài Khoa nghệ độc lập cấp Nhà học công nghệ nước giao trực tiếp bắt đầu thực năm 2009 Số: 565/ QĐ-BKHCN V/v thành lập Hội đồng Kèm theo danh 08/04/2009 khoa học công nghệ sách thành cấp Nhà nước tư vấn viên hội đồng tuyển chọn tổ chức cá nhân chủ trì đề tài độc lập cấp Nhà nước giao trực tiếp thực kế hoạch năm 2009 Số: 565/ QĐ-BKHCN V/v: phê duyệt kinh phí Kèm theo danh 04/05/2009 nhiệm vụ khoa học sách đề tài và công nghệ cấp Nhà kinh phí nước thực kế phê duyệt hoạch năm 2009 Hợp đồng Nghiên cứu Có kèm theo 04 Khoa học Phát triển phụ lục 10/07/2009 Số: 42/2009G/HĐĐTĐL Công nghệ Số: V/v: Tổ chức hội thảo Hội thảo lần 03/ĐTĐL.2009G/42/CV- khoa học Đề tài độc MICA lập ĐTĐL.2009G/42 18/11/2010 Số: V/v: Tổ chức hội thảo Hội thảo lần 04/ĐTĐL.2009G/42/CV- khoa học Đề tài độc MICA lập ĐTĐL.2009G/42 09/05/2011 Số: 66/ CV-ĐHBK- V/v: Thực kế hoạch KHCN đoàn Đề tài độc lập 08/04/2010 Số: 722/QĐ-BKHCN V/v: Tổ chức đồn cơng 04/05/2010 tác Cộng hịa Pháp đề tài độc lập cấp nhà nước “Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện” V/v: Xin chuyển đổi Có kèm theo phụ thiết 12/04/2010 ĐTĐL.2009G/42 Số: 2087/BGDĐT- V/v: Điều chỉnh danh Có kèm theo phụ KHCNMT mục thiết bị thuộc đề tài lục 20/04/2010 10 Số: 69/CV-ĐHBKKHCN độc lập cấp nhà nước mã bị đề tài lục số ĐTĐL.2009G/42 Số: 1034/BKHCN-CNN V/v: Đề nghị thay đổi số Có kèm theo phụ 11/05/2010 11 lượng thiết bị, điều chỉnh lục kinh phí số hạng mục đề tài độc lập cấp Nhà nước giao trực tiếp 12 Số: 174/TTr-ĐHBK- V/v: Phê duyệt Kế hoạch KHCN đấu thầu cho gói thầu 20/07/2010 Đề tài độc lập cấp nhà nước “Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện” 13 Số: 3051/QĐ-BGDĐT V/v: Phê duyệt Kế hoạch 27/07/2010 đấu thầu cho gói thầu Đề tài độc lập cấp nhà nước “Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thơng tin đa phương tiện” Số: 260/CV-ĐHBK- V/v: Xin điều chỉnh kinh Có kèm theo phụ KHCN phí gia hạn thời gian lục 18/05/2011 14 thực đề tài Độc lập cấp nhà nước V/v: Điều chỉnh đề tài độc 03/06/2011 ĐTĐL.2009G/42 Số 1393/BKHCN-CNN V/v: Điều chỉnh kinh phí Có kèm theo phụ 20/06/2011 16 Số 3650/BGDĐTKHCNMT 15 thời gian thực đề lục lập tài Độc lập cấp nhà nước 17 30/03/2010 Báo cáo định 29/09/2010 hình thực đề tài kỳ 14/03/2011 1, 2, Tổ chức phối hợp thực đề tài, dự án: Số TT Tên tổ chức Tên tổ chức Nội dung Sản phẩm đăng ký theo tham gia thực tham gia chủ yếu Thuyết minh chủ yếu đạt Bảo tàng dân - Kết Ghi chú* hợp - Cơ sở tộc học Việt với đề tài thu liệu hình âm ghi ảnh âm Nam hình buổi hướng hội dẫn dành cho thoại/trao khách thăm đổi quan người - Cung cấp hướng dẫn thông tin vật khách thăm quan - 500 ghi liệu vật - Lý thay đổi (nếu có): Cá nhân tham gia thực đề tài, dự án: (Người tham gia thực đề tài thuộc tổ chức chủ trì quan phối hợp, không 10 người kể chủ nhiệm) Số TT Tên cá nhân Tên cá nhân Nội dung Sản phẩm đăng ký theo tham gia tham gia chủ yếu đạt Thuyết minh thực TS Nguyễn TS Quốc Cường Nguyễn - Phụ trách Mô Quốc Cường chung đề nhận tài đun dạng tiếng nói Ghi chú* - Phụ trách Cơ sở nhánh nhận liệu tiếng dạng tiếng nói cho nói nhận dạng tiếng nói TS Lê Lan Thị TS Lê Thị Phụ Lan nhánh đun trách Mô nhận nhận dạng dạng cảm xúc cảm xúc ảnh hình hình ảnh Cơ sở liệu ảnh video cho cảm xúc Chương trình quản lý sở liệu ảnh video đun PGS TS PGS TS Phụ trách Mô Phạm Thị Phạm Thị nhánh phát động học Ngọc Yến Ngọc Yến triển mơ hình điều khiển động học robot điều robot khiển Các phương án tích hợp mơ đun bổ sung robot vào Hình 0-3 Một số đặc tính hướng microphone Ta thấy micro vơ hướng có khoảng cách từ nguồn âm đến micro gần nhất, ngược lại cho phép nguồn âm hướng Trong đề tài, triển khai hai phương án sử dụng micro Phương án một, sử dụng micro gắn với người dùng, phương án hai, sử dụng hai micro gắn với robot phép người nói lệnh từ xa (trong khoảng 1m) Trường hợp thứ 1, bắt buộc phải sử dụng micro vô hướng để đảm bảo thuận lợi cho việc cài đặt micro Người dùng gài micro nơi gần miệng Vì khoảng cách từ micro đến miệng gần nên tỉ lệ SNR đảm bảo người dùng nói to ngưỡng xác định Điều tự nhiên đối thoại với người Khi có nhiều nhiễu người nói tự động tăng âm lượng Trường hợp thứ 2, sử dụng micro đẳng hướng hoaặc ô hướng Để dễ dàng việc triển khai, micro vô hướng sử dụng Đáp ứng tần số Đáp ứng tần số micro phụ thuộc vào khoảng cách hình dáng Microphone Độ nhạy đầu 294 Đo điện áp đầu có áp suất âm Pa có tần số 1KHz gây Đơn vị V/Pa@1KHz Tuy nhiên thơng thường người ta tính đơn vị dB 1dB = 1V/Pa @ 1KHz Đối với micro thong dụng, độ nhạy tối thiểu đầu nằm khoảng -40 đến -46 dB Trong đề tài, để phục vụ mục đích thử nghiệm, hệ thống micro nối đến tiền khuyếch đại điều chỉnh Điều cho phép âm thu đủ lớn để nhận dạng tiếng nói làm việc với micro thong dụng có độ nhạy nằm khoảng -40 đến -46 dB Trở kháng đầu Đây đặc tính điện micro đo ohms ( Ω ) Tùy thuộc vào giá trị mà chia làm loại Loại trở kháng đầu thấp thường trở kháng 600 Ω Loại trở kháng trung bình có giá trị từ 600 đến 10 K Ω Loại có trở kháng cao có giá trị 10 K Ω Thông thường trở kháng đầu thuộc loại thấp có giá trị từ 50 đến 200 Ω Đối với micro có cơng suất, trở kháng cao điện áp đầu cao Hình 0-4 Ảnh hưởng khoảng cách tới chất lượng âm 295 Trong đề tài sử dụng tiền khuyếch đại điều chỉnh nên trở kháng đầu micro không bị giới hạn Chúng ta cần ý điều chỉnh mức độ khuyếch đại tiền khuyếch đại phù hợp với đầu vào ADC thu thập âm Một hiệu ứng mà cần ý hiệu ứng gần (proximity effect) Đây tượng biên độ âm tần số thấp tăng vọt nguồn âm đặt gần 296 Phụ lục - Các khối chức sẳn có robot PC-914 Như đề cập phần trên, sử dụng robot PC-BOT 914, robot đơn giản di chuyển bánh xe, cánh tay thực tương tác chủ yếu thơng qua kênh đa phương tiện (âm thanh, hình ảnh) di chuyển robot Robot trang bị máy tính cơng nghiệp chạy Windows cho phép triển khai chương trình bổ sung Robot chia làm khối chính: - Khối xử lý trung tâm – Máy tính cơng nghiệp (Host computer) - Khối điều khiển trung tâm - M3 ( Machine Management Mô đun) - Khối chấp hành – động bước, bánh xe loa - Khối Thu thập liệu – cảm biến hồng ngoại, camera, microphone Hình 0-5 Các khối PC-BOT 914 Các khối thu thập liệu gồm có: - Webcam Camera - Logitech QuickCam Communicate STX giúp robot thực chức thị giác máy tính - Microphone, phục vụ việc nhận dạng tiếng nói 297 - cảm biến hồng ngoại, phục vụ phát tránh vật cản Tám cảm biến bố trí thành nhóm: (i) sensor thân robot độ cao khoảng 370 mm với góc nhìn cúi xuống cho phép 914 PC-BOT có nhìn tồn cảnh phát “bậc thềm”; (ii) cảm biến phần chân đế với góc nhìn nằm ngang (Hình 0-6) Hình 0-6 Bố trí cảm biến hồng ngoại robot (a) IR sensor thân Robot, (b) IR sensor chân robot 298 Tài liệu tham khảo Hegel, F., et al., Understanding Social Robots, in Second International Conferences on Advances in Computer-Human Interactions 2009 Jarvis, R., Intelligent Robotics: Past, Present and the Future International Journal of Computer Science and Applications 5: p 12 Coutaz, J., L Nigay, and D Salber, The MSM Framework: A Design Space for Multi-Sensori-Motor Systems, in Selected papers from the Third International Conference on Human-Computer Interaction 1993, SpringerVerlag Mark, W.S., Talk and Draw: Bundling Speech and Graphics, H.H Joseph, et al., Editors 1990 p 59-65 Hauptmann, A.G and M Paul, Gestures with speech for graphic manipulation Int J Man-Mach Stud., 1993 38(2): p 231-249 Bernsen, N.O Modality Theory: Supporting Multimodal Interface Design in ERCIM Workshop on Multimodal Human-Computer Interaction 1993 Erman, L.D., et al., The Hearsay-II Speech-Understanding System: Integrating Knowledge to Resolve Uncertainty ACM Comput Surv., 1980 12(2): p 213-253 Silsbee, P., Computer Lipreading for Improved Accuracy in Automatic Speech Recognition 1993, The University of Texas at Austin Pelachaud, C., N Badler, and M.-L Viaud, Final report to NSF of the standards for facial animation workshop 1994, University of Pennsylvania, Philadelphia 10 Kwon, D.-S., et al An effective framework design of human-robot interaction in the coexistent environment in Proc of the 2004 Korea-Austria Joint Seminar on Intelligent Robotics 2004 Busan, Korea 11 Kim, C and R.M Stern, Nonlinear enhancement of onset for robust speech recognition, in INTERSPEECH 2010 2010 12 Patterson, R.D., et al., Complex Sounds and Auditory Images, in Proc 9th International Symposium on Hearing 1992 13 Kajita, S., K Takeda, and F Itakura, A Binaural Speech Processing Methods Using Subband – Crosscerrelation Analysis For Noise Robust Recognition, in IEEE Conference on Acoust, Speech, and Signal Processing 1997 14 Kim, C., K Kumar, and R Stern, Binaural sound source separation motivated by auditory processing, in IEEE Conference on Acoust, Speech, and Signal Processing 2011 299 15 Dutoit, T., An introduction to text-to-speech Synthesis 1997: Kluwer Academic Publics 316 16 Huang, X., A Alex, and H Hsiao-Wuen, Spoken Language Processing - A Guide to Theory, Algorithm, and System Development 1er ed 2001: Prentice Hall 17 Tatham, M., Developements in Speech Synthesis edition ed 2005: Wiley 18 Boite, R., et al., Traitement de la parole Collection électricité 2000: Polytechnique et Universitaires Romandes 19 Black, A.W and N Cambell Optimising selection of units from speech databases for concatenative synthesis in Eurospeech’95 1995 Madrid, Spain 20 Hunt, A and A.W Black Unit selection in a concatenative speech synthesis system using a large speech database in ICASSP ‘96 1996 Atlanta, GA: IEEE Signal Processing Society 21 Donovan, R., Trainable speech synthesis 1996, University of Cambridge 22 Chu M., Peng H., and Chang E A Concatenative Mandarin TTS system without prosody model and prosody modification in The 4th ISCA workshop on speech synthesis 2001 Scotland 23 Colotte, V and R Beaufort Synthèse vocale par sélection linguistiquement orientée d’unités non-uniformes : LiONS in Journée d’Etudes de la Parole JEP’04 2004 Fès, Maroc 24 Rouibia, S and O Rosec Unit selection for speech synthesis based on a new acoustic target cost in Interspeech’05 2005 Lisbon, Portugal 25 Trần, Đ.Đ., Synthèse de la parole a partir du texte en langue Vietnamienne 2007, INPG: Grenoble p 240 26 Nakajima, S and H Hamada Automatic Generation of Synthesis Units Based on Context Oriented Clustering in ICASSP'88 1998 New York 27 Nakajima, S English Speech Synthesis Based on Multi_Layered Context Oriented Clustering in Eurospeech’93 1993 Berlin, Germany 28 Huang, X., et al Whistler: A Trainable Text-to-Speech System in International Conference on Spoken Language Processing 1996 Philadelphia, USA 29 Hon, H., et al Automatic generation of synthesis units for trainable text-tospeech systems in ICASSP 98 1998 Seattle, WA 30 Moulines, E and J Laroche, Non-parametric techniques for pitch-scale and time-scale modification of speech Speech Communication, 1995 16(Special issue: voice conversion: state of the art and perspectives) 300 31 Mollá-Aliod, D., Answerfinder in TREC 2003, in Proceedings of TREC2003 2003 32 Chen, J., et al., Question answering: CNLP at the TREC-10 question answering track, in Proceedings of TREC-2001 2001 p 485 - 494 33 Pizzato, L.A.S., Using a Trie-based Structure for Question Analysis, in In Proceedings of ALTW 2004 p 25-31 34 Meng, F and W.W Chu, Database Query Formation from Natural Language Using Semantic Modeling and Statistical Keyword Meaning Disambiguation, in Technical Report CSD-TR 990003 1999, University of California, Los Angeles 35 Stratica, N., L Kosseim, and B.C Desai, NLIDB Templates for Semantic Parsing, in Applications of Natural Language to Databases (NLDB’2003) 2003: Germany p 235-241 36 Nguyen, A.K and H.T Le, Natural Language Interface Construction using Semantic Grammars, in The 10th Pacific Rim International Conference on Artificial Intelligence (PRICAI) 2008: Hanoi, Vietnam 37 Thomas G Zimmerman and Flushing N Y, Optical Flex Sensor 1985: United States Patent 38 J Triesch and C Malsburg Robust Classification of Hand Postures against Complex Backgrounds in Proceedings of the Second International Conference on Automatic Face and Gesture Recognition 1996 Killington, VT , USA 39 Thi Thanh Hai Tran, Dang Khoa Mac, and Xuan Huy Vu, Chuyên đề 10: Nghiên cứu tổng quan tập cử tồn giới, phân tích yêu cầu xây dựng tập cử 2010, Báo cáo chuyên đề Đề tài độc lập cấp Nhà nước số 42/2009G/HĐ-ĐTĐL: “Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện” , Trường Đại học Bách khoa Hà Nội 40 Thi Thanh Hai Tran and Dang Khoa Mac, Chuyên đề 12: Nghiên cứu đánh giá giải thuật học nhận dạng cử bàn tay 2010, Báo cáo chuyên đề Đề tài độc lập cấp Nhà nước số 42/2009G/HĐ-ĐTĐL: “Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện” , Trường Đại học Bách khoa Hà Nội 41 Thi Thanh Hai Tran and D.K Mac, Chuyên đề 11: Nghiên cứu phân tích đặc trưng ảnh bàn tay phục vụ cho phần nhận dạng cử chỉ, in Tuyển tập báo cáo chuyên đề Đề tài độc lập cấp Nhà nước số 42/2009G/HĐ-ĐTĐL: “Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện” 2010, Trường Đại học Bách khoa Hà Nội 301 42 Thi Thanh Hai Tran and Thi Thanh Mai Nguyen, Invariant Lighting Hand Posture Classification, in Proc of 2010 IEEE International Conference on Progress in Informatics and Computing 2010: December 10-12, 2010, Shanghai, China p 827-831 43 W T Freeman and M Roth Orientation Histograms for Hand Gesture Recognition in IEEE In International Workshop on Automatic Face and Gesture Recognition 1994 Zurich 44 C C Wang and K C Wang Hand Posture recognition using Adaboost with SIFT for human robot interaction in Proceedings of the International Conference on Advanced Robotics (ICAR'07) 2008 Jeju, Korea 45 T H Tran, Etude des lignes naturelles pour la representation d’objet en vision par ordinateur, in GRAVIR lab Inria Rhone-Alple 2006, Institut National Polytechnique de Grenoble: Grenoble, France p 150 46 L Bretzner, I Laptev, and T Lindeberg, Hand gesture recognition using multiscale color features, hieracrchichal models and particle filtering, in in Proceedings of Int Conf on Automatic face and Gesture recognition 2002: Washington D.C p 63-74 47 M J Black and A D Jepson, Eigen tracking: Robust matching and tracking of articulated objects using a view-based representation International Journal of Computer Vision, 1998: p 329-342 48 J H Shin, et al., Hand Region Extraction and Gesture Recognition using entropy analysis IJCSNS International Journal of Computer Science and 216 Network Security, 2006 6(2A) 49 G Rigoll, A Kosmala, and S Eickeler High Performance Real-Time Gesture Recognition using Hidden Markov Models in In International Gesture Workshop Bielefeld 1998 Bielefeld, Germany: Springer-Verlag 50 R Lienhart and J Maydt, An extended set of Haar-like features for rapid object detection, in IEEE Int Conf Image Processing 2002 p 900-903 51 A L C Barczak and F Dadgostar, Real-time hand tracking using a set of co-operative classifiers based on Haar-like features Res Lett Inf Math Sci., 2005 7: p 29-42 52 P Viola and M J Jones, Robust Real-time Object Detection International Journal of Computer Vision, 2001 53 Q Chen, N D Georganas, and E.M Petriu, Real-time Vision based Hand Gesture Recognition Using Haar-like features Conference Proceedings of IEEE on Instrumentation and Measurement Technology IMTC'07, 2007: p -6 54 M Kolsch and M Turk, Robust Hand Detection, in International Conference on Automatic Face and Gesture Recognition 2004: Seoul, Korea p 614-619 302 55 K A McCrae, et al Color Image Segmentation in Proc of SPIE, Applications of Artificial Neural Networks 1994 56 Picard, R.W., Affective computing 1997: MIT Press, Cambridge, MA 57 Russell, Circumplex Model of Affect 1980 58 Elfenbein, H.A and N Ambady, Universals and Cultural Differences in Recognizing Emotions Current directions in psychological science, 2003 59 Murray, I.R and J.L Arnott, Toward the simulation of emotion in synthesized speech: A review of the literature on human vocal emotion Journal of Acoustic Society of America, 1993 93(2): p 1097-1108 60 Kanade, T., Picture Processing System by Computer Complex and Recognition of Human Faces 1973, Kyoto University 61 Le, T.-L and T.-D Pham, Báo cáo chuyên đề 14 "Nghiên cứu phân tích tốn phát mặt người", in Đề tài độc lập cấp Nhà nước số ĐTĐL.2009G/42 “Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện” 2010 62 G.Yang, T.S.H., ed Human Face Detection in Complex Background Vol 27 1994 53-63 63 Kotropoulos, C and I Pitas Rule-based face detection in frontal views in Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) 1997 64 Ekman, P and W Friesen, Facial Action Coding System: A Technique for the Measurement of Facial Movement Consulting Psychologists Press, 1978 65 Panning, A., et al., Facial expression recognition based on Haar-like feature detection Pattern Recognition and Image Analysis, 2008 18(3): p 447-452 66 Zhao, G and M Pietikainen, Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007 67 Saatci, Y and C Town Cascaded Classiﬁcation of Gender and Facial Expression using Active Appearance Models 68 Cohen, I., A Garg, and T.S Huang Emotion Recognition from Facial Expressions using Multilevel HMM in NIPS Workshop on Affective Computing, Colorado 2000 69 Schroder, M., Can emotions be synthesized without controlling voice quality?, in Phonus 4, Report of the Institute of phonetics, University of the Saarland 1999 p 37-55 303 70 Oudeyer, P.Y., The production and recognition of emotions in speech: Features and algorithms International journal in Human-computer studies (Special issue on Affective computing), 2002 59/1-2: p 157-183 71 Mozziconacci, S.J.L and D.J Hermes Role of intonation patterns in conveying emotion in speech in ICPhS 2004 2004 72 Montero, J.M., et al Analysis and modelling of emotional speech in Spanish in ICPhS99 1999 73 Rank, E and H Pirker Generating emotional speech with a concatenative synthesizer in Proceedings of the 5th international conference of spoken language processing 1998 Sydney, Australia 74 Ryynänen, M.P and A Klapuri Modelling of note events for singing transcription in Proceeding of ISCA tutorial and research workshop on statistical and perceptual audio processing 2004 75 Cheveigné, A.d and H Kawahara, Yin, a fundamental frequency estimator for speech and music Journal of the Acoustical Society of America, 2002 111: p 76 Noble, J., Spoken emotion recognition with support vector machines, in Department of computer science and software engineering 2003, University of Melbourne 77 Akbar, M and J Caelen Parole et traduction automatique: Le module de reconnaisance RAPHAEL in Proceedings of the 17th international conference on computational linguistics 1998 Quebec, Canada 78 Vacher, M., et al Smart audio sensor for telemedecine in Smart objects conference (SOC) 2003 Grenoble, France 79 Zhou, X., M Hasegawa-Johnson, and T.S Huang Robust analysis and weighting on MFCC components for speech recognition and speaker identification in Multimedia and expo conference 2007 Beijing 80 Beritelli, F., et al A genetic algorithm feature selection approach to robust classification between positive and negative emotional states in speakers in Signals, systems and computer 2005 - Conference record of the 39 asilomar conference on volume 2005 81 Zheng, F., G Zhang, and Z Song, Comparison of different implementations of MFCC Journal of computer science and technology, 2001 16(6): p 582589 82 Ververidis, D., C Kotropoulos, and I Pitas Automatic emotional speech classification in International conference acoustics, speech and signal processing (ICASSP'04) 2004 Montreal 304 83 Dellaert, F., T Polzin, and A Waibel Recognizing emotion in speech in Proceedings of International conference Spoken language processing (ICSLP'96) 1996 84 Slaney, M and G McRoberts Babyears: A recognition system for affective vocalization in Proceeding of the International conference on acoustics, speech, and signal processing (ICASSP'98) 2003 Seattle, W.A 85 Lee, C.M and S Narayanan Toward detecting emotions in spoken dialogs in IEEE Trans Speech and audio process 2005 86 Fernandez, R and R.W Picard, Modeling drivers speech under stress Speech communication, 2003 40: p 145-159 87 Schuller, B., G Rigoll, and M Lang Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture in Proceeding of International conference acoustics, speech and signal processing (ICASSP'04) 2004 88 Womack, B.D and J.H.L Hansen, Classification of speech under stress using target driven features Speech communication, 1996 20: p 131-150 89 Womack, B.D and J.H.L Hansen N-channel hidden Markov models for combined stressed speech classification and recognition in IEEE Trans Speech and audio processing 1999 90 Faber, F., et al., The Humanoid Museum Tour Guide Robotinho., in Proceedings of the 18th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) 2009 91 Bennewitz, M., et al., Towards a Humanoid Museum Guide Robot that Interacts with Multiple Persons., in Proceedings of the IEEE-RAS International Conference on Humanoid Robots (Humanoids) 2005 92 Thrun, S., et al., MINERVA: A Second-GenerationMuseum Tour-Guide Robot, in In Proceedings of IEEE International Conference on Robotics and Automation (ICRA’99) 1999 93 http://www.boost.org/doc/libs/1_47_0/doc/html/ boost_asio.html [cited 94 http://www.boost.org/doc/libs/1_47_0/libs/state-chart/doc/index.html [cited 95 S Marcel Hand Posture Recognition in a Body-Face centered space in CHI'99 Conference on Human Factors in Computer Systems 1999 Pittsburgh, PA USA 96 S Marcel, et al Hand Gesture recognition using Input/Ouput Hidden Markov Models in FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000 Washington, DC, USA: IEEE Computer Society 305 97 T-K Kim, S-F Wong, and R Cipolla, Tensor Canonical Correlation Analysis for Action Classification, in In Proc of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2007: Minneapolis, MN 98 H Birk, T B Moeslund, and C B Madsen Real-Time Recognition of Hand Alphabet Gestures Using Principal Component Analysis in In Proc of 10th Scandinavian Conference on Image Analysis 1997 Lappeeranta, Finlande: Pattern Recognition Society 99 H Stern, J.W., and Y Edan Hand gesture vocabulary design: a multicriteria optimization in IEEE Conference on Systems, Man and Cybernetics 2005 100 Helman Stern, J.W., and Yael Edan, A Method for Selection of Optimal Hand Gesture Vocabularies, ed L.N.i.C Science 2009: Springer Berlin / Heidelberg 101 Thi Thanh Hai Tran, Công việc 21: Thiết kế tập cử Các cử sử dụng nhiều ứng dụng 2011, Báo cáo chuyên đề Đề tài độc lập cấp Nhà nước số 42/2009G/HĐ-ĐTĐL: “Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện” , Trường Đại học Bách khoa Hà Nội 102 Campbell, W.N The recording of emotional speech, JST/CREST database research in Proceeding LREC 2002 103 Cowie, E.D., R Cowie, and M Schroder A new emotion database: Considerations, sources and scope in Proceedings of the ISCA workshop on Speech and Emotion 2000 Newcastle, Belfast 104 Schuller, B., et al Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles in Interspeech 2005, special session: Emotional speech analysis and synthesis: Towards a multimodal approach 2005 Lisbon, Portugal 105 Engberg, I.S and A.V Hansen, Documentation of the Danish emotional speech database (DES), in Internal AAU report, Center for person Kommunikation 1996: Denmark 106 Engberg, I.S., et al Design recording and verification of a Danish emotional speech database in EuroSpeech'97: 5th European conference on speech communication and technology 1997 Rhodes, Greece 107 Fek, M., et al Designe of a Hungarian emotional database for speech analysis and synthesis in Proceeding of Workshop on affektive dialogue system 2004 108 Doan, T.T., Ngữ âm tiếng Việt (Vietnamese phonetics) 1999: Hanoi National University Publishing House 109 Nguyen, H.Q., Ngữ pháp tiếng Việt (Vietnamese Grammar) 2007: Encyclopedia Publishing House 306 110 Nguyen, Q.C., Reconnaissance de la parole en langue Vietnamienne, in INPGrenoble dissertation 2002, Institut national polytechnique de Grenoble: France 111 Nguyen, V.L and J.A Edmondson, Tones and voice quality in modern northern Vietnamese : instrumental case studies Mon-Khmer Stud, 1997 28: p 1-18 112 Nguyen, V.S., Vietnamese continuous speech recognition 2004, Master thesis, Hanoi university of Technology: Hanoi - Vietnam 113 Nguyen, V.S., R Carré, and E Castelli Production and perception of Vietnamese short vowels in Acoustical Society of America Meeting 2008 Paris 114 Nguyen, V.S., R Carré, and E Castelli Vietnamese final stop consonants /p, t, k/ described in terms of formant transition slopes in International Conference on Asian language processing (IALP 2009) 2009 Singapore 115 Nguyen, V.S., E Castelli, and R Carré Locus equation for final stop voiceless consonants /p, t, k/ in Vietnamese language in Proceedings of the Empirical Methods for Asian Language Processing workshop 2008 Hanoi, Vietnam: Pacific Rim International conference on Artificial Intelligence (PRICAI) 116 Nguyen, V.S., E Castelli, and R Carré Production and perception of Vietnamese final stop consonants /p, t, k/ in The second International workshop on spoken languages technologies for under-resourced languages SLTU'10 2010 Penang, Malaysia 117 Nguyen, V.S., E Castelli, and R Carré, Vietnamese final stop consonants /p, t, k/ described in terms of formant transition slopes Journal Sino-US English Teaching, 2010 7(David publishing company): p 39-50 118 Tran, D.D., Synthèse de la parole partir du texte en langue vietnamienne, in INP-Grenoble dissertation 2007, Institut national polytechnique de Grenoble: Grenoble - France p 248 119 Tran, D.D., et al Influence of F0 on Vietnamese syllable perception in InterSpeech - EuroSpeech 2005 Lisbon, Portugal 120 Castelli, E and R Carré Production and perception of Vietnamese vowels in InterSpeech-EuroSpeech 2005 Lisbon 121 Castelli, E and A Hierholtz "Locus equation" pour les consonnes /b/, /d/ et /ɣ/ du vietnamien in Actes des XXVIes journées d'études sur la parole 2006 Disnard 122 Michaud, A., Final consonants and glottalization: New perspectives from Hanoi Vietnamese Phonetica, 2004 61: p 119-46 307 123 Michaud, A., et al., Nasal release, nasal final and tonal contrasts in Hanoi Vietnamese: An aerodynamic experiment The Mon-Khmer studies Journal, 2006 36: p 121-137 124 Vu, M.Q., et al Extraction automatique de questions dans les corpus de réunions et de dialogues in Manifestation des jeunes chercheurs francophones dans les domaines des STIC 2005 Rennes, France 125 Vu, M.Q., et al Classification de parole en Question and Nonquestion par arbre de décision in 12eme rencontre de la société francophone de classification 2005 Montreal 126 Han, M.S and K Kim, Phonetic variation of Vietnamese tones in disyllabic utterances JP, 1974 2: p 223-232 127 Vu, T.P., Phonetic properties of Vietnamese tones across dialects, in South East Asian linguistics 1982, Edited by David Bradley: Sydney: Australia National University p 55-75 128 Thurgood, G The origins of tone in Vietnamese: Revising the model and analysis in XXXII International Conference of Sino-Tibetan language and linguistics 1999 Urbana - Champaign 129 Burkhardt, F., et al A database of German emotional speech in Interspeech 2005 2005 130 Quast, H., Automatic recognition of nonverbal speech: An approach to model the perception of para and extra-linguistic vocal communication with neural networks, in Machine perception Lab Tech report 2002/2 Institute for neural computation, UCSD 2002 308 ... THỐNG KÊ KẾT QUẢ THỰC HIỆN ĐỀ TÀI I THÔNG TIN CHUNG Tên đề tài: Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện Mã số đề tài: ĐTĐL.2009G/42 Thuộc:... đề tài độc lập cấp nhà nước ? ?Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện? ?? V/v: Xin chuyển đổi Có kèm theo phụ thiết 12/04/2010 ĐTĐL.2009G/42... gói thầu 20/07/2010 Đề tài độc lập cấp nhà nước ? ?Nghiên cứu, thiết kế, tích hợp robot thơng minh có khả ứng dụng khai thác thông tin đa phương tiện? ?? 13 Số: 3051/QĐ-BGDĐT V/v: Phê duyệt Kế hoạch

Định dạng
Số trang	359
Dung lượng	6,42 MB