A study on deep learning techniques for human action representation and recognition with skeleton data

MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY PHAM DINH TAN A STUDY ON DEEP LEARNING TECHNIQUES FOR HUMAN ACTION REPRESENTATION AND RECOGNITION WITH SKELETON DATA DOCTORAL DISSERTATION IN COMPUTER ENGINEERING 123doc Mang Ln thay h■■ng l■im■i s■ cam tr■ h■u m■t k■t nghi■m t■im■t s■ cáwebsite nhân mang kho m■ith■ kinh m■ l■i d■n vi■n nh■ng cho doanh ■■u kh■ng ng■■i quy■n chia t■ th■c dùng, l■ s■l■i v■i hi■n t■t công h■n mua ngh■a nh■t 2.000.000 ngh■ báncho tài v■ hi■n ng■■i li■u c■a tài th■ hàng li■u dùng hi■n ■■u ■ thìt■t Khi ■■i, s■p Vi■t c■ khách b■n t■i, l■nh Nam ngh■a online hàng v■c: Táctr■ không v■ tài phong thành c■a khác chun c■a thành tíngì d■ng, hàng so nghi■p, viên v■i tri■u cơng c■a b■n hồn nhà ngh■ 123doc g■c bán h■o, thông B■n hàng ■■ n■p có tin, l■i cao th■ ti■n ngo■i chuy■n tính phóng vào ng■, Khách trách tài giao to,kho■n nhi■m thu sang nh■ c■a ■■i ■■n hàng tùy123doc, v■i v■ ý cót■ng qu■n th■b■n d■ ng■■i lýChào dàng s■ dùng ■■■c m■ng tra c■u M■c h■■ng b■n tàitiêu li■u ■■n nh■ng hàng m■t v■i■■u quy■n cách 123doc c■a l■i123doc.net sau xác,n■p nhanh ti■n tr■ chóng thành website th■ vi■n tài li■u online l■n nh■t Vi■t Nam, cung c■p nh■ng tài li■u ■■c khơng th■ tìm th■y th■ tr■■ng ngo■i tr■ 123doc.net Nhi■u event thú v■, event ki■m ti■n thi■t th■c 123doc luôn t■o c■ h■i gia t■ng thu nh■p online cho t■t c■ thành viên c■a website Hanoi−2022 Mangh■n Luôn Th■a Xu■t Sau Nhi■u 123doc Link h■■ng phát thu■n l■i event cam s■ nh■n xác m■t tr■ t■ h■u k■t s■ thú nghi■m t■i th■c ýxác n■m t■■ng m■t d■ng v■, s■ nh■n s■ website mang event kho m■i ■■■c ■■i, t■o tLink t■ th■ m■ l■i c■ng ki■m ■■ng d■n 123doc CH■P g■i vi■n xác nh■ng cho ■■u ■■ng ti■n v■ th■c h■ kh■ng ng■■i NH■N ■ã ■■a quy■n th■ng thi■t chia t■ng s■ ki■m dùng, l■ ch■ CÁC s■ ■■■c th■c s■ l■i b■■c v■i ti■n email chuy■n ■I■U t■t công h■n mua 123doc g■i online kh■ng nh■t b■n 2.000.000 v■ ngh■ bán KHO■N sang b■ng cho ■■a ■ã tài ■■nh hi■n ■■ng ng■■i li■u ph■n ch■ tài TH■A tài v■ th■ li■u hàng t■o email li■u thơng ky, dùng tríhi■n THU■N hi■u c■ c■a b■n ■■u ■b■n tin t■t h■i Khi ■■i, qu■ vui Vi■t xác c■ ■ã khách gia lòng b■n nh■t, minh l■nh ■■ng Nam t■ng Chào ■■ng online hàng uy tài v■c: l■nh thu Tác m■ng ky, tín kho■n tr■ nh■p nh■p không b■n tài phong v■c cao thành b■n vui email nh■t tài email online oLink khác chuyên ■■n li■u lịng thành tínb■n Mong c■a xác cho d■ng, ■■ng v■i so nghi■p, viên th■c kinh ■ã t■t 123doc 123doc.net! v■i mu■n cơng ■■ng nh■p c■a c■ doanh s■ b■n vàcác hoàn mang ■■■c ngh■ 123doc click email ký g■c online thành v■i h■o, Chúng vào l■i thông B■n g■i c■a 123doc.netLink CH■P cho viên linkí Tính ■■ v■ n■p có tơi tin, c■ng c■a cao ■■a th■ ■■n cung NH■N ti■n ngo■i tính website phóng ■■ng ch■ th■i click vào c■p CÁC ng■, Khách trách xác email tài ■i■m D■ch vào xã to,kho■n ■I■U th■c nhi■m h■i thu linkông l■nh b■n tháng V■ nh■ m■t s■ KHO■N c■a ■ã v■c (nh■ ■■i hàng ■■■c tin tùy ngu■n 5/2014; ■■ng 123doc, tài v■i xác ■■■c ý có li■u TH■A g■i t■ng minh th■ tài ky, 123doc v■ mô nguyên b■n b■n d■ ng■■i THU■N tài kinh ■■a t■ dàng kho■n s■ vui v■■t d■■i doanh tri dùng ■■■c ch■ lòng tra th■c m■c email ■ây) email c■u ■■ng Chào online M■c h■■ng quý 100.000 cho tài b■n b■n m■ng tiêu báu, nh■p li■u Tính b■n, ■ã nh■ng ■ã hàng phong m■t l■■t ■■n email ■■ng b■n tùy ■■ng ■■u quy■n cách truy thu■c ■■n th■i phú, c■a ký ky, c■a c■p v■i ■i■m v■i ■a l■i b■n vào 123doc.net m■i 123doc.netLink d■ng, 123doc.net! sau xác, vui tháng vàngày, n■p click lòng “■i■u nhanh giàu 5/2014; ti■n s■ vào ■■ng tr■ giá Kho■n Chúng chóng h■u linkc■a thành tr■ xác 123doc nh■p 2.000.000 website ■■ng th■c Th■a th■ website cung email v■■t s■ vi■n th■i Thu■n ■■■c c■p c■a thành mong m■c tài D■ch v■ li■u g■i viên 100.000 mu■n S■ online v■ V■ ■■ng D■ng click ■■a t■o (nh■ l■■t l■n ký, D■ch ■i■u vào ch■ nh■t ■■■c truy l■t link email ki■n V■” vào c■p Vi■t 123doc môtop sau cho b■n m■i Nam, t■200 ■ây d■■i cho ngày, ■ã cung các (sau ■■ng g■i ■ây) s■ website c■p users ■ây h■u ky, cho nh■ng ■■■c có b■n 2.000.000 b■n, ph■ thêm vui tài bi■n tùy g■i lòng thu li■u thu■c t■t thành nh■t nh■p ■■c ■■ng T■i vào t■i viên không t■ng Chính nh■p Vi■t ■■ng th■i “■i■u th■ Nam, email v■y ■i■m, ký, tìm t■ Kho■n c■a l■t 123doc.net th■y l■chúng vào tìm Th■a top ki■m tơi th■ 200 click Thu■n cóthu■c ■■i tr■■ng th■ vào nh■m website c■p v■ top link ngo■i S■ 3nh■t ■áp 123doc Google D■ng ph■ tr■ ■KTTSDDV ■ng 123doc.net bi■n ■ã D■ch Nh■n nhu g■i nh■t c■u V■” ■■■c theo t■i chia sau Vi■t quy■t danh ■ây s■ Nam, tài (sau hi■u li■u t■ ■ây ch■t l■c■ng ■■■c tìm l■■ng ki■m ■■ng g■i thu■c t■t bình ki■m T■i ch■n top ti■n t■ng Google online th■i website ■i■m, Nh■n ki■m chúng ■■■c ti■ntôi online danh có th■ hi■u hi■u c■p qu■ nh■t c■ng ■KTTSDDV uy ■■ng tín nh■t bình ch■n theo quy■t website ki■m ti■n online hi■u qu■ uy tín nh■t Luônh■n 123doc Sau Th■a Xu■t h■■ng phát thu■n cam nh■n m■t t■k■t s■ t■i ýxác n■m t■■ng d■ng s■ nh■n website mang ■■i, t■o t■l■i c■ng ■■ng d■n 123doc CH■P nh■ng ■■u ■■ng h■ NH■N ■ã quy■n th■ng chia t■ng ki■m CÁC s■s■ l■i b■■c ti■n vàchuy■n ■I■U t■t mua online kh■ng nh■t bán KHO■N sang b■ng cho ■■nh thay ng■■i ph■n tài TH■A vìv■ li■u m■i thơng dùng tríTHU■N hi■u m■t c■atin Khi qu■ cá xác khách nhân nh■t, minh Chào kinh hàng uy tài l■nh m■ng doanh tín kho■n tr■ v■c cao thành b■n t■ email nh■t tàith■c ■■n li■u thành b■n Mong hi■n v■i viên kinh ■ã 123doc 123doc.net! mu■n ngh■a ■■ng c■a doanh mang 123doc v■ kýonline c■a v■i Chúng l■ivà 123doc.netLink cho Tính n■p tơi c■ng ■■n cung ti■n s■p ■■ng th■i vào c■p t■i, xác tài ■i■m D■ch xã ngh■a kho■n th■c h■itháng V■ m■t s■ v■ c■a (nh■ ■■■c c■a ngu■n 5/2014; 123doc, ■■■c c■a g■i tài 123doc hàng v■ mô nguyên b■n■■a t■ tri■u s■ v■■t d■■i tri ■■■c ch■ nhà th■c m■c ■ây) email bán h■■ng quý 100.000 cho hàng b■n báu, b■n, nh■ng l■i ■ã phong l■■t chuy■n tùy ■■ng quy■n truy thu■c phú, ky, c■p giao ■a l■i b■n vào m■i sang d■ng, sau vuingày, n■p ■■n lòng “■i■u giàu ti■n s■ ■■ng v■ giá Kho■n h■u qu■n tr■ nh■p 2.000.000 website ■■ng Th■a lý hoàn email th■i Thu■n h■o, c■a thành mong v■ ■■ viên mu■n S■ cao ■■ng D■ng tính click t■otrách ký, D■ch ■i■u vàol■t link nhi■m ki■n V■” vào 123doc top sau cho ■■i 200 ■ây cho v■i ■ãcác (sau g■i t■ng website users ■ây ng■■i ■■■c cóph■ dùng thêm bi■n g■i thu M■c t■t nh■t nh■p T■i tiêu t■i t■ng hàng Chính Vi■tth■i ■■u Nam, v■y ■i■m, c■a t■123doc.net l■ 123doc.net chúng tìm ki■m tơiracó tr■ thu■c ■■i th■ thành nh■m c■p topth■ 3nh■t ■áp Google vi■n ■KTTSDDV ■ng tàiNh■n nhu li■uc■u online ■■■c theo chia l■n quy■t danh s■nh■t tài hi■u li■u Vi■t ch■t Nam, c■ng l■■ng cung ■■ng c■p bình ki■m nh■ng ch■n ti■ntài online website li■u ■■cki■m khơng ti■n th■ online tìm th■y hi■utrên qu■th■ tr■■ng uy tín nh■t ngo■i tr■ 123doc.net Luôn Th■a Xu■t Sau Nhi■u 123doc Mang thayh■n h■■ng phát thu■n l■i event m■i cam s■ nh■n m■t tr■ t■ h■u m■t k■t s■ thú nghi■m t■i ýxác n■m t■■ng m■t d■ng v■, s■ cá nh■n website nhân mang event kho m■i ■■i, t■o t■ th■ kinh m■ l■i c■ng ki■m ■■ng d■n 123doc CH■P vi■n nh■ng cho doanh ■■u ■■ng ti■n h■ kh■ng ng■■i NH■N ■ã quy■n th■ng thi■t chia t■t■ng ki■m th■c dùng, l■ CÁC s■ th■c s■ l■i b■■c v■i ti■n hi■n chuy■n ■I■U t■t công h■n mua 123doc online kh■ng ngh■a nh■t 2.000.000 ngh■ bán KHO■N sang b■ng cho tài ■■nh v■ hi■n ng■■i li■u ph■n c■a tài TH■A tài v■ th■ li■u hàng t■o li■u thơng dùng tríhi■n THU■N hi■u c■ c■a ■■u ■ thìtin t■t h■i Khi ■■i, qu■ s■p Vi■t xác c■ khách gia b■n t■i, nh■t, minh l■nh Nam t■ng Chào ngh■a online hàng uy tài v■c: l■nh thu Tác m■ng tín kho■n tr■ nh■p khơng v■ tài phong v■c cao thành b■n c■a email nh■t tài online khác chun ■■n c■a li■u thành tínb■n Mong cho d■ng, hàng v■i so nghi■p, viên kinh ■ã t■t 123doc 123doc.net! v■i mu■n tri■u công ■■ng c■a c■ doanh b■n hoàn nhà mang ngh■ 123doc ký g■c online thành bán v■i h■o, Chúng l■i thông B■n hàng 123doc.netLink cho viên Tính ■■ n■p có tơi tin, c■ng l■i c■a cao th■ ■■n cung ti■n ngo■i chuy■n tính website phóng ■■ng th■i vào c■p ng■, Khách trách xác tài ■i■m D■ch giao xã to,kho■n th■c nhi■m h■i thu sang tháng V■ nh■ m■t s■ c■a (nh■ ■■i ■■n hàng ■■■c tùy ngu■n 5/2014; 123doc, v■i v■ ■■■c ý cóg■i t■ng qu■n th■ tài 123doc v■ mô nguyên b■n d■ ng■■i lý, ■■a t■ dàng s■ công v■■t d■■i tri dùng ■■■c ch■ tra th■c ngh■ m■c ■ây) email c■u M■c h■■ng quý hi■n 100.000 cho tài b■n tiêu báu, li■u b■n, th■ nh■ng ■ã hàng phong m■t l■■t hi■n tùy ■■ng ■■u quy■n cách truy thu■c ■■i, phú, ky, c■a c■p ■a b■n l■i b■n vào 123doc.net m■i d■ng, sau online xác, vuingày, n■p lòng “■i■u nhanh giàu khơng ti■n s■ ■■ng tr■ giá Kho■n chóng h■u khác thành tr■ nh■p 2.000.000 website ■■ng Th■a gìth■ so email vi■n th■i v■i Thu■n c■a thành b■n mong tài v■ li■u g■c viên mu■n S■ online B■n ■■ng D■ng click t■o l■n cóký, D■ch ■i■u vào th■ nh■t l■t link phóng ki■n V■” vào Vi■t 123doc top sau cho to, Nam, 200 thu ■ây cho ■ã cung nh■ các (sau g■iwebsite tùy c■p users ■ây ý.nh■ng ■■■c cóph■ thêm tài bi■n g■i thu li■u t■t nh■t nh■p ■■c T■it■i khơng t■ng Chính Vi■tth■i th■ Nam, v■y ■i■m, tìm t■123doc.net th■y l■chúng tìm ki■m tơi th■ racóthu■c ■■i tr■■ng th■nh■m c■p top ngo■i 3nh■t ■áp Google tr■ ■KTTSDDV ■ng 123doc.net Nh■n nhu c■u ■■■c theo chiaquy■t danh s■ tài hi■u li■udo ch■t c■ng l■■ng ■■ng vàbình ki■mch■n ti■n online website ki■m ti■n online hi■u qu■ uy tín nh■t Chia m■t u■t Nhi■u Mang Luôn 123doc Th■a Xu■t Sau tri■n phát h■n member s■ h■■ng phát khai thu■n l■i event s■ cam nh■n câu t■ m■t tr■ t■ event h■u ýk■t s■ chuy■n thú nghi■m t■i ýkhông t■■ng xác n■m t■■ng m■t d■ng v■, khuy■n s■ nh■n website mang m■y event t■o kho thành m■i ■■i, t■o t■ c■ng th■ n■i m■ l■i c■ng ki■m ■■ng d■n công 123doc CH■P th■ vi■n b■t nh■ng cho ■■ng ■■u ■■ng ti■n n■p h■ c■a kh■ng ng■■i NH■N ■ã quy■n th■ng 123doc thi■t chia ki■m v■i c■ng t■ng ki■m dùng, l■ CÁC s■ nh■ng th■c ti■n s■ l■i b■■c ■■ng v■i ti■n -và ki■m chuy■n ■I■U t■t công online h■n mua 123doc online ■u kh■ng 123doc nh■t 5■ãi 2.000.000 ngh■ bán KHO■N tri■u b■ng sang b■ng cho c■c tài ■■nh ■ã hi■n ch■ tài ng■■i li■u ph■n k■ tài TH■A xu■t li■u tài v■ v■i th■ li■u h■p hàng t■o li■u thơng s■c dùng trí hi■u 7hi■n THU■N hi■u d■n tài c■ c■a ■■u ■■■ng li■u! tin qu■ t■t h■i Khi ■■i, qu■ ■■ng Vi■t xác c■ khách gia nh■t, Nghe b■n nh■t, minh l■nh Nam t■ng Chào b■online có uy hàng danh l■ uy tài v■c: l■nh thu Tác v■ tín m■ng nhé, tín kho■n tr■ sách cao nh■p khó khơng tài phong v■c cao tr■■c thành b■n nh■t tin Top email nh■t tài online khác nh■ng chuyên ■■n li■u tiên thành danh tín Mong b■n Mong cho d■ng, v■i ■ây so thu nghi■p, viên kinh ■ã mu■n t■t 123doc 123doc.net! v■i mu■n cao công ■■ng c■a c■ doanh b■n nh■t mang tìm hồn mang ngh■ 123doc s■ ký g■c hi■u online thành tháng v■i l■i hồn h■o, Chúng l■i thơng B■n thơng cho 123doc.netLink cho viên t■o tồn Tính ■■ n■p có c■ng tơi tin, c■ng tin c■ c■a cao th■ ■■n cung ti■n ngo■i v■ h■i ■■ng tính website phóng ■■ng Khách th■i vào c■p xác gia ng■, Khách trách xác xã tài t■ng ■i■m mà D■ch xã to, hàng h■i kho■n th■c nhi■m h■i BQT thu thu m■t tháng V■ có nh■ m■t s■ nh■p 123doc c■a th■ (nh■ ■■i hàng ngu■n ■■■c tùy ngu■n 5/2014; 123doc, d■ v■i online ■■■c ý có ■ã dàng tài g■i t■ng th■ tài thu 123doc nguyên cho v■ mô nguyên b■n tra d■ ng■■i th■p t■t ■■a t■ c■u dàng s■ v■■t tri d■■i c■ ■■■c tri dùng ■■■c ch■ tài th■c tra th■c m■c li■u ■ây) email c■u sau thành quý M■c h■■ng quý m■t 100.000 cho ■■t tài báu, b■n tiêu báu, viên li■u cách b■n, t■ng nh■ng phong ■ã hàng phong c■a m■t l■■t tùy ■■ng k■t ■■u website phú, quy■n cách truy thu■c phú, doanh xác, ky, c■a c■p ■a ■a nhanh l■i b■n vào d■ng, thu 123doc.net m■i d■ng, sau xác, vui tháng chóng ngày, n■p giàu lòng “■i■u nhanh giàu 11 ti■n giá s■ ■■ng tr■ giá uy Kho■n chóng h■u tr■ tín thành tr■ nh■p ■■ng cao 2.000.000 website ■■ng Th■a th■ nh■t email th■i vi■n th■i Thu■n Mong mong c■a thành mong tài v■ li■u mu■n mu■n viên mu■n S■ online ■■ng D■ng mang t■o click t■o l■n ■i■u ký, D■ch ■i■u vào l■i nh■t l■t cho link ki■n ki■n V■” vào Vi■t c■ng 123doc cho top sau cho Nam, ■■ng cho 200 ■ây cho ■ã cung các (sau g■i xãusers website h■i c■p users ■ây m■t nh■ng có ■■■c cóph■ thêm ngu■n thêm tài bi■n g■i thu thu li■u tài t■t nh■p nh■t nh■p nguyên ■■c T■it■i Chính khơng t■ng Chính Vi■t tri th■c th■i vìth■ Nam, vìv■y v■y quý ■i■m, tìm 123doc.net t■123doc.net báu, th■y l■chúng tìm phong ki■m tơi th■ phú, có ■■i thu■c ■■i tr■■ng th■ ■Sau nh■m nh■m c■p top ngo■i h■n ■áp 3nh■t ■áp Google m■t ■ng tr■ ■KTTSDDV ■ng 123doc.net n■m nhu Nh■n nhuc■u rac■u ■■i, ■■■c chia theo chia 123doc s■ quy■t danh s■tàitài hi■u li■u ■ã li■u t■ng ch■t ch■t c■ng b■■c l■■ng l■■ng ■■ng kh■ng vàvàki■m bình ki■m ■■nh ch■n ti■n ti■n v■ online online tríwebsite c■a ki■m ti■nl■nh online v■c hi■u tài li■u qu■và vàkinh uy tín doanh nh■t.online Nhi■u Mang Luôn 123doc Th■a Xu■t Sau h■n h■■ng phát thu■n l■i event s■ cam nh■n m■t tr■ t■ h■u k■t s■ thú nghi■m t■i ýxác n■m t■■ng m■t d■ng v■, s■ nh■n website mang event kho m■i ■■i, t■o t■ th■ m■ l■i c■ng ki■m ■■ng d■n 123doc CH■P vi■n nh■ng cho ■■u ■■ng ti■n h■ kh■ng ng■■i NH■N ■ã quy■n th■ng thi■t chia t■ng ki■m dùng, l■ CÁC s■ th■c s■ l■i b■■c v■i ti■n vàchuy■n ■I■U t■t công h■n mua 123doc online kh■ng nh■t 2.000.000 ngh■ bán KHO■N sang b■ng cho tài ■■nh hi■n ng■■i li■u ph■n tài TH■A tài v■ th■ li■u hàng t■o li■u thơng dùng tríhi■n THU■N hi■u c■ c■a ■■u ■ tin t■t h■i Khi ■■i, qu■ Vi■t xác c■ khách gia b■n nh■t, minh l■nh Nam t■ng Chào online hàng uy tài v■c: l■nh thu Tác m■ng tín kho■n tr■ nh■p khơng tài phong v■c cao thành b■n email nh■t tài online khác chun ■■n li■u thành tínb■n Mong cho d■ng, v■i so nghi■p, viên kinh ■ã t■t 123doc 123doc.net! v■i mu■n cơng ■■ng c■a c■ doanh b■n hồn mang ngh■ 123doc ký g■c online thành v■i h■o, Chúng l■i thơng B■n 123doc.netLink cho viên Tính ■■ n■p có tin, c■ng c■a cao th■ ■■n cung ti■n ngo■i tính website phóng ■■ng th■i c■p thay ng■, Khách trách xác ■i■m D■ch xã to, th■c nhi■m m■i h■i thutháng V■ nh■ m■t s■(nh■ ■■i hàng ■■■c tùy ngu■n 5/2014; cáv■i nhân ■■■c ý cóg■i t■ng th■ tài 123doc kinh v■ mô nguyên d■ ng■■i doanh ■■a t■ dàng v■■t d■■i tri dùng ch■ t■ tra th■c m■c ■ây) th■c email c■u M■c quý 100.000 cho tài hi■n b■n tiêu báu, li■u b■n, ngh■a ■ã hàng phong m■t l■■t tùy ■■ng ■■u cách truy v■ thu■c phú, ky, c■a c■a c■p ■a b■n vào 123doc.net m■i d■ng, xác, vuingày, lịng “■i■u nhanh giàu s■p s■ ■■ng tr■ giá t■i, Kho■n chóng h■u thành tr■ ngh■a nh■p 2.000.000 ■■ng Th■a th■ email v■vi■n th■i Thu■n c■a c■a thành mong tài c■a v■ li■u viên hàng mu■n S■ online ■■ng D■ng tri■u click t■o l■n ký, D■ch ■i■u vào nhà nh■t l■t link bán ki■n V■” vào Vi■t 123doc hàng top sau cho Nam, 200 l■i ■ây cho ■ã chuy■n cung các (sau g■iwebsite c■p users ■ây giao nh■ng ■■■c cósang ph■ thêm tài bi■n g■i ■■n thu li■u t■t nh■t v■ nh■p ■■c T■i qu■n t■i khơng t■ng Chính Vi■t lý th■i quy■n th■ Nam, v■y ■i■m, tìm l■i t■123doc.net th■y l■ sau chúng tìm n■p ki■m tơi th■ ti■n racóthu■c ■■i tr■■ng th■nh■m c■p website top ngo■i 3nh■t ■áp Google tr■ ■KTTSDDV ■ng 123doc.net Nh■n nhu c■u ■■■c theo chiaquy■t danh s■ tài hi■u li■udo ch■t c■ng l■■ng ■■ng vàbình ki■mch■n ti■n online website ki■m ti■n online hi■u qu■ uy tín nh■t luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY PHAM DINH TAN A STUDY ON DEEP LEARNING TECHNIQUES FOR HUMAN ACTION REPRESENTATION AND RECOGNITION WITH SKELETON DATA Major: Computer Engineering Code: 9480106 DOCTORAL DISSERTATION IN COMPUTER ENGINEERING SUPERVISORS: Assoc Prof Vu Hai Assoc Prof Le Thi Lan Hanoi−2022 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep DECLARATION OF AUTHORSHIP I, Pham Dinh Tan, declare that the dissertation titled "A study on deep learning techniques for human action representation and recognition with skeleton data" has been entirely composed by myself I assure some points as follows: This work was done wholly or mainly while in candidature for a Ph.D research degree at Hanoi University of Science and Technology The work has not been submitted for any other degree or qualifications at Hanoi University of Science and Technology or any other institution Appropriate acknowledgment has been given within this dissertation, where reference has been made to the published work of others The dissertation submitted is my own, except where work in the collaboration has been included The collaborative contributions have been indicated Hanoi, May 08, 2022 Ph.D Student Pham Dinh Tan SUPERVISORS Assoc Prof Vu Hai Assoc Prof Le Thi Lan i luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep ACKNOWLEDGEMENT This dissertation is composed during my Ph.D at the Computer Vision Department, MICA Institute, Hanoi University of Science and Technology I am grateful to all people who contribute in different ways to my Ph.D journey First, I would like to express sincere thanks to my supervisors Assoc Prof Vu Hai and Assoc Prof Le Thi Lan for their guidance and support I would like to thank all MICA members for their help during my Ph.D study My sincere thanks to Dr Nguyen Viet Son, Assoc Prof Dao Trung Kien, and Assoc Prof Tran Thi Thanh Hai for giving me a lot of support and valuable advice Many thanks to Dr Nguyen Thuy Binh, Nguyen Hong Quan, Hoang Van Nam, Nguyen Tien Nam, Pham Quang Tien, and Nguyen Tien Thanh for their support I would like to thank my colleagues at the Hanoi University of Mining and Geology for their support during my Ph.D study Special thanks to my family for understanding my hours glued to the computer screen Hanoi, May 08, 2022 Ph.D Student ii luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep ABSTRACT Human action recognition (HAR) from color and depth sensors (RGB-D), especially derived information such as skeleton data, receives the research community’s attention due to its wide applications HAR has many practical applications such as abnormal event detection from camera surveillance, gaming, human-machine interaction, elderly monitoring, and virtual/augmented reality In addition to the advantages of fast computation, low storage, and immutability with human appearance, skeleton data have shortcomings The shortcomings include pose estimation errors, skeleton noise in complex actions, and incompleteness due to occlusion Moreover, action recognition remains challenging due to the diversity of human actions, intra-class variations, and inter-class similarities The dissertation focuses on improving action recognition performance using the skeleton data The proposed methods are evaluated using public skeleton datasets collected by RGB-D sensors Especially, they consist of MSRAction3D/MICA-Action3D - datasets with high-quality skeleton data, CMDFALL a challenging dataset with noise in skeleton data, and NTU RGB+D - a worldwide benchmark among the large-scale datasets Therefore, these datasets cover different dataset scales as well as the quality of skeleton data To overcome the limitations of the skeleton data, the dissertation presents techniques in different approaches First, as joints have different levels of engagement in each action, techniques for selecting joints that play an important role in human actions are proposed, including both preset joint subset selection and automatic joint subset selection Two frameworks are evaluated to show the performance of using a subset of joints for action representation The first framework employs Dynamic Time Warping (DTW) and Fourier Temporal Pyramid (FTP), while the second one uses Covariance Descriptors extracted on joint position and velocity Experimental results show that joint subsect selection helps improve action recognition performance on datasets with noise in skeleton data However, HAR using handcrafted feature extraction could not exploit the inherent graph structure of the human skeleton Recent Graph Convolution Networks (GCNs) are studied to handle these issues Among GCN models, Attention-enhanced Adaptive Convolutional Network (AAGCN) is used as the baseline model AAGCN achieves state-of-the-art performance on large-scale datasets such as NTU RGB+D and Kinetics However, AAGCN employs only joint information Therefore, a Feature Fusion (FF) module is proposed in this dissertation The new model is named FF-AAGCN The performance of FF-AAGCN is evaluated on the large-scale dataset NTU RGB+D and CMDFALL The evaluation results show that the proposed method is robust to iii luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep noise and invariant to the skeleton translation Particularly, FF-AAGCN achieves remarkable results on challenging datasets Finally, as the computing capacity of edge devices is limited, a lightweight deep learning model is expected for application deployment A lightweight GCN architecture is proposed to show that the complexity of GCN architecture can still be reduced depending on the dataset’s characteristics The proposed lightweight model is suitable for application development on edge devices Hanoi, May 08, 2022 Ph.D Student iv luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep CONTENTS DECLARATION OF AUTHORSHIP i ACKNOWLEDGEMENT ii ABSTRACT iii CONTENTS viii ABBREVIATIONS viii SYMBOLS x LIST OF TABLES xiii LIST OF FIGURES xvi INTRODUCTION CHAPTER LITERATURE REVIEW 1.1 Introduction 1.2 An overview of action recognition 1.3 Data modalities for action recognition 1.3.1 Color data 10 1.3.2 Depth data 10 1.3.3 Skeleton data 1.3.4 Other modalities 1.3.5 Multi-modality 11 11 13 1.4 Skeleton data collection 1.4.1 Data collection from motion capture systems 1.4.2 Data collection from RGB+D sensors 1.4.3 Data collection from pose estimation 14 14 14 16 1.5 Benchmark datasets 1.5.1 MSR-Action3D 1.5.2 MICA-Action3D 1.5.3 CMDFALL 1.5.4 NTU RGB+D 18 18 19 19 19 1.6 Skeleton-based action recognition methods 20 1.6.1 Handcraft-based methods 1.6.1.1 Joint-based action recognition 1.6.1.2 Body part-based action recognition 20 22 25 v luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep 1.6.2 Deep learning-based methods 1.6.2.1 Convolutional Neural Networks 1.6.2.2 Recurrent Neural Networks 28 28 30 1.7 Research on action recognition in Vietnam 33 1.8 Conclusion of the chapter 35 CHAPTER JOINT SUBSET SELECTION FOR SKELETON-BASED HUMAN ACTION RECOGNITION 36 2.1 Introduction 36 2.2 Proposed methods 2.2.1 Preset Joint Subset Selection 2.2.1.1 Spatial-Temporal Representation 2.2.1.2 Dynamic Time Warping 2.2.1.3 Fourier Temporal Pyramid 2.2.2 Automatic Joint Subset Selection 2.2.2.1 Joint weight assignment 2.2.2.2 Most informative joint selection 37 37 39 39 40 40 41 42 2.2.2.3 Human action recognition based on MIJ joints 43 2.3 Experimental results 2.3.1 Evaluation metrics 2.3.2 Preset Joint Subset Selection 2.3.3 Automatic Joint Subset Selection 45 45 46 48 2.4 Conclusion of the chapter 57 CHAPTER FEATURE FUSION FOR THE GRAPH CONVOLUTIONAL NETWORK 58 3.1 Introduction 58 3.2 Related work on Graph Convolutional Networks 58 3.3 Proposed method 65 3.4 Experimental results 71 3.5 Discussion 81 3.6 Conclusion of the chapter 84 CHAPTER THE PROPOSED LIGHTWEIGHT GRAPH CONVOLUTIONAL NETWORK 85 4.1 Introduction 85 4.2 Related work on Lightweight Graph Convolutional Networks 85 4.3 Proposed method 87 vi luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep 4.4 Experimental results 89 4.5 Application demonstration 97 4.6 Conclusion of the chapter 101 CONCLUSION AND FUTURE WORKS 103 PUBLICATIONS 105 BIBLIOGRAPHY 106 vii luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep ABBREVIATIONS No Abbreviation Meaning 2D Two-Dimensional 3D Three-Dimensional AAGCN Attention-enhanced Adaptive Graph Convolutional Network AGCN Adaptive Graph Convolutional Network AMIJ Adaptive number of Most Informative Joints AS Action Set AS-GCN Actional-Structural Graph Convolutional Network BN Batch Normalization BPL Body Part Location 10 CAM Channel Attention Module 11 CCTV Close-Circuit Television 12 CNN Convolutional Neural Network 13 CovMIJ Covariance Descriptor on Most Informative Joints 14 CPU Central Processing Unit 15 CS Cross-Subject 16 CV Cross-View 17 DFT Discrete Fourier Transform 18 DTW Dynamic Time Warping 19 FC Fully Connected 20 FF Feature Fusion 21 FLOP Floating Point OPeration 22 FMIJ Fixed number of Most Informative Joints 23 fps f rames per second 24 FTP Fourier Temporal Pyramid 25 GCN Graph Convolutional Network 26 GCNN Graph-based Convolutional Neural Network 27 GPU Graphical Processing Unit 28 GRU Gated Recurrent Unit 29 HAR Human Action Recognition 30 HCI Human-Computer Interaction viii luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep (a) AAGCN (b) Proposed Figure 4.8: Distribution of 20 action classes obtained by AAGCN (left) and the LW-FF-AAGCN using t-SNE Table 4.8: Accuracy (%) comparison of LW-FF-AAGCN with other methods on MSR-Action3D The reference [58] refers to the paper with correction available at: http://ravitejav.weebly.com No 10 11 12 13 14 15 16 17 18 19 20 21 22 Method Action Graph, 2010 [17] Histogram, 2012 [32] EigenJoints, 2012 [44] Cov3DJ, 2013 [45] Joint Position (JP), 2014 [58] Relative JP (RJP), 2014 [58] Joint Angle (JA), 2014 [58] Absolute SE(3), 2014 [58] LARP, 2014 [58] Spline Curve, 2015 [51] Multi-fused, 2017 [61] CovP3DJ, 2018 [57] CovMIJ, 2018 [C1] Lie Algebra with VFDT, 2020 [60] Preset JSS (Chapter 2) Preset JSS w/Covariance FMIJ (Chapter 2) AMIJ (Chapter 2) ST-GCN [62] AAGCN [104] FF-AAGCN (Chapter 3) Proposed (LW-FF-AAGCN) AS1 72.9 87.98 74.5 88.04 93.36 95.77 84.51 90.3 94.72 83.08 90.8 93.48 93.48 94.66 95.86 95.7 95.7 96.7 22.86 73.33 77.14 80.95 AS2 71.9 85.48 76.1 89.29 85.53 86.90 68.05 83.91 86.83 79.46 93.4 84.82 90.18 85.08 91.27 91.1 92.9 92.9 33.04 68.75 77.68 91.96 AS3 79.2 63.46 96.4 94.29 99.55 99.28 96.17 95.39 99.02 93.69 95.7 94.29 97.14 96.76 99.47 96.2 98.1 99.0 36.04 90.09 92.79 93.69 lightweight model is suitable for HAR implementation for edge devices Evaluation is performed on the MICA-Action3D dataset, as shown in Table 4.9 When using 96 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep AAGCN, high throw action is highly confused with hammer action These actions are performed by the right hand with sudden stops when moving, so the poses are similar However, the confusion issue is improved using the proposed method, as shown in Figure 4.9 The proposed method gains an F1-score of up to 99.49% on MICA-Action3D Recognition results are high among all action classes The reason is that the subjects stand in a fixed position in MICA-Action3D, so actions in MICA-Action3D are more discriminative than in CMDFALL Table 4.9: Performance evaluation on MICA-Action3D Performance scores are in percentage No Method AAGCN [104] LW-AAGCN FF-AAGCN (Chapter 3) Proposed (LW-FF-AAGCN) Proposed (LW-FF-AAGCN with JSS-A) Proposed (LW-FF-AAGCN with JSS-B) Precision 96.82 98.71 98.62 99.52 99.19 99.19 Recall 96.45 98.64 98.52 99.49 99.16 99.16 F1-score 96.36 98.66 98.54 99.49 99.16 99.16 On the large-scale NTU RGB+D, the baseline AAGCN has 3.76 million parameters The proposed model has 0.67 millions of model parameters, which is 5.6 times less than the baseline The proposed method achieves a trade-off performance compared with AAGCN, as shown in Table 4.10 STAR-64 requires only 0.42 millions of model parameters, while the proposed LW-FF-AAGCN requires 0.67 millions of model parameters The drawback of STAR-64 is that it achieves only 81.9%, while LWFF-AAGCN achieves 86.9% for the cross-subject benchmark For the cross-subject benchmark, the accuracy of the proposed method is 86.9%, whereas that of AAGCN is 88.0% For the cross-view benchmark, the accuracy of the proposed method is 92.7%, and the accuracy of AAGCN is 95.1% The performance of the baseline AAGCN and the proposed method with different subsets of the NTU RGB+D dataset are shown in Table 4.13 The subsets are randomly formed using the same percentage of samples from each action type The same percentage is applied to 60 action classes to avoid data imbalance When the dataset is large enough, the baseline AAGCN with more parameters outperforms the lightweight model This can be explained that when largescale data is available, a network with a large number of trainable parameters can learn better The results in Table 4.13 are consistent with [125] The performance of the graph-based model increases as the dataset size grows Further study on dataset sizes and model sizes should focus on the double descent phenomenon described in [126] 4.5 Application demonstration In this section, a demo is developed to assess a person’s performance with actions in MSR-Action3D Due to time limitations, only the result of the action recognition 97 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep P Z D YH P QG FD WFK KR KD P UL] RQ HU WD OD UP Z DY H KLJ K DU UG S X QF K KD KLJ K WK UR Z IR U ZD GU DZ [ GU DZ WLF N DZ FL UF OH KD QG FO DS 7UXHODEHO GU QG Z DY H H ER [LQ WZ J R KD EH QG VLG DU G NLF N VLG H NLF N IR UZ MRJ JLQ J LV VZ LQJ LV VH UY H WH QQ JR OIV ZL QJ WH QQ Z H SLF N XS D QG WK UR QJ UY ZL VH OIV LV QQ WH JR J N LQJ WH QQ LV VZ JLQ MRJ N VLG H NLF QG J NLF IR UZ DU G EH H VLG H ER [LQ DY DS Z FO QG QG KD KD R WZ N OH UF FL DZ GU [ Z WLF DZ GU DZ GU K UR QF WK K SX G KLJ DU UZ IR KD QG FD HU WFK H P P DY Z OD WD RQ KD UP P DU KR UL] KLJ K SLF N Z XS DY D H QG WK UR Z 3UHGLFWHGODEHO Figure 4.9: Confusion matrix on MICA-Action3D using LW-FF-AAGCN module is introduced The related modules such as action spotting, human pose evaluation to scoring/assessment are out of the scope of this study As discussed in Chapter 1, the limitation of pose estimation using RGB-D sensors is the range of a few meters To overcome the limitation of depth sensors, 3D pose estimation using Google MediaPipe is implemented using the data collected by a camera As shown in Figure 4.11, there are 33 landmarks in the skeleton model of MediaPipe for pose estimation Pose detection and pose tracking are used in combination for pose estimation Pose detection is triggered for the first frame, and then the estimation task is handled to the pose tracking for faster estimation Once estimation quality is lower than a certain threshold, the pose detector is triggered to improve estimation quality Skeleton data from MediaPipe are fed to the pre-trained LW-FF-AAGCN for action recognition The 98 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep Table 4.10: Comparison on model parameters, FLOPs, and accuracy (%) on NTU RGB+D No Method LSTM-CNN [127] SR-TSL [74] HCN [68] ST-GCN [62] DCM [128] AS-GCN [99] RA-GCNv1 [97] AGCN [101] RA-GCNv2 [113] 10 AS-RAGCN [111] 11 AAGCN [104] 12 FF-AAGCN (Chapter 3) Lightweight models 13 SAR-NAS [117] 14 STAR-64 [118] 15 STAR-128 [118] 16 Tripool [120] 17 Lightweight-KA-AGTN [121] Proposed 18 LW-FF-AAGCN 19 LW-FF-AAGCN JSS-A 20 LW-FF-AAGCN JSS-B Year 2017 2018 2018 2018 2019 2019 2019 2019 2020 2020 2020 2021 Param 60M 19.1M 2.64M 3.1M 10M 7.1M 6.21M 3.47M 6.21M 4.88M 3.76M 3.76M FLOPs 4.2G 16.32G 35.92G 32.8G 18.66G 32.8G 16.43G 16.44G CS 82.9 84.8 86.5 81.5 84.5 86.8 85.9 87.3 87.3 87.7 88.0 88.2 CV 90.1 92.4 91.1 88.3 91.3 94.2 93.5 93.7 93.6 92.9 95.1 94.8 2020 2021 2021 2021 2022 1.3M 0.42M 1.26M 3.6M 2.7M 10.2G 31.16G 146.66G 23.52G - 86.4 81.9 83.4 88.0 88.0 94.3 88.9 89.0 95.3 - - 0.67M 0.66M 0.66M 14.26G 7.42G 7.42G 86.9 84.1 83.5 92.7 90.1 90.1 Figure 4.10: System diagram of the demo system diagram of the demo in real-time is shown in Figure 4.10 Screen captures of the demo for the actions side kick and golf swing are shown in Figure 4.12 The demo interface is designed using PyQt to configure application functionalities 99 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep Table 4.11: Training time and testing time on NTU RGB+D with cross-subject (CS) benchmark Testing time per sample is calculated per sample No Method AAGCN [104] LW-AAGCN FF-AAGCN (Chapter 3) LW-FF-AAGCN LW-FF-AAGCN JSS-A LW-FF-AAGCN JSS-B FF LW JSS Training (h) 21.67 8.03 22.94 8.27 5.41 5.41 Testing (ms) 10.28 4.54 11.04 4.80 2.82 3.10 Table 4.12: Training time and testing time on NTU RGB+D with cross-view (CV) benchmark Testing time per sample is calculated per sample No Method AAGCN [104] LW-AAGCN FF-AAGCN (Chapter 3) LW-FF-AAGCN LW-FF-AAGCN JSS-A LW-FF-AAGCN JSS-B FF LW JSS Training (h) 21.29 7.88 22.51 8.12 5.32 5.32 Testing (ms) 10.15 4.52 10.78 4.77 2.77 3.09 Table 4.13: Performance evaluation on subsets of NTU RGB+D with CS benchmark No 4.1 4.2 Subset Training samples Testing samples Total samples Accuracy (%) AAGCN [104] LW-FF-AAGCN 1% 400 164 564 2% 801 329 1,130 5% 2,004 824 2,828 10% 4,009 1,648 5,657 20% 8,018 3,297 11,315 50% 20,045 8,243 28,288 100% 40,091 16,487 56,578 24.4 28.9 44.7 46.5 63.8 64.6 75.1 73.8 81.0 78.6 86.1 84.0 88.0 86.9 100 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep Figure 4.11: MediaPipe pose estimation with a 33-landmark model [129] 4.6 Conclusion of the chapter In this chapter, a lightweight model LW-FF-AAGCN is proposed Layer pruning for the deep learning network AAGCN is proposed with a Preset JSS module and a Feature Fusion module Once Preset JSS is enabled, two graph topologies (JSS-A and JSS-B) are defined for the selected joints The graph type B (JSS-B) with the edges connecting symmetrical joints achieves excellent performance on CMDFALL with fewer model parameters and FLOPs The number of parameters is reduced using Preset JSS and layer pruning Experimental results show that the lightweight model with graph type B (JSS-B) outperforms the baseline AAGCN on challenging datasets with trainable parameters 5.6 times fewer than the baseline The computation complexity in FLOPs of the proposed model is 3.5 times lower than that of the baseline on CMDFALL A study is conducted to evaluate the performance of LW-FF-AAGCN with different dataset sizes LW-FF-AAGCN is an efficient deep learning model for HAR implementation in real-world applications A demo is presented using the proposed method for human action recognition Results in this chapter have been submitted to the Multimedia Tools and Applications (MTAP) journal in the paper "A Lightweight Graph Convolutional Network for Skeleton-based Action Recognition" 101 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep (a) side kick action (b) golf swing action Figure 4.12: Screen captures from the demo 102 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep CONCLUSION AND FUTURE WORKS Conclusions Noise in the skeleton data can degrade the performance of action recognition Joint subset selection (JSS), feature combining, and the use of graph-based deep learning networks are proposed to improve representation efficiency and recognition performance It is found in the first contribution of the dissertation is that joint subset selection with both preset and automatic schemes help improve the performance of action recognition In the second contribution, a Feature Fusion module is coupled with AAGCN to form FF-AAGCN The Feature Fusion is a simple and efficient data pre-processing module for graph-based deep learning, especially for noisy skeleton data The proposed method FF-AAGCN outperforms the baseline AAGCN on CMDFALL, a challenging dataset with noise in skeleton data FF-AAGCN obtains competitive results compared to AAGCN on the large-scale dataset NTU RGB+D The third contribution is a lightweight model LW-FF-AAGCN The number of model parameters in LW-FFAAGCN is 5.6 times less than the baseline The proposed lightweight model is suitable for application development for edge devices with limited computation capacity LWFF-AAGCN outperforms both AAGCN and FF-AAGCN on CMDFALL A trade-off in the performance of the lightweight model is observed on the large-scale dataset NTU RGB+D These results suggest directions for future research from both short-term and long-term perspectives Future work Short-Term Perspectives • Study on noise in the skeleton data caused by pose estimation errors A standard high-precision Mocap system is required to collect reference skeleton data Poses estimated from color/depth images are compared with data from the Mocap system The performance of the proposed method at different noise levels needs to be evaluated • Study different statistical metrics for Joint Subset Selection, such as the variance of joint angles Other JSS methods or handcrafted features should be implemented in combination with deep learning networks for action recognition • Develop graph-based lightweight models for application development on edge devices As computation capacity is limited on edge devices, lightweight models 103 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep are required for real-time applications Further research will aim at designing lightweight models from the garden of deep learning models in the literature • Study on the interpretability of action recognition using graph-based deep learning Deep learning approaches, dominant in the present literature, have excellent performance at the expense of the learning process’s understandability Handcrafted learning can be deemed less generalizable and more data-type specific in general They are, however, more intelligible from a human standpoint The optimum trade-off strategy is still an open question [24] • For skeleton-based action recognition, performance is strongly determined by the quality of the skeleton data Improving the quality of pose estimation is important for high-performance action recognition • Evaluate the proposed methods on other datasets such as NTU RGB+D 120 [37], UAV-Human [38] • Study on key frame selection for action recognition The combination of key frame selection and JSS should be considered • Further extend the study of graph theory on action recognition, such as graph node prediction for handling noise and incompleteness in the skeleton data Long-Term Perspectives • Extend the proposed methods to continuous skeleton-based human action recognition The proposed methods are currently evaluated on datasets with segmented skeleton sequences Action segmentation is required for continuous action recognition • Extend the study of Graph Convolutional Networks to Geometric Deep Learning There is a garden of deep learning models in the literature, including CNNs, RNNs, GCNs, and many others This leads to a requirement to construct a general mathematical framework for all these models Geometric Deep Learning is an approach to unifying these deep learning models by exploring the common mathematics in these models • Develop applications using the proposed methods for human action recognition such as elderly remote monitoring in healthcare or camera surveillance for abnormal behavior detection 104 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep PUBLICATIONS Conferences [C1] Tien-Nam Nguyen, Dinh-Tan Pham, Thi-Lan Le, Hai Vu, and Thanh-Hai Tran (2018), Novel Skeleton-based Action Recognition Using Covariance Descriptors on Most Informative Joints, Proceedings of International Conference on Knowledge and Systems Engineering (KSE 2018), IEEE, Vietnam, ISBN: 978-1-5386-6113-0, pp.50-55, 2018 [C2] Dinh-Tan Pham, Tien-Nam Nguyen, Thi-Lan Le, and Hai Vu (2019), Analyzing Role of Joint Subset Selection in Human Action Recognition, Proceedings of NAFOSTED Conference on Information and Computer Science (NICS 2019), IEEE, Vietnam, ISBN: 978-1-7281-5163-2, pp.61-66, 2019 [C3] Dinh-Tan Pham, Tien-Nam Nguyen, Thi-Lan Le, and Hai Vu (2020), SpatioTemporal Representation for Skeleton-based Human Action Recognition, Proceedings of International Conference on Multimedia Analysis and Pattern Recognition (MAPR 2020), IEEE, Vietnam, ISBN: 978-1-7281-6555-4, pp.1-6, 2020 Journals [J1] Dinh-Tan Pham, Quang-Tien Pham, Thi-Lan Le, and Hai Vu (2021), An Efficient Feature Fusion of Graph Convolutional Networks and Its Application for RealTime Traffic Control Gestures Recognition, IEEE Access, ISSN: 2169-3536, pp.121930 - 121943, 2021 (ISI, Q1) [J2] Van-Toi Nguyen, Tien-Nam Nguyen, Thi-Lan Le, Dinh-Tan Pham, and Hai Vu (2020), Adaptive most joint selection and covariance descriptions for a robust skeleton-based human action recognition, Multimedia Tools and Applications (MTAP), Springer, DOI: 10.1007/s11042-021-10866-4, pp.1-27, 2021 (ISI, Q1) [J3] Dinh Tan Pham, Thi Phuong Dang, Duc Quang Nguyen, Thi Lan Le, and Hai Vu (2021), Skeleton-based Action Recognition Using Feature Fusion for SpatialTemporal Graph Convolutional Networks, Journal of Science and Technique, Le Quy Don Technical University (LQDTU-JST), ISSN 1859-0209, pp.7-24, 2021 105 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep Bibliography [1] Hoang V.N., Le T.L., Tran T.H., Nguyen V.T., et al (2019) 3D skeleton-based action recognition with convolutional neural networks In 2019 International Conference on Multimedia Analysis and Pattern Recognition (MAPR), pp 1–6 IEEE [2] Johansson G (1973) Visual perception of biological motion and a model for its analysis Perception and psychophysics, 14(2):pp 201–211 [3] Google Mediapipe https://github.com/google/mediapipe [Online; accessed 01-October-2021] [4] Chen X et al (2014) Real-time Action Recognition for RGB-D and Motion Capture Data Ph.D thesis, Aalto University [5] Zhang J., Li W., Ogunbona P.O., Wang P., and Tang C (2016) RGB-D-based action recognition datasets: A survey Pattern Recognition, 60:pp 86–105 [6] Herath S., Harandi M., and Porikli F (2017) Going deeper into action recognition: A survey Image and vision computing, 60:pp 4–21 [7] Wang Q (2016) A survey of visual analysis of human motion and its applications arXiv preprint arXiv:1608.00700 , pp 1–6 [8] Presti L.L and La Cascia M (2016) 3D skeleton-based human action classification: A survey Pattern Recognition, 53:pp 130–147 [9] Kong Y and Fu Y (2022) Human action recognition and prediction: A survey International Journal of Computer Vision, 130(5):pp 1366–1401 [10] Biliński P.T (2014) Human action recognition in videos Ph.D thesis, Université Nice Sophia Antipolis [11] Bux A (2017) Vision-based human action recognition using machine learning techniques Ph.D thesis, Lancaster University (United Kingdom) [12] Beddiar D.R., Nini B., Sabokrou M., and Hadid A (2020) Vision-based human activity recognition: a survey Multimedia Tools and Applications, 79(41):pp 30509–30555 [13] Agarwal P and Alam M (2020) A lightweight deep learning model for human activity recognition on edge devices Procedia Computer Science, 167:pp 2364– 2373 106 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep [14] Das S (2020) Spatio-Temporal Attention Mechanism for Activity Recognition Ph.D thesis, Université Côte d’Azur [15] Koperski M (2017) Human action recognition in videos with local representation Ph.D thesis, COMUE Université Côte d’Azur (2015-2019) [16] Wang P (2017) Action recognition from RGB-D data Ph.D thesis, University of Wollongong [17] Li W., Zhang Z., and Liu Z (2010) Action recognition based on a bag of 3D points In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp 9–14 IEEE [18] Ren B., Liu M., Ding R., and Liu H (2020) A survey on 3D skeleton-based action recognition using learning method arXiv preprint arXiv:2002.05907 , pp 1–8 [19] Sun Z., Liu J., Ke Q., Rahmani H., Bennamoun M., and Wang G (2020) Human action recognition from various data modalities: A review arXiv preprint arXiv:2012.11866 , pp 1–20 [20] Tran T.H., Le T.L., Pham D.T., Hoang V.N., Khong V.M., Tran Q.T., Nguyen T.S., and Pham C (2018) A multi-modal multi-view dataset for human fall analysis and preliminary investigation on modality In 2018 24th International Conference on Pattern Recognition (ICPR), pp 1947–1952 IEEE [21] Adama D.O.A (2020) Fuzzy Transfer Learning in Human Activity Recognition Ph.D thesis, Nottingham Trent University (United Kingdom) [22] Shotton J., Fitzgibbon A., Cook M., Sharp T., Finocchio M., Moore R., Kipman A., and Blake A (2011) Real-time human pose recognition in parts from single depth images In CVPR 2011 , pp 1297–1304 IEEE [23] Al-Akam R (2021) Human Action Recognition in Video Data using Color and Distance Ph.D thesis, University of Koblenz and Landau [24] Angelini F (2020) Novel methods for posture-based human action recognition and activity anomaly detection Ph.D thesis, Newcastle University [25] Cao Z., Simon T., Wei S.E., and Sheikh Y (2017) Realtime multi-person 2D pose estimation using part affinity fields In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7291–7299 [26] Google Mediapipe pose https://google.github.io/mediapipe/solutions/ pose [Online; accessed 15-September-2021] 107 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep [27] Fang H.S., Xie S., Tai Y.W., and Lu C (2017) RMPE: Regional multi-person pose estimation In Proceedings of the IEEE international conference on computer vision, pp 2334–2343 [28] Vonstad E.K., Su X., Vereijken B., Bach K., and Nilsen J.H (2020) Comparison of a deep learning-based pose estimation system to marker-based and Kinect systems in exergaming for balance training Sensors, 20(23):pp 1–16 [29] Yokoyama N Pose estimation and person description using convolutional neural networks http://naoki.io/portfolio/person_descrip.html [Online; accessed 15-September-2021] [30] Kuehne H., Jhuang H., Garrote E., Poggio T., and Serre T (2011) HMDB: a large video database for human motion recognition In 2011 International conference on computer vision, pp 2556–2563 IEEE [31] Soomro K., Zamir A.R., and Shah M (2012) UCF101: A dataset of 101 human actions classes from videos in the wild arXiv preprint arXiv:1212.0402 , pp 1–6 [32] Xia L., Chen C.C., and Aggarwal J.K (2012) View invariant human action recognition using histograms of 3D joints In 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp 20–27 IEEE [33] Ofli F., Chaudhry R., Kurillo G., Vidal R., and Bajcsy R (2013) Berkeley MHAD: A comprehensive multimodal human action database In 2013 IEEE Workshop on Applications of Computer Vision (WACV), pp 53–60 IEEE [34] Oreifej O and Liu Z (2013) HON4D: Histogram of oriented 4D normals for activity recognition from depth sequences In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 716–723 [35] Shahroudy A., Liu J., Ng T.T., and Wang G (2016) NTU RGB+D: A large scale dataset for 3D human activity analysis In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1010–1019 [36] Kay W., Carreira J., Simonyan K., Zhang B., Hillier C., Vijayanarasimhan S., Viola F., Green T., Back T., Natsev P., et al (2017) The Kinetics human action video dataset arXiv preprint arXiv:1705.06950 , pp 1–22 [37] Liu J., Shahroudy A., Perez M., Wang G., Duan L.Y., and Kot A.C (2019) NTU RGB+D 120: A large-scale benchmark for 3D human activity understanding IEEE transactions on pattern analysis and machine intelligence, 42(10):pp 2684– 2701 108 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep [38] Li T., Liu J., Zhang W., Ni Y., Wang W., and Li Z (2021) UAV-Human: A large benchmark for human behavior understanding with unmanned aerial vehicles In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16266–16275 [39] Bloom V (2015) Multiple Action Recognition for Video Games (MARViG) Ph.D thesis, Kingston University [40] Mazari A (2020) Deep Learning for Action Recognition in Videos Ph.D thesis, Sorbonne Université [41] Wang L (2021) Analysis and evaluation of Kinect-based action recognition algorithms arXiv preprint arXiv:2112.08626 , pp 1–22 [42] Xing Y and Zhu J (2021) Deep learning-based action recognition with 3D skeleton: A survey CAAI Transactions on Intelligence Technology, pp 80–92 [43] Wang J., Liu Z., Wu Y., and Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp 1290–1297 IEEE [44] Yang X and Tian Y.L (2012) Eigenjoints-based action recognition using NaiveBayes-Nearest-Neighbor In 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp 14–19 IEEE [45] Hussein M.E., Torki M., Gowayyed M.A., and El-Saban M (2013) Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations In the Proceeding of Twenty-Third International Joint Conference on Artificial Intelligence, pp 2466–2472 [46] Gaglio S., Re G.L., and Morana M (2014) Human activity recognition process using 3-D posture data IEEE Transactions on Human-Machine Systems, 45(5):pp 586–597 [47] Müller M and Röder T (2006) Motion templates for automatic classification and retrieval of motion capture data In Proceedings of the 2006 ACM SIGGRAPH/Eurographics symposium on Computer animation, pp 137–146 [48] Ofli F., Chaudhry R., Kurillo G., Vidal R., and Bajcsy R (2014) Sequence of the most informative joints (SMIJ): A new representation for human skeletal action recognition Journal of Visual Communication and Image Representation, 25(1):pp 24–38 [49] Wang J., Liu Z., Wu Y., and Yuan J (2013) Learning actionlet ensemble for 3D human action recognition IEEE transactions on pattern analysis and machine intelligence, 36(5):pp 914–927 109 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep [50] Zanfir M., Leordeanu M., and Sminchisescu C (2013) The moving pose: An efficient 3D kinematics descriptor for low-latency action recognition and detection In Proceedings of the IEEE international conference on computer vision, pp 2752–2759 [51] Ghorbel E., Boutteau R., Boonaert J., Savatier X., and Lecoeuche S (2015) 3D real-time human action recognition using a spline interpolation approach In 2015 International Conference on Image Processing Theory, Tools and Applications (IPTA), pp 61–66 IEEE [52] Wang C., Wang Y., and Yuille A.L (2013) An approach to pose-based action recognition In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 915–922 [53] Wei P., Zheng N., Zhao Y., and Zhu S.C (2013) Concurrent action detection with structural prediction In Proceedings of the IEEE International Conference on Computer Vision, pp 3136–3143 [54] Zhou L., Li W., Zhang Y., Ogunbona P., Nguyen D.T., and Zhang H (2014) Discriminative key pose extraction using extended LC-KSVD for action recognition In 2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp 1–8 IEEE [55] Eweiwi A., Cheema M.S., Bauckhage C., and Gall J (2014) Efficient posebased action recognition In Asian conference on computer vision, pp 428–443 Springer [56] Cippitelli E., Gasparrini S., Gambi E., and Spinsante S (2016) A human activity recognition system using skeleton data from RGBD sensors Computational intelligence and neuroscience, 2016:pp 1–15 [57] El-Ghaish H.A., Shoukry A.A., and Hussein M.E (2018) Covp3dj: Skeletonparts-based-covariance descriptor for human action recognition In VISIGRAPP (5: VISAPP), pp 343–350 [58] Vemulapalli R., Arrate F., and Chellappa R (2014) Human action recognition by representing 3D skeletons as points in a Lie group In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 588–595 [59] Cai X., Zhou W., Wu L., Luo J., and Li H (2015) Effective active skeleton representation for low latency human action recognition IEEE Transactions on Multimedia, 18(2):pp 141–154 [60] Boujebli M., Drira H., Mestiri M., and Farah I.R (2020) Rate-invariant modeling in Lie algebra for activity recognition Electronics, 9(11):pp 1–16 110 luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep an to nghiep docx 123docz luan van hay luan van tot nghiep

A study on deep learning techniques for human action representation and recognition with skeleton data

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan