Phát hiện và nhận dạng đối tượng 3d hỗ trợ sinh hoạt của người khiếm thị =

MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY LE VAN HUNG 3D OBJECT DETECTIONS AND RECOGNITIONS: ASSISTING VISUALLY IMPAIRED PEOPLE IN DAILY ACTIVITIES DOCTORAL DISSERTATION OF COMPUTER SCIENCE Hanoi – 2019 MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY LE VAN HUNG 3D OBJECT DETECTIONS AND RECOGNITIONS: ASSISTING VISUALLY IMPAIRED PEOPLE IN DAILY ACTIVITIES Major: Computer Science Code: 9480101 DOCTORAL DISSERTATION OF COMPUTER SCIENCE SUPERVISORS: Dr Vu Hai Assoc Prof Dr Nguyen Thi Thuy Hanoi – 2019 DECLARATION OF AUTHORSHIP I, Le Van Hung, declare that this dissertation titled, ”3-D Object Detections and Recognitions: Assisting Visually Impaired People in Daily Activities ”, and the works presented in it are my own I confirm that: This work was done wholly or mainly while in candidature for a Ph.D research degree at Hanoi University of Science and Technology Where any part of this thesis has previously been submitted for a degree or any other qualification at Hanoi University of Science and Technology or any other institution, this has been clearly stated Where I have consulted the published work of others, this is always clearly attributed Where I have quoted from the work of others, the source is always given With the exception of such quotations, this dissertation is entirely my own work I have acknowledged all main sources of help Where the dissertation is based on work done by myself jointly with others, I have made exactly what was done by others and what I have contributed myself Hanoi, January, 2019 PhD Student Le Van Hung SUPERVISORS Dr Vu Hai Assoc Prof Dr Nguyen Thi Thuy i ACKNOWLEDGEMENT This dissertation was written during my doctoral course at International Research Institute Multimedia, Information, Communication and Applications (MICA), Hanoi University of Science and Technology (HUST) It is my great pleasure to thank all the people who supported me for completing this work First, I would like to express my sincere gratitude to my advisors Dr Hai Vu and Assoc Prof Dr Thi Thuy Nguyen for their continuous support, their patience, motivation, and immense knowledge Their guidance helped me all the time of research and writing this dissertation I could not imagine a better advisor and mentor for my Ph.D study Besides my advisors, I would like to thank to Assoc Prof Dr Thi-Lan Le, Assoc Prof Dr Thanh-Hai Tran and members of Computer Vision Department at MICA Institute The colleagues have assisted me a lot in my research process as well as they are co-authored in the published papers Moreover, the attention at scientific conferences has always been a great experience for me to receive many the useful comments During my PhD course, I have received many supports from the Management Board of MICA Institute My sincere thank to Prof Yen Ngoc Pham, Prof Eric Castelli and Dr Son Viet Nguyen, who gave me the opportunity to join research works, and gave me permission to joint to the laboratory in MICA Institute Without their precious support, it has been being impossible to conduct this research As a Ph.D student of 911 program, I would like to thank this programme for financial support I also gratefully acknowledge the financial support for attending the conferences from Nafosted-FWO project (FWO.102.2013.08) and VLIR project (ZEIN2012RIP19) I would like to thank the College of Statistics over the years both at my career work and outside of the work Special thanks to my family, particularly, to my mother and father for all of their sacrifices that they have made on my behalf I also would like to thank my beloved wife for everything she supported me Hanoi, January, 2019 Ph.D Student Le Van Hung ii CONTENTS DECLARATION OF AUTHORSHIP i ACKNOWLEDGEMENT ii CONTENTS v SYMBOLS vi LIST OF TABLES x LIST OF FIGURES xix LITERATURE REVIEW 1.1 Aided-systems for supporting visually impaired people 1.1.1 Aided-systems for navigation services 1.1.2 Aided-systems for obstacle detection 1.1.3 Aided-systems for locating the interested objects in scenes 1.1.4 Discussions 1.2 3-D object detection, recognition from a point cloud data 1.2.1 Appearance-based methods 1.2.1.1 Discussion 1.2.2 Geometry-based methods 1.2.3 Intelligent Robotics System for grasping 3-D objects 1.2.4 Datasets for 3-D object recognition 1.2.5 Discussions 1.3 Fitting primitive shapes 1.3.1 Linear fitting algorithms 1.3.2 Robust estimation algorithms 1.3.3 RANdom SAmple Consensus (RANSAC) and its variations 1.3.4 Discussions 8 11 12 13 13 16 16 17 18 18 19 19 20 21 24 POINT CLOUD REPRESENTATION AND THE PROPOSED METHOD FOR TABLE PLANE DETECTION 25 2.1 Point cloud representations 25 2.1.1 Capturing data by a MS Kinect sensor 25 2.1.2 Point cloud representation 26 2.2 The proposed method for table plane detection 29 iii 2.2.1 2.2.2 2.2.3 2.3 Introduction Related Work The proposed method 2.2.3.1 The proposed framework 2.2.3.2 Plane segmentation 2.2.3.3 Table plane detection and extraction 2.2.4 Experimental results 2.2.4.1 Experimental setup and dataset collection 2.2.4.2 Table plane detection evaluation method 2.2.4.3 Results Separating the interested objects on the table plane 2.3.1 Coordinate system transformation 2.3.2 Separating table plane and the interested objects 2.3.3 Discussions PRIMITIVE SHAPES ESTIMATION BY A NEW ROBUST ESTIMATOR USING GEOMETRICAL CONSTRAINTS 3.1 Fitting primitive shapes by GCSAC 3.1.1 Introduction 3.1.2 Related work 3.1.3 The proposed a new robust estimator 3.1.3.1 Overview of the proposed robust estimator (GCSAC) 3.1.3.2 Geometrical analyses and constraints for qualifying good samples 3.1.4 Experimental results of robust estimator 3.1.4.1 Evaluation datasets of robust estimator 3.1.4.2 Evaluation measurements of robust estimator 3.1.4.3 Evaluation results of a new robust estimator 3.1.5 Discussions 3.2 Fitting objects using the context and geometrical constraints 3.2.1 The proposed method of finding objects using the context and geometrical constraints 3.2.1.1 Model verification using contextual constraints 3.2.2 Experimental results of finding objects using the context and geometrical constraints 3.2.2.1 Descriptions of the datasets for evaluation 3.2.2.2 Evaluation measurements 3.2.2.3 Results of finding objects using the context and geometrical constraints iv 29 30 31 31 33 35 37 37 38 41 47 47 49 49 52 53 53 54 56 56 59 65 65 68 69 75 77 78 78 79 79 82 83 3.2.3 Discussions 86 DETECTION AND ESTIMATION OF A 3-D OBJECT MODEL FOR A REAL APPLICATION 88 4.1 A Comparative study on 3-D object detection 88 4.1.1 Introduction 88 4.1.2 Related Work 90 4.1.3 Three different approaches for 3-D objects detection in a complex scene 92 4.1.3.1 Geometry-based method for Primitive Shape detection Method (PSM) 92 4.1.3.2 Combination of Clustering objects and Viewpoint Features Histogram, GCSAC for estimating 3-D full object models (CVFGS) 93 4.1.3.3 Combination of Deep Learning based and GCSAC for estimating 3-D full object models (DLGS) 95 4.1.4 Experiments 97 4.1.4.1 Data collection 97 4.1.4.2 Evaluation method 100 4.1.4.3 Setup parameters in the evaluations 103 4.1.4.4 Evaluation results 104 4.1.5 Discussions 108 4.2 Deploying an aided-system for visually impaired people 111 4.2.1 Environment and material setup for the evaluation 113 4.2.2 Pre-built script 114 4.2.3 Performances of the real system 117 4.2.3.1 Evaluation of finding 3-D objects 117 4.2.4 Evaluation of usability and discussion 121 CONCLUSION AND 5.1 Conclusion 5.2 Limitations 5.3 Future works FUTURE WORKS 124 124 125 126 Bibliography 129 PUBLICATIONS 144 v ABBREVIATIONS No Abbreviation Meaning API Application Programming Interface ASKC Aadaptive Scale Kernel Consensus ASSC Adaptive Scale Sample Consensus CNN Convolution Neural Network COCO Common Objects of Context CPU Central Processing Unit CVFGS Viewpoint Feature Histogram CVFH Clustered Viewpoint Feature Histogram DLGS Deep Learning + GCSAC 10 EM Expectation Maximization 11 FN False Negative 12 FP False Positive 13 FPFH Fast Point Feature Histogram 14 fps f rame per second 15 GCSAC Geometrical Constraint SAmple Consensus 16 GPS Global Positioning System 17 GT Ground Truth 18 HT Hough Transform 20 HUST Hanoi University of Science and Technology 21 ICP Iterative Closest Point 22 IMU Inertial Measurement Unit 23 IR InfraRed 24 ISS Intrinsic Shape Signatures 25 JI Jaccard Index 26 KDE Kernel Density Estimation 27 KDES Kernel DEScriptors 28 KNN K-Nearest Neighbor 29 KNNs K-Nearest Neighbors 30 LBP Local Binary Patterns 31 LMNN Large Margin Nearest Neighbor vi 32 LMS Least Mean of Squares 33 LMS Least Mean of Squares 34 LOSAC Locally Optimized RANSAC 35 LRF Local Receptive Fields 36 LS Least Squares 37 LSM Least Squares Method 38 MAPSAC Maximum A Posteriori SAmple Consensus 39 MICA Multimedia, Information, Communication and Applications 40 MIT Massachusetts Institute of Technology 41 MLESAC Maximum Likelihood Estimation SAmple Consensus 42 MS MicroSoft 43 MSAC M-estimator SAmple Consensus 44 MSI Modified Plessey 45 MSS Minimal Sample Set 46 NAPSAC N-Adjacent Points SAmple Consensus 47 NARF Normal Aligned Radial Features 48 NN Nearest Neighbor 49 NNDR Nearest Neighbor Distance Ratio 50 NYU New York University 51 OCR Optical Character Recognition 52 OPENCV OPEN source Computer Vision Library 53 PC Persional Computer 54 PCA Principal Component Analysis 55 PCL Point Cloud Library 56 PFH Point Feature Histogram 57 PFH-RGB Point Feature Histogram + RGB 58 PROSAC PROgressive SAmple Consensus 59 PSM Primitive Shape Detection Method 60 QR code Quick Response Code 61 RAM Random Acess Memory 62 RANSAC RANdom SAmple Consensus 63 R-CNN Region Convolutional Neural Network 64 RFID Radio-Frequency IDentification 65 RGB Red Green Blue 66 RPN Region Proposal Network vii 67 R-RANSAC Recursive RANdom SAmple Consensus 68 SDK Software Development Kit 69 SHOT Signature of Histograms of OrienTations 70 SIFT Scale-Invariant Feature Transform 71 SQ SuperQuadric 72 SURF Speeded Up Robust Features 73 SVM Support Vector Machine 74 TN True Negative 75 TP True Positive 76 TTS Text To Speech 77 UPC Universal Product Code 78 URL Uniform Resource Locator 79 USAC A Universal Framework for Random SAmple Consensus 80 VE Virtual Environment 81 VFH Viewpoint Feature Histogram 82 VIP Visually Impaired Person 83 VIPs Visually Impaired People 84 YOLO YOu Look Only Once viii [25] Chau C.p and Siu W.c (2004) Generalized Hough Transform Using Regions with Homogeneous Color International Journal of Computer Vision, 59(2):pp 183–199 [26] Chen C.S., Hung Y.P., and Cheng J.B (1999) RANSAC-based DARCES: a new approach to fast automatic registration of partially overlapping range images IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(11):pp 1229 –1234 [27] Choi S., Kim T., and Yu W (2009) Performance evaluation of ransac family In Procedings of the British Machine Vision Conference, pp 1–12 British Machine Vision Association [28] Chum O and Matas J (2005) Matching with prosac - progressive sample consensus In Proceedings of the Computer Vision and Pattern Recognition, pp 220–226 [29] Chum O., Matas J., and Kittler J (2003) Locally optimized ransac In DAGMSymposium, volume 2781 of Lecture Notes in Computer Science, pp 236–243 Springer [30] Chum O., Matas J., and Kittler J (2003) Locally optimized ransac In DAGMSymposium, volume 2781 of Lecture Notes in Computer Science, pp 236–243 Springer [31] Ciocarlie M.T and Allen P.K (2009) Hand posture subspaces for dexterous robotic grasping The International Journal of Robotics Research, Vol 28, Issue [32] cylinder P Cylinder model segmentation http://pointclouds.org/ documentation/tutorials/cylinder_segmentation.php [Online; accessed 20 Feb-2018] [33] Derpanis K.G (2005) Overview of the ransac algorithm [34] Deschaud J.E and Goulette F (2010) A fast and accurate plane detection algorithm for large noisy point clouds using filtered normals and voxel growing In Proceedings of the 5th International Symposium on 3D Data Processing (3DPVT) [35] Diniz P (2013) The Least-Mean-Square (LMS) Algorithm In: Adaptive Filtering Springer 131 [36] Dirk Holz S., Rusu R.B., and Behnke S (2011) Real-Time Plane Segmentation Using RGB-D Cameras In LNCS (7416): RoboCup 2011 - Robot Soccer World Cup XV , pp 306–317 [37] Dong Z., Chen W., Bao H., Zhang H., and Peng Q (2004) Real-time voxelization for complex polygonal models In 12th Pacific Conference on the Computer Graphics and Applications,, pp 43–50 Washington, DC, USA ISBN 0-76952234-3 [38] Duda R.O and Hart P.E Use of the hough transformation to detect lines and curves in pictures Comm ACM , Vol 15:p pp 11–15 [39] Duncan K., Sarkar S., Alqasemi R., and Dubey R (2013) Multiscale superquadric fitting for efficient shape and pose recovery of unknown objects In Procedings of the International Conference on Robotics and Automation (ICRA’2013) [40] Dynamics B Spotmini, howpublished = https: // www bostondynamics com/ spot-mini , year = 2018, note = ”[online; accessed 20-septemper-2017]” [41] E T and J M C (2010) A mobile phone application enabling visually impaired users to find and read product barcodes In Proceedings of the 12th international conference on Computers helping people with special needs, pp pp 290–295 [42] Eberly D Least Squares Fitting of Data [43] Eberly D (2017) Fitting 3D Data with a Cylinder https://geometrictools com/Documentation/CylinderFitting.pdf [Online; accessed 18-Septemper2017] [44] Emanuele R., Andrea A., Filippo B., and Andrea T (2005) A Scale Independent Selection Process for 3D Object Recognition in Cluttered Scenes International Journal of Computer Vision, Volume 102(Issue 1–3):p pp 129–145 [45] Everingham M., Gool L.V., Williams C.K.I., and Winn J (2010) The PASCAL Visual Object Classes ( VOC ) Challenge International Journal of Computer Vision, Volume 88(Issue 2):pp 303–338 [46] Faber P and Fisher R.B (2001) A Buyer’s Guide to Euclidean Elliptical Cylindrical and Conical Surface Fitting In Procedings of the British Machine Vision Conference 2001 , 1, pp 54.1–54.10 [47] Feng C and Hung Y (2003) A robust method for estimating the fundamental matrix In In Proceedings of the 7th Digital Image Computing: Techniques and Applications, p 633–642 132 [48] Feng C., Taguchi Y., and Kamat V (2014) Fast plane extraction in organized point clouds using ag-glomerative hierarchical clustering In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp 6218– 6225 [49] Fischler M.A and Bolles R (1981) Random sample consensus: A paradigm for model fitting with applications to image analysisand automated cartography Communications of the ACM , 24(6):pp 381–395 [50] Garcia S (2009) Fitting primitive shapes to point clouds for robotic grasping Master Thesis in Computer Science (30 ECTS credits) at the School of Electrical Engineering Royal Institute of Technology [51] Geiger A., Lenz P., and Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite In Conference on Computer Vision and Pattern Recognition (CVPR) [52] Girshick R (2015) Fast R-CNN In International Conference on Computer Vision [53] Girshick R., Donahue J., Darrell T., and Malik J (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation In Computer Vision and Pattern Recognition [54] Glent A., Lilita B., and Dirk Kraft K (2017) Rotational subgroup voting and pose clustering for robust 3d object recognition In International Conference on Computer Vision [55] Greenacre M and Ayhan H.O (2017) Identifying inliers https://econpapers.upf.edu/papers/1423.pdf [Online; accessed 18-Septemper-2017] [56] Guo Y., Bennamoun M., Sohel F., Lu M., and Wan J (2014) 3D Object Recognition in Cluttered Scenes with Local Surface Features : A Survey IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11):pp 2270–2287 [57] Hachiuma R., Ozasa Y., and Saito H (2017) Primitive shape recognition via superquadric representation using large margin nearest neighbor classifier In International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications [58] Hartley R and Zisserman A (ISBN:0521540518, 2003) Multiple View Geometry in Computer Vision Cambridge University Press New York [59] Hough P (1959) Machine Analysis of Bubble Chamber Pictures In Proc Int Conf High Energy Accelerators and Instrumentation 133 [60] Hsiao K., Chitta S., Ciocarlie M., and Jones E.G (2010) Contact-reactive grasping of objects with partial shape information In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) [61] Huang H.c., Hsieh C.t., and Yeh C.h (2015) An indoor obstacle detection system using depth information and region growth sensors, pp 27116–27141 [62] Huang T., Yang G., and Tang G (1979) A fast two-dimensional median filtering algorithm IEEE Trans Acoust., Speech, Signal Processing, 27(1):pp 13–18 [63] Huy-Hieu P., Thi-Lan L., and Nicolas V (2015) Real-time obstacle detection system in indoor environment for the visually impaired using microsoft kinect sensor Journal of Sensors [64] IGI G (2018) What is point cloud https://www.igi-global.com/ dictionary/point-cloud/36879 [Online; accessed 10-January-2018] [65] ImageNet ImageNet Object Detection Challenge https://www.kaggle.com/ c/imagenet-object-detection-challenge [Online; accessed 18-Septemper2017] [66] Jaccard P (1912) The distribution of the flora in the alpine zone New Phytologist, 11(2):pp 37–50 [67] Jafri R., Ali S.A., and Arabnia H.R (2014) Computer vision-based object recognition for the visually impaired using visual tags The Visual Computer: International Journal of Computer Graphics, Volume 30(Issue 11):pp Pages 1197–1222 [68] Jagadeesan N and Parvathi R (2014) An efficient image downsampling technique using genetic algorithm and discrete wavelet transforman Journal of Theoretical and Applied Information Technology, 61(3):pp 506–514 [69] Jain D (2014) Path-guided indoor navigation for the visually impaired using minimal building retrofitting In Proceedings of the 16th international ACM SIGACCESS conference on Computers accessibility, pp 225–232 [70] Jean-Yves B (2018) Camera calibration toolbox for matlab http://www vision.caltech.edu/bouguetj/calib_doc/ [Online; accessed 10-January2018] [71] Johnson A and Hebert M (1999) Using spin images for efficient object recognition in cluttered 3D scenes IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 21(Issue: 5):pp 433 – 449 134 [72] Karpathy A., Miller S., and Fei-Fei L (2013) Object discovery in 3d scenes via shape analysis In IEEE International Conference on Robotics and Automation doi:10.1109/ICRA.2013.6630857 [73] Kevin L., Liefeng B., and Dieter F (2014) Unsupervised feature learning for 3d scene labeling In Robotics and Automation (ICRA) [74] Khaled A., Mohammed E., and Sherif B (2014) 3d object recognition based on image features: A survey International Journal of Computer and Information Technology [75] Knopp J., Prasad M., Willems G., and Timofte R (2010) Hough Transforms and 3D SURF for robust three dimensional classification In European Conference on Computer Vision, pp pp 589–602 [76] Kohei M., Yusuke U., Shigeyuki S., and Sato S (2016) Geometric verification using semi-2d constraints for 3d object retrieval In Proceedings of the International Conference on Pattern Recognition (ICPR) 2012., pp 2339–2344 [77] Kohei M., Yusuke U., Shigeyuki S., and Sato S (2016) Geometric verification using semi-2d constraints for 3d object retrieval In 23rd International Conference on Pattern Recognition (ICPR), p 2339–2344 doi:10.1109/ICPR.2016.7899985 [78] Kramer J., Burrus N., Echtler F., Daniel H.C., and Parker M (2012) Hacking the Kinect Apress [79] Kwon S.W., Liapi K.A., Haas C.T., and Bosche F (2003) Algorithms for fitting cylindrical objects to sparse range point clouds for rapid workspace modeling In Proceedings of the 20th ISARC , pp 173–178 [80] Lab M.M (2012) FINGERREADER A WEARABLE INTERFACE FOR READING ON-THE-GO http://fluid.media.mit.edu/projects/ fingerreader [Online; accessed 18-Septemper-2017] [81] Lai K., Bo L., Ren X., and Fox D (2011) A large-scale hierarchical multiview RGB-D object dataset In IEEE International Conference on Robotics and Automation (ICRA), pp 1817–1824 [82] Lai K., Liefeng B., Ren X., and Fox D (2012) Detection-based object labeling in 3d scenes In 2012 IEEE International Conference on Robotics and Automation, pp 1330–1337, ISSN :1050–4729 Ieee [83] Lam J and Greenspan M (2013) 3d object recognition by surface registration of interest segments In International Conference on 3D Vision, p DOI: 10.1109/3DV.2013.34 135 [84] Lanigan P.E., Paulos A.M., Williams A.W., Rossi D., and Narasimhan P (2006) Trinetra: Assistive technologies for grocery shopping for the blind In 10th IEEE International Symposium on Wearable Computers, pp pp.147–148 [85] Lawson C.L and Hanson R.J (ISBN 0-13-822585-0, 1974) Solving Least Squares Problems Englewood Cliffs, NJ: Prentice-Hall [86] Lebeda K., Matas J., and Chum O (2012) Fixing the locally optimized ransac In Proceedings of the British Machine Vision Conference 2012., pp 3–7 [87] Liefeng B., Kevin L., Xiaofeng R., and Dieter F (2011) Depth kernel descriptors for object recognition In IEEE/RSJ International Conference on Intelligent Robots and Systems [88] Liefeng B., Kevin L., Xiaofeng R., and Dieter F (2011) Object recognition with hierarchical kernel descriptors In Conference on Computer Vision and Pattern Recognition, pp 581–599 [89] Liefeng B., Xiaofeng R., and Dieter F (2010) Kernel descriptors for visual recognition In Advances in Neural Information Processing Systems 23 , pp 244– 252 [90] Lin T., Maire M., Belongie S.J., Bourdev L.D., Girshick R.B., Hays J., Perona P., Ramanan D., Doll´ar P., and Zitnick C.L (2014) Microsoft COCO: common objects in context CoRR, abs/1405.0312 [91] Lowe D.G (2004) Distinctive Image Features from Scale-Invariant Keypoints International Journal of Computer Vision, 60(2):pp 91–110 [92] Mair E., Gregory D.H., Burschka D., Michael S., and Gerhard H [93] Marco C., Roberto V., and Rita C (2014) 3d hough transform for sphere recognition on point clouds Machine Vision and Applications, p 1877–1891 [94] Matas J and Chum O (2005) Randomized ransac with sequential probability ratio test In Proceedings of the 10th IEEE International Conference on Computer Vision [95] Matthew T., Rafal M., and Wolfgang H (2011) Blur-aware image downsampling EUROGRAPHICS , 30(2) [96] Microsoft (2018) Kinect for Windows SDK v1.8 https://www.microsoft.com/ en-us/download/details.aspx?id=40278 [Online; accessed 10-January-2018] [97] Mikolajczyk K and Schmid C (2005) A Performance Evaluation of Local Descriptors IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10):pp 1615–1630 136 [98] Minjie C., Kris M.K., and Yoichi S (2016) Understanding hand-object manipulation by modeling the contextual relationship between actions, grasp types and object attributes Computer Vision and Pattern Recognition (cs.CV), abs/1807.08254 [99] Monther A.S., Mustahsan M., Abdullah M.A., and Ahmed M.A (2014) An obstacle detection and guidance system for mobility of visually impaired in unfamiliar indoor environments International Journal of Computer and Electrical Engineering, DOI: 10.7763/IJCEE.2014.V6.849 [100] Mueller C.A and Birk A (2016) Hierarchical graph-based discovery of nonprimitive-shaped objects in unstructured environments In International Conference on Robotics and Automation [101] Myatt D., Torr P., Nasuto S., Bishop J., and Craddock R (2002) Napsac: high noise, high dimensional robust estimation In Procedings of the British Machine Vision Conference (BMVC’02), pp 458–467 [102] Naeemabadi M., Dinesen B., Andersen O.K., Najafi S., and Hansen J (2018) Evaluating accuracy and usability of microsoft kinect sensors and wearable sensor for tele knee rehabilitation after knee operation In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIODEVICES,, pp 128–135 INSTICC, SciTePress ISBN 978-989-758-277-6 doi:10.5220/0006578201280135 [103] Nguyen B.H (2012) Scientist Shines Light For Visually Impaired http://greetingvietnam.com/technology/scientist-shines-lightfor-visually-impaired.html [Online; accessed 20-October-2017] [104] Nguyen Q.H., Vu H., Tran T.H., Nguyen Q.H., Veelaert P., and Philips W (Sept., 2014) A visual slam system on mobile robot supporting localization services to visually impaired people In in the Proceeding of the 2nd Workshop on Assistive Computer Vision and Robotics, in conjuntion with ECCV 2014 [105] Nicholson J., Kulyukin V., and Coster D (2009) Shoptalk: independent blind shopping through verbal route directions and barcode scans The Open Rehabilitation Journal , vol 2:pp pp 11–23 [106] Nicolas B (2018) Calibrating the depth and color camera http://nicolas burrus.name/index.php/Research/KinectCalibration [Online; accessed 10January-2018] [107] Nieuwenhuisen M., Stuckler J., Berner A., Klein R., and Behnke S (2012) Shapeprimitive based object recognition and grasping shape primitive detection and ob- 137 ject recognition In The 7th German conference on Robotics, May [108] Nieuwenhuisen M., Stueckler J., Berner A., Klein R., and Behnke S (2012) Shape-primitive based object recognition and grasping In Proc of ROBOTIK VDE-Verlag [109] Nikolakis G., Tzovaras D., and Strintzis M.G Object recognition for the blind (30):pp 1–4 [110] OpenCV (2018) Opencv library https://opencv.org/ [Online; accessed 10January-2018] [111] Osselman G., Gorte B., Sithole G., and Rabbani T (2004) Recognising structure in laser scanner point clouds In International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, p 33–38 [112] Pang G and Neumann U (2016) 3D Point Cloud Object Detection with MultiView Convolutional Neural Network In 23rd International Conference on Pattern Recognition [113] Papazov C and Burschka D (2011) An efficient ransac for 3d object recognition in noisy and occluded scenes In Proceedings of the 10th Asian Conference on Computer Vision - Volume Part I , pp 135–148 ISBN 978-3-642-19314-9 [114] (PCL) P.C.L (2013) Point cloud library (pcl) 1.7.0 pointclouds.org/1.7.0/mlesac_8hpp_source.html http://docs [115] (PCL) P.C.L (2014) How to use random sample consensus model http://pointclouds.org/documentation/tutorials/random_sample_ consensus.php [116] Polewski P., Yao W., Heurich M., Krzystek P., and Stilla U (2017) A votingbased statistical cylinder detection framework applied to fallen tree mapping in terrestrial laser scanning point clouds ISPRS Journal of Photogrammetry and Remote Sensing, Vol 129:pp pp 118–130 [117] Press W., Teukolsky S., Vetterling W.T., and Flannery B.P (2007) Numerical recipes: The art of scientific computing Cambridge University Press, pp pp 1099– 1110 [118] Qingming Z., Yubin L., and Yinghui X (2009) Color-based segmentation of point clouds Laser scanning 2009, IAPRS [119] Quispe A.H., Milville B., Gutierrez M.A., Erdogan C., Stilman M., Christensen H., and Amor H.B (2015) Exploiting symmetries and extrusions for grasping 138 household objects In IEEE International Conference on Robotics and Automation [120] Radu B., Nico B., and Michael B (2009) Fast point feature histograms (fpfh) for 3d registration In IEEE International Conference on Robotics and Automation, pp pp3212 – 3217, DOI: 10.1109/ROBOT.2009.5152473 [121] Raguram R., Chum O., Pollefeys M., Matas J., and Frahm J.M (Aug 2013) Usac: A universal framework for random sample consensus IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):pp 2022–2038 [122] Raguram R., Frahm J.M., and Pollefeys M (2008) A comparative analysis of ransac techniques leading to adaptive real-time random sample consensus In Procedings of the European Conference on Computer Vision (ECCV’08), pp 500–513 [123] Redmon J., Divvala S., Girshick R., and Farhadi A (2016) You Only Look Once: Unified, Real-Time Object Detection In Computer Vision and Pattern Recognition [124] Redmon J and Farhadi A (2017) YOLO9000: Better, Faster, Stronger In Computer Vision and Pattern Recognition [125] Ren S., He K., Girshick R., and Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks In Advances in Neural Information Processing Systems 28 , pp 91–99 [126] Richtsfeld A., Morwald T., Prankl J andZillich M., and Vincze M (2012) Segmentation of unknown objects in indoor environments In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 4791–4796 [127] Ridwan M., Choudhury E., Poon B., Amin M.A., and Yan H (2014) A navigational aid system for visually impaired using microsoft kinect In International MultiConference of Engineers and Computer Scientists, volume I [128] Rimon S., Peter B., Julian S., Benjamin H.G., Christine F.M., Eva D., Joerg F., and Bjoern M.E (2016) Blind path obstacle detector using smartphone camera and line laser emitter In International Conference on Technology and Innovation in Sports, Health and Wellbeing (TISHW 2016) [129] Robert C., Emmanuel K.N., and Ratko G (2016) Survey of state-of-the-art point cloud segmentation methods Technical Report: Josip Juraj Strossmayer University of Osijek 139 [130] Rusu B Cluster recognition and 6dof pose estimation using vfh descriptors http: //pointclouds.org/documentation/tutorials/vfh_recognition.php [Online; accessed 20-January-2018] [131] Rusu B Euclidean cluster extraction http://www.pointclouds.org/ documentation/tutorials/cluster_extraction.php [Online; accessed 20January-2018] [132] Rusu B Fast point feature histograms (fpfh) descriptors http://pointclouds org/documentation/tutorials/fpfh_estimation.php#fpfh-estimation [Online; accessed 20-January-2018] [133] Rusu B Fast point feature histograms (fpfh) descriptors http://pointclouds org/documentation/tutorials/pfh_estimation.php#pfh-estimation [Online; accessed 20-January-2018] [134] Rusu B., Bradski G., Thibaux R., and Hsu J (2010) Fast 3d recognition and pose using the viewpoint feature histogram pp 2155 – 2162 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems [135] Saad B (2015) Hough Transform and Thresholding http://me.umn.edu/ courses/me5286/vision/Notes/2015/ME5286-Lecture9.pdf [Online; accessed 18-Septemper-2017] [136] Saffoury R., Blank P., Sessner J., Groh B.H., Martindale C.F., and Dorschky E (2016) Blind path obstacle detector using smartphone camera and line laser emitter In Proceedings of 1st International Conference on Technology and Innovation in Sports, Health and Wellbeing, Tishw [137] Sahbani A., El-Khoury S., and Bidaud P (2012) An overview of 3d object grasp synthesis algorithms Journal Robotics and Autonomous Systems, Volume 60, Issue [138] Saval-Calvo M., Azorin-Lopez J., Guillo A.F., and Rodriguez J.G (2017) Threedimensional planar model estimation using multi-constraint knowledge based on k-means and RANSAC CoRR, abs/1708.01143 [139] Scharstein D and Szeliski R (2003) High-Accuracy Stereo Depth Maps Using Structured Light In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1(June):pp 195–202 [140] Schauerte B., Martinez M., and Constantinescu A (2012) An Assistive Vision System for the Blind that Helps Find Lost Things In International Conference on Computers for Handicapped Persons, volume 2011, pp pp 566–572 140 [141] Schnabel R., Wahl R., and Klein R (2007) Efficient ransac for point-cloud shape detection Computer Graphics Forum, 26(2):pp 214–226 [142] Silberman N and Fergus R (2011) Indoor scene segmentation using a structured light sensor In Proceedings of the International Conference on Computer VisionWorkshop on 3D Representation and Recognition [143] Silberman N., HoiemPushmeet D., and Fergus K (2012) Indoor segmentation and support inference from rgbd images In European Conference on Computer Vision, pp pp 746–760 [144] sphere P Detecting spheres using ransac in pcl https://answers.ros.org/ question/229784/detecting-spheres-using-ransac-in-pcl/ [Online; accessed 20 Feb-2018] [145] Steder B., Rusu R.B., Konolige K., and Burgard W (October 8, 2010 2010) Narf: 3d range image features for object recognition In Workshop on Defining and Solving Realistic Perception Problems in Personal Robotics at the IEEE/RSJ Int Conf on Intelligent Robots and Systems (IROS) Taipei, Taiwan [146] Stein F and Medioni G (1992) Structural indexing: Efficient 3D object recognition IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 14(Issue: 2):pp 125 – 145 [147] Su Y.T., Hua S., and Bethel J.S (2017) Estimation of cylinder orientation in three-dimensional point cloud using angular distance-based optimization Optical Engineering, Volume 56(Issue 5) [148] Subaihi A.A (2016) Orthogonal Least Squares Fitting with Cylinders International Journal of Computer Mathematics, 7160(February) [149] Sudhakar K., Saxena P., and Soni S (2012) Obstacle detection gadget for visually impaired peoples International Journal of Emerging Technology and Advanced Engineering, 2(12):pp 409–413 [150] Sujith B and Safeeda V (2014) Computer vision-based aid for the visually impaired persons- a survey and proposing International Journal of Innovative Research in Computer and Communication Engineering, pp 365–370 [151] Tombari F., SaltiLuigi S., and Stefano D (2010) Unique Signatures of Histograms for Local Surface Description In European Conference on Computer Vision, pp pp 356–369 141 [152] Tombari F and Stefano L.D (2012) Hough voting for 3d object recognition under occlusion and clutter IPSJ Transactions on Computer Vision and Applications, 4:pp 20–29 [153] Torr P.H.S and Murray D (1997) The development and comparison of robust methods for estimating the fundamental matrix International Journal of Computer Vision, 24(3):p 271–300 [154] Torr P.H.S and Zisserman A (2000) Mlesac: A new robust estimator with application to estimating image geometry Computer Vision and Image Understanding, 78(1):pp 138–156 [155] Trung-Thien T., Van-Toan C., and Denis L (2015) Extraction of cylinders and estimation of their parameters from point clouds Computers and Graphics, 46:p pp.345–357 [156] Trung-Thien T., Van-Toan C., and Denis L (2015) Extraction of cylinders and estimation of their parameters from point clouds Computers and Graphics, 46:pp 345–357 [157] Trung-Thien T., Van-Toan C., and Denis L (2015) Extraction of reliable primitives from unorganized point clouds 3D Research, 6:44 [158] Trung-Thien T., Van-Toan C., and Denis L (2016) esphere: extracting spheres from unorganized point clouds The Visual Computer , Volume 32(No.10):p pp 1205–1222 [159] Van Hamme D.and Veelaert P and Philips W (2011) Robust visual odometry using uncertainty models In Advanced Concepts for Intelligent Vision Systems ACIVS 2011 Lecture Notes in Computer Science, vol 6915 Springer, Berlin, Heidelberg, pp 1–12 ISBN 978-3-642-23686-0 doi:10.1007/978-3-642-23687-7 [160] Virgil T., Popescu S., Bogdanov I., and Caleanu C (2008) Obstacles detection system for visually impaired guidance department of applied electronics In 2th WSEAS International Conference on SYSTEMS , September 2017 [161] Wang H Mirota D.I.M and Hager G (2008) Robust motion estimation and structure recovery from endoscopic image sequences with an adaptive scale kernel consensus estimator [162] Wang H and Suter D (2004) Robust adaptive-scale parametric model estimation for computer vision IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.26(No.11):p pp.1459–1474 142 [163] Wattal A., Ojha A., and Kumar M (2016) Obstacle detection for visually impaired using raspberry pi and ultrasonic sensors In National Conference on Product Design, July, pp 1–5 [164] Wittrowski J., Ziegler L., and Swadzba A (2013) 3d implicit shape models using ray based hough voting for furniture recognition In International Conference on 3D Vision - 3DV [165] Xiang Y., Kim W., Chen W., Ji J., Choy C., Su H., Mottaghi R., Guibas L., and Savarese S (2016) ObjectNet3D : A Large Scale Database for 3D Object Recognition In European Conference on Computer Vision, pp pp 160–176 [166] Xiang Y., Mottaghi R., and Savarese S (2014) Beyond pascal: A benchmark for 3d object detection in the wild In IEEE Winter Conference on Applications of Computer Vision (WACV) [167] Yang M.Y and Forstner W (2010) Plane detection in point cloud data Technical report Nr.1 of Department of Photogrammetry, Institute of Geodesy and Geoinformation, University of Bonn [168] Yang S.w., Wang C.c., and Chang C.h (2010) RANSAC Matching : Simultaneous Registration and Segmentation In IEEE International Conference on Robotics and Automation [169] Yi C., Flores R.W., Chincha R., and Tian Y (2014) Finding objects for assisting blind people Network Modeling Analysis in Health Informatics and Bioinformatics, Volume 2(2):pp pp 71–79 [170] Yoo H.W., Kim W.H., Park J.W., Lee W.H., and Chung M.J (2013) Real-time plane detection based on depth map from kinect In International Symposium on Robotics (ISR2013) [171] Zhong Y (2009) Intrinsic Shape Signatures : A Shape Descriptor for 3D Object Recognition In 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops) [172] Zhou X (2012) A Study of Microsoft Kinect Calibration Technical report Dept of Computer Science George Mason University [173] Zollner M., Huber S., Jetter H.c., and Reiterer H (2011) Navi – a proof-ofconcept of a mobile navigational aid for visually impaired based on the microsoft kinect In IFIP Conference on Human-Computer Interaction, pp pp 584–587 143 PUBLICATIONS OF DISSERTATION [1] Van-Hung Le, Hai Vu, Thuy Thi Nguyen, Thi Lan Le, and Thanh Hai Tran (2015) Table plane detction using geometrical constraints on depth image, The 8th Vietnamese Conference on Fundamental and Applied IT Research, FAIR, Hanoi, VietNam, ISBN: 978-604-913-397-8, pp.647-657 [2] Van-Hung Le, Hai Vu, Thuy Thi Nguyen, Thi-Lan Le, Thi-Thanh-Hai Tran, Michiel Vlaminck, Wilfried Philips and Peter Veelaert (2015) 3D Object Finding Using Geometrical Constraints on Depth Images, The 7th International Conference on Knowledge and Systems Engineering, HCM city, Vietnam, ISBN 978-1-46738013-3, pp.389-395 [3] Van-Hung Le, Thi-Lan Le, Hai Vu, Thuy Thi Nguyen, Thanh-Hai Tran, TranChung Dao and Hong-Quan Nguyen (2016), Geometry-based 3-D Object Fitting and Localization in Grasping Aid for Visually Impaired People, The 6th International Conference on Communications and Electronics (IEEE-ICCE), HaLong, Vietnam, ISBN: 978-1-5090-1802-4, pp.597-603 [4] Van-Hung Le, Michiel Vlaminck, Hai Vu, Thuy Thi Nguyen, Thi-Lan Le, ThanhHai Tran, Quang-Hiep Luong, Peter Veelaert and Wilfried Philips (2016), Real-time table plane detection using accelerometer and organized point cloud data from Kinect sensor, Journal of Computer Science and Cybernetics, Vol 32, N.3, ISSN: 1813-9663, pp 243-258 [5] Van-Hung Le, Hai Vu, Thuy Thi Nguyen, Thi-Lan Le, Thanh-Hai Tran (2017), Fitting Spherical Objects in 3-D Point Cloud Using the Geometrical constraints Journal of Science and Technology, Section in Information Technology and Communications, Number 11, 12/2017, ISSN: 1859-0209, pp 5-17 [6] Van-Hung Le, Hai Vu, Thuy Thi Nguyen, Thi-Lan Le, Thanh-Hai Tran (2018), Acquiring qualified samples for RANSAC using geometrical constraints, Pattern Recognition Letters, Vol 102, ISSN: 0167-8655, pp 58-66, (ISI) [7] Van-Hung Le, Hai Vu, Thuy Thi Nguyen (2018), A Comparative Study on Detection and Estimation of a 3-D Object Model in a Complex Scene, 10th International Conference on Knowledge and Systems Engineering (KSE 2018), pp 203-208 [8] Van-Hung Le, Hai Vu, Thuy Thi Nguyen, Thi-Lan Le, Thanh-Hai Tran (2018), GCSAC: geometrical constraint sample consensus for primitive shapes estimation in 3D point cloud, International Journal Computational Vision and Robotics, Accepted (SCOPUS) [9] Van-Hung Le, Hai Vu, Thuy Thi Nguyen (2018), A Frame-work assisting the Visually Impaired People: Common Object Detection and Pose Estimation in Surrounding Environment, 5th Nafosted Conference on Information and Computer Science (NICS 2018), pp 218-223 [10] Hai Vu, Van-Hung Le, Thuy Thi Nguyen, Thi-Lan Le, Thanh-Hai Tran (2019), Fitting Cylindrical Objects in 3-D Point Cloud Using the Context and Geometrical constraints, Journal of Information Science and Engineering, ISSN: 1016-2364, Vol.35, N1, (ISI) 145 ... sphere) The located and described of an estimated spherical object in the scene are x=-0.4m, y=-0.45m, z=1.77m, radius=0.098m 105 Figure 4.17 (a), (b) Illustrating the... (3-D data) are calculated as follows: Xp = Yp = (xa −cx )∗depthvalue(xa ,ya ) fx (ya −cy )∗depthvalue(xa ,ya ) fy Zp = depthvalue(xa , ya ) C(r, g, b) = colorvalue(xa , ya ) (2.3) where depthvalue(xa... the image obtained from the MS Kinect sensor has 640 × 480 pixels, then i = 1, , row; j = 1, , col; normally (row, col) = (480, 640) Matrix P presents the organized point cloud data of a scene

Định dạng
Số trang	166
Dung lượng	9,75 MB