A smart shopping cart with automated payment based on artificial intelligence

114 8 0
A smart shopping cart with automated payment based on artificial intelligence

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

MINISTRY OF EDUCATION AND TRAINING HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION THESIS ELECTRONICS AND TELECOMMUNICATION ENGINEERING TECHNOLOGY A SMART SHOPPING CART WITH AUTOMATED PAYMENT BASED ON ARTIFICIAL INTELLIGENCE ADVISOR : Ph.D BUI HA DUC STUDENT: TRAN NGO MINH TRI NGUYEN HOAI NAM NGUYEN VAN TONG SKL Ho Chi Minh City, July 2023 MINISTRY OF EDUCATION AND TRAINING HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION - FACULTY FOR HIGH QUALITY TRAINING GRADUATION THESIS Project title: A smart shopping cart with automated payment based on Artificial Intelligence ADVISOR: BUI HA DUC, PhD STUDENTS: TRAN NGO MINH TRI – 19146033 NGUYEN HOAI NAM – 19146219 NGUYEN VAN TONG – 19146279 CLASS: 19146CL2B ACADEMIC YEAR: 2019 – 2023 Ho Chi Minh City, July 2023 CỘNG HOÀ XÃ HỘI CHỦ NGHĨA VIỆT NAM Độc lập - Tự – Hạnh phúc NHIỆM VỤ ĐỒ ÁN TỐT NGHIỆP Học kỳ / năm học 2022-2023 Giảng viên hướng dẫn: TS Bùi Hà Đức Sinh viên thực hiện: Trần Ngơ Minh Trí MSSV: 19146033 Hệ đào tạo: CLV Nguyễn Hoài Nam MSSV: 19146219 Hệ đào tạo: CLV Nguyễn Văn Tòng MSSV: 19146279 Hệ đào tạo: CLV Mã số đề tài: – Tên đề tài: Xe đẩy siêu thị thông minh tự động bám theo người kết hợp toán tự động ứng dụng cơng nghệ trí tuệ nhân tạo Các số liệu, tài liệu ban đầu: Mơ hình phần cứng số liệu hoàn thành từ đồ án trước …………….……… ……….…………………………………………………………… …………….……… ……….…………………………………………………………… Nội dung đồ án: Thiết kế gia công xa đẩy siêu thị thông minh có khả tự theo khách hàng, hỗ trợ định giá, toán sản phẩm siêu thị …………….……… ……….…………………………………………………………… …………….……… ……….…………………………………………………………… …………….……… ……….…………………………………………………………… …………….……… ……….…………………………………………………………… Các sản phẩm dự kiến Sản phẩm kích thước vận hành thực tế …………….……… ……….…………………………………………………………… …………….……… ……….…………………………………………………………… …………….……… ……….…………………………………………………………… Ngày giao đồ án: 15/03/2023 Ngày nộp đồ án: 15/07/2023 Ngơn ngữ trình bày: Bản báo cáo: Tiếng Anh i  Tiếng Việt  Trình bày bảo vệ: Tiếng Anh  Tiếng Việt  Ghi chú: Hệ chất lượng cao tiếng Anh thực thuyết minh báo cáo tiếng Anh TRƯỞNG KHOA TRƯỞNG NGÀNH GIẢNG VIÊN HƯỚNG DẪN (Ký, ghi rõ họ tên) (Ký, ghi rõ họ tên) (Ký, ghi rõ họ tên) ii LỜI CAM KẾT - Tên đề tài: Xe đẩy siêu thị thông minh tự động bám theo người kết hợp tốn tự động ứng dụng cơng nghệ trí tuệ nhân tạo - GVHD: TS Bùi Hà Đức - Thông tin sinh viên: Trần Ngơ Minh Trí – MSSV: 19146033 – Lớp: 19146CL2B Địa chỉ: 46/3, Tân Hòa 2, phường Hiệp Phú, TP Thủ Đức, SĐT: 0343455542 Email: 19146033@student.hcmute.edu.vn Nguyễn Hoài Nam – MSSV: 19146219 – Lớp 19146CL2B Địa chỉ: đường phường Linh Xuân TP Thủ Đức, SĐT:0906732772 Email: 19146219@student.hcmute.edu.vn Nguyễn Văn Tòng – MSSV: 19146279 – Lớp 19146CL2B Địa chỉ: 2a Đường số 37, Phường Linh Đông, TP Thủ Đức, SĐT: 0375843385 Email: 19146279@student.hcmute.edu.vn - Ngày nộp khóa luận tốt nghiệp: 21/07/2023 - Lời cam kết: “Tơi xin cam đoan khóa luận tốt nghiệp (ĐATN) cơng trình tơi nghiên cứu thực Tôi không chép từ viết cơng bố mà khơng trích dẫn nguồn gốc Nếu có vi phạm nào, tơi xin chịu hồn tồn trách nhiệm” Tp Hồ Chí Minh, ngày 21 tháng 07 năm 2023 Ký tên iii LỜI CẢM ƠN Đầu tiên nhóm xin chân thành gửi lời cảm ơn đế thầy hướng dẫn – TS Bùi Hà Đức, nhóm xin cảm ơn thầy suốt khoảng thời gian làm đồ án thầy nhiệt tình hướng dẫn, đưa hướng giải khó khăn chun mơn gặp phải Ngồi thầy dạy cho thành viên nhóm nhiều kỹ quan trọng mà người kỹ sư cần có Nhóm cảm ơn thầy cho thành viên định hướng tương lai thành viên gặp khó khăn đường học vấn nghiệp Trong suốt khoảng thời gian từ đồ án Cơ điện tử đồ án tốt nghiệp, thầy cho hỗ trợ cho nhóm mượn để sử dụng phịng thiết bị mà nhóm khơng đủ điều kiện mua để nhóm nghiên cứu hoàn thiện đồ án Được hướng dẫn thầy điều mà tất thành viên cảm thấy may mắn quý trọng Thứ hai, xin gửi lời cảm ơn đến bạn Nguyễn Hoài Nam Nguyễn Văn Tòng đồng hành với nhau, từ khoảng thời gian ý tưởng giấy đến hoàn thiện vận hành Cảm ơn bạn nhiệt huyết, đam mê với khoa học kỹ thuật mà lấy làm động lực để tiếp lúc gặp khó khăn Cuối xin gửi lời cảm ơn đến người thân gia đình đặc biệt cha, mẹ thành viên nhóm tin tưởng, hỗ trợ cho tất thành viên hồn thành chương trình đại học, đến chặn cuối hoàn thiện đồ án tốt nghiệp Họ nguồn động lực lớn mà chúng tơi dùng để hồn thành đồ án Đại diện nhóm Trí Trần Ngơ Minh Trí iv ABSTRACT Project title: A smart shopping cart with automated payment based on Artificial Intelligence With the growing population in large cities such as Ho Chi Minh City or Ha Noi, supermarkets and shopping malls are facing increasing pressure to enhance the in-store shopping experience for their customers, especially during peak hours This thesis proposes a mobile robot that aims to improve the experience of customers when shopping at supermarkets and shopping malls by providing them with a more convenient and efficient way to shop at hypermarkets and shopping centers The proposed product utilizes a depth camera to capture 2D and 3D images of customers to define the distance and deviation angle with the robot and then track them In addition, the robot is equipped with an RGB camera to capture images when the customer puts a product into the cart These images are then processed using a deep learning model that has been trained with datasets collected and labeled by the research team By analyzing the images, the deep learning model is able to identify the name of the product and track the direction of the product to change the bills This product can be applied in the large supermarket and in the future, we can modify them for more functionalities such as customer behavior analysis v TABLE OF CONTENTS NHIỆM VỤ ĐỒ ÁN TỐT NGHIỆP i LỜI CAM KẾT iii LỜI CẢM ƠN iv ABSTRACT v TABLE OF CONTENTS vi LIST OF TABLES x LIST OF FIGURES xi LIST OF ABBREVIATIONS xv CHAPTER 1: INTRODUCTION 1.1 Motivations 1.2 Objective 1.3 Research task 1.4 Limitations 1.5 Research subjects and scopes 1.6 Outline CHAPTER 2: LITERATURE REVIEW 2.1 Service robot 2.2 Introducing 2D images 2.3 Deep learning and Convolutional Neural Networks 2.4 Object detection 2.4.1 Metrics are used for the object detection task 12 2.4.2 Choosing a model to deploy on Jetson Nano 13 2.4.3 Understanding SSD (Single Shot MultiBox Detector) 13 2.4.4 Understanding MobileNetV2 14 2.5 Introducing TensorRT 15 2.6 Object tracking 16 vi 2.7 OCR (Optical Character Recognition) 17 2.8 3D image processing 19 2.9 PID controller 20 CHAPTER 3: HARDWARE AND MECHANICAL DESIGN 23 3.1 Technical Requirements 23 3.2 Design Proposal 23 3.3 3D Structural Design of the Robot 24 3.4 Building the robot base 26 3.4.1 Calculations and motor selection 26 3.4.2 Calculations and selection of the belt transmission system 30 3.4.3 Selection of bearing supports 31 3.4.4 Selection of omnidirectional wheels 33 3.4.5 Designing the base plate 33 3.5 Calculating the kinematics of the robot 34 3.6 Calculating the dynamics of the robot 38 CHAPTER 4: ELECTRICAL DESIGN 41 4.1 Technical requirements 41 4.2 Block diagram and overview of the electrical system 41 4.3 Power supply block 43 4.3.1 Calculating and selecting the power supply 43 4.3.2 Buck converter circuits 45 4.4 Main data processing block 46 4.5 Sensor block 47 4.6 Control block 49 4.6.1 STM32F103C8T6 microcontroller 50 4.6.2 H-Bridge driver 51 4.7 Actuator block 52 4.7.1 DC motor 53 vii 4.7.2 Encoder 54 CHAPTER 5: ALGORITHM DESIGN 56 5.1 Designing 2D and 3D image processing algorithms 56 5.2 Following person module 57 5.3 Automatic checkout module 58 5.3.1 OCR (Optical Character Recognition) block 58 5.3.2 Semantic Entity Recognition module 61 5.4 Algorithm for Robot navigation 62 5.5 The control algorithm on STM32 65 CHAPTER 6: EXPERIMENTS AND RESULTS 68 6.1 PID Controller for Motor Speed Control 68 6.1.1 The structure of a PID controller 68 6.1.2 Finding the transfer function of the motor from experimentation 68 6.1.3 Find the parameters of the PID controller for speed control of the motor 72 6.1.4 The PID control diagram of each motor 73 6.1.5 The experimental results of the PID controller on two motors 74 6.2 Deformation testing 80 6.2.1 The main base plate deformation testing 80 6.2.2 The cargo compartment deformation testing 81 6.2.3 The base frame deformation testing 82 6.3 Training the semantic entity recognition model 84 6.3.1 Preparing data 84 6.3.2 Data labeling 86 6.3.3 Training the model 87 6.4 Model inference 89 6.5 Recognizing user actions 90 6.6 User interface designing 91 6.7 The result of the tracking model when the person is occluded 92 viii the strength and deformation of the entire base frame, we decided to apply a load of 1000N, which is significantly higher than the total weight of the vehicle and the maximum payload it can carry Through simulation using SolidWorks software, it shows that the maximum von Mises stress is 3.555e+07 N/m², and the minimum von Mises stress is 2.155e+03 N/m² Compared to the allowable stress for Aluminum Alloy A6061, which is 5.515e+07 N/m², the conclusion indicates that the base frame can ensure structural integrity Figure 6.27 Stress simulation result of the base frame 83 Figure 6.28 Displacement simulation result of the base frame 6.3 Training the semantic entity recognition model 6.3.1 Preparing data Our group has collected real-time data at the Coopmart supermarket in District In this dataset, we have collected a total of 2800 images containing information about products The list of products collected and labeled includes: • Carabao Energy Drink • True Care Dishwashing Liquid • Poca Shaking Beef Crackers • Clear Shampoo • Milo Milk • Sensodyne Toothpaste • Hao Hao Noodles • Tiger Beer • Dalat Milk • Sting Energy Drink 84 Figure 6.29 The list of products collected and labeled After collecting real-time data, the group obtained the initial raw dataset as follows: Figure 6.30 Chart of the dataset before preprocessing Recognizing that the current dataset is imbalanced, the group proceeded to filter and process the data to avoid overfitting when training the model The processed data is summarized as follows: 85 Figure 6.31 Chart of the dataset after preprocessing After the data was filtered, the group proceeded to split the dataset into two sets: training and validation, with a ratio of 80% for the training set and 20% for the validation set 6.3.2 Data labeling After going through the data preprocessing step, the group proceeded to label the dataset before training The team utilized the PPOCRLabel tool, supported by the PaddleOCR team, which is a well-known research group in Computer Vision, particularly in the field of Optical Character Recognition (OCR) After labeling, we obtained a TXT file containing the content of all text bounding boxes, corresponding content of each box and the class of each box Each row in the file represents an image in the dataset Here is an example for one image: 86 Figure 6.32 Labels of data for LayoutXLM model 6.3.3 Training the model After labeling, the team proceeded to train the model In terms of hardware, you used the NVIDIA GeForce RTX 3090 Ti 24GB graphics card for the model training During the training process that spanned 300 epochs, Divide into steps with the magnitude of the loss, H_mean, precision and recall functions as follows: 87 Figure 6.33 Loss function during LayoutXLM training process Figure 6.34 H-mean (F1-score) during LayoutXLM training process 88 Figure 6.35 Precision during LayoutXLM training process Figure 6.36 Recall during LayoutXLM training process 6.4 Model inference The team ran the model in real-time on a laptop with the following hardware configuration: • CPU: Intel Core i7-9750H • GPU: NVIDIA GeForce GTX 1050 Ti with 4GB RAM • RAM: 8GB 89 When running on the laptop hardware, the entire product recognition system achieves a performance of frames per second (FPS) The model is capable of detecting products that have cylindrical packaging, such as water bottles Furthermore, in scenarios where the products are partially obstructed, the model is still capable of accurately recognizing the product names Figure 6.37 Result of LayoutXLM model on an Aquafina product Figure 6.38 Result of LayoutXLM model on a Lavie product 6.5 Recognizing user actions The objective of this section is to identify whether the user is taking a product out or putting a product into the cart in order to add the corresponding amount to the invoice 90 Due to the relatively simple movement of the items, our team has decided to use the CSRT tracking algorithm supported by the OpenCV library The tracking object is the product name that has been previously detected by the Layout XLM model The model proceeds to compare the coordinates of the bounding box center after entering the Region of Interest (ROI) and after exiting the ROI If the coordinates tend to decrease, meaning the Y-coordinate is increasing, it indicates that the customer is placing the item into the cart Conversely, if the Y-coordinate decreases, it implies that the customer is removing the item from the cart This result indicates whether the customer is adding or removing items from the cart 6.6 User interface designing After completing the basic functionalities of the system, the team proceeds to design a user interface with interactive capabilities, allowing users to add or remove products that were not detected The interface includes the following sections: Figure 6.39 GUI (Graphical User Interface) run on Apple Ipad 6th generation The interface is divided into sections as follows: • Region 1: Data Input Area - used for customers to enter the name of the product they want to interact with • Region 2: Search Button - helps search the database for information about the product entered in Area • Region 3: Add to Invoice Button - allows customers to add the product entered in Area to the invoice This button is used when the model fails to detect the product 91 • Region 4: Remove from Invoice Button - allows customers to remove a product from the invoice This button is used when the model fails to detect the product • Region 5: Total Amount - displays the total amount of the invoice, showing the accumulated cost of all products • Region 6: Invoice Information - displays detailed information about the invoice, including product type, price, and quantity 6.7 The result of the tracking model when the person is occluded We tested the STARK algorithms in the case that the followed person is occluded and out of screen and then returns This model work well in both cases and does not misunderstanding Figure 6.40 Results of the model in the case that the followed person is obscured by another person Figure 6.41 Results of the model in the case that the followed person is out of screen and returns 92 In the case that the followed person is out of screen, the model also tracks with a random bounding box, but this problem can be solved with the template matching function provided by the OpenCV library This function returns the similarity score (from to 1) between the template and the current frame We initialized the template by the output of the object detection model, then compared it with tracking frames; if the similarity score was below 0,3, the robot would stop until the similarity score returned above 0,3 93 CHAPTER 7: CONCLUSION AND FUTURE DEVELOPMENTS We have largely accomplished our objectives, which include constructing the robot's mechanical and electrical systems, AI application, image processing algorithm, and 3D processing to enable the robot to recognize people and backgrounds and control its movements Besides that, we also design a friendly graphical user interface that allows customers to interact with our mobile robot, especially since this GUI can run on multiple types of devices such as mobile phones, tablets, desktops, and Jetson devices However, the product recognition model in general is quite slow; we need a more powerful system than the NVIDIA 1050 Ti 4GB for better customer experiences It is more perfect if our mobile robot can automatically find the way to return the charging dock after completing one operation We believe that this product will make a significant change in the shopping experience of customers at large supermarkets by completely eliminating the payment process at fixed counters, and customers will not have to exert too much effort when shopping for heavy items at the supermarket This will contribute to increasing the business efficiency of supermarkets and shopping centers significantly 94 REFERENCE [1] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen, MobileNetV2: Inverted Residuals and Linear Bottlenecks, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp 4510-4520 [2] Bin Yan, Houwen Peng, Jianlong Fu, Dong Wang, Huchuan Lu, Learning SpatioTemporal Transformer for Visual Tracking, Computer Vision and Pattern Recognition (cs.CV), 2021 [3] Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei, LayoutXLM: Multimodal Pre-training for Multilingual Visuallyrich Document Understanding, Computation and Language (cs.CL), 2021 [4] Rich Tech Robotics, Autonomous food service robots, https://www.richtechrobotics.com/matradee [5] Relay Robotics, The World’s First Hospitality Service Robot, https://www.relayrobotics.com/blog/2020/2/25/the-worlds-first-hospitalityservice-robotdoubled-in-room-dining-in-one-month-emc2-chicago [6] Notebook Check, Xiaomi launches a cheaper robot vacuum, the Mijia Robot Vacuum Cleaner 3C, https://www.notebookcheck.net/Xiaomi-launches-a-cheaper-robot vacuumthe-MijiaRobot-Vacuum-Cleaner-3C.609385.0.html [7] Following Inspiration, Wii go retail - The ultimate Customer‘s in-store experience, https://followinspiration.pt/index.php/pt/autonomous-robots/wii-go [8] Robotis E – Manual, Turtlebot3, https://emanual.robotis.com/docs/en/platform/turtlebot3/overview/ [9] Wikipedia, Differential wheeled robot, https://en.wikipedia.org/wiki/Differential_wheeled_robot [10] Juan Angel Gonzalez-Aguirre, Ricardo Osorio-Oliveros, Karen L.RodríguezHernández , Javier Lizárraga-Iturralde , Rubén Morales Menendez , Ricardo A Ramírez-Mendoza , Mauricio Adolfo Ramírez-Moreno and Jorge de Jesús LozoyaSantos (2021), Service Robots: Trends and Technology [11] Wikipedia, PID controller, https://en.wikipedia.org/wiki/PID_controller [12] PGS.TS Trịnh Chất, TS Lê Văn Uyển, Tính tốn thiết kế hệ dẫn động khí, Nhà xuất Giáo dục, 2006 95 [13] Michal Siwek, Jaroslaw Panasiuk, Leszek Baranowski, Wojciech Kaczmarek, Piotr Prusaczyk and Szymon Borys, Identification of Differential Drive Robot Dynamic Model Parameters, Faculty of Mechatronics, Armament and Aerospace, Military University of Technology, Kaliskiego Street, 00-908 Warsaw, Poland, 2023 [14] Eka Maulana, M Aziz Muslim, Akhmad Zainuri, Inverse Kinematics of a TwoWheeled Differential Drive an Autonomous Mobile Robot, 2014 Electrical Power, Electronics, Communications, Controls and Informatics Seminar (EECCIS), 2014 96 S K L 0

Ngày đăng: 14/11/2023, 10:10

Tài liệu cùng người dùng

Tài liệu liên quan