1. Trang chủ
  2. » Luận Văn - Báo Cáo

Khóa luận tốt nghiệp: An Image Annotation Tool for Machine Learning

84 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề An Annotation Tool for Machine Learning
Tác giả Nguyen Minh Tu, Do Thanh Xuan
Người hướng dẫn Dr. Do Trong Hop
Trường học University of Information Technology
Chuyên ngành Information Systems
Thể loại Thesis
Năm xuất bản 2021
Thành phố Ho Chi Minh City
Định dạng
Số trang 84
Dung lượng 33,57 MB

Nội dung

A Manual Annotation Tool For Machine LearningAuto Image Annotation Process... Thesis project “An image annotation tool” is a project focus on users who demand a tool which generate and m

Trang 1

VIETNAM NATIONAL UNIVERSITY HO CHI MINH CITY

UNIVERSITY OF INFORMATION TECHNOLOGY

ADVANCED PROGRAM IN INFORMATION SYSTEMS

NGUYEN MINH TU - DO THANH XUAN

AN ANNOTATION TOOL FOR MACHINE LEARNING

BACHELOR OF ENGINEERING IN INFORMATION SYSTEMS

Ho Chi Minh City, 2021

Trang 2

NGUYEN MINH TU - 16521839

DO THANH XUAN - 16521479

AN ANNOTATION TOOL FOR MACHINE LEARNING

BACHELOR OF ENGINEERING IN INFORMATION SYSTEMS

THESIS ADVISOR

DR DO TRONG HOP

Ho Chi Minh City, 2021

Trang 3

ASSESSMENT COMMITEE

Trang 4

First of all, the authors want to say thank you to the University of Information

Technology and all of the lecturers in Information System department for tutoring andteaching important knowledge and treasure not only in university, but also in real lifethrough 4 years so that the authors can qualified to complete this thesis

The authors want to send their best regards and extend their thanks to Mr Do Trong

Hop and Mr Nguyen Thanh Binh for directly giving advice, corrections, helping and

supporting the authors through the time implement and working on the thesis All theseencouragement, feedback and inputs are treasure motivation when the author struggling

for research and implement this project

The authors also extend their thanks to Mr Ngo Duc Thanh for giving inputs and

advices so the report and thesis can complete 100%

The final words, the authors want to say thank you all to family and friends for those

encouragement through all the research project However, due to the authors’ knowledge

and experience is still limited so that mistakes and shortcommings are inevitable Hence,

the author sincerely looking forward to receiving helpful contribution, advices andfeedbacks from the lecturers to complete, fluent and be prerequisite for me to be able toimplement other projects in the future Again, thank you for all your help

Thank you sincerely

Trang 5

UNIVERSITY OF INFORMATION

AdvancedTECHNOLOGY Education

Program

ADVANCED PROGRAM

IN INFORMATION SYSTEMS

THESIS PROPOSAL

THESIS TITLE: OBJECT DETECTION

Advisor: Do Trong Hop

Duration: (From 14" Aug to 11" Jan , 2021)

Students: Nguyen Minh Tu - 16521839

creating datasets is a very time consuming job Thus, it requires the effective

provisions of tools and secondary algorithm

2 Scope:

Main functionalities:

° Manage Videos, Images, Training dataset

° Labeling observation every single picture, video, frame

* Export types of dataset from the labelled sources

3 Objectives:

- Learn how the dataset works and the structure of the common algorithm

platforms

Trang 6

- Creating an Object Detection dataset using Yolo and Tensorflow to manage,generate datasets, training set and test set.

4 Methodologies: 3 main steps

Programing language: C#, Python

s Survey main machine learning framework:

1 Investigate, Training, Testing and Running provided dataset (COCO, TINY)

2 Analyze sample dataset

¢ Developed an application support manage dataset, labeling objects and export

training dataset compatible with analyze platforms

s Using the result exported to verify data which is images, videos

Research time lines: (Plan of action: describe the planning and the assigned tasks for

each students)

Task Start Date | End Date | Assigned

to

Meet advisor Mr.Binh to contact and have 01/08/2020 Tu, Xuan

idea about project

Research about technology and framework | 03/08/2020 | 09/08/2020 | Tu, Xuanfor working

Survey main machine learning framework: 10/08/2020 | 16/08/2020 Tu

Investigate, Training, Testing and Running

provided dataset(COCO, TINY)

Analyse sample dataset 17/08/2020 | 15/09/2020 Xuan

Trang 7

Develope an application which support 17/08/2020 | 25/11/2020 Tu

manage images, videos, labeling objects and

export training dataset compatible with

analysed platforms

Using the result exported to verify data 25/11/2020 | 02/12/2020 | Tu, Xuan

which is image, video

Testing and completing the application 02/12/2020 | 9/12/2020 | Tu, Xuan

Document writing and Reporting 09/12/2020 | 15/12/2020 | Tu, Xuan

Approved by the advisor(s)

Signature(s) of advisor(s)

Ho Chi Minh city, 18/9/2020Signature(s) of student(s)

Trang 8

LIST OF ACRONYMS xiiiABSTRACT

OPENING 5< HH TH HH TH nh T00 n3 0741.0340401 XVCHAPTER I: PROJECT OVERVIEW scscscssssssssssssessssesssssssessesesesssssessesesessseseeenee 11.1 Problems and questions: cccccccsssssssssssssccsssesssssecessnesccsnseeessnseessneceesnneeesnueeeesneesseneete 1

1.1.1 Problem and statement:

1.1.2 Annotation Tools Research:

1.1.2.1 Lableling:

1.1.2.2 5/4 6 '/' *ẽố.\4 xkx L 31.1.3 Comments: :c-cooccoctrhhhhhhhhrrrrririiiiiiirrrirrrrrrree 41.2 Project’s S€0p€: HH HH HH 4

1.3 Report Layout:

CHAPTER 2: THEORETICAL BASI

2.1 Annotation Overview:

2.1.1 Manual annofafi0n: 22222+ HH HH He, 7

2.1.2 Auto AnnofafÏONI - 5s nung HH HH niệu 8

2.1.3 Ammotation’s eafUT€S: - 52222 nh reo 92.2 Object Detection Platforms and Algorithms: - -c 57555cccccer 12

2.2.1 YONOE a cecscsssssssssssssseseeeeeseeseeseeeeeenns 1.12

2.2.1.1 Yolo’s framework: DARKNET

2.2.1.3 YoloV4 requirements: 0 ececcsccccccscsesssssssssssssnneeeesseesseeseceeesesssssssssunnnnmeseseseeeteetee 142.2.1.4 YOIOV4 dataset: con 152.2.1.5 Set up environment on Darknet/yolo V4: -ccccccrrrrrrrrrrrree 182.2.1.6 Pros and Cons: - 5222222222 2tr 23

vi

Trang 9

2.2.2 Tensorflow/SSD_resnet: - 2 s2 2 E22 x2 2117111111112 11tr crrer 232.2.2.1 TensorÍÏoW:: -cnnnHH_.,.,nH HH HH HH He Ha 232.2.2.2 Tensorflow requiremenfs: 22555 2222 ri 242.2.2.3 Temsorflow dafase(: HH rrereeren 242.2.2.4 Set up environment on Tensorflow/SSD_resnefS: - .c5-55+ 292.2.2.5 Pros and Cons: 55 nu ngư ướt

CHAPTER 3: GROUNDTRUTH SYSTEM DESIGN AND ANALYSIS 323.1 Groundtruth OYerVieW: che 323.2 Groundtruth Main Functions: .2555 22s HH, ra 33

3.2.1 Function Requirements: -33

3.2.2 Non-Function Requirements:

3.3 Use Case Diagrams

3.3.1 Use case diagram lisf: ào 343.3.2 Use case diagram desCrÏDfÏOH: c5 c2 tì tinh Hy 353.3.2.1 Manage dataset use case diagraI: cc:5Svccttiitrkiirrtiierirrrrke 35

3.3.2.2 Manage data source use Case dỉagram: c-ccccccccccceccrrrrrrrrrrer 36

3.3.2.3 Manage labels use case dỉag8raIm: - 5s 55v tre 373.3.2.4 Export dataset use case diaỹraIm: - s25 2E tr ry 383.4 Sequence đỉagTaImS: - 22th 393.4.1 Add dataset sequence đỉaðTaI: - - 55s 22th titttrirrrrririrrrreriiere 393.4.2 Delete dataset/data sourc sequence diagrams cccccccccer 39

3.4.3 Add data source sequence đỉagram: cv 40

3.4.4 Delete data source sequence diagram:

3.4.5 Edit data source sequence diagram:

3.5 Database:

CHAPTER 4: TESTING ON YOLO AND SSD_RESNET

4.1 Deploy Groundtruth For Generating Dataset For Yolo An Tensorfow: 444.1.1 QV€TV€W: HH HH HH 444.1.2 Yolo training and testing deploymen(: -ccc5ccccccccccecirrrrrrrrrrrer 444.1.2.1 Trainings oe ccceecssseccssseecessvecessevecesseeeessnsccssnessssvecssnneccsnnecesnueecssneceesnneceunneeesneeess 444.1.2.2 Tesfing: eo

4.1.3 SSD_Resnet Training And Testing Deployment: -ssse+ 50

4.1.3.1 Trainings occ ccceeccsssecessseecessvecessevecesnsesssnnsecssseceesnnecessnecesnneessneeeessneceesneceunneseeneeets 50

vii

Trang 10

4.1.3.2 TeSfÏNB: on HH HH HH HH HH ghi 524.1.3.3 Auto Image Annotation for Tensorflow:

CHAPTER 5: CONCLUSION AND FUTURE DEVELOPMENT: 56

viii

Trang 11

Figure 2-10: Yolo Dataset c.cececcecsssesessseessesseeesesesescseeceeseeeeseseseaeeceeeseseeeseaeenseeeeeasees 5Figure 2-11: - Coordinate Text File Explanation ‹-+-++c+c+scs+szx+cs> 6Figure 2-12: - Folder Obj of Yolo Dataset c cccececesseeeeseseeeeseseseseeeesesesesssneseaeseees 7Figure 2-13: - File Obj.Data Của Yolo Dataset 5c cccccsterererrrrrer 7Figure 2-14: Obj.Names Of Yolo Dataset c-cc St tr ưec 8Figure 2-15: Obj.Names Of Yolo Dataset 0 cece eessseeseseeseseseseeeseeseseseeeeneseaeseeee 8Figure 2-16: Alexeyab’s Darknet Github 00.0.0 cceceseseeesesseseseseeeneneeeeeeseseseaeeneeeeeeeeees 9Figure 2-17: Download And Install CUDA Toolkit eee ceeeeeeseeneneseee 9

Figure 2-18: Download And Install OpenCV

Figure 2-19: Cmake Configuration With CUDA Toolkit

Figure 2-20: Configuration File And Pre-Trained Model

Figure 2-21: Config Cmake To Unzip The Environment From Alexeyab

LIST OF FIGURES

œEElt›

User Interface of labellmg application -¿-5-5- 55+ ssss+s+xsescee 2Graphic User Interface of CV ATT + xxx seerrrrrrereerkrkree 3Example of Image Annotation

A Manual Annotation Tool For Machine LearningAuto Image Annotation Process

Bounding Box Annotation FeatureCuboid Box Annotation Feature - 5c cccc+xersrereersrrerrere 1Line Annotation Feature s-c:ccccxcccctettrterrerrrrerrrrrrrerrrrrrrree 2Darknet Framework LLOBO - - - + 6 SE Sk*k+kEEEEkEEkekekrkkrkrkrkeree 3Performance Of Yolov4 Compared With Other Object Detection

Figure 2-22: Building Darknet/Yolo Environiment - c5 +++c+c+scxcrexexee 2

Figure 2-23: Darknet/Yolo EnVITOIeIII ó6 + tk 22

Figure 2-24: Tensorflow Library LOBO - ¿+ SE Sk*k‡EEEEEEEkekeErkkrkrkrkrtee 24Figure 2-25: VOC Dataset For TensOr[ÏOW óc tt St Sky 24Figure 2-26: Tensorflow metadata files ¿S5 S SE 11111 re 25

Figure 2-27: Data from XML Ẩile S3 SH re 25Figure 2-28: Indicators Files For Environment To Training . - 26

ix

Trang 12

Figure 2-29: Data Sources Formatted With MDỐ - 6-65 cccetssrerrrererereek 26Figure 2-30: Labels From Tensorflow DatasSet 6 Sàn 27Figure 2-31: Annotation File For Tensorflow 6-5 csseeveeerrrrerererexee 27Figure 2-32: Data Stored In Annotated File 0.0 cccesseeceseseeeneeeeeeseseeeseneeeneeeeeseeees 28Figure 2-33: Label_Map.Pbtxt Eile - «5S rên 28Figure 2-34: Data Source Directory Stored In Train.txt File - - - 29Figure 2-35: Data Source Directory Stored In Vai.txt File cece -ccccccscs> 29Figure 2-36: Checking GPU Support

Figure 2-37: Install Python

Figure 2-38: Installing Virtualenv

Figure 2-39: Import Tensorflow Library

Figure 3-1: Groundtruth user interface .cceeeeseseseeeeseceeeseseeceeeeeeeseseseeeseeneneeeeeeaes 33Figure 3-2: Main Use Case điagraim - «c5 St TH ưếc 34Figure 3-3: Manage Dataset Use Case Diagram .c.ccccceceesseseeeeeseeteseseseensneeaeseees 35

Figure 3-4: Manage Datasource Use Case Diagram .:.cccecscscsseesesteteseseseeneseneeee 36Figure 3-5: Manage Labels Use Case Diagram .c.cccccsccseseesenseseseeteesesesneneseneees 37Figure 3-6: Export Dataset Use Case Diagram -. ¿- 5S se 2ccxcxsrererrrreree 38Figure 3-7: Add Dataset Sequence IDiagram - ¿6555 S+cx+xsrtrerrrkerer 39Figure 3-8: Delete Dataset/Data Source Sequence Diagram -‹ 39Figure 3-9: Add Data Source Sequence Diagram

Figure 3-10: Delete Data Source Sequence Diagram

Figure 3-12: Groundtruth Database

Figure 3-13: Srcdb.Xml File cccccccccesesseseseseseseeseseseeseneseseseeeeensseseseeeeeseseseenesenseee 41Figure 3-14: Setdb.Xml Eiles c2 St SH eưườ 42

Figure 3-15: Mediasource TFỌ€T - «ch HH 42Figure 3-16: Groundtruth FFỌ€T - «xxx kg re 43Figure 4-2: Set Up Dataset into EnvirOnIeIIL 5 Server 44

Figure 4-3: Input Commands For Training OÌO -¿- ¿555 ++s+s*cvxeeeeexexev 45

Figure 4-4: Training Process Of O]O càng Hư 45Figure 4-5: Indicators To Notice When Training, ccccceceseseseseeeeseseseeneeeneneeeeeeees 46Figure 4-6: Training Đ €SuÌ( - cà TT HH HT 46Figure 4-7: Configuration For Yolov4-Obj.Cfg

Figure 4-8: Input commands for testing

Figure 4-9: Machine Detecting Object

Trang 13

Figure 4-10: YOLO detection Result Number Í 5c c5 5csstsvrvrerererexev 48Figure 4-11: YOLO detection Result Number 2 5c 555 tt svrvrerererexev 49Figure 4-12: Json EÏÏe ch SH HH HH HH HH 49

Figure 4-13: Active Environment For Tensorflow And Create “Train.Record” And

“Val.Record’

Figure 4-14: Complete Dataset Of Tensorflow

Figure 4-15: Configuration For Training .c.cccccecesesesessesescseesesesesessesesesenesenseeeseeeaes 51Figure 4-16: Input Command For Training For SSD_Resnet On Tensorflow 51Figure 4-17: Training Process .ccccccsesesssssseseseseseseseesensseseseseeeesseseseeeaessseseeneaeseees 52Figure 4-18: Export Saved Model Environment Commands . ‹-‹-+ 52

Figure 4-19: Export Saved Model Complete .ccccsssessseeseseeeeessesesesesesenseeeeseeees 53Figure 4-20: Command Line for Validating Detection On Tensorflow/SSD_Resnet .53Figure 4-21: Tensorflow detection Result Number Ï - - 5s +<sc+xexses+ 53Figure 4-22: Tensorflow detection Result Number 2 555 5<<<sc<c-x-x-x/Figure 4-23: Result Of Validating Data

Figure 4-24: Data Source’s Metadata As An Xml File

Figure apendix 1: YOLO Test Result For Standard Test

Figure apendix 2: SSD_resnet Standard Test For SSD_Resnet Result 65Figure apendix 3: YOLO Test Result For Contrast Boosted Test . -‹ 65Figure apendix 4: SSD_resnet Contrast Boosted Test For SSD_ Resnet 66Figure apendix 5: YOLO Result Of Low-Resolution 'Test 5s 5+ secsxsc++ 66Figure apendix 6: YOLO Low Resolution Test ResuÏt - 5555 <<< 67

xi

Trang 14

LIST OF TABLES

calle

Table 3-1: Use case diagram list 0 0.0 e cece eee ee + )

xii

Trang 15

LIST OF ACRONYMS

CUDA: Compute Unified Device Architecture

GPU: Graphics processing unit

CC: Compute Capability

GCC: GNU Compiler Collection

MSVC: Microsoft Visual C++

CuDNN: Deep Neural Network library

CVAT: Computer Vision Annotation Tool

YOLO: You Only Look Once

xiii

Trang 16

Thesis project “An image annotation tool” is a project focus on users who demand

a tool which generate and manage datasets for training machine learning in a fast,convenient and accuracy way

This project is to build and developed an annotation tool which helps to create,generate and manage datasets response to the demand of basic functions such as

annotating and labeling object, managing datasets and data sources (pictures and video)

of those datasets in a convenient way for the users

After a brief about this project, the authors had planned to implement project as below:

⁄ Research of steps how to detect an object.

v Research about Darknet/YOLO

⁄ Research about Tensorflow/SSD resnets

Y Developed and build up Groundtruth application to labeling, annotating and managing

dataset and data source

Y Set up environments for Darknet/YOLO and Tensorflow/SSD_resnets.

Vv Using datasets generated from GroundTruth application to training and tesing on both

environment and algorithms

xiv

Trang 17

With development of AI and machine learning all over the globe, especially object

detection because of the high demands, access, exploit and need of using them in reallife has been larger than ever Especially in Big-Tech companies

Object detection’s scope applies not only to individual users/small-scale businessesbut also big corporation Following with small-scale businesses and that is one of the

hardest tasks is to create their own annotation tool There are two popular kind of

annotation tools, one of which is powerful but they’re not open-source, the other one isfree but lacks many functionalities

Our software provides a free of charge annotation tool, better features than some

nowadays tools Our targeted users are:

> Individual user

> Small-scale business company

xv

Trang 18

CHAPTER I: PROJECT OVERVIEW

This chapter will have and general view about current status of annotation tool and

overview about object detection’s algorithms in machine learning and deep learningfields Thereby, describe the problems, research surveys and provide solution to theproblems

1.1 Problems and questions:

1.1.1 Problem and statement:

In this part, the author will state a problem when a user wants to use object detection

to detect an object such as a car or a bike Below are the steps the user would do:

1 Collect object’s images for detection

2 Choose the algorithm that the user wants to implement detection

3 Study/Investigate about algorithms such as: algorithm’s framework, pre-trainedmodels and dataset, study about how algorithm works is optional

4 Build environment to run detection algorithm

5 Prepare dataset for training detection model

6 Implement detection on trained model

The first 4 steps and step 6" are easy to do But step 5 is quite difficult, a normal

user will usually get stuck or having a struggle when come into step 5 The user will

first search on the internet for an annotation and label tool to create training dataset Let

surf the internet and draw out conclusion about annotation tool pros and cons with theuser.

Trang 19

1.1.2 Annotation Tools Research:

1.1.2.a Lablellmg:

AUsersirflynn/srclabellmg/derlUsersiflynn/srcliabelimg/deNUsersiflynn/srciabellmg/deNsers/rfiynn/srclabellmg/deAUsersirflynnisrolabellmg/NUsersirflynn/srolabellmg/lUsersirflynn/srcliabellmg/

Usersirlynn/srolabel

Figure 1-1: User Interface of labelImg application[1]

¢ Open Source

¢ Friendly User Interface

e Easy to install and to use

¢ Support 2 types of dataset profiles which are VOC and YOLO

¢ Missing dataset’s management feature

e Support old VOC version which is VOC2007 dataset profile

e Cannot annotate and label on videos.

2

Trang 20

1.1.2.b CVAT:

may

This blog post was originally published at Intel's website It is reprinted here with the permission of Intel.

Pros:

Figure 1-2: Graphic User Interface of CVAT[2]

One of the top annotation tools at this time

Professional User Interface and easy to use

Lot of strong functions to support annotating and labeling such as: Datasetmanagement, annotation and label on video, Image edit and filter, video edit

function

Support almost every type of dataset profile

Support different features of annotation

Not open source

One of the biggest cons is low accessibility This annotation tool is not public forindividual, you cannot rent or pay to use You must be authorized/given access

to use this tool

Trang 21

1.1.3 Comments:

After taking a research on the internet about annotation tool, the authors and the usercome up with conclusion:

e Friendly and easy to use User Interface always is the 1 criteria.

e Annotation tools which open source are lacking mandatory functions such as

dataset management or video annotating and labeling

e Some tool is even support with early version of dataset profiles

e Annotation tools which fully capable of creating a training dataset such as

CVAT has biggest cons which is accessibility against an individual user

Understand about pros and cons of some annotation tool on the internet The authors

saw there is an opportunity to overcome those tool’s disadvantages So that we decided

to develop an annotation tool to meet the needs of the users An annotation tool support

for object detection which is easy to use, open source, fully capable of create and managedataset - data source - labels, support different types of dataset profile and support latest

version of those profile

1.2 Project’s Scope:

The authors only focus on research some of article below to build an application forthis project:

- Research about Darknet/YOLO

- Research about Tensorflow/SSD_resnet

- Research and develop an Auto Annotate function for Tensorflow evironment

- Research and develop an open-source annotation tool to support creating and

managing datasets — data sources — labels Also support annotation and labeling

on videos, can export latest version of dataset profiles and have some basic filter

function for data sources

- Building environment for YOLO and Tensorflow

- Use dataset generated from our annotation tool as an input to training models for

each algorithm then validate trained model

4

Trang 22

1.3 Report Layout:

The thesis comes with 5 chapter with the content below:

Chapter 1: Project overview:

- Introduce and have an overview about this thesis project Contents are: Status quo and

problem we are facing, short description of other annotation tools, project scope and

result and report layout

Chapter 2: Theoretical basis:

- Definition about Annotation, types of annotation and features Definition anddescription about algorithms and technology used in this project Those are: C#,

python, Darknet/YOLO, Tensorflow/SSD_resnet, auto image annotation

Chapter 3: GroundTruth system design and analysis

- Introduce about Groundtruth, functions, system analysis, database analysis and UserInterface design

Chapter 4: Testing on Darknet/YOLO and Tensorflow/SSD_resnet

- In this chapter, the authors will deploy Groundtruth to export dataset for training and

testing for both algorithms which are YOLO and SSD_RESNET After train and test,

the authors will deploy compare performance between Groundtruth and CVAT tool

Chapter 5: Conclusions and future development

- Conclusion about project, Pros and cons, Improvement and upgrade in the future

Trang 23

CHAPTER 2: THEORETICAL BASIS

This chapter will discuss about theoretical basis of the project which include

definition about annotation, types of annotation and its features In this chapter, theauthors explain about algorithms are YOLO, SSD_resnet and guide how to set upenvironment to train and test these algorithms

2.1 Annotation Overview:

Annotated data sources are becoming important part of machine learning to train thecomputers for recognizing/detecting the various types of objects on roads or other places.Image annotations highlights and label a particular object by outlining using a specialtool

In machine learning and deep learning, image annotation is the process of labeling

or classifying an image using text, annotation tools, or both, to show the data featuresyou want your model to recognize on its own When you annotate an image, you are

adding metadata to a dataset [7]

Image annotation is a type of data labeling that is sometimes called tagging,transcribing, or processing You also can annotate videos continuously, as a stream, or

frame by frame

Image annotation marks the features you want your machine learning system torecognize, and you can use the images to train your model using unsupervised learning.Once your model is deployed, you want it to be able to identify those features in imagesthat have not been annotated and, as a result, make a decision or take some action [3]

Image annotation is most commonly used to detect objects and boundaries and tosegment images for instance, meaning, or image understanding For each of these uses,

it takes a significant amount of data to train, validate and test a machine learning model

to achieve the desired outcome

Trang 24

e Simple image annotation may involve labeling an image with a phrase that

describes the objects pictured in it For example, you might annotate an image

of a cat with the label “domestic house cat.” This is also called image

classification or tagging, annotating

se Complex image annotation can be used to identify, count, or track multiple

objects or areas in an image For example, you might annotate the difference

between breeds of cat: perhaps you are training a model to recognize the

difference between a Maine Coon cat and a Siamese cat Both are unique and

can be labeled as such The complexity of your annotation will vary, based on

the complexity of your project

2.1.1 Manual annotation:

Manual annotation is one of the basic tasks in Computer vision technology.Annotated images are needed to train machine learning algorithms to recognize objectscontained in visuals and give computers the ability to ‘see’ almost like we humans

do But manual annotation has its weakness, Manual image annotation can be consuming and quite expensive, especially when the set of images that need annotation

time-7

Trang 25

is extremely large The human-powered task of adding labels to an image (annotating)

to create training datasets for computer vision algorithms AI and machine learningengineers usually predetermine these labels manually using special image annotation

software or tools: they define regions in an image and create text-based descriptions to

No Name Maks Type Data Set |

1 20d 800445013694 7b 70 Goet60 0 image acb

Pre Ply = Pause Nest Snapset

Figure 2-2: A Manual Annotation Tool For Machine Learning

2.1.2 Auto Annotation:

Auto Image annotation has become integral part of AI development to create thetraining data for machine learning Its helps to make the objects recognizable formachines into images And as much as annotated images as a data is available for training

Trang 26

the machines the accuracy level of prediction would be higher allowing AI developers tomake the right model.

Low confidence

Automatic Select segments Manual

annotation with low confidence correction

Train initial => J

model

Trained model

⁄⁄⁄Z_ Đ Retrain model

Figure 2-3: Auto Image Annotation Process

In race of supplying such training data companies are using the automatic route to

annotate the images by machines and get the high volume of data As AI-based imageannotator tools and software has been developed for such needs and they can produce

large amount of annotated images in the less time period fulfilling the needs of machine

Trang 27

It is one of the most common and important method of image annotation techniquesmainly used to outline the object in the image In this thesis, this is the goal that authors

Figure 2-4: Bounding Box Annotation Feature

CUBOID ANNOTATION:

This is called 3D cuboid annotation that involves, high-quality labeling and markingtechnique to highlight the objects in the third-dimension sketching formats It helps tocalculate the depth or distance of various objects like gadgets, building, vehicles and also

on humans to distinguishing the volume and space of the object 3D cuboid annotation is

basically used for construction and building structure field including radiology imaging

in medical fields

10

Trang 28

LINE ANNOTATION:

Line annotation is used to draw the lines on the roads or streets to make it identifiable

for training the vehicle perception computer models to detect the lane It is different fromother types of annotations like bounding boxes and cuboid annotation It is suitable for

drawing attention to important areas like road or streets, decoration or diagramming the

process flow to provide the clear view of streets to a machine like self-driving cars [8]

11

Trang 29

2.2 Object Detection Platforms and Algorithms:

2.2.1 Yolo:

2.2.1.1 Yolo’s framework: DARKNET

An open-source Neural Networks in C

Darknet is an open-source neural network framework written in C and CUDA It is

fast, easy to install, and supports CPU and GPU computation Darknet is developed first

by Pjreddie but after Pjreddie dropped this project, AlexeyAB continue its development

Darknet is a framework environment support for training and testing process of yoloalgorithm Beside of support yolo, darknet also support other object detection algorithms

such as RNNS, Tiny darknet, CIFAR-10, ImageNet, etc

12

Trang 30

2.2.1.2 Yolo:

YOLO is short for You Only Look Once It is a real-time object recognition systemthat can recognize multiple objects in a single frame YOLO recognizes objects more

precisely and faster than other recognition systems It can predict up to 9000 classes and

even unseen classes The real-time recognition system will recognize multiple objectsfrom an image and also make a boundary box around the object It can be easily trainedand deployed in a production system [4]

First version of yolo is created by Pjreddie, through times and versions, yolo is

improving its accuracy when training/testing, training times is shorter Until version 3.0,Pjreddie decided to drop out yolo At the present, yolo is at version 4.0 and developed

by AlexeyAB promises more improvement on performance, accuracy and less timeconsuming on training In this project, the authors decided to use YoloV4 as an

experiment for Groundtruth application

13

Trang 31

MS COCO Object Detection

Trang 32

2.2.1.4 YoloV4 dataset:

YOLO uses dataset of COCO dataset which helps to improve on trainingperformance, high accuracy and reduce time-consuming for training

Performance on the COCO Dataset

Model Train Test mAP FLOPS FPS

SSD300 COCO trainval test-dev 41.2 46

SSDS500 COCO trainval testdev 46.5 19

YOLOv2 608x608 COCOtrainval testdev 48.1 40 Tiny YOLO Coco trainval testdev 23.7

SSD321 COCO trainval testdev 45.4 DSSD321 COCO trainval test-dev 46.1 R-FCN Coco trainval testdev 51.9 SSD513 COCO trainval test-dev 50.4 DSSD513 COCO trainval test-dev 53.3

FPN FRCN Coco trainval testdev 59.1

Retinanet-50-500 COCO trainval test-dev 50.9 Retinanet-101-500 COCO trainval testdev 53.1 -

Retinanet101800 COCOtrainval testdev 57.5

-YOLOv3-320 COCO trainval test-dev 51.5 38.97 Bn YOLOv3-416 GOCO trainval tesi-dev 553 65.86Bn

YOLOv3-608 Coco trainval test-dev 57.9 140.69BnYOLOv3-tiny COCO trainval testdev 33.1 5.56 Bn

YOLOv3-spp COCO trainval testdev 60.6 141.45Bn

File folder 12/22/2020 2:54 PM DATA File

Figure 2-10: Yolo Dataset

15

Trang 33

Yolo dataset consists of these components shown below:

Obj folder:

This folder stored data sources whose name converted with MDS format and a textfile consist of labeled class, x coordinate, y coordinate, width and height of the datasource Text file’s name and data source’s name will be named matching so users willnot mistake when implement on dataset

This place is where the data sources (pictures and coordinate text files) are stored

In the coordinate file is a text file comes as below:

Class number followed up as array types

¢ X coordinate: Set at the center of the rectangular annotate filed and refer to Oxy

coordinate axis

¢ Y coordinate: Set at the center of the rectangular annotate filed and refer to Oxy

coordinate axis

e Width: The width of annotation field

e Height: The width of annotation field

Figure 2-11: - Coordinate Text File Explanation

16

Trang 34

nent

4614_ 14,

be4614_1.©xt 755fa534_1.jpg

11 tet ay ument

—_1.jpg

1.bet ;, PM ument

nent

Figure 2-12: - Folder Obj of Yolo Dataset

e Obj.data file: This file contains dataset’s informations

‘Obj.data’ file contains the number of class labeled in a specific dataset Example:

A dataset contains dog and cat then number of classes is set to 2 If there is only dog orcat, then class is set to 1

Beside, ‘Obj.data’ also contain directory to other config file once set up as inputsinto environment such as ‘train.txt’, ‘obj.name’ and back up folder directory where

trained ‘.weights’ files are stored

Trang 35

e Obj.names: The place to stored labeled names.

Cat

100% — Windows (CRLF)

Figure 2-14: Obj.Names Of Yolo Dataset

e Train.txt: This file stored datasources’ directory once set up into environment

data/obj /3d085617356640289721140a6Fca308e_1-jpa

Gata/obj /665e39a878d2eeco3Gce08c9755Fa534_1 Jp data/obj /6d42a8419200d024cba967 38695668b1_1 - pe

Gate/ob3 /75104541a88901 SbatesebFs19da50e7—4 -3pe

Gata/obj /8019202a6db0550a05330FS436be8045_1Jpe data/ob3 /b2bbb481b42batF435b0032bb25b1 3bd_1 - pe

data/obj /ec8o352a724c412ef1F394816dSede95_1 - SpE data /obj /Feb2GbbeA46685624824da0aGaabes75_1 Ips

int, colt 100% Windows (CRLF) uTr-s

Figure 2-15: Obj.Names Of Yolo Dataset

2.2.1.5 Set up environment on Darknet/yoloV4:

In this section will be the guide how to install and set up environment for Darknet

and yoloV4 These are the main steps for you to follow up to get a environment whichsupport training and testing [4]

18

Trang 36

-Step 1: go to alexeyAB’s github and clone the source code to your local

Figure 2-16: Alexeyab’s Darknet Github

- Step 2: Download and install CUDA toolkit

e Note: please select the properly CUDA toolkit version which compatible with

the GPU you are using

se Note 2: please check your GPU which have enough CC that >= 3.0 on

darknet/yolo requirements

Get Started

The above options provide the complete CUDA Tock fr application development Runtime components for depioying CUDA-based applications are avalble in reacto-utecontiners

from NVIDIA PU Cou

Installing the CUDA Toolkit Introduction to CUDA Getting Started with CUDA Discover Latest CUDA Capabilities

Figure 2-17: Download And Install CUDA Toolkit

- Step 3: Download and Install OpenCV with CMake library

19

Trang 37

Note: please check with your CUDA toolkit and OpenCV version compatibility.

Figure 2-18: Download And Install OpenCV

Grigor Opn rota Cama ld 162039

Figure 2-19: Cmake Configuration With CUDA Toolkit

- Step 4: Download config file and pre-trained model

20

Trang 38

Detection Using A Pre-Trained Model

Nill guide ý ough detecting objects with the

If ye

of reading all t

git clone htt

ubdireetory

just run thi

Figure 2-20: Configuration File And Pre-Trained Model

- Step 5: Config Cmake together with alexeyAB’s environment

Aowsies69-c7

ome iNet alba

net Ls fine for nox, but uselib_teack has been disabled!

18 rebuild OpencY fron sources with CUDA support to enable Ít

~gencode srchecompuLe_75,codersa, 75 ~Wno-deprecated-declarations -Xoonpilers*/wit013, /wid018, /xd4929, /44047, /u44068, /e44099, /vd4101, /wd4213,.

L |

Figure 2-21: Config Cmake To Unzip The Environment From Alexeyab

21

Trang 39

Figure 2-22: Building Darknet/Yolo Environment.

- Step 7: After build on Visual Studio then you can use Darknet/yolo environmentfor training and testing with object detection

22

Trang 40

2.2.1.6 Pros and Cons:

Pros:

e Fast Good for real-time processing

e Predictions (object locations and classes) are made from one single network Can

be trained end-to-end to improve accuracy

e Process frames at the rate of 45 fps (larger network) to 150 fps(smaller network)

which is better than real-time

e The network is able to generalize the image better

Tensorflow is developed by Google — Google Brain team to pursuit purpose of

implement for research and apply for machine learning and other field such as logistic,

Al in an effective way Tensorflow was license to operate at 9/11/2015 as a library forcomputing

Tensorflow is separated as definitions below:

- Tensor: Define as types data structures gathered in a main library is tensorflow

Inside this structure, there are 3 basic elements as: level, dimension and types

- In other way, the fundamental of Tensorflow is not defined as an environment forAl/machine learning but a computing library [5]

23

Ngày đăng: 02/10/2024, 04:28

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN