rapid object detection using a boosted cascade of

 Giới thiệu Introduction  Thuật toán Boosting cho học phân lớp The Boost algorithm for classifier learning  Lựa chọn đặc tính, đặc trưng Feature Selection  Hàm phân lớp yếu Weak lea

Trang 1

Nhận Dạng Đối

Tượng

Sử dụng thuật toán Adaptive Boosting

Original Author

Paul Viola & Michael Jones

Người Trình Bày: Nguyễn Đăng Bình

(moving or acting with great speed)(moving or acting with great speed)

(increase the strength or value of Sth)(increase the strength or value of Sth)

Trang 2

 Giới thiệu (Introduction)

 Thuật toán Boosting cho học phân lớp

(The Boost algorithm for classifier learning)

 Lựa chọn đặc tính, đặc trưng (Feature Selection)

 Hàm phân lớp yếu (Weak learner constructor)

 Phân lớp mạnh (The strong classifier)

 Khó khăn (A tremendously difficult problem)

 Kết quả (Result)

 Kết luận (Conclusion)

Trang 3

What had we done?

 Tiếp cận máy học cho phát hiện và nhận dạng

đối tượng trực quan

 Khả năng xử lý ảnh cực kỳ nhanh (extremely

rapidly)

 Achieving high detection rates

 A new image representation  Integral Image

 A learning algorithm( Based on AdaBoost[5])

 A combining classifiers method  cascade

of the image

Trang 4

Working only with a single grey scale image

A demonstration on face

detection

 A frontal face detection system

 The detector run at 15 frames per

second without resorting to image

differencing or skin color detection

Image difference in video sequences

384 x 288 on a PentiumIII 700 MHz

Trang 5

The broad practical

applications for a extremely

fast face detector

Teleconferencing

small low power devices

Compaq iPaq  2 frame/sec

Trang 6

Training process for

classifier

detect examples of a particular class - a supervised training process

In the domain of face detection

Trang 7

Cascaded detection

process

sequence of classifiers

each slightly more complex than the last

Any classifier rejects the sub-window,

no further processing is performed

Any classifier rejects the sub-window,

no further processing is performed

degenerate decision tree

Trang 8

Our object detection

framework

Original Image

Integral Image

In order to computingfeatures rapidly at many

Trang 9

Feature Selection

The detection process is based

on the feature rather than the

pixels directly.

Two Reasons:

The ad-hoc domain knowledge is difficult

to learn using a finite quantify of

training data.

The feature based system operates much

faster

Two Reasons:

to learn using a finite quantify of training data.

faster

The Haar basis functions which have been used by

Papageorgiou et al.[9]

Trang 10

Three kinds of features

The difference between the sum of pixels

within two rectangular regions

The difference between the sum of pixels

within two rectangular regions

The base resolution is 24x24

The exhaustive set of rectangle is large,

over 180,000.

The base resolution is 24x24

The exhaustive set of rectangle is large,

center rectangle

The difference

between the diagonal pairs

of rectangles

Rectangle Feature

Trang 11

Four-; 0 )

, 1

(

, 0 )

1 ,

(

), ,

( )

, 1 (

) ,

(

), ,

( )

1 ,

( )

x

s

y x s y

x ii y

x

ii

y x i y

x s y

x

s

; 0 )

, 1

(

, 0 )

1 ,

(

), ,

( )

, 1 (

) ,

(

), ,

( )

1 ,

( )

x

s

y x s y

x ii y

x

ii

y x i y

x s y

y x

i y

x

ii

' ' ,

' ' , ) (

) ,

y x

i y

x

ii

' ' ,

' ' , ) (

) , (

The integral image

The original image

The recurrences pair

for one pass

computing

The recurrences pair

for one pass

9

Trang 12

Calculating any rectangle sum with integral image

Trang 13

AdaBoost learning algorithm

 Is used to do the feature selection task

non-Over 180,000 rectangle features associate with each sub-image

Over 180,000 rectangle features associate with each sub-image

24 24

Weak Learner 1

Weak Learner 1 Learner 2Weak

Weak Learner 2 Learner 2Weak

Weak Learner 2

The final strong classifier

Trang 14

), , ( ), , ( x1, y1), ( x2, y2), , ( xn, yn)

Image

Positive =1 Negative=0

Step 1: Giving example images

Step 2: Initialize the

weights

positives.

and negatives of

# the are and

, 1 , 0

for 2

1 , 2

1

, 1

l m

y l

m

w i  i 

positives.

and negatives of

# the are and

, 1 , 0

for 2

1 , 2

1,

1

l m

y l

m

For t = 1, … , T

1 Normalize the weights,

2 For each feature j, train a classifier hj which is restricted to using a single

feature

3 Update the weights:

For t = 1, … , T

2 For each feature j, train a classifier hj which is restricted to using a single

feature

on distributi probabity

a is that

so ,

w

w w

w

w w

 



.error lowest

with the ,

, classifier the

Choose

| )

(

|

, respect to with

evaluated is

error The

t t

i

i i

j i j

t

h

y x

h w

w



.error lowest

with the ,

, classifier the

Choose

| )

(

|

, respect to with

evaluated is

error The

t t

i

i i

j i j

t

h

y x

h w

is if

,

, 1

, ,

1

i t

i t e

t i t i

t

w

x

w w

is if

,,

, 1

, ,

1

i t

i t e

t i t i

t

w

x

w w

Trang 15

for each subimage

Over 180,000 features for each subimage

180 , 000

i

i i

j i

j w | h ( x ) y |

i

i i

j i

h180 , 000

ht

h

.error lowest

with the ,

, classifier the

1

t

t i t i

1

Update the weights

Trang 16

Training the weak learner

j i



1 ) ( i 

j x h

False positive

False negative

feature a

is

sign, inequality

the of direction the

indicating

, threshold a

is 0

) ( if

,

1 )

(

j j

j

j j j

j j

f P where

otherwise

P x

f

P x

is

sign, inequality

the of direction the

indicating

, threshold a

is 0

) ( if

,

1 )

(

j j

j

j j j

j j

f P where

otherwise

P x

f

P x

Trang 17

examples must often misclassified by the preceding weak rules

 Forcing the base learner to focus its

attention on the “hardest” examples

Trang 18

The Boost algorithm for classifier

learning

) , ( , .

), , ( ), , ( x1, y1), ( x2, y2), , ( xn, yn)

Step 1: Giving example images

Step 2: Initialize the

weights

positives.

and negatives of

# the are and

, 1 , 0

for 2

1 , 2

1

, 1

l m

y l

m

w i  i 

positives.

and negatives of

# the are and

, 1 , 0

for 2

1 , 2

1,

1

l m

y l

m

For t = 1, … , T

2 For each feature j, train a classifier hj which is restricted to using a single feature

For t = 1, … , T

2 For each feature j, train a classifier hj which is restricted to using a single feature

Weak learner constructor

j i

j w | h ( x ) y |

i

i i

j i

j w | h ( x ) y |



Trang 19

The Big Picture on testing

Trang 20

A tremendously difficult

problem

 The number of classifier stages

 The number of features in each stages

 The threshold of each stage

Trang 21

Ada Boosting Learner

False (Reject)

face

Non-face

100% Detection Rate50% False Positive

Ada Boosting Learner Ada Boosting Learner

Trang 22

trained to detect frontal upright faces

24x24.

Trang 24

Outline

Set

Trang 25

Speed of the final Detector

 The speed is directly related to the

number of features evaluated per

scanned sub-window.

6061 are evaluated per sub-window.

 On a 700Mhz PentiumIII, a 384 x 288 pixel image in about 067 seconds (using

a staring scale of 1.25 and a step size of 1.5)

Trang 26

Image Processing

Trang 27

Scanning the Detector

image at multiple scale and locations

scales a factor of 1.25 apart

window some pixels

is the rounding operation

[]

Trang 28

Integration of Multiple

Detector

around each face and some types of false positives

sub-windows in order to combine

overlapping detections into a single detection

their bounding regions overlap

Trang 29

Experiments on a Real-World

Test Set

Result

The MIT+CMU frontal face test set consists

of 130 images with 507 labeled frontal faces

The MIT+CMU frontal face test set consists

Trang 30

Detection rates for various numbers of false positives on the MIT+ CMU test set containing 130 images and 507 faces.

Experiments on a Real-World

Test Set

Result

Our detector

Trang 31

ROC curve for the face detector on MIT+CMU test set

The detector was run using a step size of 1.0 and starting scale of 1.0

Trang 32

A simple voting scheme to further

improve results

Result

 The 38 layer one described above plus two similarly trained detectors

 Output the majority vote of three

detectors

The improvement would be greater if the detectors were more independent

Trang 33

Output of our face detector from the MIT+CMU test set

Trang 34

minimizes computation time while

achieving high detection rate

algorithms, representations and

insights which are quite generic

The detector is approximately 15 times faster than previous approach

Trang 35

under very wide range of conditions including: illumination, scale, pose, and camera variation

Định dạng
Số trang	35
Dung lượng	1,18 MB