MODEL-BASED VISUAL TRACKING The OpenTL Framework GIORGIO PANIN A JOHN WILEY & SONS, INC., PUBLICATION ffirs02.indd iiiffirs02.indd iii 1/26/2011 3:05:15 PM1/26/2011 3:05:15 PM www.it-ebooks.info ffirs01.indd iiffirs01.indd ii 1/26/2011 3:05:13 PM1/26/2011 3:05:13 PM www.it-ebooks.info MODEL-BASED VISUAL TRACKING ffirs01.indd iffirs01.indd i 1/26/2011 3:05:13 PM1/26/2011 3:05:13 PM www.it-ebooks.info ffirs01.indd iiffirs01.indd ii 1/26/2011 3:05:13 PM1/26/2011 3:05:13 PM www.it-ebooks.info MODEL-BASED VISUAL TRACKING The OpenTL Framework GIORGIO PANIN A JOHN WILEY & SONS, INC., PUBLICATION ffirs02.indd iiiffirs02.indd iii 1/26/2011 3:05:15 PM1/26/2011 3:05:15 PM www.it-ebooks.info Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifi cally disclaim any implied warranties of merchantability or fi tness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profi t or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data: Panin, Giorgio, 1974– Model-based visual tracking : the OpenTL framework / Giorgio Panin. p. cm. ISBN 978-0-470-87613-8 (cloth) 1. Computer vision–Mathematical models. 2. Automatic tracking–Mathematics. 3. Three- dimensional imaging–Mathematics. I. Title. II. Title: Open Tracking Library framework. TA1634.P36 2011 006.3′7–dc22 2010033315 Printed in Singapore oBook ISBN: 9780470943922 ePDF ISBN: 9780470943915 ePub ISBN: 9781118002131 10 9 8 7 6 5 4 3 2 1 ffirs03.indd ivffirs03.indd iv 1/26/2011 3:05:16 PM1/26/2011 3:05:16 PM www.it-ebooks.info CONTENTS PREFACE xi 1 INTRODUCTION 1 1.1 Overview of the Problem / 2 1.1.1 Models / 3 1.1.2 Visual Processing / 5 1.1.3 Tracking / 6 1.2 General Tracking System Prototype / 6 1.3 The Tracking Pipeline / 8 2 MODEL REPRESENTATION 12 2.1 Camera Model / 13 2.1.1 Internal Camera Model / 13 2.1.2 Nonlinear Distortion / 16 2.1.3 External Camera Parameters / 17 2.1.4 Uncalibrated Models / 18 2.1.5 Camera Calibration / 20 2.2 Object Model / 26 2.2.1 Shape Model and Pose Parameters / 26 2.2.2 Appearance Model / 34 2.2.3 Learning an Active Shape or Appearance Model / 37 v ftoc.indd vftoc.indd v 1/27/2011 1:53:25 PM1/27/2011 1:53:25 PM www.it-ebooks.info vi CONTENTS 2.3 Mapping Between Object and Sensor Spaces / 39 2.3.1 Forward Projection / 40 2.3.2 Back-Projection / 41 2.4 Object Dynamics / 43 2.4.1 Brownian Motion / 47 2.4.2 Constant Velocity / 49 2.4.3 Oscillatory Model / 49 2.4.4 State Updating Rules / 50 2.4.5 Learning AR Models / 52 3 THE VISUAL MODALITY ABSTRACTION 55 3.1 Preprocessing / 55 3.2 Sampling and Updating Reference Features / 57 3.3 Model Matching with the Image Data / 59 3.3.1 Pixel-Level Measurements / 62 3.3.2 Feature-Level Measurements / 64 3.3.3 Object-Level Measurements / 67 3.3.4 Handling Mutual Occlusions / 68 3.3.5 Multiresolution Processing for Improving Robustness / 70 3.4 Data Fusion Across Multiple Modalities and Cameras / 70 3.4.1 Multimodal Fusion / 71 3.4.2 Multicamera Fusion / 71 3.4.3 Static and Dynamic Measurement Fusion / 72 3.4.4 Building a Visual Processing Tree / 77 4 EXAMPLES OF VISUAL MODALITIES 78 4.1 Color Statistics / 79 4.1.1 Color Spaces / 80 4.1.2 Representing Color Distributions / 85 4.1.3 Model-Based Color Matching / 89 4.1.4 Kernel-Based Segmentation and Tracking / 90 4.2 Background Subtraction / 93 4.3 Blobs / 96 4.3.1 Shape Descriptors / 97 4.3.2 Blob Matching Using Variational Approaches / 104 4.4 Model Contours / 112 4.4.1 Intensity Edges / 114 4.4.2 Contour Lines / 119 4.4.3 Local Color Statistics / 122 ftoc.indd viftoc.indd vi 1/27/2011 1:53:25 PM1/27/2011 1:53:25 PM www.it-ebooks.info CONTENTS vii 4.5 Keypoints / 126 4.5.1 Wide-Baseline Matching / 128 4.5.2 Harris Corners / 129 4.5.3 Scale-Invariant Keypoints / 133 4.5.4 Matching Strategies for Invariant Keypoints / 138 4.6 Motion / 140 4.6.1 Motion History Images / 140 4.6.2 Optical Flow / 142 4.7 Templates / 147 4.7.1 Pose Estimation with AAM / 151 4.7.2 Pose Estimation with Mutual Information / 158 5 RECURSIVE STATE-SPACE ESTIMATION 162 5.1 Target-State Distribution / 163 5.2 MLE and MAP Estimation / 166 5.2.1 Least-Squares Estimation / 167 5.2.2 Robust Least-Squares Estimation / 168 5.3 Gaussian Filters / 172 5.3.1 Kalman and Information Filters / 172 5.3.2 Extended Kalman and Information Filters / 173 5.3.3 Unscented Kalman and Information Filters / 176 5.4 Monte Carlo Filters / 180 5.4.1 SIR Particle Filter / 181 5.4.2 Partitioned Sampling / 185 5.4.3 Annealed Particle Filter / 187 5.4.4 MCMC Particle Filter / 189 5.5 Grid Filters / 192 6 EXAMPLES OF TARGET DETECTORS 197 6.1 Blob Clustering / 198 6.1.1 Localization with Three-Dimensional Triangulation / 199 6.2 AdaBoost Classifi ers / 202 6.2.1 AdaBoost Algorithm for Object Detection / 202 6.2.2 Example: Face Detection / 203 6.3 Geometric Hashing / 204 6.4 Monte Carlo Sampling / 208 6.5 Invariant Keypoints / 211 ftoc.indd viiftoc.indd vii 1/27/2011 1:53:25 PM1/27/2011 1:53:25 PM www.it-ebooks.info viii CONTENTS 7 BUILDING APPLICATIONS WITH OpenTL 214 7.1 Functional Architecture of OpenTL / 214 7.1.1 Multithreading Capabilities / 216 7.2 Building a Tutorial Application with OpenTL / 216 7.2.1 Setting the Camera Input and Video Output / 217 7.2.2 Pose Representation and Model Projection / 220 7.2.3 Shape and Appearance Model / 224 7.2.4 Setting the Color-Based Likelihood / 227 7.2.5 Setting the Particle Filter and Tracking the Object / 232 7.2.6 Tracking Multiple Targets / 235 7.2.7 Multimodal Measurement Fusion / 237 7.3 Other Application Examples / 240 APPENDIX A: POSE ESTIMATION 251 A.1 Point Correspondences / 251 A.1.1 Geometric Error / 253 A.1.2 Algebraic Error / 253 A.1.3 2D-2D and 3D-3D Transforms / 254 A.1.4 DLT Approach for 3D-2D Projections / 256 A.2 Line Correspondences / 259 A.2.1 2D-2D Line Correspondences / 260 A.3 Point and Line Correspondences / 261 A.4 Computation of the Projective DLT Matrices / 262 APPENDIX B: POSE REPRESENTATION 265 B.1 Poses Without Rotation / 265 B.1.1 Pure Translation / 266 B.1.2 Translation and Uniform Scale / 267 B.1.3 Translation and Nonuniform Scale / 267 B.2 Parameterizing Rotations / 268 B.3 Poses with Rotation and Uniform Scale / 272 B.3.1 Similarity / 272 B.3.2 Rotation and Uniform Scale / 273 B.3.3 Euclidean (Rigid Body) Transform / 274 B.3.4 Pure Rotation / 274 B.4 Affi nity / 275 ftoc.indd viiiftoc.indd viii 1/27/2011 1:53:25 PM1/27/2011 1:53:25 PM www.it-ebooks.info [...]... lectures on model-based visual tracking that I have given at the Chair since 2006 I therefore wish to express my deep sense of appreciation for the input and feedback of my students, some of whom later joined the Visual Tracking Group Giorgio Panin www.it-ebooks.info fpref.indd xiii 1/26/2011 3:05:16 PM www.it-ebooks.info fpref.indd xiv 1/26/2011 3:05:16 PM CHAPTER 1 INTRODUCTION Visual object tracking. .. libraries for model-based visual tracking are available, and most existing software deals with more or less limited application domains, not easily allowing extensions or inclusion of different methodologies in a modular and scalable way Therefore, a unifying, general-purpose, open framework is becoming a compelling issue for both users and researchers in the field This challenging Model-Based Visual Tracking: ... 1 1/26/2011 2:55:13 PM 2 INTRODUCTION Figure 1.1 Model-based object tracking Left: object model; middle: visual features; right: estimated pose target constitutes the main motivation of the present work, where a twofold goal is pursued: 1 Formulating a common and nonredundant description vocabulary for multimodal, multicamera, and multitarget visual tracking schemes 2 Implementing an object-oriented... Performing visual processing, to obtain measurements associated with objects in order to carry out detection or state updating procedures • Tracking the objects through time using a prediction–measurement– update loop www.it-ebooks.info c01.indd 2 1/26/2011 2:55:13 PM OVERVIEW OF THE PROBLEM Sensors Objects 3 Environment Models Detection/ Recognition Pre-processing Features Sampling Object Tracking Visual. .. deformation, and appearance parameters is going to be estimated during tracking • Temporal dynamics: a probabilistic model of the temporal-state evolution, possibly taking into account mutual interactions in a multitarget scenario • Camera model: available (intrinsic and extrinsic) camera parameters for space-to-image mapping Model-Based Visual Tracking: The OpenTL Framework, First Edition Giorgio Panin © 2011... 1:53:25 PM www.it-ebooks.info ftoc.indd x 1/27/2011 1:53:25 PM PREFACE Object tracking is a broad and important field in computer science, addressing the most different applications in the educational, entertainment, industrial, and manufacturing areas Since the early days of computer vision, the state of the art of visual object tracking has evolved greatly, along with the available imaging devices and... methodologies Many of the low-level image processing and understanding algorithms involved in a visual tracking system can now be found in open-source vision libraries such as the Intel OpenCV [15], which provides a worldwide standard; and at the same time, powerful programmable graphics hardware makes it possible both to visualize and to perform computations with very complex object models in negligible time... up to model-based object detection and sequential localization, and defines, at the application level, what we call the tracking pipeline Within this framework, extensive use of graphics hardware (GPU computing) as well as distributed processing allows real-time performances for complex models and sensory systems The book is organized as follows: In Chapter 1 we present our approach to the object -tracking. .. our approach to the object -tracking problem in the most abstract terms In particular, we define the three main issues involved: models, vision, and tracking, a structure that we follow in subsequent chapters A generic tracking system flow diagram, the main tracking pipeline, is presented in Section 1.3 xi www.it-ebooks.info fpref.indd xi 1/26/2011 3:05:16 PM xii PREFACE The model layer is described in... scalable, and parallelizable way 1.1 OVERVIEW OF THE PROBLEM The lack of a complete and general-purpose architecture for model-based tracking can be attributed in part to the apparent problem complexity: An extreme variety of scenarios with interacting objects, as well as many heterogeneous visual modalities that can be defined, processed, and combined in virtually infinite ways [169], may discourage any attempt . aspects of an object tracking task: models, vision, and tracking. Object Tracking Pre-processing Visual processing Data fusion Tracking Target Update Measurement Detection/ Recognition Target Prediction Models Objects Sensors Environment Features Sampling Occlusion Handling Data association c01.indd. Problem / 2 1.1.1 Models / 3 1.1.2 Visual Processing / 5 1.1.3 Tracking / 6 1.2 General Tracking System Prototype / 6 1.3 The Tracking Pipeline / 8 2 MODEL