1. Trang chủ
  2. » Công Nghệ Thông Tin

springer - visual analysis of humans

655 577 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Cấu trúc

  • Cover

  • Visual Analysis of Humans

  • ISBN 9780857299963

  • Foreword

  • Preface

  • Contents

    • Contributors

  • Part I: Detection and Tracking

    • Chapter 1: Is There Anybody Out There?

      • 1.1 Detection

        • 1.1.1 Data Acquisition

        • 1.1.2 Pixel-Based Detection

        • 1.1.3 Object-Based Detection

      • 1.2 Tracking

      • 1.3 Future Trends in Detection and Tracking

      • References

    • Chapter 2: Beyond the Static Camera: Issues and Trends in Active Vision

      • 2.1 Introduction

      • 2.2 Active Camera Configurations

        • 2.2.1 The Autonomous Camera Approach

        • 2.2.2 The Master/Slave Approach

        • 2.2.3 The Active Camera Network Approach

        • 2.2.4 Environmental Reasoning

      • 2.3 The Autonomous Camera: A Hands-on Experience

        • 2.3.1 Camera-World Model

        • 2.3.2 Estimation

        • 2.3.3 Control

        • 2.3.4 System Performance

          • 2.3.4.1 Simulated Data

          • 2.3.4.2 Live Cameras

        • 2.3.5 Closing Remarks

      • 2.4 Active Vision in Practice: A Case Study

        • 2.4.1 Practical Considerations

        • 2.4.2 HERMES Hardware Platform

        • 2.4.3 HERMES Software Platform

      • 2.5 Conclusions: Learning from the Past to Foresee the Future

      • References

    • Chapter 3: Figure-Ground Segmentation-Pixel-Based

      • 3.1 Introduction

      • 3.2 Challenges in Scene Modeling

        • Illumination changes:

        • Motion changes:

        • Structural changes:

      • 3.3 Statistical Scene Modeling

        • 3.3.1 Parametric Background Models

        • 3.3.2 Nonparametric Background Models

          • 3.3.2.1 KDE-Background Practice and Other Nonparametric Models

      • 3.4 Moving Shadow Suppression

      • 3.5 Tradeoffs in Background Maintenance

        • Background update rate:

        • Selective vs. blind update:

      • 3.6 Background Subtraction from a Moving Camera

      • 3.7 Conclusion and Further Reading

      • References

    • Chapter 4: Figure-Ground Segmentation-Object-Based

      • 4.1 Introduction

      • 4.2 Challenges and Outline

      • 4.3 Object Detection Approaches

        • 4.3.1 Sliding-Window Object Detection

        • 4.3.2 Holistic Detector Representations

        • 4.3.3 Part-Based Detectors

        • 4.3.4 Local Feature-Based Detectors

        • 4.3.5 Use for Tracking-by-Detection

      • 4.4 Figure-Ground Segmentation

        • 4.4.1 Bounding Box Priors

        • 4.4.2 Bottom-Up Segmentation Refinement

        • 4.4.3 Class-Specific Top-Down Segmentation

        • 4.4.4 Combinations

      • 4.5 Object Propagation

        • 4.5.1 Appearance-Based Tracking

        • 4.5.2 Online Classification

      • 4.6 Applications

        • 4.6.1 Tracking-by-Detection from a Moving Platform

        • 4.6.2 Tracking Using the Continuous Detector Confidence

        • 4.6.3 Tracking-Learning-Detection

        • 4.6.4 Articulated Multi-person Tracking

      • 4.7 Conclusion

      • References

    • Chapter 5: Face Detection

      • 5.1 Introduction

      • 5.2 Categorization of Existing Approaches

      • 5.3 Representative Detection Methods: AdaBoost and Bag-of-Words

        • 5.3.1 AdaBoost

          • 5.3.1.1 Integral Image Representation

          • 5.3.1.2 Feature Selection

          • 5.3.1.3 Focus of Attention

          • 5.3.1.4 Extension to Other Facial Variations

        • 5.3.2 Bag-of-Words

          • 5.3.2.1 Interest-Point Detection and Representation

          • 5.3.2.2 Dictionary Formation

          • 5.3.2.3 Image Representation and Classification

      • 5.4 Context

        • 5.4.1 Detecting Faces Using Semantic Context

          • 5.4.1.1 Design of Detectors

          • 5.4.1.2 Inference Through First-Order Logic [43]

          • 5.4.1.3 Probabilistic Inference Using Markov Logic Networks [9]

      • 5.5 Computational Efficiency

        • 5.5.1 Navigating the Search Space

        • 5.5.2 Efficient Image Representations

          • 5.5.2.1 Fast Contour-Based Face (Object) Localization

      • 5.6 Other Related Topics and Conclusions

        • 5.6.1 Online Learning

        • 5.6.2 Evaluations

      • References

    • Chapter 6: Wide Area Tracking in Single and Multiple Views

      • 6.1 Introduction

      • 6.2 Review of Multi-Target Tracking Approaches

        • 6.2.1 Kalman Filter Based Tracker

        • 6.2.2 Particle Filter Based Tracker

        • 6.2.3 Multi-Hypothesis Tracking (MHT)

        • 6.2.4 Joint Probabilistic Data Association Filters (JPDAF)

      • 6.3 Errors in Multi-Target Tracking

        • Detection of lost track:

        • Track switching:

        • 6.3.1 Solution Strategies

        • 6.3.2 Tracklet Affinity Modeling

          • Appearance model:

          • Motion model:

      • 6.4 Tracking in Camera Networks

      • 6.5 A Stochastic Graph Evolution Framework for Tracklet Association

      • 6.6 Identifying Transitions to Groups and Crowds

      • 6.7 Performance Analysis

        • 6.7.1 Single Camera Tracking Performance

        • 6.7.2 Camera Network Tracking

      • 6.8 Conclusions and Thoughts on Future Work

      • References

    • Chapter 7: Benchmark Datasets for Detection and Tracking

      • 7.1 Datasets

      • 7.2 Ground Truth

        • 7.2.1 Annotation Types

        • 7.2.2 Annotation Tools

        • 7.2.3 Ground Truth Quality

      • 7.3 Evaluation

        • 7.3.1 Evaluation Types and Metrics

          • Notation

        • 7.3.2 Tools Available for Evaluation

      • 7.4 Future Direction

        • 7.4.1 Web-Based Evaluation Tools

        • 7.4.2 Web-Based Evaluation Results

        • 7.4.3 Standardization of Metrics and Benchmark Datasets

        • 7.4.4 Further Reading

      • References

  • Part II: Pose Estimation

    • Chapter 8: Articulated Pose Estimation and Tracking: Introduction

      • 8.1 Introduction

      • 8.2 Challenges

      • 8.3 Models and Inference

        • 8.3.1 Generative Methods

        • 8.3.2 Discriminative Methods

        • 8.3.3 Part-Based Methods

        • 8.3.4 Geometric Methods

          • Reconstruction-based:

          • Factorization-based:

      • 8.4 Alternative Sensors

      • References

    • Chapter 9: Model-Based Pose Estimation

      • 9.1 Kinematic Parametrization

        • 9.1.1 Rotation Matrices

        • 9.1.2 Euler Angles

        • 9.1.3 Quaternions

        • 9.1.4 Axis-Angle

          • 9.1.4.1 The Exponential Formula

          • 9.1.4.2 Exponential Maps for Rigid Body Motions

          • 9.1.4.3 The Logarithm

          • 9.1.4.4 Adjoint Transformation

        • 9.1.5 Kinematic Chains

          • 9.1.5.1 The Articulated Jacobian

        • 9.1.6 Human Pose Parametrization

          • 9.1.6.1 The Pose Jacobian

      • 9.2 Model Creation

        • 9.2.1 Geometric Primitives

        • 9.2.2 Detailed Body Scans

        • 9.2.3 Detailed Shape from Images

      • 9.3 Optimization

        • 9.3.1 Local Optimization

          • 9.3.1.1 Correspondence-Based

            • Feature extraction:

            • Model image association:

            • Descent strategies:

            • Different error functions:

            • Combining features:

          • 9.3.1.2 Optical Flow-Based

          • 9.3.1.3 Region-Based

          • 9.3.1.4 Probabilistic Interpretation

        • 9.3.2 Particle-Based Optimization and Filtering

          • 9.3.2.1 Particle Filter

            • Likelihood functions:

          • 9.3.2.2 Annealed Particle Filter

          • 9.3.2.3 Tailored Optimization Methods

      • 9.4 Discussion

      • References

    • Chapter 10: Motion Models for People Tracking

      • 10.1 Introduction

        • 10.1.1 Human Pose Tracking

      • 10.2 Kinematic Joint Limits and Smooth Motion

      • 10.3 Linear Kinematic Models

        • 10.3.1 Motion PCA: Evolving Pose Subspace

      • 10.4 Nonlinear Kinematic Models

        • 10.4.1 Gaussian Process Latent Variable Model

        • 10.4.2 Gaussian Process Dynamical Model

        • 10.4.3 Constrained Latent Spaces and Other Variants

          • 10.4.3.1 Back-Constraints and Topological Constraints

          • 10.4.3.2 Multi-factor GPLVM

          • 10.4.3.3 Hierarchical GPLVM

        • 10.4.4 Switching Linear Dynamical Systems

        • 10.4.5 Conditional Restricted Boltzmann Machines

        • 10.4.6 Heterogeneity, Compositionality, and Exogenous Factors

      • 10.5 Newtonian (Physics-Based) Models

        • 10.5.1 Planar Models of Locomotion

        • 10.5.2 Discussion: 3D Full-Body Models

      • 10.6 Discussion

      • References

    • Chapter 11: Part-Based Models for Finding People and Estimating Their Pose

      • 11.1 Introduction

        • Contemporary work:

        • Star models:

        • Tree models:

        • Related approaches:

      • 11.2 Part Models

        • 11.2.1 Color Models

        • 11.2.2 Oriented Gradient Descriptors

      • 11.3 Structural Constraints

        • 11.3.1 Linearly Parameterized Spring Models

          • Appearance term:

          • Deformation term:

        • 11.3.2 Articulation

        • 11.3.3 Gaussian Tree Models

          • Spatial prior:

          • Feature likelihood:

          • Log-linear posterior:

        • 11.3.4 Inference

          • MAP estimation:

          • Computation:

          • Sampling:

          • Marginals:

      • 11.4 Non-tree Models

        • 11.4.1 Occlusion Constraints

        • 11.4.2 Appearance Constraints

          • Pairwise consistency:

          • Global consistency:

        • 11.4.3 Inference with Non-tree Models

          • Mixtures of trees:

          • Generating tree-based configurations:

          • Loopy belief propagation:

          • Continuous state-spaces:

      • 11.5 Learning

        • 11.5.1 Generative Models

        • 11.5.2 Conditional Random Fields

        • 11.5.3 Structured Max-margin Models

        • 11.5.4 Latent-Variable Structural Models

          • Coordinate descent:

      • 11.6 Applications

        • 11.6.1 Pedestrian Detection

        • 11.6.2 Pose Estimation

          • Appearance constraints:

          • Mixtures of parts:

        • 11.6.3 Tracking

          • Tracking by detection:

          • Tracking by model-building:

      • 11.7 Discussion and Open Questions

      • References

    • Chapter 12: Feature-Based Pose Estimation

      • 12.1 Introduction

        • General difficulties:

        • Monocular ambiguities:

        • Generative and discriminative methods:

        • Degree of training data realism:

        • 12.1.1 The Need for Structure Modeling

        • 12.1.2 The Need for Selectivity and Invariance in Image Descriptors

          • Multilevel Spatial Blocks (MSB)

          • HMAX

          • Hyperfeatures

          • Spatial pyramid

          • Vocabulary tree

      • 12.2 Modeling Complex Image to Pose Relations

        • Conditional mixture of experts:

        • Training the experts:

        • Training the gates:

      • 12.3 Manifolds: Supervised Spectral Latent Variable Models

        • 12.3.1 Conditional Latent Variable Model

          • Implicit latent geometric constraints:

          • Learning algorithm:

      • 12.4 Structural SVM: Joint Localization and State Estimation

        • Joint kernel for location and state estimation:

        • Output loss function:

        • Cutting plane algorithm:

        • 12.4.1 Software

      • 12.5 Challenges and Open Problems

      • References

    • Chapter 13: Benchmark Datasets for Pose Estimation and Tracking

      • 13.1 Introduction

      • 13.2 Datasets for Monocular 2D Human Pose Estimation

        • Full-body human pose estimation:

        • Upper-body human pose estimation:

      • 13.3 Datasets for 3D Human Pose Estimation and Tracking

      • 13.4 Attributes and Properties of Datasets

        • 13.4.1 Real versus Simulated Images

        • 13.4.2 Sources of Ground Truth

        • 13.4.3 Evaluation Measure

          • 13.4.3.1 Average Joint Position Error

          • 13.4.3.2 Average Joint Angle Error

          • 13.4.3.3 Pixel Overlap

          • 13.4.3.4 Percentage of Correctly Estimated Body Parts

          • 13.4.3.5 Perplexity

        • 13.4.4 Complexity of Motion and Appearance

      • 13.5 Case Study: Monocular 2D Human Pose Estimation

        • Appearance representation:

        • Part relationships and inference:

      • 13.6 Case Study: 3D Human Pose Estimation

      • 13.7 Datasets for the Future

      • References

  • Part III: Recognition

    • Chapter 14: On Human Action

      • 14.1 The Problem

      • 14.2 Action Learning and Imitation

      • 14.3 Framing the Problem

        • 14.3.1 Action Recognition as a Specific Labeling

        • 14.3.2 Action Recognition as Primitive Sequence Decomposition

        • 14.3.3 Action, Objects and Context

        • 14.3.4 Action Recognition in Computer Vision

      • 14.4 Facial Expressions for the Analysis of Human Behavior

      • References

    • Chapter 15: Modeling and Recognition of Complex Human Activities

      • 15.1 Introduction

        • 15.1.1 Overview of Activity Recognition Methods

        • 15.1.2 Abnormal Activity Recognition

      • 15.2 Feature Descriptors

        • 15.2.1 Local Features

          • 15.2.1.1 Spatio-temporal Interest Points

          • 15.2.1.2 Cuboidal Features

          • 15.2.1.3 Volumetric Features

        • 15.2.2 Global Features

          • 15.2.2.1 Motion Descriptors

          • 15.2.2.2 Histogram of Oriented Optical Flow (HOOF)

          • 15.2.2.3 Space-Time Shapes

      • 15.3 Recognition Strategies

        • 15.3.1 Hidden Markov Models

        • 15.3.2 Stochastic Context-Free Grammars

      • 15.4 Complex Activity Recognition

        • 15.4.1 Challenges in Complex Motion Analysis

      • 15.5 Some Recent Approaches in Complex Activity Recognition

        • 15.5.1 Spatio-temporal Relationship Match

        • 15.5.2 String of Feature Graphs

        • 15.5.3 Stochastic Integration of Motion and Image Features for Hierarchical Video Search

        • 15.5.4 Dynamic Modeling of Streaklines for Motion Pattern Analysis

      • 15.6 Conclusion

        • 15.6.1 Further Reading

      • References

    • Chapter 16: Action Recognition Using Topic Models

      • 16.1 Introduction

      • 16.2 Topic Models

        • 16.2.1 Inference on Topic Models

      • 16.3 Far-Field Action Recognition Based on Trajectories of Objects

        • 16.3.1 Single Camera View

        • 16.3.2 Multiple Camera Views

      • 16.4 Far-Field Action Recognition Based on Local Motions

      • 16.5 Near-Field Action Recognition Based on Interest Points

      • 16.6 Conclusion and Discussion

        • 16.6.1 Further Reading

      • References

    • Chapter 17: Learning Action Primitives

      • 17.1 Introduction

        • 17.1.1 Connections to Biological Models of Movement Primitives

      • 17.2 Representations and Learning of Movement Primitives

        • 17.2.1 Stochastic Approaches

          • 17.2.1.1 Generative Models

          • 17.2.1.2 Discriminative Models

        • 17.2.2 Dynamical Systems Approaches

        • 17.2.3 Measurement Systems

      • 17.3 Dimensionality Reduction

        • 17.3.1 Spectral Dimensionality Reduction

        • 17.3.2 Generative Dimensionality Reduction

        • 17.3.3 Approaches Specific to Action Primitives

      • 17.4 Segmentation

        • 17.4.1 Movement Primitives Known a priori

        • 17.4.2 Assumption About Segment Point Indicators

      • 17.5 Connections to Learning at Higher Levels of Abstraction

      • 17.6 Conclusions and Open Questions

      • References

    • Chapter 18: Contextual Action Recognition

      • 18.1 Introduction

        • 18.1.1 What is Context?

        • 18.1.2 What Are the Advantages of Context?

          • Context improves recognition

          • Context makes semi-supervised learning easier

      • 18.2 Context in Human Action Recognition

        • 18.2.1 Object Context in Human Action Recognition

        • 18.2.2 Scene Context in Human Action Recognition

        • 18.2.3 Object-Scene Context in Human Action Recognition

        • 18.2.4 Semantic Context in Human Action Recognition

      • 18.3 Example: A Contextual Object-Action Recognition Method

        • 18.3.1 Features for Classification

          • Object features

          • Action features

          • Correlation between object and action features

        • 18.3.2 Classification of Object-Action Data

          • Linear-chain CRF

          • Factorial CRF

        • 18.3.3 Object-Action Recognition using CRF:s

        • 18.3.4 Experiments

      • 18.4 Conclusions

        • 18.4.1 Further Reading

      • References

    • Chapter 19: Facial Expression Analysis

      • 19.1 Introduction

      • 19.2 Annotation of Facial Expression

      • 19.3 Databases

      • 19.4 Facial Feature Tracking, Registration and Feature Extraction

        • 19.4.1 Facial Feature Detection and Tracking

        • 19.4.2 Registration and Feature Extraction

          • Registration:

          • Geometric features:

          • Appearance features:

          • Other features:

      • 19.5 Supervised Learning

        • 19.5.1 Classifiers

        • 19.5.2 Selection of Positive and Negative Samples During Training

          • 19.5.2.1 Static Approach

          • 19.5.2.2 Dynamic Approach

      • 19.6 Unsupervised Learning

        • 19.6.1 Facial Event Discovery for One Subject

        • 19.6.2 Facial Event Discovery for Sets of Subjects

      • 19.7 Conclusion and Future Challenges

      • References

    • Chapter 20: Benchmarking Datasets for Human Activity Recognition

      • 20.1 Introduction

      • 20.2 Single View Activity Benchmarks with Cleaner Background

        • 20.2.1 The KTH and the Weizmann Dataset

        • 20.2.2 The University of Rochester Activity of Daily Living Dataset

        • 20.2.3 Other Datasets

      • 20.3 Single View Activity Benchmarks with Cluttered Background

        • 20.3.1 The CMU Soccer Dataset and Crowded Videos Dataset

        • 20.3.2 The University of Maryland Gesture Dataset

      • 20.4 Multi-view Benchmarks

        • 20.4.1 The University of Central Florida Sports Dataset

        • 20.4.2 The INRIA Multi-view Dataset

      • 20.5 Benchmarks with Real World Footages

        • 20.5.1 The University of Central Florida Youtube Dataset

        • 20.5.2 The Hollywood Dataset

        • 20.5.3 The Olympic Dataset

      • 20.6 Benchmarks with Multiple Activities

      • 20.7 Other Benchmarks

      • 20.8 Conclusions

        • 20.8.1 Further Readings

      • References

  • Part IV: Applications

    • Chapter 21: Applications for Visual Analysis of People

      • 21.1 Introduction

        • 21.1.1 Security

        • 21.1.2 Communication

        • 21.1.3 Entertainment

        • 21.1.4 Automotive

      • 21.2 Conclusion

    • Chapter 22: Image and Video-Based Biometrics

      • 22.1 Introduction

      • 22.2 Face Recognition Techniques

        • 22.2.1 Traditional Approaches

        • 22.2.2 Sparsity Inspired Face Recognition

        • 22.2.3 Face Recognition Across Illumination and Pose

        • 22.2.4 Video-Based Face Recognition

        • 22.2.5 Face Recognition Experiments

      • 22.3 Iris-Based Personal Authentication

        • 22.3.1 Components of an Iris Recognition System

        • 22.3.2 Publicly Available Datasets

        • 22.3.3 Iris Recognition from Videos

        • 22.3.4 Sparsity-Based Iris Recognition

      • 22.4 Gait-Based Recognition

        • 22.4.1 Results on the USF Database

      • 22.5 Conclusion

      • References

    • Chapter 23: Security and Surveillance

      • 23.1 Introduction

      • 23.2 Current Systems

      • 23.3 Emerging Techniques

        • 23.3.1 Intent Profiling

        • 23.3.2 Crowded Scene Analysis

        • 23.3.3 Cooperative Multi-camera Network Surveillance

        • 23.3.4 Context-Aware Activity Analysis

        • 23.3.5 Human in the Loop

      • 23.4 Conclusion

        • 23.4.1 Further Reading

      • References

    • Chapter 24: Predicting Pedestrian Trajectories

      • 24.1 Introduction

      • 24.2 LTA, a Pedestrian Motion Model

      • 24.3 Stochastic LTA

        • 24.3.1 Why not a Particle Filter Framework?

        • 24.3.2 Training

      • 24.4 Experiments

        • 24.4.1 Prediction

        • 24.4.2 Group Classification

        • 24.4.3 Tracking

      • 24.5 Conclusion

      • Appendix A: Derivation of the marginal probabilities

      • References

    • Chapter 25: Human-Computer Interaction

      • 25.1 Introduction

      • 25.2 Head and Eye-Gaze Tracking

        • 25.2.1 Head Pose Estimation

          • 25.2.1.1 Appearance-Based Techniques

          • 25.2.1.2 Regression-Based Techniques

          • 25.2.1.3 Model-Based Techniques

          • 25.2.1.4 Other Techniques

          • 25.2.1.5 Discussion

        • 25.2.2 Eye-Gaze Estimation

          • 25.2.2.1 Feature-Based Techniques

          • 25.2.2.2 Appearance-Based Techniques

          • 25.2.2.3 Natural Light Techniques

          • 25.2.2.4 Discussion

      • 25.3 Gesture Recognition

        • 25.3.1 Hand Gestures

        • 25.3.2 Finger Gestures

          • 25.3.2.1 Appearance-Based Techniques

          • 25.3.2.2 Model-Based Techniques

        • 25.3.3 Discussion

      • 25.4 Conclusion

        • 25.4.1 Further Reading

      • References

    • Chapter 26: Social Signal Processing: The Research Agenda

      • 26.1 Introduction

      • 26.2 Social Signals: Terminology, Definition, and Cognitive Modelling

        • 26.2.1 Terminology

        • 26.2.2 Working Definition of Social Signals

          • Social emotions:

          • Social evaluation:

          • Social attitudes:

      • 26.3 Machine Analysis of Social Signals

        • Social interactions:

        • Social emotions:

        • Social evaluations:

        • Social attitudes:

        • Social relations:

        • Fusion:

        • Fusion and context:

        • Technical aspects:

      • 26.4 Machine Synthesis of Social Signals

        • Social interactions:

        • Social emotions:

        • Social evaluations:

        • Social attitudes:

        • Social relations:

        • Continuity:

        • Consistency:

      • 26.5 Summary and Additional Issues

      • 26.6 Conclusion

      • References

    • Chapter 27: Sign Language Recognition

      • 27.1 Motivation

      • 27.2 Sign Linguistics

      • 27.3 Data Acquisition and Feature Extraction

        • 27.3.1 Manual Features

          • 27.3.1.1 Tracking Based

          • 27.3.1.2 Non-tracking Based

          • 27.3.1.3 Hand Shape

        • 27.3.2 Finger Spelling

        • 27.3.3 Non-manual Features

      • 27.4 Recognition

        • 27.4.1 Classification Methods

        • 27.4.2 Phoneme Level Representations

      • 27.5 Research Frontiers

        • 27.5.1 Continuous Sign Recognition

        • 27.5.2 Signer Independence

        • 27.5.3 Fusing Multi-modal Sign Data

        • 27.5.4 Using Linguistics

        • 27.5.5 Generalising to More Complex Corpora

      • 27.6 Conclusion

        • 27.6.1 Further Reading

      • References

    • Chapter 28: Sports TV Applications of Computer Vision

      • 28.1 Motivation

      • 28.2 Graphics and Analysis Systems that do not Rely on Computer Vision

        • 28.2.1 Simple Graphics Overlay

        • 28.2.2 Graphics Overlay on a Calibrated Camera Image

      • 28.3 The Evolution of Computer Vision Systems for Sport

        • 28.3.1 Analysis in 2D

        • 28.3.2 Analysis in 3D

      • 28.4 A Closer Look at Camera Calibration

        • 28.4.1 Tracking Camera Movement Using Pitch Lines

          • 28.4.1.1 Estimating the Position of the Camera Mount

          • 28.4.1.2 Initialisation of the Camera Pose

          • 28.4.1.3 Frame-to-Frame Tracking of the Camera Motion

        • 28.4.2 Camera Tracking Using Areas of Rich Texture

      • 28.5 Conclusion

        • 28.5.1 Camera Calibration

        • 28.5.2 Segmentation, Identification and Tracking

        • 28.5.3 Further Reading

      • References

    • Chapter 29: Multi-view 4D Reconstruction of Human Action for Entertainment Applications

      • 29.1 Introduction

      • 29.2 Applications

      • 29.3 Approaches

        • 29.3.1 4D Reconstruction

        • 29.3.2 Model-Based Tracking

      • 29.4 A 4D Reconstruction Pipeline

        • 29.4.1 Camera Calibration

        • 29.4.2 Segmentation

          • 29.4.2.1 Sport Broadcast

        • 29.4.3 Reconstruction

          • 29.4.3.1 Algorithms for Visual-Hull Computation

          • 29.4.3.2 Alternative Computation Strategies

        • 29.4.4 Rendering

      • 29.5 Conclusion

      • References

    • Chapter 30: Vision for Driver Assistance: Looking at People in a Vehicle

      • 30.1 Introduction and Motivation

      • 30.2 Overview of Selected Studies

      • 30.3 Looking at Driver Head, Face, and Facial Landmarks

        • 30.3.1 Monitoring and Prediction of Driver Fatigue

        • 30.3.2 Eye Localization, Tracking and Blink Pattern Recognition

        • 30.3.3 Tracking Driver Head Pose

      • 30.4 Looking at Driver Body, Hands, and Feet

        • 30.4.1 Looking at Hands

        • 30.4.2 Modeling and Prediction of Driver Foot Behavior

        • 30.4.3 Analyzing Driver Posture for Driver Assistance

      • 30.5 Open Issues for Future Research

      • 30.6 Conclusion

        • 30.6.1 Further Reading

      • References

  • Glossary

  • Index

Nội dung

[...]... Moeslund et al (eds.), Visual Analysis of Humans, DOI 10.1007/97 8-0 -8 572 9-9 9 7-0 _1, © Springer- Verlag London Limited 2011 3 4 T.B Moeslund sets have allowed for different methods to be directly comparable, since they now can train and test on the same data sets Some conferences even introduce competitions on data sets defined just for that event Chapter 7 will give an overview of such data sets and how... jordi.gonzalez@uab.cat X Roca e-mail: xavier.roca@uab.cat T.B Moeslund et al (eds.), Visual Analysis of Humans, DOI 10.1007/97 8-0 -8 572 9-9 9 7-0 _2, © Springer- Verlag London Limited 2011 11 12 M Al Haj et al 2.1 Introduction Many applications in the computer vision field benefit from high-resolution imagery These include, but are not limited to, license-plate identification [4] and face recognition, where it has been observed... the pixels are from a human, a car or something else To this end filtering and blob analysis are normally required Blob analysis can use shape cues to detect non-human-like objects, but the problem of shadows cast by humans is hard to solve since the shape of such blobs are naturally human-like Different types of context-reasoning are therefore involved when trying to detect and delete shadows, for example... expensive, but with the introduction of for example GPU-based implementations this is less of a problem 1.2 Tracking Tracking is here defined as finding the temporal trajectory of an object through some state-space The object would here often be the human but it could also be different body-parts as will be the case in Part II The variables spanning the state-space are very often the 3D location parameters... maximum resolution of the tracked object, whereas the second is minimizing the risk of losing this object Therefore, zoom control can be thought of as a trade-off between the effective resolution per target and the desired coverage of the area of surveillance With a finite number of fixed sensors, there is a fundamental limit on the total area that can be observed Thus, maximizing both the area of coverage... rosenhahn@tnt.uni-hannover.de Amit K Roy-Chowdhury University of California, Riverside, 900 University Ave Riverside, CA 92521, USA, amitrc@ee.ucr.edu Marc Schroeder DFKI, Saarbrucken, Germany William R Schwartz Institute of Computing, University of Campinas, CampinasSP 1308 4-9 71, Brazil, wschwartz@liv.ic.unicam.br Ricky J Sethi University of California, Los Angeles, 4532 Boelter Hall, CA 9009 5-1 596, USA,... Jaishanker K Pillai Department of Electrical and Computer Engineering, Center for Automation Research, University of Maryland, College Park, MD 20742, USA, jsp@umiacs.umd.edu Isabella Poggi Dept Of Education, University Roma Tre, Rome, Italy Gerard Pons-Moll Leibniz University, Hanover, Germany, pons@tnt.uni-hannover.de Deva Ramanan Department of Computer Science, University of California, Irvine, USA,... Over the course of the last 10–20 years the field of computer vision has been preoccupied with the problem of looking at people Hundreds, if not thousands, of papers have been published on the subject that span people and face detection, pose estimation, tracking and activity recognition This research focus has been motivated by the numerous potential application for visual analysis of people from human–computer... hardware or simple software A good example of such an application is commercial motion capture equipment [4] 3D sensing is another strategy that might play an important role in future acquisition systems While different stereo solutions have been around for some time [14] a new type of compact 3D measurement devices are emerging, the time -of- flight cameras [2, 3] They also provide 3D images of the scene,... e-mail: malhaj@cvc.uab.es C Fernández e-mail: perno@cvc.uab.es I Huerta e-mail: ivan.huerta@cvc.uab.es Z Xiong · J Gonzàlez · X Roca Computer Vision Center and Departament de Ciències de la Computació, Universitat Autònoma de Barcelona, Bellaterra 08193, Spain Z Xiong e-mail: zhanwu@cvc.uab.es J Gonzàlez e-mail: jordi.gonzalez@uab.cat X Roca e-mail: xavier.roca@uab.cat T.B Moeslund et al (eds.), Visual . 615 Pittsburgh, PA 15213 USA lsigal@disneyresearch.com ISBN 97 8-0 -8 572 9-9 9 6-3 e-ISBN 97 8-0 -8 572 9-9 9 7-0 DOI 10.1007/97 8-0 -8 572 9-9 9 7-0 Springer London Dordrecht Heidelberg New York British Library. y0 w0 h0" alt="" Visual Analysis of Humans

Ngày đăng: 05/06/2014, 11:54

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN