Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 30 trang
THÔNG TIN TÀI LIỆU
Cấu trúc
0387890238
Handbook of Multimedia
for Digital Entertainment
and Arts
Preface
Part I
DIGITAL ENTERTAINMENT
TECHNOLOGIES
1 Personalized Movie Recommendation
Introduction
Background Theory
Recommender Systems
Collaborative Filtering
Data Collection -- Input Space
Neighbors Similarity Measurement
Neighbors Selection
Recommendations Generation
Content-based Filtering
Other Approaches
Comparing Recommendation Approaches
Hybrids
MoRe System Overview
Recommendation Algorithms
Pure Collaborative Filtering
Pure Content-Based Filtering
Hybrid Recommendation Methods
Experimental Evaluation
Conclusions and Future Research
2 Cross-category Recommendation for Multimedia Content
Introduction
Technological Overview
Overview
Multimedia Content Recommendation
Basic Technologies Involving CF
Basic Technologies Involving CBF
Key Elements of a Content Recommendation System Using CBF
Content Profiling
Manual Tagging
Automatic Tagging
1) Automatic Tagging from Textual Information
2) Automatic Tagging from Visual Information
3) Automatic Tagging from Audio Information
Context Learning
User Preference Learning
Matching
1) VSM
2) NB Classifier
3) Other Approaches
Typical Cases of Multimedia Content Recommendation System
1) Content-meta-based Search
2) Context-aware Search
3) User-preference-based Search
Cross-category Recommendation
Key Points of a Cross-category Recommendation
Category Common Metadata
Separate User Preference Generation for Each Category
Embodiment of Recommendation Engine: Voyager Engine (VE)
Overview
Explanation of Component
Key Methods to Realize Cross-category Recommendation
AME
ICF
RCF
Example of Practical Applications
Multimedia Content Recommendation
branco
SensMe
Cross-category Recommendation
VAIO Giga Pocket Digital
TV Kingdom Service
Difficulties
Summary and Future Prospects
3 Semantic-Based Framework for Integration and Personalization of Television Related Media
Introduction
Related Work
Application Scenario
TV-Anytime
TV-Anytime Phase I
TV-Anytime Phase II
Semantically Enriched Content Model
Personalized Home Media Center
Design of Personalized Home Media Center
User Modeling
Context
Events
Cold Start
Import of known user profiles
Classification of users in groups
Personalized Content Search
Personalized Presentations
Implementation
SenSee Server
iFanzy
Conclusions
4 Personalization on a Peer-to-Peer Television System
Introduction
Related Work
Recommendation
Distributed Recommendation
Learning User Interest
System Design
User Profiling from Zapping Behavior
BuddyCast Profile Exchange
Recommendation by Relevance Models
Item-based Generation Model
User-based Generation Model
Statistical Ranking Mechanisms
Personalized User Interfaces
Experiments and Results
Data Set
Observations of the Data Set
Learning the User Interest Threshold
Convergence Behavior of BuddyCast
Recommendation Performance
Conclusions
5 A Target Advertisement System Based on TV Viewer's Profile Reasoning
Introduction
Architecture of Proposed Target Advertisement System
Proposed Profile Reasoning Algorithm
Analysis of Features Depending on User Profiles
Feature Extraction
The First Stage Classifier
The Second Stage Classifier
The Third Stage Classifier
Target Advertisement Contents Selection Method
Target Advertisement Contents Selection Method
Experimental Results
Experimental Result of Profile Reasoning
The Implementation Result of the Prototype Target Advertisement System
Conclusion
6 Digital Video Quality Assessment Algorithms
Introduction
HVS -- Based Approaches
Digital Video Quality Metric
Scalable Wavelet-Based Distortion Metric
Structural and Information-Theoretic approaches
Structural Similarity Index
Video Visual Information Fidelity
Feature Based Approaches
Video Quality Metric
Motion Modeling Based Approaches
Speed-Weighted SSIM
Motion Based Video Integrity Evaluation
Performance Evaluation & Validation
Conclusions & Future Directions
7 Countermeasures for Time-Cheat Detection in Multiplayer Online Games
Introduction
Background on System Architectures
System Model
Modeling Game Time
Time Cheats
Look-Ahead Cheat
Fast Rate Cheat
Suppress-Correct Cheat
Cheating Prevention
Cheating Detection
Conclusions and Future Directions
8 Zoning Issues and Area of Interest Management in Massively Multiplayer Online Games
Introduction
Challenges and Requirements
MMOG Architecture -- An Overview
MMOG Classification
Communication Architecture
Virtual Space Decomposition - Zoning
Zone Definition
Multiple Zones and its Space
Area of Interest Management
Interest Management Models
Publisher-Subscriber Model
Space Model
Region Model
Implementation Intelligence
Message Aggregation
Message Compression
Dead Reckoning
Interest Management Algorithms
Proximity Algorithms
Comparison of Euclidean Distance and Hexagonal Tile Algorithms
Euclidean Distance Algorithm
Hexagonal Tile Algorithm
Visibility Algorithms
Comparison of Ray and Tile Visibility Approach
Ray visibility
Tile visibility
Reachability Algorithms
Comparison of Tile Distance and Tile Neighbor Algorithms
Tile distance
Tile neighbor
Zone Crossing in P2P MMOGs
Different Interest Management Models -- Research Perspectives
Conclusions and Future Directions
9 Cross-Modal Approach for Karaoke Artifacts Correction
Karaoke
Preprocessing: noise detection and removal.
Tempo handling.
Tune handling.
Pitch handling.
Detection of Highlighted Video Captions
Algorithm for Karaoke Adjustment
Results
Conclusion
10 Dealing Bandwidth to Mobile Clients Using Games
Introduction
Resource Allocation Taxonomies
Resource Allocation Using Game Theory
Dealing Bandwidth Using a Game
The Three Phase Bandwidth Dealing Game
k-calculation Phase
Main Game Phase
Round 0 - base bandwidth dealing (BBD):
Round 1 - dynamic bandwidth dealing (DBD):
Round 2 - remainder bandwidth dealing (RBD):
Streaming-Seat Reallocation Phase
Concluding Discussion
11 Hack-proof Synchronization Protocol for Multi-player Online Games
Introduction
Backgrounds
Dead-reckoning
Linear Extrapolation
Speed-hack
Hack-proof Synchronization Protocol
Countermeasure
Invulnerability
Handling Missing Packets
Modified Dead-reckoning Protocol
Invulnerability
Extension
Handling Missing Packets
Enhanced Invulnerable Protocol
Handling Missing Packets
Extensions
Proof of Invulnerability
Implementation
Network Overhead
Related Works
Conclusion
12 Collaborative Movie Annotation
Introduction
Collaborative Retrieval and Tagging
Collaborative Retrieval
Collaborative Tagging of Non-Video Media
Collaborative Tagging of Video Media
Summary
Experiment Design
Video Metadata Tools and Content
User Groups and Tasks
Experiment Results
Research Method: Grounded Theory
Movie Content Metadata Creation
Most Commonly Used Tags
Relationships between Tags
System Features
An Architecture for a Collaborative Movie Annotation System
Metadata Scheme
System Architecture
Resources
Annotation
Retrieval
Community Interaction and Profiling
Concluding Discussion
Part II DIGITAL AUDITORY MEDIA
13 Content Based Digital Music Management and Retrieval
Introduction
Music Visualization: Tension Visualization Approach
Noisy Level Calculation
Tempo Estimation
Music Summarization: Key Segment Extraction Approach
Description of Key Segment
Key-Segment Extraction
Music Similarity Measure: Chroma-Histogram Approach
Feature Extraction
Model Construction
Distance Measure
An Realized Music Archive Management System
Conclusion and Future Directions
14 Incentive Mechanisms for Mobile Music Distribution
Introduction
The Current Mobile Music Market
Communication Infrastructure
Pricing Strategy
Copyright Protection
A Multi-Channel Distribution Approach
Multi-Channel Mobile Distribution
An Incentive Mechanism
Evaluation of the Multi-Channel Distribution Strategy
Wireless Music store
Cellphone Network Operators
Customer Point of View
Selfish distribution.
Equal distribution.
Higher Reward.
Minimum Reward.
Proportional distribution.
Higher Reward.
Related Work
Conclusions
15 Pattern Discovery and Change Detection of Online Music Query Streams
Introduction
Problem Definition of Pattern Discovery of Music Query Streams
Mining of Frequent Temporal Patterns in Music Query Streams
Data Processing: Bit-sequence Representation
The Proposed Algorithm FTP-stream
Window Initialization Phase of FTP-stream Algorithm
Window Sliding Phase of FTP-stream Algorithm
Frequent Temporal Pattern Generation Phase of FTP-stream
Experimental Evaluation of Pattern Discovery of Music Query Streams
Change Detection of Online Music Query Streams
Problem Definition
Detecting Changes from User-centered Music Query Streams
The Proposed Summary Data Structure MSC-list
The Proposed MQS-change Algorithm
Connection Between FTP-stream and MQS-change
Experimental Results of MQS-change Algorithm
Conclusions
16 Music Search and Recommendation
Introduction
Acoustic Features for Music Modeling
Low-level Audio Features
Mel-Frequency Cepstral Coefficients
Audio Spectrum Envelope
Audio Spectrum Flatness
Linear Predictive Coding
Zero Crossing Rate
Audio Spectrum Centroid
Audio Spectrum Spread
Mid-level Audio Features
Rhythmic Mid-level Features
Harmonic Mid-level Features
High-level Music Features
Statistical Modeling and Similarity Measures
Dimension Reduction
Principal Component Analysis
Self-Organizing Maps
Linear Discriminant Analysis
Statistical Models of The Song
Distance Measures
``In the Mood'' -- Towards Capturing Music Semantics
Classification Models
Classification Based on Gaussian Mixture Models
Classification Based on Support Vector Machines
Mood Semantics
Mood Models
Mood Classification
Music Recommendation
Visualizing Music for Navigation and Exploration
Visualization of Songs
Visualization of Music Archives
Navigation and Exploration in Music Archives
Summary and Open Issues
Applications
Business to Business Applications
Business to Consumer Applications
Future Directions and Challenges
17 Automated Music Video Generation Using Multi-levelFeature-based Segmentation
Introduction
Related Work
System Overview
Video Segmentation and Analysis
Segmentation by Contour Shape Matching
Video Feature Analysis
Detecting Significant Regions
Music Segmentation and Analysis
Novelty Scoring
Music Feature Analysis
Matching Music and Video
Experimental Results
Conclusion
Part III DIGITAL VISUAL MEDIA
18 Real-Time Content Filtering for Live Broadcasts in TV Terminals
Introduction
Real-time Content Filtering System
Filtering System Structure
Real-Time Content Filtering Algorithm
Filtering System Analysis
Modeling of a Filtering System
Experiments
Applied Filtering Algorithm for Soccer Videos
Experimental Results with Soccer Videos
Discussion
Conclusion
19 Digital Theater: Dynamic Theatre Spaces
Introduction
Interactive Theater
Interactive Theater Architecture
Embodied mixed reality space and Live 3D actors
Hardware setup
Interactive Theatre system
Automated Performance by Digital Actors
Human/machine collaborative performance
The Pattern Game
One Word Story
The Association Engine
Experiencing a Performance
Completely Automated Performances
Story Discovery
Conclusions
20 Video Browsing on Handheld Devices
Introduction
A Short Review of Video Browsing Techniques for Larger Displays
Mobile Video Usage and Need for Browsing
Timeline-Based Mobile Video Browsing and Related Problems
Implementation
Flicking vs. Elastic Interfaces
Linear vs. Circular Interaction Patterns
One-handed Content-Based Mobile Video Browsing
Summary and Outlook
21 Projector-Camera Systems in Entertainment and Art
Introduction
Visualization with Projector-Camera Systems
Inverting the Light Transport
Geometric Image Correction
Photometric Image Correction
Defocus Compensation
Structured Light Scanning
Interaction with Projector-Camera Systems
Interaction with Spatial Projector
Physically Viewing Interaction
Near Distance Interaction
Far Distance Interaction
Interaction with Handheld Projectors
Image Stabilizing
Pointing Techniques
Selection and Manipulation
Multi-user Interaction
Environment Awareness
Interaction Design and Paradigm
Application Examples
Embedded Multimedia Presentations
Superimposing Museum Artifacts
Spatial Augmented Reality
Flexible Digital Video Composition
Interactive Attraction Installations
The Future of Projector-Camera Systems
22 Believable Characters
Introduction
Character Personality
Body Type Theories
Psychodynamic Theories
Traits Theories
Factor Theories
Johnstone's Fast Food Stanislavsky Model
Personality and Believable Characters
Nonverbal Behavior Theory and Models
Structural Approach
Descriptive Approach
Social and Communication
Gesture
Delsarte
Laban Movement Analysis
Effort Overview
Understanding the Subtle Meaning of Nonverbal Behaviors
Nonverbal Behavior and Adaptive Believable Character
Animation Techniques
Animation and Adaptive Believable Character
Conclusions and Open Problems
23 Computer Graphics Using Raytracing
Introduction
The Origins of Raytracing
Raytracing
The Raycasting Algorithm
Ray Intersection Tests
Performing Surface Shading
Generation of Secondary Rays
Controlling Scene Complexity
Image Quality Issues
Acceleration of Raytracing
Bounding Volumes
Space Partitioning Tree Structures
Hardware Accelerated Raytracing
Summary
24 The 3D Human Motion Control Through Refined Video Gesture Annotation
Introduction
Related Work
Proposed Approach
Human Motion Analysis & Comparison
Video Human Motion Feature Extraction
3D Human Motion Capture Data
Motion Feature Value Comparison between 3D Motion Capture and Video Human Motion
Controlling 3D Human Motion using Video Human Motion Annotation
Conclusion
Part IV DIGITAL ART
25 Information Technology and Art: Concepts and Stateof the Practice
Introduction
The Conceptual Framework
Who
Where
Why
What
Description of the Projects
Flyndre
Who
Where
Why
What
Sonic Onyx
Who
Why
Where
What
The Open Wall
Who
Why
Where
What
Chaotic Robots For Art (Fig.4)
Who
Where
Why
What
Interactive Bubble Robots For Art
Who
Where
Why
What
Discussion and Conclusion
26 Augmented Reality and Mobile Art
Introduction
Overview of AR
AR Mobile Art
Case Study: AR Mobile Art
Conclusion
27 The Creation Process in Digital Art
Introduction
Digital Art Fundamentals
Definitions
Creation Process
The Process
The Creative Design Space Architecture
Discussion
Conclusions and Future Work
28 Graphical User Interface in Art
Introduction
Strategies for the Re-contextualization of the GUIin Art Practice
The Visual and Conceptual Configuration of the GUI
The GUI as an Environment for Art Practice
Conclusion
29 Storytelling on the Web 2.0 as a New Means of Creating Arts
Introduction
Use Scenarios
Related Work
Community of Practice and Web 2.0
Knowledge Work and Web 2.0
Storytelling on Web 2.0
Existing Storytelling Platforms
YouTell: A Web 2.0 Service for Community Based Storytelling
Virtual Campfire
The Role Model
Web 2.0 for Storytelling: Tagging and Rating
Profile-based Story Searching
Expert Finding System
Web 2.0 for the Expert-finding Algorithm
Implementation of the YouTell Prototype
YouTell Evaluation
Prototype Testing
Profile Based Story Search
Expert Finding Algorithm
Summary
Part V CULTURE OF NEW MEDIA
30 A Study of Interactive Narrative from User's Perspective
Introduction
Previous Research
Interactive Narrative Architectures
Evaluating the User's Experience within Interactive Narrative
Façade
Method
Study Design
Participants
Procedure
Analysis
Theoretical Lenses for Discussing Participants' Experience
Summary of Participants' Statements
Results
Lens 1: System Constraints (Informed by Boundaries, Freedom, Goals, and Control)
Phase I: Initial Conceptions of IN Pertaining to System Constrains Lens
Phase II: Pre Play Conceptions of IN from the Façade Description Pertaining to System Constrains Lens
Phase IV: Façade Post-play Interview Pertaining to System Constrains Lens
Lens 2: Role Play
Phase I: Initial Conceptions of IN Pertaining to Role Play Lens
Preparation for Role Play
The Process of Role Playing
Phase II: Pre Play Conceptions of IN from the Façade Description Pertaining to Role Play Lens
Preparation for Role Play
The Process of Role Playing
Phase IV: Façade Post-play Interview Pertaining to Role Play Lens
Preparation for Role Play
The Process of Role Playing
Reflections on Interactive Narrative
Conclusion
31 SoundScapes/Artabilitation -- Evolution of a Hybrid Human Performance Concept, Method Apparatus Where Digital Interactive Media, The Arts, Entertainment are Combined
Introduction
Background
Painting for Life
Strategies of Use
All-inclusive Inquiry
System Actability, Usability, Usefulness and Affordability
Untraditional Therapeutic Practice
Transcending to and from Entertainment and the Arts
Underground Non-formal Learning
ArtAbilitation Workshops, Casa da Musica, Porto, Portugal
Visualizing classical music
Conclusions and Future Directions
32 Natural Interaction in Intelligent Spaces: Designingfor Architecture and Entertainment
Introduction
Related Work
Smart Spaces
Perceptual Intelligence and Natural Interaction
Bayesian Networks for User Modeling and Interactive Narrative
Criteria for Intelligent Space Design
Perceptual Intelligence
Interpretive Intelligence
Narrative Intelligence
Intelligence Modeling
Applications
Perceptual Intelligence: Navigating the Internet City
Natural Interfaces: Motivation
City of news: an Internet City in 3D
2D Blob Tracking
Person Tracking and Shape Recovery
Gesture Recognition
Comments
Interpretive Intelligence: Modeling User Preferences in The Museum Space
User Modeling: Motivation
The Museum Wearable
Sensor-Driven Understanding of Visitors' Interests with Bayesian Networks
Model Description, Learning and Validation
Comments
Narrative Intelligence: Sto(ry)chastics
Narrative Intelligence: Motivation
Editing Stories for Different Visitor Types and Profiles
Comments
Discussion and Conclusions
33 Mass Personalization: Social and Interactive Applications Using Sound-Track Identification
Introduction
Personalizing Broadcast Content: Four Applications
Personalized Information Layers
Ad-hoc Peer Communities
Real-time Popularity Ratings
Video ``Bookmarks''
Supporting Infrastructure
Client-Interface Setup
Audio-Database Server Setup
Social-Application Server Setup
Audio Fingerprinting
Hashing Descriptors
Within-Query Consistency
Post-Match Consistency Filtering
Evaluation of System Performance
Empirical Evaluation
``In-Living-Room'' Experiments
Discussion
Index
Nội dung
21 Projector-Camera Systems in Entertainment and Art 479 Physically Viewing Interaction By projecting images directly onto everyday surfaces, a projector-camera system may be used for creating augmentation effects, such as virtually painting the ob- ject surface with a new color, new texture, or even an animation. Users can interact directly with such projector-based augmentations. For example, they may observe the object from different sides, while simultaneously experiencing consistent occlu- sion effects and depth, or they can move nearer or further from the object, to see local details and global views. Thus, the intuitiveness of physical interaction and advantages of digital presentation are combined. This kind of physically interactive visualization ability is suitable for use in situations when virtual content is mapped as a texture on real object surfaces. View-dependent visual effects such as highlighting to simulate virtually shiny sur- faces require tracking of the users’ view. Multi-user views can also be supported by time-multiplexing the projection for multiple users, with each user wearing a synchronized shutter glass allowing the selection of individual views. But this is only necessary for view-dependent augmentations. Furthermore, view tracking and stereoscopic presentation ability enables virtual objects to be displayed not only on the real surface, but also in front of or behind the surface. A general geometric framework to handle all these variants is described in [26]. The techniques described above, only simulate the desired appearance of an aug- mented object which is supposed to remain fixed in space. To make the projected content truly user-interactive, more information apart from viewpoint changes is required. After turning an ordinary surface into a display, it is further desirable to ex- tend it to become a user interface with an additional input channel. Thereby, cameras can be used for sensing. In contrast to other input technologies, such as embedded electronics for touch screens, tracked wand, or stylus and data gloves often used in virtual environments; vision-based sensing technology has the flexibility to sup- port different types of inputting techniques without modifying the display surface or equipping the users with different devices for different tasks. Differing from in- teraction with special projection screens such as electronically enabled multi-touch or rear-projected screens, some of the primary issues associated with vision-based interaction with front-projected interfaces are the illuminations on the detected hand and object, as well as cast of shadows. In following subsections, two types of typical interaction approaches with spatial projector-camera systems will be introduced, namely near distance interaction and far distance interaction. Vision based interaction techniques will be the main focus and basic interaction operations such as pointing, selecting and manipulation will be considered. Near Distance Interaction In near-distance situations where the projection surface is within arm’s length of the user, finger touching or hand gestures are intuitive ways to select and manipulate the 480 O. Bimber and X. Yang interface. Apart from this, the manipulation of physical objects can also be detected and used for triggering interaction events. Vision-based techniques may apply a visible light or infrared light camera to capture the projected surface area. To detect finger touching on a projected surface a calibration process, similar to the geometric techniques presented in section “Geo- metric Image Correction”, is needed to map corresponding pixels between projector and camera. Next, fingers, hands and objects need to be categorized as part of the foreground in order to separate them from the projected surface background. When interactions take place on a front-projected surface, the hand is illuminated by the displayed images and thus the appearance of a moving hand changes quickly. This renders segmentation methods, based on skin color or region-growing methods as useless. Frequently, conventional background subtraction methods are also unreliable, since the skin color of a hand may become buried in the projected light. One possible solution to this problem is to expand the capacity of the background subtraction. Despite, its application to an ideal projection screen which assumes enough color differences from skin color as in [27], the background subtraction can also be used to take into account different background and foreground re- flectance factors. When the background changes significantly, a segmentation may fail. An image update can be applied to keep the segmentation robust, where an artificial background may be generated from the known input image for a pro- jector with geometric and color distortions corrected between the projector and camera. Another feasible solution is to detect the changing pixel area between the frames of the captured video to obtain a basic shape of the moving hand or object. Noise can then be removed using image morphology. Following this, a fingertip can be detected by convolution with a fingertip-shaped template over the extracted image, as in [28]. To avoid the complex varying illumination problem for visible light, an infrared camera can be used instead, together with an infrared light source to produce in- visible shadow of a finger on the projected flat surface, as shown in [29]. The shadow of the finger can then be detected by the infrared camera and can thus be singularly used to detect the finger region and fingertip. To enable screen intera- tion by finger touching, the positioning of the finger, either touching the surface or hovering above it, can be further determined by detecting the occlusion ratio of the finger shadow. When the finger is touching the surface, its shadow is fully oc- cluded by the finger itself; while the finger is hovering over the surface, its shadow is larger. It is also possible to exclude the projected content from the captured video by interlacing the projecting images and captured camera frames using synchronized high-speed projectors and cameras, so that more general gesture recognition algo- rithms can be adopted as those reviewed in [30]. To obtain more robust detection results, specific vision hardware can also be utilized, such as real-time depth cam- eras that are based on the time-of-flight principle [31]. 21 Projector-Camera Systems in Entertainment and Art 481 Far Distance Interaction In a situation where the projection surface is beyond the user’s arm length, laser pointer interaction is an intuitive way to select and manipulate projected interface components. Recently, laser pointer interaction has used for interacting with large scale projection display or tiled display at a far distance [32]. To detect and track a laser dot on a projection surface in projector-camera sys- tems, a calibrated camera covering the projecting area is often used. The location and movement of a laser dot can be detected simply by applying an intensity thresh- old to the captured image – assuming that the laser dot is much brighter than the projection. Since the camera and the projector are both geometrically calibrated, the location of the laser dot on the camera image can be mapped to corresponding pixels on projection image. The “on” and “off” status of the laser pointer can be mapped to mouse click events for selecting particular operations. One or more virtual objects that are supposed to be intersected with the laser dot or a corresponding laser ray can be further calculated from the virtual scene geometry. More events for laser pointer interaction can be triggered by temporal or spa- tial gestures, such as encircling, or simply by adding some hardware on laser pointers, such as buttons and embedded electronics for wireless communication. Multiple user laser pointer interaction can also be supported for large projection areas where each user’s laser pointer is distinguishable. This can be supported by time-multiplexing the laser or by using different laser colors or patterns. User stud- ies have been carried out to provide optimized design parameters for laser pointer interaction [33]. Although laser pointing is an intuitive technique, it also suffers from issues such as hand-jittering, inaccuracy and slow interaction speeds. To overcome the hand-jittering problem, which is compounded at greater distances, filtering-based smoothing techniques can be applied, though may lead to discrepancy between the pointing laser dot and the estimated location. Infrared laser pointers may solve this problem, but according to user study results, visible laser lights are still found to be better for interaction. Apart from laser pointing, other tools such as a tracked stylus or specially de- signed passive vision wands [34] tracked by a camera have proven to be flexible and efficient when interacting with large scale projection displays over distances. Gesture recognition provides a natural way for interaction in greater distances without using specific tools. It is mainly based on gesture pattern recognition with or without hand model reconstruction. Evaluating body motions is also an intuitive way for large scale interaction, where the body pose and motion are estimated and behavior patterns may be further detected. When gesture and body motion are the dominant modes of interaction with projector-camera systems, shadows and varying illumination conditions are the main challenges, though shadows can also be utilized for detecting gesture or body motion. In gesture or body interaction, background subtraction is often used for detect- ing the moving body from the difference between the current frame and a reference background image. The background reference image must be regularly updated so 482 O. Bimber and X. Yang as to adapt to the varying luminance conditions and geometry settings. More com- plex models have extended the concept of background subtraction beyond its literal meaning. A thorough review of the background extraction methods is presented in [35]. Vision-based human action recognition approaches can be generally divided into four phases. The model initialization phase ensures that a system commences its operation with a correct interpretation of the current scene. The tracking phase seg- ments and tracks the human bodies in each camera frame. The pose estimation phase estimates the pose of the users in one or more frames. The recognition phase can recognize the identity of individuals as well as the actions, activities and behaviors performed by one or more user. Details about video based human action detection techniques are reviewed in [36]. Interaction with Handheld Projectors Hand-held projectors may display images on surfaces anywhere at anytime while they are being moved by the user. This is especially useful for mobile projector- based augmentation, which superimposes digital information in physical environ- ments. Unlike other mobile displays such as provided by PDAs or mobile phones, hand-held projectors offer a consistent visual combination of real information gather from physical surfaces with virtual information. This is possible without context switching between information space and real space, thus seamlessly blurring the virtual and real world. They can be used, for instance, as interactive information flashlights [37] – displaying registered image content on surface portions that are illuminated by the projector. Although hand-held projectors provide great flexibility for ubiquitous computing and spontaneous interaction, there are fundamental issues to be addressed before a fluid interaction between the user and the projector is possible. When using a hand- held projector to display on various surfaces in a real environment, the projected image will be dynamically modulated and distorted by the surfaces as the user moves. When the user stops moving the projector, the presented image still suf- fers from shaking by the user’s unavoidable hand-jitter. Thus, a basic requirement for hand-held projector interaction is to produce stable projection. Image Stabilizing One often desired form of image stabilization is to produce a rectangular 2D image on a planar surface – independently of the projector’s actual pose and movement. In this case, the projected image must be continuously warped to keep the correct aspect ratio and to remain undistorted. The warping process is similar to the geo- metric correction techniques described earlier. The difference, however, is that the 21 Projector-Camera Systems in Entertainment and Art 483 target viewing perspective is usually pointing towards the projection surface along its normal direction, while the position of the hand-held projector may keep on changing. To find the geometric mapping between the projector and the target perspective, the projector’s six degrees of freedom may be obtained from an attached tracking device. The homography is an adequate method to represent this geometric mapping when the projection surface is planar. Instead of using the detected four vertices of the visible projection area to calculate the homography matrix, another practical technique is to identify laser spots displayed from laser-pointers that are attached to the projector-camera system. The laser spots are brighter and therefore easier to detect. In [38], hand-jittering was compensated together with the geometry correc- tion, by continuously tracking the projector’s pose and warping the image at each time-step. A camera attached to the projector detects visual markers on the projec- tion surface, that are used for warping the projected image accordingly. In [42]a similar stabilization approach is described. Here, the projector pose relative to the display surface is recovered up to an unknown translation in the display plane. Pointing Techniques After the stabilization of the projector images, several techniques can be adopted to interact with the displayed content. Controlling a cursor by laser pointing (e.g., with a projector-attached laser pointer) represents one possibility. In this case, common desktop mouse interaction techniques can be mapped directly to hand-held projec- tors. The projector’s center pixel ray can also be used instead of a laser pointer to control the mouse cursor. One of the biggest problems associated with these meth- ods are size reductions and cropping of the display area, caused by the movement of the projector when controlling the cursor. Using a secondary device such as a tracked stylus or a separate laser pointer can overcome these limitations, however the user needs both hands for interaction. Mounting a touch pad or other input devices on the projector is also possible, but might not be as intuitive as a direct pointing with the projector itself. Selection and Manipulation Based on the display and direct pointing ability described above, mouse like interac- tion can be emulated such as selecting a menu or performing a cut-and-paste oper- ation by pointing the cursors on the projected area and pressing buttons mounted on the projector. However, in this scenario, the hand jitter problem, similar to laser pointer interaction, also exists – making it difficult to locate the cursor in specific and small areas. The jitter problem is intensified when cursor pointing is combined with mouse button-pressing operations. Adopting specially designed interaction techniques rather than emulating common desktop GUI methods, can alleviate this problem. 484 O. Bimber and X. Yang One proven and efficient interaction technique for hand-held projectors is the crossing based widget technique [37]. Crossing based widget is operated by moving the cursor to cross the widget in a specific direction (e.g. from outside to inside, or from top to bottom), while holding the mouse button. This technique avoids point- ing the cursor and pressing a button at the same time. Crossing widget can be used for hand-held projectors to support commonly used desktop GUI elements, such as menus and sliders. Crossing based menu items can be activated by crossing from one direction; and deactivated by crossing from the opposite direction. All actions are executed by releasing the mouse button. Different colors can be used to indicate the crossing directions. Hierarchical menus can also be supported. Similarly, the crossing based slider is activated by crossing the interface in one direction, deacti- vated by crossing it in the opposite direction, and adjusted according to the cursor movement parallel to the slider. Another specially designed interaction technique is called zoom-and-pick wid- get, proposed by [39]. It was designed to implement the simultaneous use of stable high-resolution visualization and pixel-accurate pointing for hand-held projectors. The widget is basically a square magnification area, located around the current pointing cursor position. A circular dead zone is defined within this area. The center of the dead zone is treated as an interaction hot-spot. The widget remains static when the pointing cursor is moving within the dead zone. To gain pixel-accurate pointing ability, a rim is defined around the dead zone. Each crossing of the cursor from the dead zone into the rim triggers a single pixel movement of the widget in the direc- tion of the pointer movement. If the pointer is moving beyond the dead zone and the rim, the widget will be relocated to include the pointer in its dead zone again. Multi-user Interaction Hand-held projectors also pose new chances and challenges for multi-user interac- tion. In contrast to other multi-user devices such as tabletop displays, primarily used for sharing information with others, or other mobile devices such as personal mo- bile phones; hand-held projectors, due to their portability, and personal usage, are suitable both for shared and individual use. Multiple hand-held projectors combine the advantages of public and personal display systems. The main issues associated with multi-user interaction and hand-held projec- tors are primarily concerned with design for ownership, privacy control, sharing, and so on. The name of the owner of a displayed object can be represented by spe- cially designed label widgets placed on the object and operated using crossing based operations. The overlap of two or more cursors can signify consent from multiple users to accomplish collaborative interactive task, such as coping a file or blending two images between the users. Snapping and docking actions can be performed by multiple users in order to quickly view or modify connected information between multiple objects. Multiple displayed images from more than one user can be blended directly or semantically. By displaying high resolution images when the user moves 21 Projector-Camera Systems in Entertainment and Art 485 closer to the display surface, a focus-and-context experience can be achieved by providing refined local details. More details can be found in [40]. Environment Awareness Due to their portability, hand-held projectors are mainly used spontaneously. There- fore, it is desirable to enhance the hand-held projectors with environment awareness abilities. Geometric and photometric measurement and object recognition and track- ing capacities, would enable the projector to sense and respond to the environment accordingly. Geometric and photometric awareness can be implemented using, for example, structured light techniques, as described in section “Structured Light Scanning”. For object recognition and tracking, the use of a passive fiducial marker (e.g., supported with open source computer vision toolkits such as ARToolkit[41]) is a cheap solu- tion. However, it is not visually attractive which may disturb the appearance of the object and may fail as a result of occlusion or low illumination. Unpowered pas- sive RFID tags can be detected via a radio frequency reader without being visible. They represent another inexpensive solution for object identification. However, they do not support pose tracking. The combination of RFID tags with photo-sensors, called RFIG, has been developed in order to obtain both – object identification and object position. The detection of the object position is implemented by projecting Gray codes onto the photo-sensors. In this way the Gray code is sensed by each photo-sensor and allows computing the projection of the sensors to the projector image plane, and consequently enables projector registration. More details about RFIG are referred to [42]. Interaction Design and Paradigm In the sections above, techniques for human interaction with different configura- tions of projector-camera systems were presented. This subsection, however, will introduce higher level concepts and methods for interaction design and interaction paradigms for such devices. Alternative configurations such as steerable projector and moveable surfaces will also be discussed briefly. Projector-based systems for displaying virtual environments assume high qual- ity, large field of view, and continuous display areas which often evoke feelings of immersion and presence, and provide continuous interaction spaces. In contrast, spatial projector-camera systems that display on everyday surfaces may produce blended and warped images with average quality and a cropped field of view. The cropped view occurs as a result of the constricted display area, discontinuous im- ages on different depth levels, and surfaces with different modulation properties. Due to these discrepancies, it is not always possible to directly adopt interaction techniques from immersive virtual environments or from conventional augmented reality applications. 486 O. Bimber and X. Yang For example, moving a virtual object using the pointing-and-drag technique, which is often adopted in virtual environments, may not be the preferred method in a projector-based augmented environment, since the appearance of the virtual ob- ject may vary drastically as it is moved and displayed on discontinuous surfaces with different depths and material properties. Instead, grasp-and-drop techniques may be better suited to this situation, as discussed in [43]. Furthermore, the distance between the user and display surface is important for designing and selecting interaction techniques. It was expected that pointing interac- tion is more suitable for manipulating far distance objects, while touching is suitable for near distance objects. However, contradictory findings, derived from user studies for interaction with projector-camera systems aimed for implementing augmented workspace [43], have proven otherwise. Users were found unwilling to touch the physical surfaces even at close range distances after they learned distance gestures such as pointing. Instead, users frequently continued using the pointing method, even for surfaces located in close proximity to them. The reason for this behavior may be two-fold. Firstly, users may prefer to use a consistent technique for manipu- lation such as pointing. Secondly, it seems that the appearance and materials of the surfaces affect the user’s willingness to interact with them [44]. Several interaction paradigms have been introduced with or for projector-camera systems. Tangible user interfaces were developed to manipulate projected content using physical tangible objects [45]. Vision based implicit interaction techniques have also been applied to support subtle and persuasive display concepts derived from ubiquitous computing [46]. The peephole paradigm is discussed as a concept to describe the projected display as a peephole for the physical environment [47]. Varying bubble-like free-form shapes of the projected area based on the environment enables a new interface that moves beyond regular fixed display boundaries [48]. Besides hand-held projectors which enable ubiquitous display, steerable projec- tors also bring new interaction concepts, such as everywhere displays. Such systems enable projections on different surfaces in a room, and to turn them into an interac- tion interfaces. The best way to control a steerable projector during the interaction, however still needs to be determined. Body tracking can be combined with steer- able projections to produce a paradigm called user-following display [49], where the user’s position and pose are tracked. Projection surfaces are then dynamically selected and modulated accordingly, based on a measured and maintained three- dimensional model of the surfaces in the room. Alternatively, laser pointers can be used and tracked by a pan/tilt/zoom camera to control and interact with a steer- able projector unit [50]. Another issue for interaction with steerable projectors is the question of how to support a dynamic interfaces which can change form and location on the fly. A vision-based approach can solve this problem by decoupling interface specifications from its location in space and in the camera image [51]. Besides the projectors themselves, projection surfaces might also be moveable rather than remain static in the environment. They may be rigidly moveable flat screens, semi-rigidly foldable objects such as a fan or an umbrella, or deformable objects such as paper and cloth. Moveable projection surfaces can provide novel interfaces and enable unique interaction paradigms such as foldable displays or 21 Projector-Camera Systems in Entertainment and Art 487 organic user interfaces [52]. Tracking the pose or deformation of such surfaces, how- ever, is an issue that still needs to be addressed. Cheap hardware trackers have been used recently to support semi-rigid surfaces [53]. Vision-based deformation detec- tion algorithms may be useful in future for supporting deformable display surfaces. Application Examples The basic visualization and interaction techniques that have been presented in the sections above enable a variety of new applications in different domains. In general, projector-camera systems can be applied to interactive or non-interactive visual presentations in situations where the application of projection screens is not possible, or not desired. Several examples are outlined below. Embedded Multimedia Presentations Many historic sites, such as castles, caves, or churches, are open to public. Flat panel displays or projection screens are frequently being used for presenting vi- sual information. These screens, however, are permanently installed features and unnecessarily cover a certain amount of space. They cannot be temporally disas- sembled to give the visitors an authentic impression of the environment’s ambience when required. Being able to project undistorted images onto arbitrary existing surfaces offers a potential solution to this problem. Projectors can display images that are much larger than the device itself. The images can be seamlessly embedded, and turned off any time to provide an unconstrained experience. For these reasons, projector- camera systems and image correction techniques are applied in several professional domains, such as historic sites, theater, festivals, museums, public screen presenta- tions, advertisement displays, theme parks, and many others. Figure 2 illustrates two examples for a theater stage projection at the Karl-May Festival in Elspe (Germany), and an immersive panoramic projection onto the walls of the main tower of castle Osterburg in Weida (Germany). Both are used for displaying multimedia content which is alternately turned on and off during the main stage performance and the museum presentation respectively. Other examples of professional applications can be found at www.vioso.com. Superimposing Museum Artifacts Projector-camera systems can also be used for superimposing museum artifacts with pictorial content. This helps to communicate information about the displayed ob- jects more efficiently than secondary screens. 488 O. Bimber and X. Yang Fig. 2 Projection onto physical stage setting (top), and 360 degree surround projection onto natu- ral stone walls in castle tower (bottom). Image courtesy: VIOSO GmbH, www.vioso.com In this case, a precise registration of the projector-camera system is not only nec- essary to ensure an adequate image correction (e.g., geometrically, photometrically, and focus), but also for displaying visual content that is geometrically registered to the corresponding parts of the object. Figure 3 illustrates two examples for superimposing visual content, such as color, text and image labels, interactive visualizations of magnifications and un- derdrawings, and visual highlights on replicas of a fossil (primal horse displayed by Senckenberg Museum Frankfurt, Germany) and paintings (Michaelangelo’s Creation of Adam, sanguine and Pontormo’s Joseph and Jacob in Egypt, oil on wood) [22]. In addition to augmenting an arbitrary image content, it is also possible to boost the contrast of low contrast objects, such as paintings whose colors have faded after a long exposure to sun light. The principle techniques describing how this can be achieved are explained in [19]. Spatial Augmented Reality Projector-camera systems cannot only acquire parameters that are necessary for im- age correction, but also higher level information, such as the surrounding scene geometry. This, for instance, enables corrected projections of stereoscopic images [...]... Computer Vision and Image Understanding, Vol 104, No 2–3, pp 90–126, 2006 37 X Cao and R Balakrishnan, “Interacting with dynamically defined information spaces using a handheld projector and a pen,” Proceedings of the 19th annual ACM symposium on User interface software and technology (UIST ’06), pp 225–234, 2006 38 P Beardsley, J Van Baar, R Raskar, and C Forlines, “Interaction using a handheld projector,”... Department of Animation, Emily Carr University of Art and Design, Vancouver, BC, Canada e-mail: lbishko@ecuad.ca V Zammitto, M Nixon, and H Wei School of Interactive Arts and Technology, Simon Fraser University, Vancouver, BC, Canada e-mail: vzammitt@sfu.ca; mna32@sfu.ca; huaxinw@sfu.ca A.V Vasiliakos University of Peloponnese, Nauplion, Greece e-mail: vasilako@ath.forthnet.gr B Furht (ed.), Handbook of Multimedia. .. and Computer Graphics (TVCG), Vol 10, No 3, pp 290–301, 2004 494 O Bimber and X Yang 13 M S Brown, P Song, and T - J Cham, “Image Pre-Conditioning for Out -of- Focus Projector Blur,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol II, pp 1956–1963, 2006 14 Y Oyamada and H Saito, “Focal Pre-Correction of Projected Image for Deblurring Screen Image,” Proceedings of. .. IEEE Computer Graphics and Applications, Vol 25, No 1, pp 39–43, 2005 39 C Forlines, R Balakrishnan, P Beardsley, J van Baar, and R Raskar, “Zoom -and- pick: facilitating visual zooming and precision pointing with interactive handheld projectors,” Proceedings of the 18th annual ACM symposium on User interface software and technology (UIST ’05), pp 73–82, 2005 40 X Cao, C Forlines, and R Balakrishnan, “Multi-user... integrate characters, narrative, and drama as part of their design One can see this pattern through the emergence of games like Assassin’s Creed (published by Ubisoft 2008), Hotel Dusk (published by Nintendo 2007), and Prince of Persia series (published by Ubisoft), which emphasized character and narrative as part of their design M.S El-Nasr ( ) School of Interactive Arts and Technology, Simon Fraser University,... in Entertainment and Art 495 33 B Myers, R Bhatnagar, J Nichols, C Peck, D Kong, R Miller, and A Long, “Interacting at a distance: measuring the performance of laser pointers and other devices,” Proceedings of the SIGCHI conference on Human factors in computing systems (CHI ’02), pp 33–40, 2002 34 X Cao and R Balakrishnan, “Visionwand: interaction techniques for large displays using a passive wand... simulations and interactive 3D environments for a wide variety of applications [4–11] Several great examples are displayed in the projects developed by Institute of Creative Technologies at University of Southern California, where they utilize 3D environments with rich characters to teach cultural norms and foreign language, among other subjects These applications provide a safe and comfortable environment for. .. ego, and superego The id is full of animal instincts and operates by prioritizing pleasure satisfaction, but is purely unconscious The ego mediates between the id, the superego, and the external world by evaluating the consequences of actions The superego is formed by the mandates that have been internalized, and the ideal image of oneself The ego and superego have unconscious, preconscious, and conscious... not fit into any of the three other categories Similar to Kretschmer, Sheldon and Stevens [27] developed a quantitative classification of personality along three dimensions of body types: Endomorphy (fatness), mesomorphy (muscularity), and ectomorphy (thinness), where each dimension was defined on a score from 1 to 7 For example, a person with 7 for endomorphy, 1 for mesomorphy, and 1 for ectomorphy represents... visualization of an architectural lighting simulation (left), and a stereoscopically projected spatial augmented reality game (right) Door, window, illumination and the car are projected Flexible Digital Video Composition Blue screens and chroma keying technology are essential for digital video composition Professional studios apply tracking technology to record the camera path for perspective augmentations of . huaxinw@sfu.ca A.V. Vasiliakos University of Peloponnese, Nauplion, Greece e-mail: vasilako@ath.forthnet.gr B. Furht (ed.), Handbook of Multimedia for Digital Entertainment and Arts, DOI 10.1007/978-0-387-89024-1. phones; hand-held projectors, due to their portability, and personal usage, are suitable both for shared and individual use. Multiple hand-held projectors combine the advantages of public and personal. the main tower of castle Osterburg in Weida (Germany). Both are used for displaying multimedia content which is alternately turned on and off during the main stage performance and the museum