Multimedia Image and Video Processing Second Edition © 2012 by Taylor & Francis Group, LLC © 2012 by Taylor & Francis Group, LLC Multimedia Image and Video Processing Edited by Ling Guan Yifeng He Sun-Yuan Kung Second Edition CRC Press is an imprint of the Taylor & Francis Group, an informa business Boca Raton London New York © 2012 by Taylor & Francis Group, LLC CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2012 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Version Date: 20120215 International Standard Book Number-13: 978-1-4398-3087-1 (eBook - PDF) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the valid- ity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or uti- lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy- ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http:// www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com © 2012 by Taylor & Francis Group, LLC Contents List of Figures ix Preface xxvii Acknowledgments xxix Introduction xxxi Editors li Contributors liii Part I Fundamentals of Multimedia 1. Emerging Multimedia Standards 3 Huifang Sun 2. Fundamental Methods in Image Processing 29 April Khademi, Anastasios N. Venetsanopoulos, Alan R. Moody, and Sridhar Krishnan 3. Application-Specific Multimedia Architecture 77 Tung-Chien Chen, Tzu-Der Chuang, and Liang-Gee Chen 4. Multimedia Information Mining 129 Zhongfei (Mark) Zhang and Ruofei Zhang 5. Information Fusion for Multimodal Analysis and Recognition 153 Yongjin Wang, Ling Guan, and Anastasios N. Venetsanopoulos 6. Multimedia-Based Affective Human–Computer Interaction 173 Yisu Zhao, Marius D. Cordea, Emil M. Petriu, and Thomas E. Whalen Part II Methodology, Techniques, and Applications: Coding of Video and Multimedia Content 7. Part Overview: Coding of Video and Multimedia Content 197 Oscar Au and Bing Zeng 8. Distributed Video Coding 215 Zixiang Xiong 9. Three-Dimensional Video Coding 233 Anthony Vetro 10. AVS: An Application-Oriented Video Coding Standard 255 Siwei Ma, Li Zhang, Debin Zhao, and Wen Gao Part III Methodology, Techniques, and Applications: Multimedia Search, Retrieval, and Management 11. Multimedia Search and Management 291 Linjun Yang, Xian-Sheng Hua, and Hong-Jiang Zhang v © 2012 by Taylor & Francis Group, LLC vi Contents 12. Video Modeling and Retrieval 301 Zheng-Jun Zha, Jin Yuan, Yan-Tao Zheng, and Tat-Seng Chua 13. Image Retrieval 319 Lei Zhang and Wei-Ying Ma 14. Digital Media Archival 345 Chong-Wah Ngo and Song Tan Part IV Methodology, Techniques, and Applications: Multimedia Security 15. Part Review on Multimedia Security 367 Alex C. Kot, Huijuan Yang, and Hong Cao 16. Introduction to Biometry 397 Carmelo Velardo, Jean-Luc Dugelay, Lionel Daniel, Antitza Dantcheva, Nesli Erdogmus, Neslihan Kose, Rui Min, and Xuran Zhao 17. Watermarking and Fingerprinting Techniques for Multimedia Protection 419 Sridhar Krishnan, Xiaoli Li, Yaqing Niu, Ngok-Wah Ma, and Qin Zhang 18. Image and Video Copy Detection Using Content-Based Fingerprinting 459 Mehrdad Fatourechi, Xudong Lv, Mani Malek Esmaeili, Z. Jane Wang, and Rabab K. Ward Part V Methodology, Techniques, and Applications: Multimedia Communications and Networking 19. Emerging Technologies in Multimedia Communications and Networking: Challenges and Research Opportunities 489 Chang Wen Chen 20. A Proxy-Based P2P Live Streaming Network: Design, Implementation, and Experiments 519 Dongni Ren, S H. Gary Chan, and Bin Wei 21. Scalable Video Streaming over the IEEE 802.11e WLANs 531 Chuan Heng Foh, Jianfei Cai, Yu Zhang, and Zefeng Ni 22. Resource Optimization for Distributed Video Communications 549 Yifeng He and Ling Guan Part VI Methodology, Techniques, and Applications: Architecture Design and Implementation for Multimedia Image and Video Processing 23. Algorithm/Architecture Coexploration 573 Gwo Giun (Chris) Lee, He Yuan Lin, and Sun Yuan Kung © 2012 by Taylor & Francis Group, LLC Contents vii 24. Dataflow-Based Design and Implementation of Image Processing Applications 609 Chung-Ching Shen, William Plishker, and Shuvra S. Bhattacharyya 25. Application-Specific Instruction Set Processors for Video Processing 631 Sung Dae Kim and Myung Hoon Sunwoo Part VII Methodology, Techniques, and Applications: Multimedia Systems and Applications 26. Interactive Multimedia Technology in Learning: Integrating Multimodality, Embodiment, and Composition for Mixed-Reality Learning Environments . . 659 David Birchfield, Harvey Thornburg, M. Colleen Megowan-Romanowicz, Sarah Hatton, Brandon Mechtley, Igor Dolgov, Winslow Burleson, and Gang Qian 27. Literature Survey on Recent Methods for 2D to 3D Video Conversion 691 Raymond Phan, Richard Rzeszutek, and Dimitrios Androutsos 28. Haptic Interaction and Avatar Animation Rendering Centric Telepresence in Second Life 717 A. S. M. Mahfujur Rahman, S. K. Alamgir Hossain, and A. El Saddik Index 741 © 2012 by Taylor & Francis Group, LLC List of Figures 1.1 Typical MPEG.1 encoder structure. 6 1.2 (a) An example of an MPEG GOP of 9, N = 9, M = 3. (b) Transmission order of an MPEG GOP of 9 and (c) Display order of an MPEG GOP of 9. 7 1.3 Two zigzag scan methods for MPEG-2 video coding. 8 1.4 Block diagram of an H.264 encoder. 13 1.5 Encoding processing of JPEG-2000. 18 1.6 (a) MPEG-1 audio encoder. (b) MPEG-1 audio decoder. 20 1.7 Relations between tools of MPEG.7. 23 1.8 Illustration of MPEG.21 DIA. 25 2.1 Histogram example with L number of bins. (a) FLAIR MRI (brain). (b) PDF p G (g) of (a). 31 2.2 Example histograms with varying number of bins (bin widths). (a) 100 bins, (b) 30 bins, (c) 10 bins, (d) 5 bins. 32 2.3 Empirical histogram and KDA estimate of two random variables, N(0, 1) and N(5,1). (a) Histogram. (b) KDA. 33 2.4 Types of kernels for KDA. (a) Box, (b) triangle, (c) Gaussian, and (d) Epanechnikov. 34 2.5 KDA of random sample (N(0,1) +N(5,1)) for box, triangle, and Epanechnikov kernels. (a) Box, (b) triangle, and (c) Epanechnikov. 34 2.6 Example image and its corresponding histogram with mean and variance indicated. (a) g(x, y). (b) PDF p G (g) of (a). 36 2.7 HE techniques applied to mammogram lesions. (a) Original. (b) Histogram equalized. 37 2.8 The KDA of lesion “(e)” in Figure 2.7, before and after enhancement. Note that after equalization, the histogram resembles a uniform PDF. (a) Before equalization. (b) After equalization. 38 2.9 Image segmentation based on global histogram thresholding. (a) Original. (b) B(x, y) ∗g(x, y). (c) (1 −B(x, y)) ∗ g(x, y). 39 2.10 The result of a three-class Otsu segmentation on the image of Figure 2.6a. The left image is the segmentation result of all three classes (each class is assigned a unique intensity value). The images on the left are binary segmentations for each tissue class B(x, y). (a) Otsu segmentation. (b) Background class. (c) Brain class. (d) Lesion class. 40 2.11 Otsu’s segmentation on retinal image showing several misclassified pixels. (a) Original. (b) PDF p G (g) of (a). (c) Otsu segmentation. 41 ix © 2012 by Taylor & Francis Group, LLC x List of Figures 2.12 Example FLAIR with WML, gradient image, and fuzzy edge mapping functions. (a) y(x 1 , x 2 ). (b) g(x 1 , x 2 ) =∇y. (c) ρ k and p G (g). (d) ρ k (x 1 , x 2 ). 42 2.13 T1- and T2-weighted MR images (1 mm slice thickness) of the brain and corresponding histograms. Images are from BrainWeb database; see http://www.bic.mni.mcgill.ca/brainweb/. (a) T1-weighted MRI. (b) T2-weighted MRI. (c) Histogram of Figure 2.13a. (d) Histogram of Figure 2.13b. 44 2.14 T1- and T2-weighted MR images (1 mm slice thickness) with 9% noise and corresponding histograms. Images are from BrainWeb database; see http://www.bic.mni.mcgill.ca/brainweb/. (a) T1-weighted MRI with 9% noise. (b) T2-weighted MRI with 9% noise. (c) Histogram of Figure 2.14a. (d) Histogram of Figure 2.14b. 45 2.15 (Un)correlated noise sources and their 3D surface representation. (a) 2D Guassian IID noise. (b) Surface representation of Figure 2.15a. (c) 2D Colored noise. (d) Surface representation of Figure 2.15c. 47 2.16 Empirically found M 2 distribution and the observed M obs 2 for uncorrelated and correlated 2D data of Figure 2.15. (a) p(M 2 ) and M obs 2 for Figure 2.15a. (b) p(M 2 ) and M obs 2 for Figure 2.15c. 48 2.17 Correlated 2D variables generated from normally (N) and uniformly (U) distributed random variables. Parameters used to simulate the random distributions are shown in Table 2.1. 49 2.18 1D nonstationary data. 50 2.19 Grid for 2D-extension of RA test. (a), (b), and (c) show several examples of different spatial locations where the number of RAs are computed. 51 2.20 Empirically found distribution of R and the observed R ∗ for 2D stationary and nonstationary data. (a) IID stationary noise. (b) p(R) and R ∗ of (a). (c) Nonstationary noise. (d) p(R) and R ∗ of (c). 52 2.21 Nonstationary 2D variables generated from normally (N) and uniformly (U) distributed. Parameters (μ, σ) and (a, b) used to simulate the underlying distributions are shown in Table 2.1. 53 2.22 Scatterplot of gradient magnitude images of original image (x-axis) and reconstructed version (y-axis). 54 2.23 Bilaterally filtered examples. (a) Original. (b) Bilaterally filtered. (c) Original. (d) Bilaterally filtered. 56 2.24 Image reconstruction of example shown in Figure 2.23a. (a) Y 0.35 rec , (b) Y 0.50 rec , (c) Y 0.58 est , and (d) Y 0.70 rec 58 2.25 Reconstruction example (τ ∗ = 0.51 and τ ∗ = 0.53, respectively). (a) S ( Y τ rec (x 1 ,x 2 ) ) C ( Y τ rec (x 1 ,x 2 ) ) . (b) Y 0.51 rec . (c) Hist(Y). (d) Hist Y 0.51 rec . (e) S ( Y τ rec (x 1 ,x 2 ) ) C ( Y τ rec (x 1 ,x 2 ) ) . (f) Y 0.53 rec . (g) Hist(Y). (h) Hist Y 0.53 rec 60 2.26 Normalized differences in smoothness and sharpness, between the proposed method and the bilateral filter. (a) Smoothness. (b) Sharpness. 61 © 2012 by Taylor & Francis Group, LLC List of Figures xi 2.27 Fuzzy edge strength ρ k versus intensity y for the image in Figure 2.23a. (a) ρ k vs. y, (b) μ ρ (y), and (c) μ ρ (x 1 , x 2 ). 62 2.28 Original image y(x 1 , x 2 ), global edge profile μ ρ (y) and global edge values mapped back to spatial domain μ ρ (x 1 , x 2 ). (a) y(x 1 , x 2 ), (b) μ ρ (y), and (c) μ ρ (x 1 , x 2 ) 63 2.29 Modified transfer function c(y) with original graylevel PDF p Y (y), and the resultant image, c(x 1 , x 2 ). (a) c(y) and p Y (y) and (b) c(x 1 , x 2 ) of (b). 64 2.30 CE transfer function and contrast-enhanced image. (a) y CE (y) and p Y (y). (b) y CE (x 1 , x 2 ) of (b). 65 2.31 Original, contrast-enhanced images and WML segmentation. (a–c) Original. (d–f) Enhanced. (g–i) Segmentation. 66 2.32 One level of DWT decomposition of retinal images. (a) Normal image decomposition; (b) decomposition of the retinal images with diabetic retinopathy. CE was performed in the higher frequency bands (HH, LH, HL) for visualization purposes. 68 2.33 Medical images exhibiting texture. (a) Normal small bowel, (b) small bowel lymphoma, (c) normal retinal image, (d) central retinal vein occlusion, (e) benign lesion, and (f) malignant lesion. CE was performed on (e) and (f) for visualization purposes. 71 3.1 A general architecture of multimedia applications system. 79 3.2 (a) The general architecture and (b) hardware design issues of the video/image processing engine. 80 3.3 Memory hierarchy: trade-offs and characteristics. 82 3.4 Conventional two-stage macroblock pipelining architecture. 84 3.5 Block diagram of the four-stage MB pipelining H.264/AVC encoding system. . . 85 3.6 The spatial relationship between the current macroblock and the searching range. 86 3.7 The procedure of ME in a video coding system for a sequence. 87 3.8 Block partition of H.264/AVC variable block size. 88 3.9 The hardware architecture of 1DInterYSW, where N = 4, P h = 2, and P v = 2. . . . 89 3.10 The hardware architecture of 2DInterYH, where N = 4, P h = 2, and P v = 2 90 3.11 The hardware architecture of 2DInterLC, where N = 4, P h = 2, and P v = 2 90 3.12 The hardware architecture of 2DIntraVS, where N = 4, P h = 2, and P v = 2. 91 3.13 The hardware architecture of 2DIntraKP, where N = 4, P h = 2, and P v = 2. 92 3.14 The hardware architecture of 2DIntraHL, where N = 4, P h = 2, and P v = 2 92 3.15 (a) The concept, (b) the hardware architecture, and (c) the detailed architecture of PE array with 1-D adder tree, of Propagate Partial SAD, where N = 4. 93 © 2012 by Taylor & Francis Group, LLC [...]... advances in multimedia research and applications The 28 chapters are classified into 7 parts Part I focuses on Fundamentals of Multimedia, and Parts II through VII focus on Methodology, Techniques, and Applications Part I includes Chapters 1 through 6 Chapter 1 provides an overview of multimedia standards including video coding, still image coding, audio coding, multimedia interface, and multimedia framework... including multimedia information mining, multimodal information fusion and interaction, multimedia security, multimedia systems, hardware for multimedia, multimedia coding, multimedia search, and multimedia communications Each chapter of the book is contributed by prominent experts in the field Therefore, it offers a very insightful treatment on the topic This book includes an Introduction and 28 chapters... Meanwhile, the infrastructures of imagesharing social networks make it easier for users to attach tags to images than before These huge amount of user tags enable better understanding of the associated images and provide many research opportunities to boost image search and retrieval performance On the other hand, the user tags somehow reflect the users’ intentions and subjectivities and therefore can be leveraged... vocabulary and is highly scalable and robust to outliers 0.5.2 Multimedia Recommendation Developing recommendation systems for community media has attracted many attentions with the popularity of Web 2.0 applications, such as Flickr, YouTube, and Facebook Users give their own comments and rates on multimedia items, such as images, amateur videos, and movies However, only a small portion of multimedia. .. recent work on video modeling and retrieval including semantic concept detection, semantic video retrieval, and interactive video retrieval Chapter 13 presents a variety of existing techniques for image retrieval, including visual feature extraction, relevance feedback, automatic image annotation, and large-scale visual indexing Chapter 14 describes three basic components: content structuring and organization,... advances on optimal resource allocation for video communications over P2P streaming systems, wireless ad hoc networks, and wireless visual sensor networks Part VI focuses on architecture design and implementation for multimedia image and video processing It includes Chapters 23 through 25 Chapter 23 presents the methodology for concurrent optimization of both algorithms and architectures Chapter 24 introduces... significant advances in multimedia research and applications Amount of new technologies have been invented for various fundamental multimedia research problems They are helping the computing machines better perceive, organize, and retrieve the multimedia content With the rapid development of multimedia hardware and software, nowadays we can easily make, access and share considerable multimedia contents,... techniques and content-based image retrieval tools 0.3.2 Ontological Inference The explicit representation of domain knowledge in multimedia ontologies enables inference for new concept based on concepts and their relationships It is due to this capability that more and more research work incorporate ontologies into all kinds of multimedia tasks, such as image annotation, retrieval and video event... applications ranging from online multimedia search, Internet Protocol Television (IPTV), and mobile multimedia, to social media The proliferation of diverse multimedia applications has been the motivating force for the research and development of numerous paradigm-shifting technologies in multimedia processing This book documents the most recent advances in multimedia research and applications It is a comprehensive... representation formats and the associated compression techniques Chapter 10 gives a detailed description to Audio Video Coding Standard (AVS) developed by the China Audio Video Coding Standard Working Group Part III focuses on multimedia search, retrieval, and management It includes Chapters 11 through 14 Chapter 11 is a part overview which provides the research trends in the area of multimedia search and management . Multimedia Image and Video Processing Second Edition © 2012 by Taylor & Francis Group, LLC © 2012 by Taylor & Francis Group, LLC Multimedia Image and Video Processing Edited. Methodology, Techniques, and Applications: Coding of Video and Multimedia Content 7. Part Overview: Coding of Video and Multimedia Content 197 Oscar Au and Bing Zeng 8. Distributed Video Coding 215 Zixiang. Rui Min, and Xuran Zhao 17. Watermarking and Fingerprinting Techniques for Multimedia Protection 419 Sridhar Krishnan, Xiaoli Li, Yaqing Niu, Ngok-Wah Ma, and Qin Zhang 18. Image and Video Copy