Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 12 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
12
Dung lượng
31,02 KB
Nội dung
ACKNOWLEDGEMENTS Without help of many people and parties, the thesis would not have been completed by this moment. First of all, I would like to express my heartfelt gratitude to my supervisor, Dr. Ko Chi Chung, for his valuable advice and guidance during various phases of my study, especially for his serious research attitude and his encouragement on me to take the challenges. Secondly, I need to give thanks to NUS for providing me with an opportunity of pursuing further research and study in such a beautiful university. I miss all those lecturers and classmates with whom I spent great time together. Thirdly, I would like to thank Institute for Infocomm Research for giving good support during the entire process of my research. Without this support, the timely completion of the thesis is unimaginable. Last but not least, I strongly appreciate my family for their deep understanding. Thanks to my wife, for her love, patience and encouragement throughout my Ph.D study period. Thanks to my lovely daughter. Her birth has brought me a new chapter of my life and new angle of view to look at this world. Thanks to my parents, for taking good care of my daughter, and their confidence in my capability, determination and passion to excel all the time. i TABLE OF CONTENTS ACKNOWLEDGEMENTS i TABLE OF CONTENTS ii SUMMARY vi NOMENCLATURE viii LIST OF FIGURES x LIST OF TABLES xii CHAPTER INTRODUCTION 1.1 Background 1.2 Objectives 1.3 Thesis Contributions 1.4 Organization of the Thesis CHAPTER H.264 AND LITERATURE SURVEY 2.1 H.264 2.2 H.264 Encoder 2.3 H.264 Decoder 10 2.4 Predictive Coding 11 2.4.1 Intra Coding 11 2.4.2 Inter Coding 14 2.5 Motion Estimation 15 2.6 Mode Decision 17 2.7 Literature 18 2.8 Summary 28 ii CHAPTER FAST INTRA MODE DECISION FOR H.264 29 3.1 Overview of Intra Coding in H.264 30 3.2 Determining the Primary Edge Direction in the Image Block 32 3.2.1 Edge Map 33 3.2.2 Edge Direction Histogram 35 3.2.2.1 x luma block edge direction histogram 35 3.2.2.2 Edge direction histogram for 16 x 16 luma and x chroma block 3.3 Mode Decision for Intra Prediction 36 38 3.3.1 x Luma Block Prediction Modes 40 3.3.2 16 x 16 Luma Block Prediction Modes 40 3.3.3 x Chroma Prediction Mode 41 3.3.4 Algorithm Complexity Analysis 41 3.4 Experimental Results 42 3.4.1 Experiments on IPPPP Sequences 43 3.4.2 Experiments on All Intra Frames Sequences 46 3.4.3 Experiments on IBBPBB Sequences 48 3.4.4 Comparison of Different Fast Intra Prediction Methods 51 3.5 Summary 51 CHAPTER FAST INTER MODE DECISION FOR H.264 52 4.1 Overview of Inter Mode Decision 53 4.1.1 Inter Mode Decision in H.264/AVC 53 iii 4.1.2 Observations and Motivation 4.2 Determination of Homogenity and Stationarity 55 57 4.2.1 Homogeneous Regions Determination 57 4.2.2 Stationary Regions Detection 59 4.2.3 Overall Algorithm 60 4.3 Experimental Results 62 4.3.1 Experiments on IPPP Sequences 63 4.3.2 Experiments on IBBP Sequences 64 4.4 Summary 68 CHAPTER FAST INTRA 4X4 MODE ELIMINATION APPROACHES FOR H.264 69 5.1 Overview of H.264 Intra Mode 70 5.2 Fast Intra x Mode Elimination 72 5.2.1 Lossless Approach 72 5.2.2 Lossy Approach 72 5.3 Experimental Results 76 5.4 Summary 78 CHAPTER ADAPTIVE INTERPOLATION APPROACHES FOR H.264 79 6.1 Introduction 79 6.2 H.264 Interpolation Scheme 80 6.3 General Interpolation Approaches 82 6.4 New Approaches 84 iv 6.4.1 Approach One 84 6.4.2 Approach Two 87 6.5 Experimental Results 89 6.6 Summary 90 CHAPTER CONCLUSIONS AND FUTURE WORK 93 7.1 Conclusions 93 7.2 Future work 94 7.2.1 Fast SAD Method 95 7.2.2 Reordering Motion Estimation Steps for Different Block Sizes 96 REFERENCE 97 v SUMMARY The new international video coding standard H.264, also known as Advanced Video Coding (AVC), has been proposed by JVT. Compared to previous video coding standards, H.264/AVC has significantly better performance in terms of being able to achieve much better peak signal-to-noise ratio (PSNR) and visual quality at the same bit rate. This is due to the fact that a number of new techniques are adopted, which include directional prediction for intra coded blocks, variable block size for inter coded blocks, multi-reference frame motion estimation, integer transform, in-loop filter and context-based adaptive binary arithmetic coding (CABAC), and rate distortion optimization (RDO). Unfortunately, this good performance is obtained at the expense of very high computational complexity. Therefore, the main aim of this thesis is to develop fast algorithms that can improve the encoding speed of H.264 without significant loss of visual quality. A fast mode decision algorithm is firstly presented for intra prediction in H.264 video coding. By making use of the edge direction histogram, the number of mode combinations for luma and chroma blocks in a macroblock (MB) that take part in the rate distortion optimization calculation has been reduced significantly from 592 to as low as 132. This results in great reduction in the complexity and computation load of the encoder. Experimental results show that the fast algorithm has negligible loss of PSNR compared to the original H.264 scheme. Secondly, a fast inter mode decision algorithm is proposed to decide the best mode in the inter coding of H.264. It makes use of the spatial homogeneity and the temporal stationarity characteristics of the textures of video objects. Specifically, vi homogeneity decision of a block is based on edge information inside the block, and cosited MB difference is used to decide whether the MB is temporal stationary. Based on the homogeneity and stationarity of video objects, only a small number of inter modes are used in RDO. The experimental results show that the fast algorithm is able to reduce much encoding time, with negligible PSNR loss. Thirdly, two fast intra 4x4 mode elimination methods are proposed for H.264 coding. The lossless method checks the cost after each 4x4 block intra mode decision, and terminates if the cost is higher than the minimum cost of inter mode coding. On the other hand, by using some low cost preprocessing to make prediction, the lossy method terminates if the cost is higher than some fraction of this minimum cost. Experimental results show that the lossless method can reduce the encoding time without any sacrifice of visual quality. The lossy method can further reduce encoding time with negligible PSNR loss or bit rate increase. Finally, this thesis presents two adaptive interpolation methods that can significantly reduce the interpolation operation in H.264 coding. By making use of the flag matrix data structure and using interpolation on-demand, the proposed methods are able to increase the encoder speed greatly without any PSNR loss or increase in bit rate. vii NOMENCLATURE ABT adaptive block transform AVC advanced video coding BCH bit rate change CABAC context-based binary arithmetic coding CAVLC context-based variable length coding CIF common interchange format DC direct coefficient DCT discrete cosine transform FME fast motion estimation GOP group of planes HDTV high definition television ISO International Standard Organization ITU International Telecommunication Union JVT Joint Video Team MB macroblock MPEG Motion Picture Experts Group MSE mean square error MV motion vector NAL Network Abstraction Layer PSNR peak signal-to-noise ratio QCIF quarter common interchange format QP quantization parameter VCEG Video Coding Experts Group RDO rate distortion optimization viii SAD Sum of Absolute Difference ME/MC motion estimation/motion compensation MV motion vector PCH PSNR change SATD sum of transformed difference SBD sum of border difference SSD sum of the squared differences TS time saving FIR Finite Impulse Response ix LIST OF FIGURES Figure 2.1 H.264 encoder architecture Figure 2.2 H.264 decoder architecture Figure 2.3 Intra 4x4 prediction modes Figure 2.4 Intra 16x16 prediction modes Figure 2.5 Variable partition sizes employed in inter coding Figure 2.6 Multi-frame motion estimation/motion compensation Figure 3.1 An example of intra prediction Figure3.2 Examples of 4x4 edge patterns and their preferred intra predication directions Figure 3.3 Edge direction histogram of x blocks Figure 3.4 Intra x and 16 x 16 prediction mode directions Figure 3.5 Edge direction histogram of 16 x 16 luma and x chroma blocks Figure 3.6 News, Ch_Psnr = -0.067dB, Ch_Bits =1.226 % Figure 3.7 Mobile, Ch_Psnr = -0.018dB, Ch_Bits =0.451% Figure 3.8 Time saving at different intra periods Figure 3.9 Time saving at different size of searching area Figure 3.10 News, Ch_Psnr = -0.294dB, Ch_Bits =3.902% Figure 3.11 Mobile, Ch_Psnr = -0.255dB, Ch_Bits =3.168% Figure 3.12 News, Ch_Psnr = -0.156dB, Ch_Bits =3.106% x Figure 3.13 Mobile, Ch_Psnr = -0.013dB, Ch_Bits =0.379% Figure 4.1 Different partitions in an MB Figure 4.2 Segmentation of video objects in H.264 algorithm Figure 4.3 RD curve for ‘News’ (IPPP) Figure 4.4 RD curve for ‘Mobile’ (IPPP) Figure 4.5 RD curve for ‘News’ (IBBP) Figure 4.6 RD curve for ‘Mobile’ (IBBP) Figure 5.1 Overall mode decision process Figure 5.2 Border difference of x block Figure 6.1 Half pixel interpolation Figure 6.2 Quarter pixel interpolation Figure 6.3 Match between current block and reference block Figure 6.4 Active and Inactive macroblocks Figure 6.5 Reference blocks in the reference frame Figure 6.6 Interpolated frame memory reorganization Figure 6.7 Flag matrix for a Quarter CIF frame Figure 6.8 Flow chart of the second approach xi LIST OF TABLES Table 3.1 Number of candidate modes Table 3.2 Results for IPPPP sequences Table 3.3 Results for IIIII sequences Table 3.4 Results for IBBPB sequences Table 3.5 Comparison of different fast intra prediction methods Table 4.1 Results for IPPP sequences Table 4.2 Results for IBBP sequences Table 4.3 Results for ‘News’ (IPPP) Table 4.4 Results for ‘Mobile’ (IPPP) Table 4.5 Results for ‘News’ (IBBP) Table 4.6 Results for ‘Mobile’ (IBBP) Table 5.1 Codec Performance (QP=28) Table 5.2 Codec Performance (QP=32) Table 5.3 Codec Performance (QP=36) Table 5.4 Codec Performance (QP=40) Table 6.1 Speed increase at QP = 16 Table 6.2 Speed increase at QP = 32 xii [...]... Ch_Psnr = -0.013dB, Ch_Bits =0.379% Figure 4.1 Different partitions in an MB Figure 4 .2 Segmentation of video objects in H. 26 4 algorithm Figure 4.3 RD curve for ‘News’ (IPPP) Figure 4.4 RD curve for ‘Mobile’ (IPPP) Figure 4.5 RD curve for ‘News’ (IBBP) Figure 4.6 RD curve for ‘Mobile’ (IBBP) Figure 5.1 Overall mode decision process Figure 5 .2 Border difference of 4 x 4 block Figure 6.1 Half pixel interpolation... Results for IIIII sequences Table 3.4 Results for IBBPB sequences Table 3.5 Comparison of different fast intra prediction methods Table 4.1 Results for IPPP sequences Table 4 .2 Results for IBBP sequences Table 4.3 Results for ‘News’ (IPPP) Table 4.4 Results for ‘Mobile’ (IPPP) Table 4.5 Results for ‘News’ (IBBP) Table 4.6 Results for ‘Mobile’ (IBBP) Table 5.1 Codec Performance (QP =28 ) Table 5 .2 Codec Performance... 6 .2 Quarter pixel interpolation Figure 6.3 Match between current block and reference block Figure 6.4 Active and Inactive macroblocks Figure 6.5 Reference blocks in the reference frame Figure 6.6 Interpolated frame memory reorganization Figure 6.7 Flag matrix for a Quarter CIF frame Figure 6.8 Flow chart of the second approach xi LIST OF TABLES Table 3.1 Number of candidate modes Table 3 .2 Results for. .. Results for ‘News’ (IBBP) Table 4.6 Results for ‘Mobile’ (IBBP) Table 5.1 Codec Performance (QP =28 ) Table 5 .2 Codec Performance (QP= 32) Table 5.3 Codec Performance (QP=36) Table 5.4 Codec Performance (QP=40) Table 6.1 Speed increase at QP = 16 Table 6 .2 Speed increase at QP = 32 xii . FAST INTRA MODE DECISION FOR H. 26 4 29 3.1 Overview of Intra Coding in H. 26 4 30 3 .2 Determining the Primary Edge Direction in the Image Block 32 3 .2. 1 Edge Map 33 3 .2. 2 Edge Direction Histogram. proposed for H. 26 4 coding. The lossless method checks the cost after each 4x4 block intra mode decision, and terminates if the cost is higher than the minimum cost of inter mode coding. On the other. 2. 3 H. 26 4 Decoder 10 2. 4 Predictive Coding 11 2. 4.1 Intra Coding 11 2. 4 .2 Inter Coding 14 2. 5 Motion Estimation 15 2. 6 Mode Decision 17 2. 7 Literature 18 2. 8 Summary 28 iii CHAPTER