1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Digital Signal Processing Handbook P55

21 419 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 21
Dung lượng 398,15 KB

Nội dung

Osama Al-Shaykh, et. Al. “Video Sequence Compression.” 2000 CRC Press LLC. <http://www.engnetbase.com>. VideoSequenceCompression OsamaAl-Shaykh UniversityofCalifornia, Berkeley RalphNeff UniversityofCalifornia, Berkeley DavidTaubman HewlettPackard AvidehZakhor UniversityofCalifornia, Berkeley 55.1Introduction 55.2MotionCompensatedVideoCoding MotionEstimationandCompensation • Transformations • Discussion • Quantization • CodingofQuantizedSymbols 55.3DesirableFeatures Scalability • ErrorResilience 55.4Standards H.261 • MPEG-1 • MPEG-2 • H.263 • MPEG-4 Acknowledgment References Theimageandvideoprocessingliteratureisrichwithvideocompressionalgorithms. Thischapteroverviewsthebasicblocksofmostvideocompressionsystems,discusses someimportantfeaturesrequiredbymanyapplications,e.g.,scalabilityanderrorre- silience,andreviewstheexistingvideocompressionstandardssuchasH.261,H.263, MPEG-1,MPEG-2,andMPEG-4. 55.1 Introduction Videosourcesproducedataatveryhighbitrates.Inmanyapplications,theavailablebandwidthis usuallyverylimited.Forexample,thebitrateproducedbya30frame/scolorcommonintermediate format(CIF)(352×288)videosourceis73Mbits/s.Inordertotransmitsuchasequenceovera 64Kbits/schannel(e.g.,ISDNline),weneedtocompressthevideosequencebyafactorof1140.A simpleapproachistosubsamplethesequenceintimeandspace.Forexample,ifwesubsampleboth chromacomponentsby2ineachdimension,i.e.,4:2:0format,andthewholesequencetemporally by4,thebitratebecomes9.1Mbits/s.However,totransmitthevideoovera64kbits/schannel,it isnecessarytocompressthesubsampledsequencebyanotherfactorof143.Toachievesuchhigh compressionratios,wemusttoleratesomedistortioninthesubsampledframes. Compressioncanbeeitherlossless(reversible)orlossy(irreversible).Acompressionalgorithmis losslessifthesignalcanbereconstructedfromthecompressedinformation;otherwiseitislossy.The compressionperformanceofanylossyalgorithmisusuallydescribedintermsofitsrate-distortion curve,whichrepresentsthepotentialtrade-offbetweenthebitrateandthedistortionassociatedwith thelossyrepresentation.Theprimarygoalofanylossycompressionalgorithmistooptimizethe rate-distortioncurveoversomerangeofratesorlevelsofdistortion.Forvideoapplications,rate c  1999byCRCPressLLC is usually expressed in terms of bits per second. The distortion is usually expressed in terms of the peak-signal-to-noise ratio (PSNR) per frame or, in some cases, measures that try to quantify the subjective nature of the distortion. In addition to good compression performance, many other properties may be important or even critical to the applicability of a given compression algorithm. Such properties include robustness to errors in the compressed bit stream, low complexity encoders and decoders, low latency require- ments, and scalability. Developing scalable video compression algorithms has attracted considerable attention in recent years. Generally speaking, scalability refers to the potential to effectively decom- press subsets of the compressed bit stream in order to satisfy some practical constraint, e.g., display resolution, decoder computational complexity, and bit rate limitations. The demand for compatible video encoders and decoders has resulted in the development of differentvideocompressionstandards. Theinternational standardsorganization(ISO)hasdeveloped MPEG-1 to store video on compact discs, MPEG-2 for digital television, and MPEG-4 for a wide range of applications including multimedia. The international telecommunication union (ITU) has developed H.261 for video conferencing and H.263 for video telephony. All existing video compression standards are hybrid systems. That is, the compression is achieved in two main stages. The first stage, motion compensation and estimation, predicts each frame from its neighboring frames, compresses the prediction parameters, and produces the prediction error frame. The second stage codes the prediction error. All existing standards use block-based discrete cosine transform (DCT) to code the residual error. In addition to DCT, others non-block-based coders, e.g., wavelets and matching pursuit, can be used. In this chapter, we will provide an overview of hybrid video coding systems. In Section 55.2,we discuss the main parts of a hybrid video coder. This includes motion compensation, signal decompo- sitions and transformations, quantization, andentropycoding. Wecompare various transformations such as DCT, subband, and matching pursuit. In Section 55.3, we discuss scalability and error re- silience in video compression systems. We also describe a non-hybrid video coder that provides scalable bit-streams [28]. Finally, in Section 55.4, we review the key video compression standards: H.261, H.263, MPEG 1, MPEG 2, and MPEG 4. 55.2 Motion Compensated Video Coding Virtually all video compression systems identify and reduce four basic types of video data redun- dancy: inter-frame (temporal) redundancy, interpixel redundancy, psychovisual redundancy, and coding redundancy. Figure 55.1 shows a typical diagram of a hybrid video compression system. First the current frame is predicted from previously decoded frames by estimating the motion of blocks or objects, thus reducing the inter-frame redundancy. Afterwards to reduce the interpixel redundancy, the residual error after frame prediction is transformed to another format or domain such that the energy of the new signal is concentrated in few components and these components are as uncorrelated as possible. The transformed signal is then quantized according to the desired compression performance (subjective or objective). The quantized transform coefficients are then mapped to codewords that reduce the coding redundancy. The rest of this section will discuss the blocks of the hybrid system in more detail. 55.2.1 Motion Estimation and Compensation Neighboring frames in typical video sequences are highly correlated. This inter-frame (temporal) redundancy can be significantly reduced to produce a more compressible sequence by predicting each frame from its neighbors. Motion compensation is a nonlinear predictive technique in which the feedback loop contains both the inverse transformation and the inverse quantization blocks, as c  1999 by CRC Press LLC FIGURE 55.1: Motion compensated coding of video. shown in Fig. 55.1. Most motion compensation techniques divide the frame into regions, e.g., blocks. Each region is then predicted from the neighboring frames. The displacement of the block or region, d, is not fixed and must be encoded as side information in the bit stream. In some cases, different prediction models are used to predict regions, e.g., affine transformations. These prediction parameters should also be encoded in the bit stream. To minimize the amount of side information, which must be included in the bit stream, and to simplify the encoding process, motion estimation is usually block based. That is, every pixel  i in a given rectangular block is assigned the same motion vector, d. Block-based motion estimation is an integral part of all existing video compression standards. 55.2.2 Transformations Mostimage and video compressionschemes apply a transformation tothe rawpixels ortothe residual error resulting from motion compensation before quantizing and coding the resulting coefficients. The function of the transformation is to represent the signal in a few uncorrelated components. The most common transformations are linear transformations, i.e., the multi-dimensional sequence of input pixel values, f[  i], is represented in terms of the transform coefficients, t[  k],via f[  i]=   k t[  k]w  k [  i] (55.1) for some w  k [  i]. The input image is thus represented as a linear combination of basis vectors, w  k . It is important to note that the basis vectors need not be orthogonal. They only need to form an over-complete set (matching pursuits), a complete set (DCT and some subband decompositions), or very close to complete (some subband decompositions). This is important since the coder should be able to code a variety of signals. The remainder of the section discusses and compares DCT, subband decompositions, and matching pursuits. The DCT There are two properties desirable in a unitary transform for image compression: the energy should be packed into a few transform coefficients, and the coefficients should be as uncorrelated c  1999 by CRC Press LLC as possible. The optimum transform under these two constraints is the Karhunen-Lo ´ eve transform (KLT) where the eigenvectors of the covariance matrix of the image are the vectors of the trans- form [10]. Although the KLT is optimal under these two constraints, it is data-dependent, and is expensive to compute. The discrete cosine transform (DCT) performs very close to KLT especially when the input is a first order Markov process [10]. The DCT is a block-based transform. That is, the signal is divided into blocks, which are indepen- dently transformed using orthonormal discrete cosines. The DCT coefficients of a one-dimensional signal, f , are computed via t DCT [Nb+ k]= 1 √ N              N−1  i=0 f[Nb+ i],k= 0 N−1  i=0 √ 2f[Nb+ i] cos (2i + 1)kπ 2N , 1 ≤ k<N ∀b (55.2) where N is the size of the block and b denotes the block number. The orthonormal basis vectors associated with the one-dimensional DCT transformation of Eq. (55.2)are w DCT k [i]= 1 √ N  1,k= 0, 0 ≤ i<N √ 2 cos (2i+1)kπ 2N , 1 ≤ k<N,0 ≤ i<N (55.3) Figure 55.2(a) shows these basis vectors for N = 8. FIGURE 55.2: DCT basis vectors (N = 8): (a) one-dimensional and (b) separable two-dimensional. The one-dimensional DCT described above is usually separably extended to two dimensions for image compression applications. In this case, the two-dimensional basis vectors are formed by the tensor product of one-dimensional DCT basis vectors and are given by w DCT  k [  i]=w DCT k 1 ,k 2 [i 1 ,i 2 ]  = w DCT k 1 [i 1 ]·w DCT k 2 [i 2 ]; 0 ≤ k 1 ,k 2 ,i 1 ,i 2 <N c  1999 by CRC Press LLC Figure 55.2(b) shows the two-dimensional basis vectors for N = 8. The DCT is the most common transform in video compression. It is used in the JPEG still image compression standard, and all existing video compression standards. This is because it performs reasonably well at different bit rates. Moreover, there are fast algorithms and special hardware chips to compute the DCT efficiently. The major objection to the DCT in image or video compression applications is that the non- overlapping blocks of basis vectors, w  k , are responsible for distinctly “blocky” artifacts in the de- compressed frames, especially at low bit rates. This is due to the quantization of the transform coefficients of a block independent from neighboring blocks. Overlapped DCT representation ad- dresses this problem [15]; however, the common solution is to post-process the frame by smoothing the block boundaries [18, 22]. Due to bit rate restrictions, some blocks are only represented by one or a small number of coarsely quantizedtransform coefficients,hencethedecompressedblockwill onlyconsistofthesebasisvectors. This will cause artifacts commonly known as ringing and mosquito noise. Figure 55.8(b) shows frame 250 of the 15 frame/s CIF Coast-guard sequence coded at 112 Kbits/s using a DCT hybrid video coder. 1 This figure provides a good illustration of the “blocking” artifacts. Subband Decomposition The basic idea of subband decomposition is to split the frequency spectrum of the image into (disjoint) subbands. This is efficient when the image spectrum is not flat and is concentrated in a few subbands, which is usually the case. Moreover, we can quantize the subbands differently according to their visual importance. As for the DCT, we begin our discussion of subband decomposition by considering only a one- dimensional source sequence, f[i]. Figure 55.3 provides a general illustration of an N-band one- dimensional subband system. We refer to the subband decomposition itself as analysis and to the FIGURE 55.3: 1D, N-band subband analysis and synthesis block diagrams. (Source: Taubman, D., Chang, E., and Zakhor, A., Directionality and scalability in subband image and video compression, in Image Technology: Advances in Image Processing, Multimedia, and Machine Vision, Jorge L.C. Sanz, Ed., Springer-Verlag, New York, 1996. With permission). inversetransformation as synthesis. The transformation coefficients of bands 1, 2, .,Naredenoted by the sequences u 1 [k],u 2 [k], .,u N [k], respectively. For notational convenience and consistency with the DCT formulation above, we write t SB [·] for the sequence of all subband coefficients, arranged 1 It is coded using H.263 [3], which is an ITU standard. c  1999 by CRC Press LLC accordingto t SB [(β−1)+Nk]=u β [k],where1 ≤ β ≤ N is the subband number. These coefficients are generated by filtering the input sequence with filters H 1 , .,H N and downsampling the filtered sequencesbyafactorofN, as depicted in Fig. 55.3. In subband synthesis, the coefficients for each band are upsampled, interpolated with the synthesis filters, G 1 , .,G N , and the results summed to form a reconstructed sequence, ˜ f[i],asdepictedinFig.55.3. If the reconstructed sequence, ˜ f[i], and the source sequence, f[i], are identical, then the subband system is referred to as perfect reconstruction (PR) and the corresponding basis set is a complete basis set. Although perfect reconstruction is a desirable property, near perfect reconstruction (NPR), for which subband synthesis is only approximately the inverse of subband analysis, is often sufficient in practice. This is because distortion introduced by quantization of the subband coefficients, t SB [k], usually dwarfs that introduced by an imperfect synthesis system. The filters, H 1 , .,H N , are usually designed to have band-pass frequency responses, as indicated in Fig. 55.4, so that the coefficients u β [k] for each subband, 1 ≤ β ≤ N, represent different spectral components of the source sequence. FIGURE 55.4: Typical analysis filter magnitude responses. (Source: Taubman, D., Chang, E., and Za- khor, A., Directionalityand scalability insubband imageand video compression, inImageTechnology: Advances in Image Processing, Multimedia, and Machine Vision, Jorge L.C. Sanz, Ed., Springer-Verlag, New York, 1996. With permission). The basis vectors for subband decomposition are the N-translates of the impulse responses, g 1 [i], .,g N [i], of synthesis filters G 1 , .,G N . Specifically, denoting the kth basis vector as- sociated with subband β by w SB Nk+β−1 ,wehave w SB Nk+ β − 1 [i]=g β [i − Nk] (55.4) Figure 55.5 illustrates five of the basis vectors for a particularly simple, yet useful, two-band PR subband decomposition, with symmetric FIR analysis and synthesis impulse responses. As shown in Fig. 55.5 and in contrast with the DCT basis vectors, the subband basis vectors overlap. As for the DCT, one-dimensional subband decompositions may be separably extended to higher dimensions. By this we mean that a one-dimensional subband decomposition is first applied along one dimension of an image or video sequence. Any or all of the resulting subbands are then further decomposed into subbands along another dimension and so on. Figure 55.6 depicts a separable two- dimensional subband system. For video compression applications, the prediction error is sometimes decomposed into subbands of equal size. Two-dimensional subband decompositions have the advantage that they do not suffer from the disturbing blocking artifacts exhibited by the DCT at high compression ratios. Instead, the most noticeable quantization-induced distortion tends to be ‘ringing’ or ‘rippling’ artifacts, which become most bothersome in the vicinity of image edges. Figures 55.11(c) and 55.8(c) clearly show this effect. Figure 55.11 shows frame 210 of the Ping-pong sequence compressed using a scalable, three- dimensional subband coder [28] at 1.5 Mbits/s, 300 Kbits/s, and 60 Kbits/s. As the bit rate decreases, we notice loss of detail and introduction of more ringing noise. Figure 55.8(c) shows frame 250 of the Coast-guard sequence compressed at 112 Kbits/s using a zerotree scalable coder [16]. The edges of the trees and the boat are affected by ringing noise. c  1999 by CRC Press LLC FIGURE 55.5: Subband basis vectors with N = 2,h 1 [−2 .2]= √ 2 · (− 1 8 , 1 4 , 3 4 , 1 4 ,− 1 8 ), h 2 [−2 .0]= √ 2 · (− 1 4 , 1 2 ,− 1 4 ), g 1 [−1 .1]= √ 2 · ( 1 4 , 1 2 , 1 4 ), and g 2 [−1 .3]= √ 2 · (− 1 8 ,− 1 4 , 3 4 ,− 1 4 ,− 1 8 ).h i and g i are the impulse responses of the H i (analysis) and G i (synthesis) filters, respectively. (Source: Taubman, D., Chang, E., and Zakhor, A., Direction- ality and scalability in subband image and video compression, in Image Technology: Advances in Image Processing, Multimedia, and Machine Vision, Jorge L.C. Sanz, Ed., Springer-Verlag, New York, 1996. With permission). Matching Pursuit Representing a signal using an over-complete basis set implies that there is more than one representation for the signal. For coding purposes, we are interested in representing the signal with the fewest basis vectors. This is an NP-complete problem [14]. Different approaches have been investigated to find or approximate the solution. Matching pursuits is a multistage algorithm, which in each stage finds the basis vector that minimizes the mean-squared-error [14]. Suppose we want to represent a signal f[i] using basis vectors from an over-complete dictionary (basis set) G. Individual dictionary vectors can be denoted as: w γ [i]∈G. (55.5) Here γ is an indexing parameter associated with a particular dictionary element. The decomposition begins by choosing γ to maximize the absolute value of the following inner product: t =<f[i],w γ [i] >, (55.6) where t is the transform (expansion) coefficient. A residual signal is computed as: R[i]=f[i]−tw γ [i]. (55.7) This residual signal is then expanded in the same way as the original signal. The procedure continues iteratively until either a set number of expansion coefficients are generated or some energy threshold for the residual is reached. Each stage k yields a dictionary structure specified by γ k , an expansion coefficient t[k], and a residual R k , which is passed on to the next stage. After a total of M stages, the signal can be approximated by a linear function of the dictionary elements: ˆ f[i]= M  k=1 t[k] w γ k [i]. (55.8) c  1999 by CRC Press LLC FIGURE 55.6: Separable spatial subband pyramid. Two level analysis system configuration and subband passbands shown. (Source: Taubman, D., Chang, E., and Zakhor, A., Directionality and scalability in subband image and video compression, in Image Technology: Advances in Image Process- ing, Multimedia, and Machine Vision, Jorge L.C. Sanz, Ed., Springer-Verlag, New York, 1996. With permission). The above technique has useful signal representation properties. For example, the dictionary element chosen at each stage is the element that provides the greatest reduction in mean square error between the true signal f[i] and the coded signal ˆ f[i]. In this sense, the signal structures are coded in order of importance, which is desirable in situations where the bit budget is limited. For image and video coding applications, this means that the most visible features tend to be coded first. Weaker image features are coded later, if at all. It is even possible to control which types of image features are coded well by choosing dictionary functions to match the shape, scale, or frequency of the desired features. An interesting feature of the matching pursuit technique is that it places very few restrictions on the dictionary set. The original Mallat and Zhang paper considers both Gabor and wave-packet function dictionaries, but such structure is not required by the algorithm itself [14]. Mallat and Zhang showed that if the dictionary set is at least complete, then ˆ f[i] will eventually converge to f[i], though the rate of convergence is not guaranteed [14]. Convergence speed and thus coding efficiency are strongly related to the choice of dictionary set. However, true dictionary optimization can be difficult because there are so few restrictions. Any collection of arbitrarily sized and shaped functions can be used with matching pursuits, as long as completeness is satisfied. Bergeaud and Mallat used the matching pursuit technique to represent and process images [1]. Neff and Zakhor have used the matching pursuit technique to code the motion prediction error signal [20]. Their coder divides each motion residual into blocks and measures the energy of each block. The center of the block with the largest energy value is adopted as an initial estimate for the inner product search. A dictionary of Gabor basis vectors, shown in Fig. 55.7, is then exhaustively matched to an S × S window around the initial estimate. The exhaustive search can be thought of as follows. Each N × N dictionary structure is centered at each location in the search window, and the inner product between the structure and the corresponding N × N region of image data is computed. The largest inner-product is then quantized. The location, basis vector index, and quantized inner product are then coded together. Video sequences coded using matching pursuit do not suffer from either blocking or ringing artifacts, because the basis vectors are only coded when they are well-matched to the residual signal. As bit rate decreases, the distortion introduced by matching pursuit coding takes the form of a gradually increasing blurriness (or loss of detail). Since matching pursuits involves exhaustive search, it is more complex than DCT approaches, especially at high bit rates. c  1999 by CRC Press LLC FIGURE 55.7: Separable two-dimensional 20 × 20 Gabor dictionary. Figure 55.8(d) shows frame 250 of the 15 frame/s CIF Coast-guard sequence coded at 112 Kbits/s using the matching pursuit video coder described by Neff and Zakhor [20]. This frame does not suffer from the blocky artifacts, which affect the DCT coders as shown in Fig. 55.8(b). Moreover, it does not suffer from the ringing noise, which affects the subband coders as shown in Figs. 55.8(c) and 55.11(c). 55.2.3 Discussion Figure 55.8 shows frame 250 of the 15 frame/s CIF Coast-guard sequence coded at 112 Kbits/s using DCT, subband, and matching pursuit coders. The DCT coded frame suffers from blocking artifacts. The subband coded frame suffers from ringing artifact. Figure 55.9 compares the PSNR performance of the matching pursuit coder [20] to a DCT (H.263) coder [3] and a zerotree subband coder [16] when coding the Coast-guard sequence at 112 Kbits/s. The matching pursuit coder [20] in this example has consistently higher PSNR than the H.263 [3] and the zerotree subband [16] coders. Table 55.1 shows the average luminance PSNRs for different sequences at different bit rates. In all examples mentioned in Table 55.1, the matching pursuit coder has higher average PSNR than the DCT coder. The subband coder has the lowest average PSNR. TABLE 55.1 The Average Luminance PSNR of Different Sequences at Different Bit Rates When Coding Using a DCT Coder (H.263) [3], Zero-Tree Subband Coder (ZTS) [16], and Matching Pursuit Coder (MP) [20] Rate PSNR (dB) Sequence Format Bit Frame DCT ZTS MP Container-ship QCIF 10 K 7.5 29.43 28.01 31.10 Hall-Monitor QCIF 10 K 7.5 30.04 28.44 31.27 Mother-Daughter QCIF 10 K 7.5 32.50 31.07 32.78 Container-ship QCIF 24 K 10.0 32.77 30.44 34.26 Silent-Voice QCIF 24 K 10.0 30.89 29.41 31.71 Mother-Daughter QCIF 24 K 10.0 35.17 33.77 35.55 Coast-Guard QCIF 48 K 10.0 29.00 27.65 29.82 News CIF 48 K 7.5 30.95 29.97 31.96 c  1999 by CRC Press LLC [...]... Image Processing, 3, 767–770, 1996 [7] Committee Draft of Standard ISO11172, Coding of Moving Pictures and Associated Audio, ISO/MPEG 90/176, Dec 1990 [8] Gray, R., Vector quantization, IEEE Acoustics, Speech, and Signal Processing Magazine, 4–29, April 1984 [9] Huffman, D., A method for the construction of minimal redundancy codes, Proc IRE, 1098– 1101, Sept 1952 [10] Jain, A.K., Fundamentals of Digital. .. Atlantic City, NJ, September 10-13, 1957), IT-28(2), 129–137, Mar 1982 [14] Mallat, S and Zhang, Z., Matching pursuits with time-frequency dictionaries, IEEE Trans Signal Processing, 41(12), 3397–3415, Dec 1993 [15] Malvar, H.S., Signal Processing with Lapped Transforms, Artech House, 1992 [16] Martucci, S.A., Sodagar, I., Chiang, T and Zhang, Y.-Q., A zerotree wavelet coder, IEEE Trans Circuits and... 1992 [23] Ruf, M.J and Modestino, J.W., Rate-distortion performance for joint source channel coding of images, Proc IEEE Intl Conf on Image Processing, 2, 77–80, 1995 [24] Shapiro, J.M., Embedded image coding using zerotrees of wavelet coefficients, IEEE Trans Signal Processing, 41(12), 3445–3462, Dec 1993 [25] Sikora, T., The MPEG-4 video standard verification model, IEEE Trans Circuits and Systems for... 1984 [9] Huffman, D., A method for the construction of minimal redundancy codes, Proc IRE, 1098– 1101, Sept 1952 [10] Jain, A.K., Fundamentals of Digital Image Processing, Prentice-Hall, Englewood Cliffs, NJ, 1989 [11] Jayant, N and Noll, P., Digital Coding of Waveforms, Prentice-Hall, Englewood Cliffs, NJ, 1984 [12] Linde, Y., Buzo, A and Gray, R.M., An algorithm for vector quantizer design, IEEE Trans... 158–162, Feb.-April 1995 [27] Tan, W., Chang, E and Zakhor, A., Real time software implementation of scalable video codec, IEEE Intl Conf on Image Processing, 1, 17–20, 1996 [28] Taubman, D and Zakhor, A., Multirate 3-D subband coding of video, IEEE Trans Image Processing, 3(5), 572–588, Sept 1994 [29] Taubman, D and Zakhor, A., A common framework for rate and distortion based scaling of highly scalable... application to motion compensated video coding, Proc IEEE Intl Conf on Image Processing, 1, 725–729, Nov 1994 [31] Woods, J., Ed., Subband Image Coding, Kluwer Academic Publishers, 1991 [32] Taubman, D., Chang, E., and Zakhor, A., Directionality and scalability in subband image and video compression, in Image Technology: Advances in Image Processing, Multimedia, and Machine Vision, Jorge L.C Sanz, Ed., Springer-Verlag,... and vertical directions), 4:2:2 (half as many samples in the horizontal direction only), or 4:4:4 (full chrominance size) formats MPEG-2 supports scalability by offering four tools: data partitioning, signal- to-noise-ratio (SNR) scalability, spatial scalability, and temporal scalability Data partitioning can be used when two channels are available The bit-stream is partitioned into two streams according... be noticed on the DCT coded frame Ringing artifacts can be noticed on the subband coded frame 55.2.4 Quantization Motion compensation and residual error decomposition reduce the redundancy in the video signal However, to achieve low bit rates, we must tolerate some distortion in the video sequence This is because we need to map the residual and motion information to a fewer collection of codewords to... (b) 300 Kbits/s, and (c) 60 Kbits/s [28] (Source: Taubman, D., Chang, E., and Zakhor, A., Directionality and scalability in subband image and video compression, in Image Technology: Advances in Image Processing, Multimedia, and Machine Vision, Jorge L.C Sanz, Ed., Springer-Verlag, New York, 1996 With permission) Real time software only implementation of scalable video codec has also received a great... also plotted as connected dots for reference (Source: Taubman, D., Chang, E., and Zakhor, A., Directionality and scalability in subband image and video compression, in Image Technology: Advances in Image Processing, Multimedia, and Machine Vision, Jorge L.C Sanz, Ed., Springer-Verlag, New York, 1996 With permission) In order to achieve such features when using variable length codes, the bit-stream is usually . coefficient. A residual signal is computed as: R[i]=f[i]−tw γ [i]. (55.7) This residual signal is then expanded in the same way as the original signal. The procedure. reduction in mean square error between the true signal f[i] and the coded signal ˆ f[i]. In this sense, the signal structures are coded in order of importance,

Ngày đăng: 23/10/2013, 16:15

TỪ KHÓA LIÊN QUAN