1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Tài liệu 55 Video Sequence Compression doc

21 436 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 21
Dung lượng 398,15 KB

Nội dung

Osama Al-Shaykh, et. Al. “Video Sequence Compression.” 2000 CRC Press LLC. <http://www.engnetbase.com>. VideoSequenceCompression OsamaAl-Shaykh UniversityofCalifornia, Berkeley RalphNeff UniversityofCalifornia, Berkeley DavidTaubman HewlettPackard AvidehZakhor UniversityofCalifornia, Berkeley 55.1Introduction 55.2MotionCompensatedVideoCoding MotionEstimationandCompensation • Transformations • Discussion • Quantization • CodingofQuantizedSymbols 55.3DesirableFeatures Scalability • ErrorResilience 55.4Standards H.261 • MPEG-1 • MPEG-2 • H.263 • MPEG-4 Acknowledgment References Theimageandvideoprocessingliteratureisrichwithvideocompressionalgorithms. Thischapteroverviewsthebasicblocksofmostvideocompressionsystems,discusses someimportantfeaturesrequiredbymanyapplications,e.g.,scalabilityanderrorre- silience,andreviewstheexistingvideocompressionstandardssuchasH.261,H.263, MPEG-1,MPEG-2,andMPEG-4. 55.1 Introduction Videosourcesproducedataatveryhighbitrates.Inmanyapplications,theavailablebandwidthis usuallyverylimited.Forexample,thebitrateproducedbya30frame/scolorcommonintermediate format(CIF)(352×288)videosourceis73Mbits/s.Inordertotransmitsuchasequenceovera 64Kbits/schannel(e.g.,ISDNline),weneedtocompressthevideosequencebyafactorof1140.A simpleapproachistosubsamplethesequenceintimeandspace.Forexample,ifwesubsampleboth chromacomponentsby2ineachdimension,i.e.,4:2:0format,andthewholesequencetemporally by4,thebitratebecomes9.1Mbits/s.However,totransmitthevideoovera64kbits/schannel,it isnecessarytocompressthesubsampledsequencebyanotherfactorof143.Toachievesuchhigh compressionratios,wemusttoleratesomedistortioninthesubsampledframes. Compressioncanbeeitherlossless(reversible)orlossy(irreversible).Acompressionalgorithmis losslessifthesignalcanbereconstructedfromthecompressedinformation;otherwiseitislossy.The compressionperformanceofanylossyalgorithmisusuallydescribedintermsofitsrate-distortion curve,whichrepresentsthepotentialtrade-offbetweenthebitrateandthedistortionassociatedwith thelossyrepresentation.Theprimarygoalofanylossycompressionalgorithmistooptimizethe rate-distortioncurveoversomerangeofratesorlevelsofdistortion.Forvideoapplications,rate c  1999byCRCPressLLC is usually expressed in terms of bits per second. The distortion is usually expressed in terms of the peak-signal-to-noise ratio (PSNR) per frame or, in some cases, measures that try to quantify the subjective nature of the distortion. In addition to good compression performance, many other properties may beimportant or even critical to the applicability of a given compression algorithm. Such properties include robustness to errors in the compressed bit stream, low complexity encoders and decoders, low latency require- ments, andscalability. Developing scalablevideocompressionalgorithmshasattractedconsiderable attention in recent years. Generally speaking, scalabilityrefers to the potential to effectively decom- press subsets of the compressed bit stream in order to satisfy some practical constraint, e.g., display resolution, decoder computational complexity, and bit rate limitations. The demand for compatible video encoders and decoders has resulted in the development of differentvideocompressionstandards. Theinternationalstandardsorganization(ISO)hasdeveloped MPEG-1 to store video on compact discs, MPEG-2 for digital television, and MPEG-4 for a wide range of applications including multimedia. The international telecommunication union (ITU) has developed H.261 for video conferencing and H.263 for video telephony. All existing video compression standards are hybrid systems. That is, the compression is achieved in two main stages. The first stage, motion compensation and estimation, predicts each framefrom its neighboring frames, compresses the prediction parameters, and produces the prediction error frame. The second stage codes the prediction error. All existing standards use block-based discrete cosine transform (DCT) to code the residual error. In addition to DCT, others non-block-based coders, e.g., wavelets and matching pursuit, can be used. In this chapter, we will provide an overview of hybrid video coding systems. In Section 55.2,we discussthemainpartsofahybridvideocoder. Thisincludesmotioncompensation, signaldecompo- sitionsandt ransformations,quantization,andentropycoding. Wecomparevarioustransformations such as DCT, subband, and matching pursuit. In Section 55.3, we discuss scalability and error re- silience in video compression systems. We also describe a non-hybrid video coder that provides scalable bit-streams [28]. Finally, in Section 55.4, we review the key video compression standards: H.261, H.263, MPEG 1, MPEG 2, and MPEG 4. 55.2 Motion Compensated Video Coding Virtually all video compression systems identify and reduce four basic types of video data redun- dancy: inter-frame (temporal) redundancy, interpixel redundancy, psychovisual redundancy, and coding redundancy. Figure 55.1 shows a typical diagram of a hybrid video compression system. First the current frame is predicted from previously decoded frames by estimating the motion of blocks or objects, thus reducing the inter-frame redundancy. Afterwards to reduce the interpixel redundancy, the residual error after frame prediction is transformed to another format or domain such that the energy of the new signal is concentrated in few components and these components are as uncorrelated as possible. The transformed signal is then quantized according to the desired compression performance (subjective or objective). The quantized transform coefficients are then mapped to codewords that reduce the coding redundancy. The rest of this section will discuss the blocks of the hybrid system in more detail. 55.2.1 Motion Estimation and Compensation Neighboring frames in typical video sequences are highly correlated. This inter-frame (temporal) redundancy can be significantly reduced to produce a more compressible sequence by predicting each frame from its neighbors. Motion compensation is a nonlinear predictive technique in which the feedback loop contains both the inverse transformation and the inverse quantization blocks, as c  1999 by CRC Press LLC FIGURE 55.1: Motion compensated coding of video. shown in Fig. 55.1. Most motion compensation techniques divide the frame into regions, e.g., blocks. Each region is then predicted from the neighboring frames. The displacement of the block or region, d, is not fixed and must be encoded as side information in the bit stream. In some cases, different prediction models areusedto predict regions, e.g., affine transformations. These prediction parameters should also be encoded in the bit stream. To minimize the amount of side information, which must be included in the bit stream, and to simplify the encoding process, motion estimation is usually block based. That is, every pixel  i in a given rectangular block is assigned the same motion vector, d. Block-based motion estimation is an integral part of all existing video compression standards. 55.2.2 Transformations Mostimageandvideocompressionschemesapplyatransformationtotherawpixelsortotheresidual error resulting from motion compensation before quantizing and coding the resulting coefficients. The function ofthetransformation is torepresentthesignalinafewuncorrelatedcomponents. The most common transformations are linear transformations, i.e., the multi-dimensional sequence of input pixel values, f [  i], is represented in terms of the transform coefficients, t[  k],via f [  i]=   k t[  k]w  k [  i] (55.1) for some w  k [  i]. The input image is thus represented as a linear combination of basis vectors, w  k . It is important to note that the basis vectors need not be orthogonal. They only need to form an over-completeset(matchingpursuits),acompleteset(DCTandsomesubband decompositions),or very close tocomplete (somesubbanddecompositions). Thisisimportantsincethecodershouldbe abletocodea variety of signals. The remainderofthe section discussesandcomparesDCT,subband decompositions, and matching pursuits. The DCT There are two properties desirable in a unitary transform for image compression: the energy should be packed into a few transform coefficients, and the coefficients should be as uncorrelated c  1999 by CRC Press LLC as possible. The optimum transformunder these two constraints is the Karhunen-Lo ´ eve transform (KLT) where the eigenvectors of the covariance matrix of the image are the vectors of the trans- form [10]. Although the KLT is optimal under these two constraints, it is data-dependent, and is expensive to compute. The discrete cosine transform (DCT) performs very close to KLT especially when the input is a first order Markov process [10]. TheDCTisablock-based transform. That is, thesignalisdividedintoblocks,whichareindepen- dently transformedusingorthonormal discrete cosines. The DCT coefficients of a one-dimensional signal, f , are computed via t DCT [Nb + k]= 1 √ N              N−1  i=0 f [Nb + i],k= 0 N−1  i=0 √ 2f [Nb + i]cos (2i + 1)kπ 2N , 1 ≤ k<N ∀b (55.2) where N is the size of the block and b denotes the block number. The orthonormal basis vectors associated with the one-dimensional DCT transformation of Eq. (55.2)are w DCT k [i]= 1 √ N  1,k= 0, 0 ≤ i<N √ 2 cos (2i+1)kπ 2N , 1 ≤ k<N, 0 ≤ i<N (55.3) Figure 55.2(a) shows these basis vectors for N = 8. FIGURE55.2: DCTbasisvectors(N = 8): (a)one-dimensionaland(b)separabletwo-dimensional. The one-dimensional DCT described above is usually separably extended to two dimensions for image compression applications. In this case, the two-dimensional basis vectors are formed by the tensor product of one-dimensional DCT basis vectors and are given by w DCT  k [  i]=w DCT k 1 ,k 2 [i 1 ,i 2 ]  = w DCT k 1 [i 1 ]·w DCT k 2 [i 2 ]; 0 ≤ k 1 ,k 2 ,i 1 ,i 2 <N c  1999 by CRC Press LLC Figure 55.2(b) shows the two-dimensional basis vectors for N = 8. The DCTis the most common transform in video compression. It is used in the JPEG still image compression standard, and all existing video compression standards. This is because it performs reasonably well at different bit rates. Moreover, there are fast algorithms and special hardware chips to compute the DCT efficiently. The major objection to the DCT in image or video compression applications is that the non- overlapping blocks of basis vectors, w  k , are responsible for distinctly “blocky” artifacts in the de- compressed frames, especially at low bit rates. This is due to the quantization of the transform coefficients of a block independent from neighboring blocks. Overlapped DCT representation ad- dresses this problem[15]; however, the common solution is topost-processtheframe bysmoothing the block boundaries [18, 22]. Due tobitraterestrictions, someblocksareonlyrepresented byoneor a smallnumberofcoarsely quantizedtransfor mcoefficients,hencethedecompressedblockw illonlyconsistofthesebasisvectors. This will cause artifacts commonly known as ringing and mosquito noise. Figure 55.8(b) shows frame250 of the 15 frame/s CIF Coast-guard sequence coded at 112 Kbits/s usingaDCThybridvideocoder. 1 Thisfigureprovidesagoodillustrationofthe “blocking”artifacts. Subband Decomposition The basic idea of subband decomposition is to split the frequency spectrumof the image into (disjoint)subbands. Thisis efficientwhenthe imagespectrumisnotflat andisconcentratedina few subbands, which is usually the case. Moreover, we can quantize the subbands differently according to their visual importance. As for the DCT, we begin our discussion of subband decomposition by considering only a one- dimensional source sequence, f [i]. Figure 55.3 provides a general illustration of an N-band one- dimensional subband system. We refer to the subband decomposition itself as analysis and to the FIGURE 55.3: 1D, N -band subband analysis and synthesis block diagrams. (Source: Taubman, D., Chang, E., and Zakhor, A., Directionality and scalability in subband image and video compression, inImage Technology: Advances inImage Processing, Multimedia, and Machine Vision, JorgeL.C.Sanz, Ed., Springer-Verlag, New York, 1996. With permission). inversetransformationassynthesis. Thetransformationcoefficientsofbands1, 2, ,Naredenoted by the sequences u 1 [k],u 2 [k], ,u N [k], respectively. For notational convenience and consistency withtheDCTformulationabove,wew ritet SB [·]forthesequenceofallsubbandcoefficients, arranged 1 It iscoded usingH.263 [3], which is an ITUstandard. c  1999 by CRC Press LLC accordingtot SB [(β −1)+Nk]=u β [k],where1 ≤ β ≤ N isthesubbandnumber. Thesecoefficients are generated byfiltering the input sequence with filters H 1 , ,H N and downsampling the filtered sequencesbyafactorofN, as depicted in Fig. 55.3. In subband synthesis, the coefficients for each band areupsampled, interpolatedwith the synthesis filters, G 1 , ,G N , and the resultssummedto form a reconstructed sequence, ˜ f [i],asdepictedinFig.55.3. Ifthere constructed sequence, ˜ f [i], andthesourcesequence, f [i], areidentical,thenthesubband system is referred to as perfect reconstruction (PR) and the corresponding basis set is a complete basisset. Althoughperfect reconstructionisadesirableproperty,nearperfect reconstruction(NPR), for whichsubbandsynthesisisonlyapproximately the inverseof subbandanalysis, isoften sufficient in practice. This is because distortion introduced byquantizationofthesubbandcoefficients, t SB [k], usually dwarfs that introduced by an imperfect synthesis system. The filters, H 1 , ,H N , are usually designed to haveband-passfrequency responses, as indicated in Fig.55.4, so that the coefficients u β [k]for each subband, 1 ≤ β ≤ N , represent different spectral components of the source sequence. FIGURE55.4: Typicalanalysis filtermagnituderesponses. (Source: Taubman,D.,Chang,E.,and Za- khor,A.,Directionalityandscalabilityinsubbandimageandvideocompression,inImageTechnology: Advances inImage Processing, Multimedia,and Machine Vision, JorgeL.C.Sanz, Ed., Springer-Verlag, New York, 1996. With permission). The basis vectors for subband decomposition are the N-translates of the impulse responses, g 1 [i], ,g N [i], of synthesis filters G 1 , ,G N . Specifically, denoting the kth basis vector as- sociated withsubband β by w SB Nk+β−1 ,wehave w SB Nk + β −1 [i]=g β [i − Nk] (55.4) Figure 55.5 illustrates five of the basis vectors for a particularly simple, yet useful, two-band PR subbanddecomposition, with symmetric FIR analysisandsynthesisimpulseresponses. Asshown in Fig. 55.5 and in contrast with the DCT basis vectors, the subband basis vectors overlap. As for the DCT, one-dimensional subband decompositions may be separably extended to higher dimensions. By this we mean that a one-dimensional subband decomposition is first applied along one dimension of an image or video sequence. Any or all of the resulting subbands are then further decomposedintosubbandsalonganother dimensionandso on. Figure55.6 depictsaseparabletwo- dimensionalsubbandsystem. Forvideocompression applications,thepredictionerrorissometimes decomposed into subbands of equal size. Two-dimensional subband decompositions have the advantage that they do not suffer from the disturbing blocking artifacts exhibited by the DCT at high compression ratios. Instead, the most noticeablequantization-induceddistortiontendstobe‘ringing’or‘rippling’artifacts, whichbecome most bothersome in the vicinity of image edges. Figures 55.11(c) and 55.8(c) clearly show this effect. Figure 55.11 shows frame210 of the Ping-pong sequence compressed using a scalable, three- dimensional subbandcoder [28]at1.5Mbits/s,300Kbits/s, and 60 Kbits/s. Asthebitratedecreases, we notice loss of detail and introduction of more ringing noise. Figure 55.8(c) shows frame 250 of the Coast-guard sequence compressed at 112 Kbits/s using a zerotree scalable coder [16]. The edges of the trees and the boat are affected by ringing noise. c  1999 by CRC Press LLC FIGURE 55.5: Subband basis vectors with N = 2,h 1 [−2 2]= √ 2 · (− 1 8 , 1 4 , 3 4 , 1 4 , − 1 8 ), h 2 [−2 0]= √ 2 · (− 1 4 , 1 2 , − 1 4 ), g 1 [−1 1]= √ 2 · ( 1 4 , 1 2 , 1 4 ), and g 2 [−1 3]= √ 2 ·(− 1 8 , − 1 4 , 3 4 , − 1 4 , − 1 8 ).h i and g i are the impulse responses of the H i (analysis) and G i (synthesis) filters, respectively. (Source: Taubman, D., Chang, E., and Zakhor, A., Direction- ality and scalability in subband image and video compression, in Image Technology: Advances in Image Processing, Multimedia, and Machine Vision, Jorge L.C. Sanz, Ed., Springer-Verlag,New York, 1996. With permission). Matching Pursuit Representing a signal using an over-complete basis set implies that there is more than one representation for the signal. For coding purposes, we are interested in representing the signal with the fewest basis vectors. This is an NP-complete problem [14]. Different approaches have been investigatedtofindorapproximate thesolution. Matchingpursuitsisamultistagealgorithm,which in each stage finds the basis vector that minimizes the mean-squared-error [14]. Suppose we want to represent a signal f [i] using basis vectors from an over-complete dictionary (basis set) G. Individual dictionary vectors can be denoted as: w γ [i]∈G. (55.5) Here γ is anindexingpar ameterassociatedwitha particulardictionaryelement. Thedecomposition begins by choosing γ to maximize the absolute value of the following inner product: t =<f[i],w γ [i] >, (55.6) where t is the transform (expansion) coefficient. A residual signal is computed as: R[i]=f [i]−tw γ [i]. (55.7) Thisresidualsignalisthen expandedinthe samewayastheoriginalsignal. The procedurecontinues iterativelyuntileitheraset number ofexpansioncoefficientsare generatedorsomeenergythreshold for the residual is reached. Each stage k yields a dictionary structure specified by γ k , an expansion coefficient t[k], and a residual R k , which is passed on to the next stage. After a total of M stages, the signal can be approximated by a linear function of the dictionaryelements: ˆ f [i]= M  k=1 t[k]w γ k [i]. (55.8) c  1999 by CRC Press LLC FIGURE 55.6: Separable spatial subband pyramid. Two level analysis system configuration and subband passbands shown. (Source: Taubman, D., Chang, E., and Zakhor, A., Directionality and scalabilityinsubbandimageandvideocompression,inImage Technology: Advancesin ImageProcess- ing, Multimedia, and Machine Vision, Jorge L.C. Sanz, Ed., Springer-Verlag, New York, 1996. With permission). The above technique has useful signal representation properties. For example, the dictionary elementchosenateachstageistheelementthat providesthegreatestreductioninmeansquare error between the true signal f [i] and the coded signal ˆ f [i]. In this sense, the signalstructures are coded inorderofimportance,whichisdesirableinsituationswherethebitbudgetislimited. Forimageand video coding applications, this means that the most visible features tend to be coded first. Weaker image features arecodedlater, if atall. Itiseven possibletocontrolwhichtypes ofimagefeaturesare coded well by choosing dictionary functions to match the shape, scale, or frequency of the desired features. An interesting feature of the matching pursuit technique is that it places very few restrictions on the dictionary set. The orig inal Mallat and Zhang paper considers both Gabor and wave-packet function dictionaries, but such structure is not required by the algorithm itself [14]. Mallat and Zhang showed that if the dictionary set is at least complete, then ˆ f [i] will eventually converge to f [i], though the rate of convergence is not guaranteed [14]. Convergence speed and thus coding efficiency are strongly related to the choice of dictionary set. However, true dictionary optimization can be difficult because there are so few restrictions. Any collection of arbitrarily sized and shaped functions can be used withmatching pursuits, as long as completeness is satisfied. Bergeaud and Mallat used the matching pursuit technique to represent and process images [1]. Neff and Zakhor have used the matching pursuit technique to code the motion prediction error signal [20]. Their coder divides each motion residual into blocks and measures the energy of each block. The center of the block with the largest energy value is adopted as an initial estimate for the inner product search. A dictionary of Gabor basis vectors, shown in Fig. 55.7, is then exhaustively matchedtoanS ×S window aroundtheinitialestimate. The exhaustive searchcanbethoughtofas follows. EachN ×N dictionary structure is centered at each location in the search window, and the innerproductbetweenthestructureandthecorrespondingN ×N regionofimagedataiscomputed. The largest inner-product is then quantized. The location, basis vector index, and quantized inner product are then coded together. Video sequences coded using matching pursuit do not suffer from either blocking or ringing artifacts, because the basis vectors are only coded when they are well-matched to the residual signal. As bit rate decreases, the distortion introduced by matching pursuit coding takes the form of a graduallyincreasingblurriness(orlossofdetail). Sincematchingpursuitsinvolvesexhaustivesearch, it is more complex than DCT approaches, especially at high bit rates. c  1999 by CRC Press LLC FIGURE 55.7: Separable two-dimensional 20 ×20 Gabor dictionary. Figure 55.8(d) shows frame250of the 15 frame/s CIF Coast-guard sequence coded at112 Kbits/s using the matching pursuit video coder described by Neff and Zakhor [20]. This frame does not suffer from the blocky artifacts, which affect the DCT coders as shown in Fig. 55.8(b). Moreover, it does not suffer from the ringing noise, which affects the subband coders as shown in Figs. 55.8(c) and 55.11(c). 55.2.3 Discussion Figure 55.8 shows frame 250 of the 15 frame/s CIF Coast-guard sequence coded at 112 Kbits/susing DCT, subband, and matching pursuit coders. The DCTcodedframe suffers from blocking artifacts. The subband coded framesuffers from ringing artifact. Figure55.9comparesthePSNRperformanceofthematchingpursuit coder[20]toaDCT(H.263) coder [3] and a zerotree subband coder [16] when coding the Coast-guard sequence at 112 Kbits/s. The matching pursuit coder [20] in this example has consistently higher PSNR than the H.263 [3] and the zerotree subband [16] coders. Table 55.1 shows the average luminance PSNRs for different sequences at different bit rates. In all examples mentioned in Table55.1, the matching pursuit coder has higheraverage PSNR than the DCT coder. The subband coder has the lowest average PSNR. TABLE 55.1 TheAverage Luminance PSNR of Different Sequences at Different Bit Rates When Coding Using a DCT Coder (H.263) [3], Zero-Tree Subband Coder (ZTS) [16], andMatching Pursuit Coder (MP) [20] Rate PSNR (dB) Sequence Format Bit Frame DCT ZTS MP Container-ship QCIF 10 K 7.5 29.43 28.01 31.10 Hall-Monitor QCIF 10 K 7.5 30.04 28.44 31.27 Mother-Daughter QCIF 10 K 7.5 32.50 31.07 32.78 Container-ship QCIF 24 K 10.0 32.77 30.44 34.26 Silent-Voice QCIF 24 K 10.0 30.89 29.41 31.71 Mother-Daughter QCIF 24 K 10.0 35.17 33.77 35.55 Coast-Guard QCIF 48 K 10.0 29.00 27.65 29.82 News CIF 48 K 7.5 30.95 29.97 31.96 c  1999 by CRC Press LLC [...]... existing standards Sections 55. 4.2, 55. 4.3, and 55. 4.5 outline the Motion Picture Experts Group (MPEG) standards for video compression Sections 55. 4.1 and 55. 4.4 review the CCITT H.261 and H.263 standards for digital video communications This section lists the standards according to their chronological order in order to provide an understanding of the progress of the video compression standardization... namely scalability and error resilience 55. 3.1 Scalability Developing scalable video compression algorithms has attracted considerable attention in recent years Scalable compression refers to encoding a sequence in such a way so that subsets of the encoded bit-stream correspond to compressed versions of the sequence at different rates and resolutions Scalable compression is useful in today’s heterogeneous... fully scalable video with fine granularity of bit rates Temporal filtering, however, introduces significant overall latency, a critical parameter for interactive video compression applications To reduce this effect, it is possible to use a 2-tap temporal filter, which results in one frame of delay As a visual demonstration of the quality tradeoff inherent to rate-scalable video compression, Fig 55. 11 shows... tradeoff between good compression performance and error resilience In order to reduce the cost of error resilient codes, some approaches jointly optimize the source and channel codes [6, 23] 55. 4 Standards In this section we review the major video compression standards Essentially, these schemes are based on the building blocks introduced in Section 55. 2 All these standards use the DCT Table 55. 2 summarizes... organized in a data structure design based on this observation FIGURE 55. 10: (a) A common scan for an 8 × 8 block DCT (b) A common scan for subband decompositions (zero-tree) 55. 3 Desirable Features Some video applications require the encoder to provide more than good compression performance For example, it is desirable to have scalable video compression schemes so that different users with different bandwidth,... is followed by the motion data, then followed by the block information c 1999 by CRC Press LLC 55. 4.2 MPEG-1 The first (MPEG) video compression standard [7], MPEG-1, is intended primarily for progressive video at 30 frames/s The targeted bit rate is in the range 1.0 to 1.5 Mbits/s MPEG-1 was designed to store video on compact discs Such applications require MPEG-1 to support random access to the material... the rectangular frame The decoder uses the color information to detect the object in the decoded stream 55. 4.5 MPEG-4 The moving picture expert group is developing a video standard that targets a wide range of applications including Internet multimedia, interactive video games, video- conferencing, video- phones, multimedia storage, wireless multimedia, and broadcasting applications Such a wide range... subband coding of video, IEEE Trans Image Processing, 3(5), 572–588, Sept 1994 [29] Taubman, D and Zakhor, A., A common framework for rate and distortion based scaling of highly scalable compressed video, IEEE Trans Circuits and Systems for Video Technology, 6(4), 329–354, Aug 1996 [30] Vetterli, M and Kalker, T., Matching pursuit for compression and application to motion compensated video coding, Proc... 16 × 16, 8×8 Forward, backward Half pixel No Yes Yes Yes No Yes Yes Yes Yes Weighted uniform Motion Compensation Scalability 55. 4.1 Yes No No No H.261 Recommendation H.261 of the CCITT Study Group XV was adopted in December 1990 [2] as a video compression standard to be used for video conferencing applications The bit rates supported by H.261 are p × 64 Kbits/s, where p is in the range 1 to 30 H.261... not provide good compression performance, especially since the histogram of the transform coefficients has a significant peak around low frequency c 1999 by CRC Press LLC FIGURE 55. 12: Rate-distortion curves for PING-PONG sequence Overall PSNR values for Y, U, and V components for the codec in [28] are plotted against the bit rate limit imposed on the rate-scalable bit stream prior to decompression MPEG-1 . Osama Al-Shaykh, et. Al. Video Sequence Compression. ” 2000 CRC Press LLC. <http://www.engnetbase.com>. VideoSequenceCompression OsamaAl-Shaykh UniversityofCalifornia, Berkeley RalphNeff UniversityofCalifornia, Berkeley DavidTaubman HewlettPackard AvidehZakhor UniversityofCalifornia, Berkeley 55. 1Introduction 55. 2MotionCompensatedVideoCoding MotionEstimationandCompensation • Transformations • Discussion • Quantization • CodingofQuantizedSymbols 55. 3DesirableFeatures Scalability • ErrorResilience 55. 4Standards H.261 • MPEG-1 • MPEG-2 • H.263 • MPEG-4 Acknowledgment References Theimageandvideoprocessingliteratureisrichwithvideocompressionalgorithms. Thischapteroverviewsthebasicblocksofmostvideocompressionsystems,discusses someimportantfeaturesrequiredbymanyapplications,e.g.,scalabilityanderrorre- silience,andreviewstheexistingvideocompressionstandardssuchasH.261,H.263, MPEG-1,MPEG-2,andMPEG-4. 55. 1. Section 55. 4, we review the key video compression standards: H.261, H.263, MPEG 1, MPEG 2, and MPEG 4. 55. 2 Motion Compensated Video Coding Virtually all video

Ngày đăng: 27/01/2014, 03:20

w