Estimation of Redundancy in Compressed Image and Video Data for Joint SourceChannel Decoding Hang NGUYEN, Pierre DUHAMEL Alcatel, Research and Invovation Department, Route de Nozay, F91460, FRANCE (hang.nguyen@alcatel.fr) CNRS/LSS, Supelec, Plateau de Moulon, F92190, FRANCE (pierre.duhamel@lss supelec.fr) Abstract—Joint source–channel decoding of Variable Length Codes for image and video streaming transmission over unreliable links, such as wireless networks, is a subject of increasing interest In this article, we first provide an analysis of available redundancy in compressed video data produced by some commonly used standardized codecs Then, we show that significant improvement decoding performance can be achieved when additional image and video data properties are taken into consideration Finally, performance results for improving video decoding over wireless channels are presented Index Terms—Source redundancy, variable length codes, compressed image and video decoding, joint source-channel decoding simple projection of the bit stream on the used VLC structure The correlation between bits inside one VLC codeword is exploited However, these methods not exploit the relationship between the VLC codewords of one same image block So, the part IV of this paper will find out this relationship and then, quantify the associated redundancy Finally, part V shows that using all the redundancy information improves the source decoding when the signal has been transmitted over unreliable links II CRITERIA FOR EVALUATING SOURCE REDUNDANCY In a conventional I M INTRODUCTION n ultimedia streaming over wireless networks is nowadays a possibility, and can offer a medium to low video quality However, wireless channels cause high bit error rates and the residual number of erroneous bits can be significant in the received sequences, despite the use of Forward Error Correcting codes On another hand, today’s source encoders are designed to compress data as much as possible, assuming error free transmission Hence, most video and image compression standards [1-3] make heavy use of entropy compression techniques which are known to be very sensitive to errors This sensitivity is in fact mostly due to the fact that codewords have variable length For the variable length codes (VLC) such as Huffman code [1], RVLC [2,4-6], UVLC [3,5]…, codewords are of variable length and the prefix property is used for decoding Joint soft-input soft-output (SISO) source–channel decoding of VLC for image and video streaming transmission over unreliable links, such as wireless networks, is being a subject of increasing interest [8-13] This idea is based on the turbocode iterative decoding process where two SISO channel decoders are exchanging soft information based on each decoder’s own redundancy bits and own channel code structure This mechanism performs well because the two channel code structures are different and their redundancy bits are also different This idea is applied for joint source-channel decoding Though both channel code and source code structures are different, the residual redundancy in the compressed data would be very small because the source compression removes most of the redundancy This paper proposes to quantify the residual redundancy in the VLC compressed data and a procedure able to make use of this Redundancy in source coding as an equivalent of redundancy in the case of channel coding is defined in part II The image and video compression standard H.263 [1] is used as an illustration where a run-length entropy compression followed by a VLC is used Part III studies the case of a GLOBECOM 2003 channel coding of rate k , we have (n − k ) bits of redundancy for each sequence of k bits This also means that we encode a source codebook of cardinal number equal to k in a space of cardinal number equal to n The larger the encoded space is, compared with the source space, the more redundancy there is, the better the protection is So we can define an equivalent redundancy rate as k and 2n an “equivalent redundancy bits” as (n − k ) A The equivalent redundancy rate In the following, S is a binary sequence of length N This sequence S is a VLC codeword sequence These VLC codewords are taken from a VLC codebook and correspond to a compressed block of image We can then define three different sets: EN = {all the possible binary sequences of N bits} of N bits D N = {all the valid VLC codeword sequences } all the valid VLC codeword sequences PN = which could be a valid image block of N bits In the following, the cardinal number of a set is noted card(.) Card ( E N ) is N Card ( D N ) is less than card ( E N ) because some binary sequences of N bits are not sequences of VLC codewords from the used codebook Similarly, card ( PN ) is less than card ( DN ) because some VLC sequences of D N are not valid compressed image blocks The higher these differences are, the better the correction capability is and the more redundancy there is We can then define an “equivalent redundancy rate” as: R= the number of valid sequenc es the number of possible binary se quences (1) In the case of the three previous sets, two “equivalent redundancy rates” could be defined: - 2198 - 0-7803-7974-8/03/$17.00 © 2003 IEEE R1 = card ( D N ) card ( E N ) R2 = card ( PN ) card ( E N ) codeword and the longest codeword of the codebook C (2-3) B The number of equivalent redundancy bits In the channel coding case, the number of redundancy bits is: N redundancy = n − k = log ( n ) − log ( k ) (4) We can define a “number of equivalent redundancy bits” as: N redundancy = log ( the number of possible binary sequences ) − log ( the number of available sequences ) is the number of codewords of length i of the codebook C The sequence {M i }i ≥1 is given when the code book is given If there is no codeword of length i in the codebook C, then M i = This following recursive sequence (9) gives the number of valid VLC sequences of any binary length N n (5) x For the cases taken as illustration, the “numbers of equivalent redundancy bits” are: Nredundancy,1 = log2 (card( EN )) − log2 (card( DN )) (6) N redundancy, = log (card ( E N )) − log (card ( PN )) (7) give the theoretical formulas for computing the cardinal numbers of the two sets D N and PN Then, the part V is the numerical application of these formulas for the H.263 video compression standard case III REDUNDANCY FROM THE PROJECTION OF THE BITSTREAM OVER THE VLC CODEBOOK STRUCTURE A Definition of the problem A sequence of N bits representing one encoded image block is received We want to know the number of all valid sequences of VLC codewords that have exactly N bits For that, we introduce a sequence {vi }i ≥1 defined by: vi : v i = card ( Di ) , D i i ≥ 1, S ∈ E i such as : = ∃ K ≥ and ∃ {V i }i ∈ [1 , K ] ∈ C such as S = V V i V K K (8) If there is no VLC sequence of length “i”, Di is an empty set and its cardinal number vi = For computing R1 and N redundancy ,1 , the cardinal number vN = vi = N of the set D N = Di = N is to be calculated B Evaluation We have shown that the values of the terms of the sequence {vi }i ≥ are given by the following recursive formula: ∀ n < l n = l ∀ n ∈ [l + 1, l max ∀ n > l max v l = M l = M n + = n −1 ∑ M n −i vi (9) i =1 n −1 ∑ M n −i vi i = n − l max l , l max are respectively the length of the shortest GLOBECOM 2003 … v n computation recursive algorithm for l max = IV ADDITIONAL REDUNDANCY FROM CORRELATION BETWEEN VLC CODEWORDS OF THE SAME IMAGE BLOCK A Source redundancy in VLC sequence of image block The basic idea here is to use the relationship between the VLC codewords of one same image block for decoding In most image and video compression standards [1-3], each image is divided into several blocks Then, the pixels of each block are processed by a 2-dimensional Discrete Cosine Transform (DCT) The number of DCT coefficients is equal to the number of the pixels of the image block These DCT coefficients are then encoded by a run-length VLC More precisely, in the H.263 standard [1-2], each VLC codeword corresponds to a triplet (run, level, last) : “run” represents the number of zeros till the next non-zero DCT coefficient, “level” represents the value of this non-zero DCT coefficient, and “last” indicates if this DCT coefficient is the last non-zero value of the image block So, each VLC codeword corresponds to (“run” + 1) DCT coefficients (“run” zero DCT coefficients and one non-zero coefficient) In the H.26L standard [3], codewords are either the couple (run, level) or the end of block indicator EOB Under these assumptions, the following properties are always true: the sum over all the number of DCT coefficients corresponding to the VLC codewords of one compressed image block should be less or equal N DCTcoef (the total number of DCT coefficients of one image block) [1-3] So, the “run” values of the VLC codewords of one compressed image block meet the following constraint: ∑ ( run vlc vlc codewords of one image block + 1) ≤ N DCTcoef (10) = ] x x x x x x x x x Fig 1: In the following parts, the quantities: R1 and R2 , N redundancy ,1 and N redundancy , are computed For that, the parts III and IV Mi only the last codeword of the VLC codeword sequence corresponding to one compressed block of image has the field “last” of the triplet (run, level, last) equal to one [1-2] So this property can always be used to decode the compressed image or video data For the H.263 standard, both conditions are available For the H.26L standard, the first condition is true, and the equivalent of the second condition is that the last VLC codeword should correspond to the end of block EOB indicator Hence, we consider separately two cases: when only the first condition is used and when both - 2199 - 0-7803-7974-8/03/$17.00 © 2003 IEEE conditions are used and plus one This recursive sequence gives the number of valid VLC sequences of any binary length N respecting the “run” constraint for compressed image and video data [1-3] B Only “run” constraint problem resolution 1) Definition of the problem We introduce a sequence p n,r defined by: { }n≥1,r ≥1 p n ,r : Pn , r n 0 0 pn, r = cardinal ( Pn, r ) , n ≥ 1, r ≥ , with S ∈ En such as : = S = V1 V j V K with ∀ j ∈ [1, K ]: V j ∈ C K 2 R = + ≤ ( run ) r ∑ j j = r (11) An ,r { } ∆ ∑ δ n,ρ and n ,r 0 x x 0 x x 0… 0… x x… x x … 0 x … p n,r computation recursive algorithm V ∈ C such as : length of V is n bits = + = run r V last V = {βn,r }n≥1,r≥1 are defined: Bn , r V ∈ C such as : 1 length of V is n bits = 2 runV + = r 3 lastV = (14-15) {α n,r }n≥1,r ≥1 , {β n , r }n ≥1, r ≥1 (12) is known, and sequences {δ n,r } are known It can be n≥1,r ≥1 {M ' n , r }n ≥1, r ≥1 are also known The sequence is defined by: M ' = n ,r r (16) ∑ β n,ρ ρ =1 In this section, the sequence {pn,r } is defined by: n ≥1,r ≥1 p n,r : p n , r = card ( Pn , r ) , n ≥ 1, r ≥ , shown that the values of the terms of the sequence p n ,r n ≥1,r ≥1 are given by the following recursive formula: } p n ,r = ∀ n < l , ∀ r ≥ R δ lmon , ρ n l , r p = ∀ ≥ = ∑ l , r ρ = ∀ r < R , ∀ n ≥ p n,r = p n,r = R = δ n , r = R r = R , ∀ n ≥ ∀ n ∈ [l + 1, l max ] , ∀ r > R n −1 r −1 p n,r = M n , r + ∑ ∑ δ n − i , r − ρ p i , ρ i =1 ρ =1 ∀ n > l max , ∀ r > R n −1 r −1 p n,r = ∑ ∑ δ n − i , r − ρ p i , ρ i = n − l max ρ =1 0 x x ( β n,r = ) The codebook C is known, so the sequences ρ =1 ∆ n,r can be empty and in this case: δ n ,r = Codebook C { 0 x x A n , r ( Bn,r ) can be empty and in this case: α n,r = δ n , r = cardinal ( ∆ n , r ) as : V ∈ C such = length of V is n bits run V + = r 0 x x α n, r = card ( An, r ) , β n, r = card ( Bn, r ) result that we are looking for 2) Evaluation Two sequences {Mn,r } and δn,r n≥1, r≥1 are introduced: n≥1, r≥1 = 0 x x { } }n≥1,r ≥1 n ,r 0 x x Two sequences α n,r and n≥1,r ≥1 term of this sequence pn = N ,r = N DCTcoef = pN = card ( PN ) is the M 0 x x C “Run” and “last” constraints problem resolution number is p n,r = If we are able to compute the terms of the sequence p n ,r , then the ( n = N , r = N DCTcoef )-th r 0 x x Fig 2: If there is no VLC sequence of length n or/and of “total run” smaller than r, the Pn,r is an empty set and its cardinal { 0 0 Pn , r such as : S ∈ En S = V V j V K with ∀ j ∈ [1, K ] : V j ∈ C K = ( ) R run r = + ≤ ∑ j j =1 ∀ j ∈ [1, K - ] last j = and last K = (17) (13) If there is no VLC sequence of length n or/and of “total run” smaller than r, the Pn,r is an empty set and its cardinal number pn,r = The aim here is to compute: p N = card ( PN ) = p n = N , r = N DCTcoef (18) With a similar demonstration as the previous cases, the values of the terms of the sequence {pn, r } are given by n ≥1,r ≥1 R is the sum of all the run values of the variable length codewords in the sequence plus the number of the VLC codewords Rmin , Rmax are the minimum and maximum the recursive formula (19) This recursive sequence gives the number of valid VLC sequences of any binary length N respecting both the “run” and the “last” constraints for compressed image and video data [1] values of the field “run” for all the codewords of the codebook GLOBECOM 2003 - 2200 - 0-7803-7974-8/03/$17.00 © 2003 IEEE p n ,r = ∀ n < l , ∀ r ≥ R p l , r = ∑ β l mon , ρ n = l , ∀ r ≥ ρ =1 ∀ r < R p n, r = , ∀ n ≥ p n, r = R = β n , r = R r = R , ∀ n ≥ ∀ n ∈ [l + 1, l max ] , ∀ r > R n −1 r − p n, r = M ' n , r + ∑ ∑ α n − i , r − ρ p i , ρ i = ρ =1 ∀ n > l max , ∀ r > R n −1 r −1 = p ∑ ∑ α n − i ,r − ρ p i , ρ n, r i = n − l max ρ = (19) compressed image block is 60 bits and the available redundancy rate R1 is about “0.2” This means that we have encoded a source codebook in a space whose cardinal number is five times larger The plot “ R2 ” shows that the cardinal number of the set D N is much higher than the cardinal number of the set PN So the correlation between VLC codewords of the same image block exists and consequently creates redundancy Using both correlations between VLC codewords of the same image block and between bits of the same VLC codeword for decoding should give better results than simple VLC decoding or only with the correlation between bits of the same VLC codeword For the previous typical values of 60 bits, here, the available redundancy is about 0.0005 This means that we have encoded a source codebook in a space whose cardinal number is 2000 times larger B Number of equivalent redundancy bits V NUMERICAL RESULTS From those equations, it is clear that a number of VLC sequences are not valid source sequences This results in redundancy in the bitstream which can be used for correcting some transmission errors By doing so, we are implementing joint source-channel decoding This section evaluates the redundancy found in a H.263 bitstream The evaluation of N redundancy ,1 and N redundancy , is based on the results of parts III and IV-C Figures below depict the redundancy levels: R1 , R2 and N redundancy ,1 , N redundancy , as a function of the length in bits of the binary compressed image blocks which are VLC codeword sequences A Equivalent redundancy rates Fig 4: Equivalent redundancy bits numbers in the image block Fig shows that the number of equivalent redundancy bits N redundancy , is much higher than N redundancy ,1 For the same previous typical value of 60 bits of one compressed image block, N redundancy , is about 11 bits, while N redundancy ,1 is Fig 3: Equivalent redundancy rates Fig plots “ R1 ” and “ R2 ” The plots “ R1 ” showed that the cardinal number of the set E N is much higher than the cardinal number of the set D N So the correlation between bits of the same VLC codeword exists and creates redundancy This redundancy can be used as a decoding correction capacity of the received sequence A typical length for H.263 GLOBECOM 2003 only about These results show that significant residual source redundancy still exists in the compressed video data, and much more residual redundancy could be exploited if the correlation between VLC codewords of the same image block is taken into account Clearly, this redundancy increases rapidly with the binary compressed image block sequence length Long sequences correspond either to long VLC codewords or short but several VLC codewords First case could be explained by the basic property of the VLC: longer a VLC codeword is, smaller its occurence probability is, so fewer long sequences of same length are In second case, more VLC codewords of the same image block are, more constraints between them there are, so more correlation and more redundancy there are - 2201 - 0-7803-7974-8/03/$17.00 © 2003 IEEE C Source redundancy for image and video decoding We have proposed in [6,7] a VLC source decoder algorithm which can use the above mentioned redundancy Basically, this algorithm is reminiscent of a list-Viterbi decoder, keeping track of all survivors for all lengths smaller than the block size, and taking into account both the VLC structure projection and the intrinsic image properties that are the “run” and the “last” properties This new VLC sequence decoder performance is compared with the conventional prefix-based VLC decoder and the existing decoder using only the projection on the VLC structure [8-13] The performance metric is chosen as the block error rate, defined as follows : number of blocks that are erroneously decoded (20) r= number of transmitted blocks which is more significant in terms of application, since a single bit in error can result in a loss of the whole image block Our intent here is only to show how using source redundancy the improvement at the decoder translates to lower number of image blocks in errors A very simple transmission chain of BPSK modulation over Gaussian channel is simulated A set of image blocks from three conventional video sequences: “Mother-daughter”, “Foreman”, “Irene”, is used for simulation This set of image blocks are transmitted over a Gaussian (AGWN) channel, then decoded with the new VLC sequence decoder using the VLC structure projection and intrinsic image properties that are the “run” and the “last” properties, or the conventional prefix-based VLC decoder, or the existing decoder using only the projection on the VLC structure Only source encoder and decoder are used in the simulation chain No channel coding is used small compared with a channel coding with a number of redundancy bits equal to N redundancy , This is due to the fact that in a real channel code, other properties for good error protection such as the minimal distance d between any two codewords are also optimized Here, d is not optimized and for most cases: d = So having a source code with a better d should give better results VI CONCLUSION This paper demonstrates that significant residual source redundancy still exists in the compressed image and video data A tool is given for quantifying this redundancy in source coding as an equivalent of redundancy in the case of channel coding In previous work, the redundancy related to the correlation between bits inside one same VLC codeword is exploited by the direct projection of the bit stream on the VLC code structure This paper shows that improved decoding performance is achieved when the correlation between VLC codewords of the same image block is also exploited Here, the residual source redundancy has been shown to be effectively contributing towards error correction like a channel code redundancy ACKNOWLEDGMENT The authors would like to thank Jerome BROUET, Denis ROUFFET and Robert TINGAUD from Alcatel Research and Innovation Department for their help, supervision and comments REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] Fig 5: Comparison between the prefix-based, the only-VLC-structureprojection-based, the proposed decoding methods [10] The conventional prefix-based VLC decoder is only a source decoder: it can not correct the transmission errors in the bit stream The new VLC decoder is also a source decoder, but it has also the channel decoding behavior: it can correct some of the transmission errors in the bit stream In figure 5, a gain of 0.5 to dB in terms of “image block error rate” is obtained An average gain of 0.5 dB is obtained compared with the existing decoder using only the projection on the VLC structure Unfortunately, the obtained gain of 0.5 to dB is GLOBECOM 2003 [11] [12] [13] - 2202 - ITU-T Recommendation H.263, 03-96 ITU-T Recommendation H.263 (H263+), 02-98 H26L Test model, ITU-T, Study Group 16, VCEG, june 2001 Takishima, Wada, Murakami, “Reversible Variable Length Codes”, IEEE Trans on Comm, Feb.-March-April 1995 Itoh, Cheung, “Universal Variable Length Code for DCT Coding”, Proc IEEE ICIP, 2000, page(s): 940 -943 vol.1 H Nguyen, P Duhamel, ‘Robust Source Decoding of VLC Encoded Video Data taking into account Source Semantics’, submitted to IEEE Trans on Comm H Nguyen, P Duhamel, “Method for decoding variable length codes and corresponding receiver (Patent style)”, European Patent 03290826.1, April 2nd, 2003 Wen, Villasenor, “Reversible Variable Length Codes for Efficient and Robust Image and Video Coding”, IEEE Conference Record of the Thirty-First Asilomar Signals, Systems & Computers, 1997 R.Bauer, J Hagenauer, “ On Variable Length Codes for Iterative Source-Channel Decoding”, Proc of IEEE Data Compression Conference, 2001, page(s): 273 –282 R Bauer, J Hagenauer, “Iterative Source-Channel Decoding based on a Trellis representation for Variable Length Codes”, ISIT , Sorrento, Italy, June 2000 J.Wen, J.D.Villasenor, “ Utilizing soft information in decoding variable length codes”, Proc IEEE Data Compression Conference, Utah, March 1999 S.Kaiser, M.Bystrom, “Soft decoding of variable length codes”, IEEE International Conference on Communications, 2000 L.Guivarch, J.C Carlach, P Siohan, “Joint source-channel soft decoding of Huffman codes with Turbo-codes”, Proceedings of IEEE Data Compression Conference, March 2000 0-7803-7974-8/03/$17.00 © 2003 IEEE ... stream In figure 5, a gain of 0.5 to dB in terms of “image block error rate” is obtained An average gain of 0.5 dB is obtained compared with the existing decoder using only the projection on the... redundancy in source coding as an equivalent of redundancy in the case of channel coding In previous work, the redundancy related to the correlation between bits inside one same VLC codeword is exploited... give better results VI CONCLUSION This paper demonstrates that significant residual source redundancy still exists in the compressed image and video data A tool is given for quantifying this redundancy