Mã nén Lecture_1

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	27
Dung lượng	187,5 KB

Nội dung

Mã nén

1Introduction to Data CompressionData compression seeks to reduce the number of bits used to store or transmit information. 2Lecture 1Source Coding and Statistical ModelingAlexander KolesnikovEntropyExampleHow to get probailitiesShannon-Fano codeHuffman codePrefix codesContext modeling 3EntropySet of symbols (alphabet) S={s1, s2, …, sN},N is number of symbols in the alphabet. Probability distribution of the symbols: P={p1, p2, …, pN}According to Shannon, the entropy H of an informationsource S is defined as follows:∑=⋅−=NiiippH12)(log 4EntropyThe amount of information in symbol si, in other words, the number of bits to code or code lengthfor the symbol si:)(log)(2 iipsH −=∑=⋅−=NiiippH12)(logThe average number of bits for the source S: 5Entropy for binary source: N=2))1(log)1(log(22ppppH−⋅−+⋅−=S={0,1}p0=pp1=1-p0 1p1-pH=1 bit for p0=p1=0.5 6Entropy for uniform distribution: pi=1/NUniform distribution of probabilities: pi=1/N:)(log)/1(log)/1(212NNNHNi∑==⋅−=Examples: N= 2: pi=0.5; H=log2(2) = 1 bitN=256: pi=1/256; H=log2(256)= 8 bitsPi=1/Ns1s2sN 7How to get the probability distribution?1) Static modeling: a) The same code table is applied to all input data. b) One-pass method (encoding) c) No side information (không cần thông tin phụ)2) Semi-adaptive modeling: a) Two-pass method: (1) analysis and (2) encoding. b) Side information needed (model, code table)3) Adaptive (dynamic) modeling: a) One-pass method: analysis and encoding b) Updating the model during encoding/decoding c) No side information 8Static vs. Dynamic: Example S = {a,b,c}; Data: a,a,b,a,a,c,a,a,b,a.1) Static model: pi=1/10 H = -log2(1/10)=1.58 bits2) Semi-adaptive method: p1=7/10; p2=2/10; p3=1/10; H = -(0.7*log20.7 + 0.2*log20.2 + 0.1*log20.1)=1.16 bits 93) Adaptive method: Example S = {a,b,c}; Data: a,a,b,a,a,c,a,a,b,a.Symbol 1 2 3 4 5 6 7 8 9 10 a 1 2 3 3 4 5 5 6 7 7b 1 1 1 2 2 2 2 2 2 3c 1 1 1 1 1 1 2 2 2 2Pi 1/3 2/4 1/5 3/6 4/7 1/8 5/9 6/10 2/11 7/120.33 0.5 0.2 0.5 0.57 0.13 0.56 0.60 0.18 0.58H 1.58 1.0 2.32 1.0 0.81 3.0 0.85 0.74 2.46 0.78H=(1/10)(1.58+1.0+2.32+1.0+0.81+3.0+0.85+0.74+2.46+0.78) =1.45 bits/char 1.16 < 1.45 < 1.58 S.-Ad. Ad. Static 10Shannon-Fano Code: A top-down approach1) Sort symbols according their probabilities: p1 ≤ p2 ≤ … ≤ pN2) Recursively divide into parts, each with approx. the same number of counts (probability) . ECodes: 01 00 01 00 10 00 10 00 11 0 11 1Bitstream: 010 0 010 010 0 010 0 011 011 1 16 Shannon-Fano Code: DecodingA - 00B - 01C - 10 D - 11 0E - 11 1 010 11 010 ACBDEBinary. bits 011 010 ACDE1B87/39=2.23 bitsBinary tree 19 Huffman Code: DecodingA - 0B - 10 0 C - 10 1D - 11 0E - 11 1 011 010 ACDE1BBinary treeBitstream: 10 0 010 0 010 1 010 1 011 011 1

Ngày đăng: 13/11/2012, 17:09

Xem thêm

Mã nén Lecture_1