1. Trang chủ
  2. » Công Nghệ Thông Tin

Lecture 2 ArithCode

14 322 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 14
Dung lượng 196,5 KB

Nội dung

Mã nén Lecture 2 ArithCode

Data Compression Lecture Arithmetic Code Alexander Kolesnikov Arithmetic code Alphabet extension (blocking symbols) can lead to coding efficiency How about treating entire sequence as one symbol! Not practical with Huffman coding Arithmetic coding allows you to precisely this Basic idea - map data sequences to sub-intervals in [0,1) with lengths equal to probability of corresponding sequence 1) Huffman coder: 2) Arithmetic code: H ≤R H ≤R ≤H +1 bit/(symbol, pel) ≤ H + bit/message (!) Arithmetic code: History Rissanen [1976] : arithmetic code Pasco [1976] : arithmetic code Arithmetic code: Algorithm (1) 0) Start by defining the current interval as [0,1) 1) REPEAT for each symbol s in the input stream a) Divide the current interval [L, H) into subintervals whose sizes are proportional to the symbols's probabilities b) Select the subinterval [L, H) for the symbol s and define it as the new current interval 2) When the entire input stream has been processed, the output should be any number V that uniquely identify the current interval [L, H) Arithmetic code: Algorithm (2) 0.70 Arithmetic code:Algorithm (3) Probabilities: p1, p2, …, pN Cumulants: C1=0; C2=C1+p1=p1; C3=C2+p2 =p1+p2; etc CN=p1+p2+…+pN-1; CN+1=1; 0) Current interval [L, H) = [0.0, 1.0): 1) REPEAT for each symbol si in the input stream: H ← L + (H − L)*C(si+1), L ← L + (H − L)*C(si); 2) UNTIL the entire input stream has been processed The output code V is any number that uniquely identify the current interval [L, H) Example 1: Statistics Message: 'SWISS_MISS' Char S W I M _ Freq 1 Prob 5/10=0.5 1/10=0.1 2/10=0.2 1/10=0.1 1/10=0.1 [C(si), C(si+1)) [0.5, 1.0) [0.4, 0.5) [0.2, 0.4) [0.1, 0.2) [0.0, 0.1) Example 1: Encoding S W I M 0.5 0.1 0.2 0.1 0.1 [0.5, 1.0) [0.4, 0.5) [0.2, 0.4) [0.1, 0.2) [0.0, 0.1) Example 1: Decoding V ∈ [0.71753375, 0.71753500) S W I M 0.5 0.1 0.2 0.1 0.1 [0.5, 1.0) [0.4, 0.5) [0.2, 0.4) [0.1, 0.2) [0.0, 0.1) Example 1: Compression? V ∈ [0.71753375, 0.71753500) • How many bits we need to encode a number V in the final interval [L, H)? 00 000 01 001 010 011 10 101 11 101 110 111 0000 0001 1110 1111 m=4 bits: 16=24 intervals of size ∆=1/16 • The number of bits m to represent a value in the interval of size ∆ : m= -Log2(∆) bits Example 1: Compression (1) V ∈ [L, H) = [0.71753375, 0.71753500) • Interval size (range) r: r = n ∏p i i =1 r=0.5*0.1*0.2*0.5*0.50.1*0.1*0.2*0.5*0.5=0.00000125 • The number of bits to represent a value in the interval [L, H)=[L, L+r) of size r:  n  m =  − log r  = − ∑ log pi  =  Entropy  i =1  m=-log2r = -log2(0.0000125) = 19.6 =20 bits Example 1: Compression (2) • Entropy = 1.96 bits/char • Arithmetic coder: a) Codeword V: L ≤ V < H V = (0.71753407…)10 = (0.10110111101100000101)2 20 bits 0.71753375 < 0.71753407… < 0.71753500 b) Codelength m: m=-log2(r) = -log2(0.0000125) = 19.6 =20 bits c) Bitrate: R=20 bits/10 chars = 2.0 bits/char • Huffman coder: (1+3+2+1+1+4+4+2+1+1)/10=2.2 bits/char Properties of arithmetic code In practice, for images, arithmetic coding gives 15-30% improvement in compression ratios over a simple Huffman coder The complexity of arithmetic coding is however 50-300% higher Exercise BE_A_BBE ... (0.10110111101100000101 )2 20 bits 0.71753375 < 0.71753407… < 0.71753500 b) Codelength m: m=-log2(r) = -log2(0.0000 125 ) = 19.6 =20 bits c) Bitrate: R =20 bits/10 chars = 2. 0 bits/char • Huffman coder: (1+3 +2+ 1+1+4+4 +2+ 1+1)/10 =2. 2... 2/ 10=0 .2 1/10=0.1 1/10=0.1 [C(si), C(si+1)) [0.5, 1.0) [0.4, 0.5) [0 .2, 0.4) [0.1, 0 .2) [0.0, 0.1) Example 1: Encoding S W I M 0.5 0.1 0 .2 0.1 0.1 [0.5, 1.0) [0.4, 0.5) [0 .2, 0.4) [0.1, 0 .2) ... m=-log2r = -log2(0.0000 125 ) = 19.6 =20 bits Example 1: Compression (2) • Entropy = 1.96 bits/char • Arithmetic coder: a) Codeword V: L ≤ V < H V = (0.71753407…)10 = (0.10110111101100000101)2

Ngày đăng: 26/10/2012, 11:57

w