FUNDAMENTAL CONCEPTS FOR MEMORYLESS SOURCES
7.4 DISCRETE MEMORYLESS SOURCES TRELLIS CODES
For block codes we have demonstrated a duality between source and channel coding where the source encoder performs in a manner similar to the channel decoder and the source decoder performs like a channel encoder (see Fig. 7.6).
Thisdualityholdsalsofortrelliscodes.We nowproceedtoshowthattrelliscodes can be used for source codingwhere thesourceencoder performs theoperations that are essentially the operations of the maximum likelihood trellis decoding algorithm ofchannelcoding, while thetrellis sourcedecoder isessentiallyatrellis
channel encoder. In particular, we show that it is possible to use trellis codes, whichare generalformsofconvolutionalcodes, toachievethe rate distortionlimit (D, R(D))of a discrete memoryless source.
Furthermore, the same algorithm which attains the channel coding bound with convolutional channel codes (Viterbi [19670]) alsoattainsthesource coding bound with trellis (generalized convolutional) source codes. In this context, however, the term"maximum likelihood
"
does not apply.
We again consider a discrete memoryless source with alphabet # ={al9
2, ..., aA
} and nonzero letter probabilities Q(a1 ), Q(a2], ..., Q(aA). The user alphabet is denoted by i" =
{b^ b2, ..., bB}, andthere isa nonnegativebounded
distortion measure d(u, v) which satisfies
<
d(u, v)<
(7.4.1)
for allu E #, re i and some d < oo.
Trellis codes are generalized convolutional codes generatedbythesameshift register encoder as convolutional codes, but with arbitrary delayless nonlinear operations replacing the linearcombinatoriallogic of the latter.Whetherfixedor time-varying, they can mostconveniently be described andanalyzed by means of the familiar trellis diagram ofChap. 4. Figure 7.8 shows atrellis sourcedecoder and Fig. 7.9 shows the corresponding trellis diagram for the binary-trellis code with K - 1 delay elements and a delayless transformation. Following the same conventionasforchannel convolutionalcodes,wewill refertoKas theconstraint length of the trellis code. We assume for the present a binary-trellis code with n destination symbols per branch, resulting in a code rate r= \/n bits per source
.ve{0.
Figure7.8 Trellissource decoder.
412 SOURCECODING FORDIGITAL COMMUNICATION
State 1 K- 1 / L
oo...oo fo\ ^/oV- -*/<rw +CQ
00...01
(2K~l
states)
Figure7.9 Trellisdiagram.
symbol. This means that, foreach binaryinput, thetrellissourcedecoder emits n symbols from V, and a sequence ofbinary input symbols defines a path in the
trellisdiagram.Wecaneasilygeneralize tononbinarytrellissourcedecoderslater.
Here, each branch of the diagram is labelled with the corresponding n- dimensional destination vector in i^n, and the states (contents of the source decoders first K 1 delay registers) are denoted bythe vertical position in the diagram, also shownat theleftof thetrellisdiagram.Thetrellisisassumedto be initiatedand terminatedin the state, and no encodingordecodingisperformed during the final merging in what we will call the
"
tail
"
ofthe trellis. There are 2K~l states,and we assumethatthetrellissource coding operates continuouslyfor
manysourcesymbolsso that the effects of thetailcan beignored.Weletthetotal
code length be Lbranches, while the tail requires K - 1 further branches.
The source encoder searches for that path in the trellis whose destination (user)sequencevmostcloselyresembles (inthe sense ofminimumdistortion) the source sequence u. Once the source encoder picks a path, then it sends binary symbols x through the channel (again taken to be noiseless) which drives the trellis sourcedecoder through the desired sequenceofstates yielding the desired path v as the trellis sourcedecoder output. Figure7.10showstheblock diagram
for the trellissource coding system.
We assume that the trellis source coding system operates continuouslyfora long time between initial fan-out and final merging. This means we assume that
L >K and thattheeffectsofthetailcan beignored. In particular,wewill ignore
the last K 1 brancheswhere all paths merge tothezerostate. Hence, thetotal codelengthistaken to beLbranchesand we haveatotalsource sequencelength ofNL= nL.Therearemanypossiblepathsortrelliscodewordsof length NL, one
of which must be chosen to represent the source sequence. For a given source sequence u and any trellis codeword v,we have the distortion measure
1 NL
(7.4.2) f=l
The source encoder chooses the path corresponding to the trellis sourcede coder output v that minimizes d
NL(u, v). Defining, for each index i e{0, 1, 2, ...,L - 1}, the subsequences of length n
Vi= V
and branch distortion measures
we can rewrite dNL(u, v)as
1
n ,T
L-1
(7.4.3)
(7.4.4)
(7.4.5)
In this formit is clear that the sourceencoder selects a pathin the trellis which consists of a sequence of L connected branches, where each branch adds an amountof distortion thatisindependentof the distortion values ofotherbranches
inthe path.Fora givensource sequenceu,thesourceencoderssearchforthepath
Sourceencoder
I Noiseless I
n i
i
channel
Sourcedecoder
Figure 7.10 Trellis sourcecodingsystem.
414 SOURCECODING FORDIGITAL COMMUNICATION
in thetrellisthatminimizes d
NL(u, v)isequivalent to the channel decoding search problem where the Viterbi algorithm wasusedto findthe path, or convolutional codeword, thatminimizes the negative of the log-likelihood function. Hence the sourceencoder for trelliscodes can be realized with the Viterbi algorithm.
For the given sourceand distortion measure,we have shownin Sec. 7.2that the rate distortion function R(D) is given by (7.2.53). Regardless ofwhat type of source coding system weconsider, theconverse source codingtheorem (Theorem
7.2.3) has shown that it is impossible to achieve average distortion ofD or less
with a system using rate less than R(D). This converse theorem applies to trellis
source coding as well as to block source coding (Sec. 7.2). We have also shown
that, in the limit of large block lengths, block source coding systems can achieve average distortion D with rate R(D)nats per sourcesymbol, thusjustifyingR(D) as the rate distortion function. In this section, we will show that, in the limit of large constraint length X, trellis codes can also achieve the rate distortion limit.
We again appeal to an ensemble coding argument where we consider an ensembleofbinarytrellissourcecodes of constraint lengthK andbitrater=
l/n,
The ensemble and thecorrespondingdistributionare so chosen thateachbranch of the trellis diagram has associated with it a user or representation sequence consisting ofsymbols with common probability distribution {P(v): v e V}with
independence among all symbols. Nowfor any given sourcesequence uand any given trellis code,wedenotetheminimumdistortion path sequenceas v(u).Thus by definition, we have the bound dNL(u, v(u)) <d
NL(u, v) for any other path se quencev belonging to the trelliscode. We nowchoose v =v* as follows:
1. Fora giventrelliscode andthegiven sourcesequenceu,replace therepresenta tion sequence of theall-zeros state path by the sequence v randomly selected according to the conditionalprobability
P(v |u)= l\P(v ,\ut) (7.4.6) t=l
Thisresults in anewtrelliscode whichdiffersfromthe originaltrelliscodeonly
inthebranchvaluesof theall-zeros statepath.Wecallthismodifiedtrelliscode a forbidden code, since in general weare not allowed toselectparts ofatrellis
codeafterobservingthesource output sequenceu.Note thatthe originalcode and the corresponding forbidden code differ only in theforbidden code path correspondingto the all-zeros state path.
2. Given a source sequence u, for the above forbidden code, let v** bethemini
mum distortion path sequence. That is, let v** correspond to the forbidden
trellis code output sequence which representsu with minimum distortion.
3. v**definesa path through theforbidden trellisdiagram. Nowchoosev* as the corresponding path sequencein the originaltrellisdiagram. Hence v**andv*
are the same except for the subsequences on branches of the all-zeros state path.
Note that v* isatrelliscodesequenceinthe originally selectedtrelliscode,and we introduced the forbidden trelliscodeonlyas ameansof selecting thistrellis code
sequence.Weneveruse theforbidden trelliscode inthe actualencodingofsource sequences and require it only toderive the following bounds. Since v* is a path sequence in the trellis code, we have from the definition ofv(u)
4v>,*()) = <*v>- **> (7.4.7)
We nowderive abound ondVL(u, v(u)), where( )denotesan average overall source sequences and trellis codes in the ensemble. We do this by bounding
Lemma 7.4.1
A L-I L-j-1
X kPJk (7.4.8)
7= k=l
where
|v* merges with the all-zeros state I
Pjk = Pr
path at nodejand remains merged (7-4.9) (for exactly kbranches
(
and
(")nr|WM (7.4.10)
PROOF Fora given source sequence u and for v* as selected above, let 2[ =
{i: vf isa branch output vector of theall-zeros state path}. Then
= Irfjn,,v,*)+ I4,(u,,v?) (7.4.11)
\tX ief
For ie3? we use the bound dn(uit \*)<d , while for i$ we have dn(u,.,vf) = </
n(u,vf*).
Hence
L</,>, v*)< 2rfju,,?)+ d
<d>,.,v ,.)+ X^o (7.4.12)
i= ieJ"
wherev0listheithbranchoutputvector of theall-zerospathof the forbidden code. This last inequality follows from the fact that, by the definition of v**,
we have
d.v>, v**) < rf
Vl(u, v
) (7.4.13)
416 SOURCE CODINGFOR DIGITALCOMMUNICATION
in the forbidden trellis code where v is the all-zeros state output sequence.
From (7.4.7) and (7.4.12), we obtain the inequality
When we average (7.4.14) over all source sequencesand overthetrelliscode ensemble, the first term becomes D(P). Using the definition ofP
jk given in (7.4.9), we employ the union-of-events bound on thesecond term to get the desired result.
Thereremains onlytheevaluationofatightbound forP
jk.Thisiscomputed over the ensemble offorbidden trellis codes which consist of the normal codes with the branch vectors of theall-zeros state path\ selectedaccordingto (7.4.6) foreach source sequenceu.Notethatwhenv*mergeswiththeall-zeros statepath atnodejandremainsmergedforexactly kbranches,inthecorrespondingforbid den codev** alsoismergedwiththeall-zeros state forthesamespan.Hence P
jkis
also the probability that, inthe forbiddentrellis codes,v** (theminimum distor tion path) merges with the all-zeros state path at node; and remainsmergedfor exactly k branches.
Let x** be the binary input sequence to the forbidden trellis decoder that yieldstheminimumdistortioncodewordv**.Ifv**mergeswiththeall-zeros state forexactly kbranchesstartingwith thejth node, thebinarysequence x**hasthe form
al a2 aK-l 1 00 1 b, b2 --bK_
1 ---
(7.4.15)
T T T T
nodej- K nodej nodej+ k nodej+ k + K
At node; - X, wetaketheforbidden trellisdecodertobeinstatea= (al9a2, ---, flx_i), and at node J+ k + ^ to be in state b=
(bit b2, ..., &K-I)- Tne "!"
immediately following nodej K is required, for otherwise merging could not
start exactly at nodej. Similarly a "1
"
must follownodej+ k,for otherwisethe merged span would be longer than exactly k as assumed. The merged span is shown in Fig. 7.11. Note that states aand b are unrestricted, andeither or both may possibly be the all-zeros state.
Now for the moment let us assume that statesa and barefixed.Thatis,the trellis path corresponding to the minimum-distortion forbidden trellis decoder output, v**, is assumed to have passed intostate a at node; - K and stateb at
nodej+ k + K. Then weseek the probability that thesubpath with decoderinput sequence
a 1 0---0 1 b (7.4.16)
istheminimumdistortion path (subsequenceofx**)from statea tostatebinthe forbidden trelliscode. Anyother path from a tob hasaninput of the generalform
ax,._ x xx,+fc b (7.4.17)
j-K j /+! /+2 .- j+k j+k+K
) A 1 H \- {
Figure7.11 Mergerwith the all-zeros statepath.
where x =
(x;_x+i, , xj+k-1 ). Since the probability that path a 1 1 b isthe
minimum distortion path among allpaths of the generalformax
7_K x xj+kb is
upper-bounded by the probability that path a 1 1 bistheminimumdistortion pathamongallpathsof therestrictedform a 1 x 1 b,we nowconsideronlypaths of this restricted form. Let v(x) be the forbidden trellis decoder output for the (k+ 2K) branches going from state a to state b corresponding to the input a 1 x 1 b. Then for random source subsequences of length n(k + 2K), denoted
u, and for the ensemble offorbidden trellis codes, we seek to bound P
jk byfirst boundingthe probability13
P,.fc(a,b) = Pr
<*(, v(0)) < min d(u, v(x))
x*0
a.b
By restrictingour attention tosubpathsfrom stateatostatebofa forbidden code, we have formulated the problem as a block source coding problem. Our bound on P
jk willbe developed inawayanalogoustotheblock codingbound of Sec. 7.2.