Up to this point we have examined only discrete-time sources with source alphabets which are sets of real numbers. Many common information sources with outputs such as voice waveforms and pictures can bemodeled as discrete- time real-valued sources only ifthe source has been sampled in an appropriate manner. In this section wetake the approach ofmodeling all suchmoregeneral sources as discrete-time sources with abstract alphabets. For continuous-time sources such as voice, for example, we consider sources that emit a continuous- time waveform each unit of time. Thuseach unitoftime the discrete-timemodel
for a voice source emits an element belonging to the moreabstract alphabet of continuous-timefunctions.Picturesourcesortelevisioncansimilarlybe modeled asadiscrete-timesourcewiththesource alphabetconsisting ofpictures.Hence,by allowing the source alphabets to lie in more general spaces, we can model more
general classes of sources.
Thecorresponding sourcecodingproblemforgeneralsourcesmodeledinthis
manner can be formulated conceptually in the same way as for those with real sourcealphabets.Defining appropriateprobabilitymeasures onthe abstractsource andrepresentationalphabetsanddefiningadistortionmeasure betweenelements inthese alphabets,Berger[1971] hasformulatedtheprobleminthis moreabstract setting. The resulting rate distortion functions are defined in terms of mutual information between source and representation alphabetsinthesame manneras those given earlier for stationary ergodic sources with real alphabets. The main
difference lies in the moregeneral probability measures required fortheabstract alphabets.
We do not attempt to prove coding theorems for discrete-time stationary ergodic sources with abstract alphabets.Indeed,wewillnotevendefinethe corre sponding rate distortion function. Besides requiring some measure-theoretic definitions, generally these rate distortion functions aredifficult to evaluate and are known exactly only for some specialcases. In this section, wepresent only a fewof theknowncasesfor which theratedistortion functioncan be evaluatedby reducing the source outputs to a countable collection of independent random
variables, and where the distortion measure can be defined in terms of these representative random variables.
Before proceeding with variousexamples wepoint out that,although wecan derive rate distortion functionsfor sourceswith abstract alphabets, toachievethe limiting distortions implied by these functions requires coding with codewords whose components are elements from the abstract representation alphabet. In practice this is usually too difficult to accomplish. The rate distortion function does, however, set theoretical limits on performance and often motivates the
design ofmore practicalsource encoding (datacompression) schemes.The Gaus
sian source with squared-error distortion which is presented here represents the worst case for the commonly used squared-error criterion. This and the sub sequent examples are often used as standards ofcomparison for practical data compression schemes.
Continuous-time Gaussianprocess,squared-errordistortion Considerasourcethat emits the zero-mean random process ofTsecondsduration.{u(t): < t <
T}. As we stated above, our approach is to model this source as a stationary ergodic discrete-time source with source alphabet consisting oftime waveforms ofdura tion T. Assumetheenergyof theoutput samplestobefiniteandchoosethesource alphabet to be
I
T
% = \
\u(t): | u2(t)
dt<ao\ (8.4.9)
-o
and the representation alphabet to be
I
-T
r = \v(t): | \
v2(t)dt< oo (8.4.10)
o
Thatis,ourabstractalphabetsare# = i~= L2(T),thespaceof square-integrable functions over the interval < f < T, and the distortion measure
dT: # x
r->[0, oo) (8.4.11)
satisfies a boundedsecond momentcondition. Therate distortion function is de
fined as a limit of average mutual information defined on abstract spaces %N
and f~ N. Forstationaryergodicdiscrete-timesources with these alphabets, there arecodingtheorems which establish thatthe ratedistortion function does in fact represent the minimum possible rateto achieve the given distortion.
Modeling sources which generate continuous-time random processes as discrete-time sources is somewhat artificial since we donot assume continuity of therandomprocessbetweensuccessivesource outputs(seeBerger[1971]).Rather,
we usually have a single continuous random process oflong duration which we wish to encode efficiently. Still, in our discrete-time model, by letting the signal duration T get large, we can usually reduce the source to a memoryless vector source with outputs of duration T. This is analogous to the arguments in the
heuristic proof of the coding theorem for stationary ergodic sources given in Sec. 8.2. When we assume the discrete-time source is memoryless, then the rate distortion function depends only on the single output probability measure, namely on thespace Jll x i and the distortion dT:-ft x i ->
[0, oo). We denote
this rate distortion function as RT (D).
Even with thememoryless assumption, the rate distortion function RT(D) is difficult to evaluate. Thekeytoitsevaluationisthereductionof theproblem from one involving continuous-time random processes to one involving a countable number ofrandom variables. A natural step is to represent the output and rep-
504 SOURCE CODINGFORDIGITALCOMMUNICATION
resentation waveforms7 in terms ofan orthonormal basis {fk(t)} for L2(T) such
that
u(t)= f u(k)fk (t) 0<t<T
(8.4.12) fc=i
and
v(t)= vWfk(t
) 0<t<T (8.4.13)
fc=i
for anyueW and veV.Ifnow the distortion measuredT:tft x V->
[0, oo) can be expressed in terms of the coefficients [u
(k}
] and [v(k)
], then RT (D) is the rate distortion function ofa memoryless source with a real vector output. Earlierin Sec. 8.1, we examined such rate distortion functions for the sum distortion measure
d(u,v)= fd(k>(
M(*>,
t>
(k)
) (8.4.14)
k=l
All known evaluations of RT(D) involve reduction to not only a memoryless vector source with a sum or maximum distortion measure, but to one having uncorrelated vector components.Thiscan beeasilyaccomplished bychoosingthe basis{fk}tobetheKarhunen-Loeveexpansionof thesource outputprocess.That
is, choose thefk (t) to be the orthonormaleigenfunctions of the integralequation s)f(s)ds =
;/(!) 0<t<T
(8.4.15)
where
</>(, s)=
E{u(t)u(s)} is assumed to be both positivedefiniteand absolutely integrable over the rectangle8 <
s, t < T.
For each normalized eigenfunctionfk(t), the corresponding constant Ak isan eigenvalue of</>(r, s). This choice oforthonormal basisyieldsthe representation9
"(0= I (k)/*W (8A16)
k=l where
E{u(k)uu)} =lkd
kj for /c,j= 1, 2, ... (8.4.17)
Thechoiceof distortionmeasureisnot alwaysclearin practice.Yet,eventhough we are concerned with encoding a random process, there is no reason why we cannot choose distortion measures that depend directly on the expansion
7Forsource output{u(t): <t<
T}, thisrepresentationholdsinthemeansquaresenseuniformly inte[0, T].
8This is a sufficient condition for the eigenfunctions {/k } to be complete in L2(T). However, completenessisnotnecessary,forwecan,without lossofgenerality, restrictourspaces to thespace spannedbythe eigenfunctions.
9Without lossofgenerality,wecanassumeAt >A2> .If {u(k)}aremutuallyindependent,this representationholdswith probabilityoneforeachte[0, T].
coefficients of the random process with respect to some orthonormal basis.
Indeed, practical data compression schemesessentially use thistype of distortion measure. The squared-error distortion measure lends itselfnaturally to such a choice for while dT: u x r-
[0, oo) is given by dT(u,v)=
~\
T
[u(t)-v(t)]
2dt (8.4.18)
1 *o
it may also be expressed in terms of the Karhunen-Loeveexpansioncoefficients
M, v) = ^ f (u">
-
)* (8.4.19)
*=1
Therate distortion function RT(D) is thus the ratedistortion functionofamem-
oryless vector source with uncorrelated components and a sumdistortionmeas
ure.Itfollowsfrom Lemma7.7.3and Theorem 8.1.1 thatRT(D) isbounded bythe correspondingratedistortionfunctionfortheGaussian source.Thus from (8.2.65) and (8.2.66), we have
(8.4.20)
where 6 satisfies
D = l_ f>m(MJ (8A21)
* *=i
Here (8.4.20) becomes an equality if and only if the continuous-time random
process is Gaussian. Further, if we now let T->oo and we assume the source outputprocess isstationary with spectral density
0(o)=
|
4>(i)e-
im dr (8.4.22)
oo
where</>(T)
= E{u(t)u(t+ T)},then based ona continuous-timeversion of theToe-
plitz distribution theorem (see Berger [1971], theorem 4.5.4)10 we have
\imRT(D) <- max 0, In
4*-L 9
where 6 satisfies
dco (8.4.23)
D = -- min [9, O(co)] dco (8.4.24)
2n J-9
with equality ifand only ifthe source output process is Gaussian.
This requiresfinitesecond moment,0(0)< oo, andfiniteessentialsupremumof O(c
506 SOURCECODINGFOR DIGITAL COMMUNICATION
Again we see that for the squared-error distortion measure the Gaussian source statistics yield the largest rate distortion function among all stationary processes with the same spectral density
O(o>). The Gaussian source rate distor
tion function
Rg(D) = ~-C4ft -oomax 0, In ^ |dco (8.4.25)
where 6 satisfies (8.4.24) often serves as a basis for comparingvarious practical data compression schemes.
Example(Band-limited Gaussiansource) An idealband-limitedGaussian sourcewith constant spectraldensity
\(o\<2nW
(8.4.26)
\o) >2nW
yields the rate distortionfunction
R*(D)= WMn 0<D<ff
2
(8.4.27)
This is Shannons[1948] classical formula. It iseasy to see thatthisisalso theratedistortion functionforany stationary Gaussian sourceofaveragepower a2 whosespectral densityisflat over any setofradianfrequencies oftotalmeasure W.
Gaussian images,squared-error distortion Information sources that produce pic tures (two-dimensional images) may be modeled as discrete-time sources with outputs that are two-dimensional random fields represented by
sf, Ws|J (8A28)
Imagesare usuallydescribedby thenonnegativeimage intensityfunction{i(x,y}\
|
x|
<
L/2, |y|
<
L/2}. We assume that the source output is u(x, y) = In i(x, y),
which is modeled here as a zero-mean Gaussian random field. In addition, if u(x, y) and v(x,y) are any two-dimensional functions, we define the distortion measure to be
L/2 L/2
[u(x,y) -
v(x, y)]
2 dx dy (8.4.29)
*"
-L/2 J-L/2
Thefactthatwe encodeu(x, y) = In i(x, y),the log of theintensityfunction,with a
mean square criterion may appear somewhat artificial. There is, however, evidence(seeCampbell and Robson[1968]and Van Ness and Bouman[1965]) that an observers ability to determine the difference between two field intensities corresponds to the difference between corresponding transformed fields of the logarithmof the intensities.
Thus, forsources thatproduce two-dimensionalimages,we modelour source as a discrete-time source that outputs a zero-mean Gaussian random field. The
abstract source and representation alphabets are assumed to be
j
.L/2 .L/2 I
# = r = \u(x,y): \
u2(x,y) dx dy< oo (8.4.30) -L/2 -L/2
and we choose the squared-error distortion measure given by (8.4.29). If we assume the discrete-time source is stationary and ergodic, then a rate distortion function can be defined which represents the smallest rateachievablefor a given averagedistortion. Firstassume that thediscrete-timesource ismemoryless. This means that successive output images of the source are independent and therate distortion function RL(D) depends only on the probability measures on JU x V
and the single outputdistortion measure given in (8.4.29).
For thememoryless case,evaluation ofRL(D) isthenatural generalization of thecontinuous-timeproblem givenabove. Webegin by defining the autocorrela tion function of thezero-mean Gaussian random field as
0(x, >; x, y )- {M(X,
>)"(*, y)} (8.4.31)
To beable toevaluate RL(D\ weagain requirearepresentation ofsource outputs
in terms ofacountable number ofindependentrandom variables, and again we attempt to express our distortion measure in terms of these random variables.
With the squared-error distortion measure, any orthonormal expansion of the source output random field will suffice. To have independent components, however, we need the Karhunen-Loeveexpansion. Weexpress outputs as
u(x,y)= fXfcfcy) |x| <^, \y\ <^
(8.4.32)
k=i 22
where
.L/2 .L/2
u<*>=
I I u(x,y)fk(x, y)dx dy (8.4.33)
*-L/2 -L/2
and{/k(x, y)} areorthonormalfunctions (eigenfunctions) that are solutionstothe integral equation
.L/2 .L/2
V(x, y)=
| | </>(*, y,*, /)/(*, y) dx dy (8.4.34)
*-L,2 -L/2
For each eigenfunction/fc(x, y), the corresponding eigenvalue /fc is nonnegative and satisfies the condition11
E{u(k}u(J}} = A
k6
kj for /c,j= 1, 2, ...
(8.4.35)
11
Again we assume /M >A
2 >---. This representation holds with probability one for every
x,ye[-1/2,L/2].
508 SOURCECODINGFOR DIGITAL COMMUNICATION
Asfortheone-dimensionalcase,we assumethatthe autocorrelation
</>(x, y\ x, /)
satisfies the conditions necessary to insure that the eigenfunctions {fk} span the alphabet space <% = V. Thus for any two functions in ^ = V, we have
u(x,y)= u*f*k(x,y) (8.4.36)
v(x,y)= ZvVK(x,y) (8.4.37)
fc=l
and the distortion measure becomes
U. ) = T* I ("
W - "
W
) 2
(8.4.38)
L> k=l
Forthis sumdistortionmeasure,RL(D)isnowexpressedintermsofamemoryless vector source with output u=
{w
(1)
, w(2), ...} whose components are independent Gaussian random variables, with the variance of u(k)givenbyAfc, foreach k. The
rate distortion function of the random fieldnormalized to unit areais thus (see Sec. 8.2)
JO, i In
RL(D)=-= Y max 0, i In ^1 (8.4.39)
k=i \ #/
where 9satisfies
D = ^ I min (9, kk) (8.4.40)
I^t 1, i
Here RL(D) represents theminimumrate innatsperunitarea required toencode the source with average distortion D or less.
Sinceeigenvaluesaredifficultto evaluate,RL(D)given in thisformisnot very useful.We now take the limit as L goesto infinity. Defining
Rg(D)= \imRL(D) (8.4.41)
L->oo
we observethatR9(D)represents theminimumrateoverallchoicesofLandthus theminimumachievablerateper unitarea.In addition, sinceformostimagesLis
large compared to correlation distances, letting L approach infinity is a good approximation. To evaluate this limit we must now restrict our attention to
homogeneous random fields where we have 0(x, y\ x, /) -
</>(* -x ,y-y) (8.4.42)
This is the two-dimensional stationarity condition and allows us to define a two-dimensionalspectral density function,
0(rx, r y)e
~***+***dr
xdr
y (8.4.43)
oo oo
Sakrison [1969] has derived atwo-dimensional version of the Toeplitz distribu tion theorem which allows us to evaluate the asymptotic distribution of the eigenvalues of(8.4.34).This theorem showsthatforanycontinuousfunctionG(A)
f G[OK,w,)]</ Wjc /wy (8.4.44)
k=l -oo *-oo
Applyingthis theorem to (8.4.39) and (8.4.40) yields
Rg(D)= \imRL(D)
max 0, In d\vxdw
y (8.4.45)
where 9satisfies
1 .00 .00
~
4n2 L^ Lj*
As with our one-dimensional case, R9(D) is an upper boundto allother rate distortionfunctionsofnon-Gaussian memorylesssources with the same spectral
densityO(wje, wy), and thus serves as a basis for comparison for various image compression schemes.
Example (Isotropicfield) Anisotropicfieldhasa correlation functionwhich dependsonlyonthe totaldistancebetweentwopointsin thetwo-dimensionalspace.Thatis,
Bydefiningr, T,H-,and6 W as polar coordinates where
rx=rcos dr r
y=rsin9r (8.4.48)
and
w^=H-cos W wy= wsin H. (8.4.49)
weobtain
=2 ^(r)J (wr)r^r (8.4.50)
*o
where J ( )isthezerothorder Bessel function of thefirstkind.Since thereisno Wdependence
fc(w)=2xCri(r)JJwr)dr (8.4.51)
o
510 SOURCECODINGFOR DIGITAL COMMUNICATION
where
(8.4.52)
O(w)and4>(r]are relatedbythe Hankeltransformof zero order.
Fortelevision images, areasonablysatisfactorypowerspectral densityis
resultingin
3>(r)
=e-Wd<
(8.4.54) where dc=l/w isthecoherencedistanceof thefield (SakrisonandAlgazi[1971]).
For many sources successive images are often highly correlated so that the above memoryless assumption isunrealistic. We nowfindan upper boundto the ratedistortionfunctionof adiscrete-time stationaryergodic sourcethatemitsthe two-dimensional homogenous Gaussian random field described above. Let the nth output be denoted
^n(X,y):\X\<^,\y\<^
(8.4.55)
Again use the usual Karhunen-Loeve expansion
ua(x,y)= J n<j%(jc,y) (8.4.56) fc=l
where {fk( , )} and {Ak} are eigenfunctions and eigenvalues which satisfy the in tegralequationof(8.4.34). By theassumed stationarity of the discrete-timesource with memory, the autocorrelation of the random field 0(x, y\ x, /) is inde pendent of the output time index rc, and hence eigenfunctions and eigenvalues are the same for each output of the discrete-time stationary ergodic source. We
now have a source that outputs a vector un =
(u
(
n l
\ u(n2\ ...) at the nth time.
The rate distortion function of the discrete-time stationary ergodic source is given by
RL(D)= lim RLtN (D) (8.4.57)
JV-oo
where RL N (D)isthe Mh-orderratedistortion function [i.e., which uses only the
first Nterms in the expansion (8.4.56)]. We can upper-bound RL(D) by the rate required withanyparticular encoding schemethatachievesaveragedistortion D.
Considerthe following scheme:
1. Encode each Karhunen-Loeve expansion coefficient independently of other coefficients.12 That is, regard the kih coefficient sequence {u
(
f\ u
(
2\ ...} as the
12Thisamountsto partitioning thesourceintoits spatial spectralcomponents and treating suc cessive(in time)samplesof a givencomponentas asubsource which is tobe encoded independent ofallothercomponentsubsources.
output of a zero-mean Gaussian subsource and encode it with respect to a squared-error distortion measure with averagedistortion D(k).
2. Choose the distortions D(1\ D(2\ ... so as to achieve an overall average distortion D.
The required rate for the above scheme, which we now proceed to evaluate, will certainly upper-bound RL(D). Let us define correlation functions for each subsource.
</>
= {>,} (8.4.58)
and corresponding spectraldensity functions
^*)(w)= j[ 0<
k
>(r)e-
>w
(8.4.59)
r= oo
Consider encoding the sequence [u
(
f\ u(}\ ...} with respect to the squared-error distortionmeasure. From (8.2.69) and (8.2.70), wesee that for distortion D(k)the required rate is
#<*>(><*>)
= -A I
*
max
4;r ._ 0,ln^ -\dw (8.4.60)
where 9 satisfies
D(k)= -- I* min[ft iA(k)(w)] dw (8.4.61) 2n _
Here R(k)(D(k)) is in nats per output of the subsource.
Recall that the total single-output distortion measure is
1 x
dL(u, r) =-2 I ("
(k)- f(k)
) 2
(8.4.62) i?At
Hence, choosing{D
(fc)
} such that
D =
J2 I ^^ (8A63)
will achieve averagedistortion D. The total rateper unit area is given by
1
_ RW(D(k>) (8.4.64)
^ k=i Thus we have
1 if*
72 L; ~A~ maX 0, In (8.4.65)
where now we choose 9 to satisfy
1 x i n
D = -2 X I min[ft i//(A:)(w)]dw (8.4.66)
We consider next a special case forwhich this upper bound is tight.
512 SOURCECODINGFOR DIGITALCOMMUNICATION
Example(Separation of correlation) Supposethetimeandspatialcorrelationofsource outputs separate as follows:
where
<p(0)
= 1.
Recall thatanytwoKarhunen-Loeveexpansioncoefficientsuj,k)anduj/|tare given by
L/2 L/2
u(n}=
I \ "(*>y)fk(x,y)dx dy
-L/2 -L/2
L/2 L/2 (8-4-68)
Thuswehavecorrelation
.L/2 L/2
E(Un
,XH-T)=
{"(*>y)Un+r(X>
>/)}/k(*>V)/^/) dxdVd* dV
-L/2 -L/2
L/2 L/2
=
I |
<?(T)<(X
-*
,y-
y)fk(x,y)fj(x,y)dxdydx dy
J
-L/2 J-L/2
L/2 L/2
= 9Wk \ I
fk(x>y)fj(x,y)dx dy
-L/2 -L/2
-Akv(i)dk j (8.4.69)
Hence
-Ak^(T) (8.4.70)
and foranyk^j
^allT (8.4.71)
Sincewe have Gaussian statistics,theuncorrelated randomvariablesareindependent random
variablesandthedifferent Karhunen-Loeveexpansioncoefficientsequences can be regardedas independentsubsources. Lemma8.1.1 showsthat the upper boundgiven in (8.4.65) is in fact exact,andwehavefor thiscase
dw (8.4.72)
" k=l ^^ -ic
where6 ischosen tosatisfy
D = \ f I min[0,Ak#(w)]dw (8.4.73)
Using(8.4.44)intaking thelimitasL->
oo,wehavethe limitingratedistortion function givenby
R(D}= \\mRL(D]
167t J_n^
where satisfies
f max o,b ?! to.*, * (8.4.74)
J J ^
1 r" r
00 r
00
D = ^ min[0,O(wx,w
>,)i//(w)]^wx^wyrfw (8.4.75)
8;r J_mJ__ J