Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 20 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
20
Dung lượng
158,66 KB
Nội dung
Image Databases: Search and Retrieval of Digital Imagery
Edited by Vittorio Castelli, Lawrence D. Bergman
Copyright
2002 John Wiley & Sons, Inc.
ISBNs: 0-471-32116-8 (Hardback); 0-471-22463-4 (Electronic)
9 Transmission of Digital Imagery
JEFFREY W. PERCIVAL
University of Wisconsin, Madison, Wisconsin
VITTORIO CASTELLI
IBM T.J. Watson Research Center, Yorktown Heights, New York
9.1 INTRODUCTION
The transmission of digital imagery can impose significant demands on the
resources of clients, servers, and the networks that connect them. With the explo-
sive growth of the internet, the need for mechanisms that support rapid browsing
of on-line imagery has become critical.
This is particularly evident in scientific imagery. The increase in availability of
publicly accessible scientific data archives on the Internet has changed the typical
scientist’s expectations about access to data. In the past, scientific procedures
produced proprietary (PI-owned) data sets that, if exchanged at all, were usually
exchanged through magnetic tape. Now, the approach is to generate large on-
line data archives using standardized data formats and allow direct access by
researchers. For example, NASA’s Planetary Data System is a network of archive
sites serving data from a number of solar system missions. The Earth-Observing
System will use eight Distributed Active Archive Centers to provide access to
the very large volume of data to be acquired.
Although transmission bandwidth has been increasing with faster modems
for personal networking, upgrades to the Internet, and the introduction of new
networks such as Internet 2 and the Next Generation Internet, there are still
a number of reasons because of which bandwidth for image-transmission will
continue to be a limited resource into the foreseeable future.
Perhaps the most striking of these factors is the rapid increase in the resolution of
digital imagery and hence the size of images to be transmitted. A case in point is the
rapid increase in both pixel count and intensity resolution provided by the solid-
state detectors in modern astronomical systems. For example, upgrading from a
640 × 480 8-bit detector to a 2,048 × 2,048 16-bit detector represents a data growth
of about a factor of 30. Now, even 2,048
2
detectors seem small; new astronomical
241
242 TRANSMISSION OF DIGITAL IMAGERY
detectors are using 2,048 × 4,096 detectors in mosaics to build units as large as
8,192
2
pixels. This is a factor of 400 larger than the previously mentioned 8-bit
image. The Sloan Digital Sky Survey will use thirty 2,048
2
detectors simultaneously
and will produce a 40 terabyte data set during its lifetime.
This phenomenon is not limited to astronomy. For example, medical imaging
detectors are reaching the spatial and intensity resolution of photographic film
and are replacing it. Similarly, Earth-observing satellites with high-resolution
sensors managed by private companies produce images having 12,000
2
pixels,
which are commercialized for civilian use.
In addition to image volume (both image size and number of images), other
factors are competing for transmission bandwidth, including the continued growth
in demand for access and the competition for bandwidth from other application
such as telephony and videoconferencing. Therefore, it does not seem unrea-
sonable to expect that the transmission of digital imagery will continue to be a
challenge requiring careful thought and a deft allocation of resources.
The transmission of digital imagery is usually handled through the exchange of
raw, uncompressed files, losslessly compressed files, or files compressed with some
degree of lossiness chosen in advance at the server. The files are first transmitted and
then some visualization program is invoked on the received file. Another type of
transmission, growing in popularity with archives of large digital images, is called
progressive transmission. When an image is progressively transmitted from server
to client, it can be displayed by the client as the data arrive, instead of having to wait
until the transmission is complete. This allows browsing in an archive even over
connections for which the transmission time of a single image may be prohibitive.
In this chapter each of these transmission schemes and their effect on the allocation
of resources between server, network, and client are discussed.
9.2 BULK TRANSMISSION OF RAW DATA
This is the simplest case to consider. Error-free transmission of raw digital images
is easily done using the file transfer protocol (FTP) on any Internet-style (TCP/IP)
connection. Image compression can be used to mitigate the transmission time by
decreasing the total number of bytes to be transmitted.
When used to decrease the volume of transmitted data, the compression usually
needs to be lossless, as further data analysis is often performed on the received
data sets. A lossless compression is exactly reversible, in that the exact value
of each pixel can be recovered by reversing the compression. Many compres-
sion algorithms are lossy, that is, the original pixel values cannot be exactly
recovered from the compressed data. Joint photographic experts group (JPEG)[1]
and graphics interchange format (GIF) (see Chapter 8) are examples of such
algorithms. Lossy compression is universally used for photographic images trans-
mitted across the World Wide Web. Interestingly, it is becoming increasingly
important in transmitting scientific data, especially in applications in which the
images are manually interpreted, rather than processed, and where the bandwidth
between server and client is limited.
PROGRESSIVE TRANSMISSION 243
Compression exploits redundancy in an image, which can be large for certain
kinds of graphics such as line drawings, vector graphics, and computer-generated
images. Lossless compression of raw digital imagery is far less efficient because
digital images from solid-state detectors contain electronic noise, temperature-
dependent dark counts, fixed-pattern noise, and other artifacts. These effects
reduce the redundancy, for example, by disrupting long runs of pixels that would
otherwise have the same value in the absence of noise. A rule of thumb is that
lossless compression can reduce the size of a digital image by a factor of 2 or 3.
The cost of transmitting compressed images has three components: the cost of
compression, the cost of decompression, and the cost of transmission. The latter
decreases with the effectiveness of the compressed algorithm: the bigger the
achieved compression ratio, the smaller the transmission cost. However, the first
two costs increase with the achieved compression ratio: when comparing two
image-compression algorithms, one usually finds that the one that compresses
better complex is more
1
. Additionally, the computational costs of compressing
and decompressing are quite often similar (although asymmetric schemes exist
where compression is much more expensive than decompression). In an image
database, compression is usually performed once (when the image is ingested into
the database) and therefore its cost is divided over all the transmissions of the
image. Hence, the actual trade-off is between bandwidth and decompression cost,
which depends on the client characteristics. Therefore, the compression algorithm
should be selected with client capabilities in mind.
A number of high-performance, lossless image-compression algorithms exist.
Most use wavelet transforms of one kind or another, although simply using a
wavelet basis is no guarantee that an algorithm is exactly reversible. A well-tested,
fast, exactly reversible, wavelet-based compression program is HCOMPRESS[2].
Source code is available at www.stsci.edu.
Finally, it is easily forgotten in this highly networked world that sometimes
more primitive methods of bulk transmission of large data sets still hold some
sway. For example, the effective bandwidth of a shoebox full of Exabyte tapes
sent by overnight express easily exceeds 100 megabits per second for 24 hours.
9.3 PROGRESSIVE TRANSMISSION
Progressive transmission is a scheme in which the image is transmitted from
server to client in such a way that the client can display the image as it arrives,
instead of waiting until all the data have been received.
Progressive transmission fills an important niche between the extremes of
transmitting either raw images in their entirety or irreversibly reduced graphic
products.
1
This assertion needs to be carefully qualified: to compress well in practice, an algorithm must be
tailored to the characteristics of actual images. For example, a simple algorithm that treats each pixel
as an independent random variable does not perform as well on actual images as a complex method
that accounts for dependencies between neighboring pixels.
244 TRANSMISSION OF DIGITAL IMAGERY
Progressive transmission allows users to browse an archive of large digital
images, perhaps searching for some particular features. It has also numerous
applications in scientific fields. Meteorologists often scan large image archives
looking for certain types or percentages of cloud cover. The ability to receive,
examine, and reject an image while receiving only the first 1 percent of its data
is very attractive. Astronomers are experimenting with progressive techniques
for remote observing. They want to check an image for target position, filter
selection, and telescope focus before beginning a long exposure, but the end-to-
end network bandwidth between remote mountain tops and an urban department
of astronomy can be very low. Doing a quality check on a 2,048
2
pixel, 16-
bit image over a dial-up transmission control protocol (TCP) connection seems
daunting, but it is easily done with progressive image-transmission techniques.
Some progressive transmission situations are forced on a system. The Galileo
probe to Jupiter was originally equipped with a 134,400 bit-per-second transmission
system, which would allow an image to be transmitted in about a minute. The high-
gain antenna failed to deploy; however, it resulted in a Jupiter-to-Earth bandwidth
of only about 10 bits per second. Ten days per image was too much! Ground system
engineers devised a makeshift image browsing system using spatially subsampled
images (called jail bars, typically one or two image rows spaced every 20 rows) to
select images for future transmission. Sending only every twentieth row improves
the transmission time, but the obvious risk of missing smaller-scale structure in
the images is severe. Figure 9.1 shows the discovery image of Dactyl, the small
moon orbiting the asteroid Ida. Had Dactyl been a little smaller, this form of image
browsing might have prevented its discovery.
Ideally, progressive transmission should have the following properties.
• It should present a rough approximation of the original image very quickly. It
should improve the approximation rapidly at first and eventually reconstruct
the original image.
Figure 9.1. Discovery image of Dactyl, a small moon orbiting the asteroid Ida. Had the
moon been a little smaller, it could have been missing in the transmitted data.
PROGRESSIVE TRANSMISSION 245
• It should capture features at all spatial and intensity scales early in the
transmission, that is, broad, faint features should be captured in the early
stages of transmission as easily as bright, localized features.
• It should support interactive transmission, in which the client can use the
first approximations to select “regions of interest,” which are then scheduled
for transmission at a priority higher than that of the original image. By
“bootstrapping” into a particular region of an image based on an early view,
the client is effectively boosting bandwidth by discarding unneeded bits.
• No bits should be sent twice. As resolution improves from coarse to fine,
even with multiple overlapping regions of interest having been requested,
the server must not squander bandwidth by sending the client information
that it already has.
• It should allow interruption or cancellation by the client, likely to occur
while browsing images.
• It should be well behaved numerically, approximately preserving the image
statistics (e.g., the pixel-intensity histogram) throughout the transmission.
This allows numerical analysis of a partially transmitted image.
Progressive image transmission is not really about compression. Rather, it is
better viewed as a scheduling problem, in which one wants to know which bits
to send first and which bits can wait until later. Progressive transmission uses the
same algorithms as compression, simply because if a compression algorithm can
tell a compressor which bits to throw away, it can also be used to sense which
bits are important.
9.3.1 Theoretical Considerations
In this section, we show that progressive transmission need not require much
more bandwidth than nonprogressive schemes.
To precisely formulate the problem, we compare a simple nonprogressive
scheme to a simple progressive scheme using Figure 9.2. Let the nonprogressive
scheme be ideal, in the sense that in order to send an image with distortion (e.g.,
mean-square error, MSE, or Hamming distance) no larger than D
N
,itneedsto
send R
N
bits per pixel and that the point (R
N
,D
N
) lies on the rate-distortion
curve[3] defined in Chapter 8, Section 8.3. No other scheme can send fewer bits
and produce an image of the same quality.
The progressive scheme has two stages. In the first, it produces an image
having distortion no larger than D
1
by sending R
1
bits per pixel and in the
second it improves the quality of the image to D
2
= D
N
by sending further
R
2
bits per pixel. Therefore, both schemes produce an image having the same
quality.
Our wish list for the progressive scheme contains two items: first we would
like to produce the best possible image during the initial transmission, namely, we
wish the point (R
1
,D
1
) to lie on the rate-distortion curve, namely R
1
= R(D
1
)
(constraint nr 1), second, we wish overall to transmit the same number of bits
246 TRANSMISSION OF DIGITAL IMAGERY
0 1 2 3 4 5 6 7 8 9 10
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Distortion
D
(MSE)
R
(
D
) (bits)
R
(
D
)
(
R
N
,
D
N
) = (
R
1
+
R
2
,
D
N
)
Second stage
First stage
Ideal behavior of progressive transmission
(
R
1
,
D
1
)
Figure 9.2. Ideal behavior of a successive refinement system: the points describing the
first and second stage lie on the rate-distortion curve R(D).
R
N
as the nonprogressive scheme, that is, we wish R
1
+ R
2
= R
N
= R(D
N
) =
R(D
2
) (constraint nr 2).
In this section it is shown that it is not always possible to satisfy both
constraints and that recent results show that constraints 1 and 2 can be relaxed
to R
1
<R(D
1
) + 1/2andR
1
+ R
2
≤ R(D
2
) + 1/2, respectively.
The first results are due to Equitz and Cover [4], who showed that the answer
depends on the source (namely, on the statistical characteristics of the image),
and that although, in general, the two-stage scheme requires a higher rate than the
one-stage approach, there exist necessary and sufficient conditions under which
R
1
= R(D
1
) and R
1
+ R
2
= R(D
2
) (the sources for which the two equalities
hold for every D
1
and D
2
are called successively refinable
2
). An interesting ques-
tion is whether there are indeed sources that do not satisfy Equitz and Cover’s
2
More specifically, a sufficient condition for a source to be successively refinable is that the original
data I , the better approximation I
2
, and the coarse approximation I
1
form a Markov chain in this
order, that is, if I
1
is conditionally independent of I given I
2
. In simpler terms, the conditional
independence condition means that if we are given the finer approximation of the original image,
our uncertainty about the coarser approximation is the same regardless of whether we are given the
original image.
PROGRESSIVE TRANSMISSION 247
conditions. Unfortunately, sources that are not successively refinable do exist:
an example of such a source over a discrete alphabet is described in Ref. [3],
whereas Ref. [5] contains an example of a continuous source that is not succes-
sively refinable. This result is somewhat problematic: it seems to indicate that
the rate-distortion curve can be used to measure the performance of a progressive
transmission scheme only for certain types of sources.
The question of which rates are achievable was addressed by Rimoldi [6], who
refined the result of Equitz and Cover: by relaxing the condition that R
1
= R(D
1
)
and (R
1
+ R
2
) = R(D
2
), the author provided conditions under which pairs of
rates R
1
and R
2
can be achieved, given fixed distortions D
1
and D
2
. Interestingly,
in Ref. [6], D
1
and D
2
need not be obtained with the same distortion measure.
This is practically relevant: progressive transmission methods in which the early
stages produce high-quality approximations of small portions of the image and
poor-quality renditions of the rest are discussed later (e.g., in telemedicine, a
radiographic image could be transmitted with a method that quickly provides
a high-quality image of the area of interest and a blurry version of the rest of
the image [7], which can be improved in later stages. Here, the first distortion
measure could concentrate on the region of interest, whereas subsequent measures
could take into account the image as a whole.)
Although Rimoldi’s regions can be used to evaluate the performance of a
progressive transmission scheme, a more recent result [8] provides a simpler
answer. For any source producing independent and identically distributed
samples
3
under squared error distortion for an m-step progressive transmission
scheme, any fixed set of m distortion values D
1
>D
2
>D
m
, Lastras and
Berger showed that there exists an m-step code that operates within 1/2 bit of the
rate-distortion curve at all of its steps (Fig. 9.3). This is a powerful result, which
essentially states that the rate-distortion curve can indeed be used to evaluate a
progressive transmission scheme: an algorithm that at some step achieves a rate
that is not within 1/2 bit of the rate-distortion curve is by no means optimal and
can be improved upon.
The theory of successive refinements plays the same role for progressive trans-
mission as the rate-distortion theory for lossy compression.
In particular, it confirms that it is possible, in principle, to construct progressive
transmission schemes that achieve transmission rates comparable to those of
nonprogressive methods.
This theory also provides fundamental limits to what is achievable and guide-
lines to evaluate how well a specific algorithm performs under given assumptions.
Such guidelines are sometimes very general and hence difficult to specialize to
actual algorithms. However, the field is relatively young and very active; some of
the more recent results can be applied to specific categories of progressive trans-
mission schemes, such as the bounds provided in Ref. [9] for multiresolution
coding.
3
Lastras’s thesis also provides directions on how to extend the result to stationary ergodic sources.
248 TRANSMISSION OF DIGITAL IMAGERY
0 1 2 3 4 5 6 7 8 9 10
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Distortion
D
(MSE)
R
(
D
) (bits)
R
(
D
)+1/2
R
(
D
)
(
R
1
+
R
2
,
D
N
)
Second stage
(
R
1
,
D
1
)
First stage
1/2 bit
Attainable behavior of progressive transmission
Figure 9.3. Attainable behavior of a progressive transmission scheme: each stage is
described by a point that lies within 1/2 bits of the rate-distortion curve.
Note, finally, that there is a limitation to the applicability of the theory of
successive refinements to image transmission: a good distortion measure that is
well-matched to the human visual system is not known. The implications are
discussedinSection9.3.3.
9.3.2 Taxonomies of Progressive Transmission Algorithms
Over the years, a large number of progressive transmission schemes have
appeared in the literature. Numerous progressive transmission methods have been
developed starting from a compression scheme, selecting parts of the compressed
data for transmission at each stage, and devising algorithms to reconstruct images
from the information available at the receiver after each transmission stage. The
following taxonomy, proposed by Tsou [10] and widely used in the field, is
well suited to characterize this class of algorithm because it focuses on the
characteristics of the compression scheme. Tsou’s taxonomy divides progressive
transmission approaches into three classes:
Spatial-Domain Techniques. Algorithms belonging to this class compress
images without transforming them. A simple example consists of dividing the
PROGRESSIVE TRANSMISSION 249
image into bit planes, compressing the bit planes separately, and scheduling the
transmission to send the compressed bit planes in order of significance. Dividing
an 8-bit image I into bit planes consists of creating eight images with pixel
values equal to 0 or 1 — the first image contains the most significant bit of the
pixels of I , the last image contains the least significant bit, and the intermediate
images contain the intermediate bits.
A more interesting example is progressive vector quantization (VQ): the image
I is first vector-quantized (Chapter 8, Section 8.6.2) at low rate, namely, with
a small codebook, which produces an approximation I
1
. The difference I − I
1
between the original image and the first coarse approximation is further quantized,
possibly with a different codebook, to produce I
2
, and the process is repeated.
The encoded images I
1
,I
2
, are transmitted in order and progressively recon-
structed at the receiver.
Transform-Domain Techniques. Algorithms belonging to this category trans-
form the image to a specific space (such as the frequency domain), and compress
the transform. Examples of this category include progressive JPEG and are
discussedinSection9.4.1.
Pyramid-Structured Techniques. This category contains methods that rely on
a multiresolution pyramid, which is a sequence of approximations of the orig-
inal image I at progressively coarser resolution and larger scale (i.e., having
smaller size). The coarsest approximation is losslessly compressed and trans-
mitted; subsequent steps consist of transmitting only the information necessary
to reconstruct the next finer approximation from the received data. Schemes
derived from compression algorithms that rely on subband coding or on the
wavelet transform (Chapter 8, Sections 8.5.3 and 8.8.2) belong to this category,
and in this sense, the current category overlaps the transform-domain techniques.
Numerous progressive transmission algorithms developed in recent years are
not well categorized by Tsou’s taxonomy.
Chee [11] recently proposed a different classification system, which
uses Tsou’s taxonomy as a secondary categorization. Chee’s taxonomy
specifically addresses the transmission mechanism, rather than the compression
characteristics, and contains four classes:
Multistage Residual Methods. This class contains algorithms that progressively
reduce the distortion of the reconstructed image. Chee assigns to this class only
methods that operate on the full-resolution image at each stage: multiresolution-
based algorithms are assigned to the next category. This category includes multi-
stage VQ [12] and the transform-coding method proposed in Ref. [13].
Our discussion of successive refinements in Section 9.3.1. is directly relevant
to this class of methods.
Hierarchical Methods. These algorithms analyze the images at different scales
to process them in a hierarchical fashion. Chee divides this class into nonresidual
coder, residual multiscale coders, and filter-bank coders.
250 TRANSMISSION OF DIGITAL IMAGERY
• Nonresidual coders perform a multiscale decomposition of the image and
include quadtree-coders [14,15], binary-tree coders [16], spatial pyramid
coders [17,18], and subsampling pyramids [19].
• Residual coders differ from nonresidual coders in that they compute and
encode the residual image at each level of the decomposition (the difference
between the original image and what is received at that stage). The well-
known Laplacian pyramid can be used to construct a hierarchical residual
coder [20]. A theoretical analysis of this category of coders can be found
in Ref. [21].
• Filter-bank coders include wavelet-based coders and subband coders.
Wavelet-based coders send the lowest resolution version of the image
first and successively transmit the subbands required to produce the
approximation at the immediately higher resolution. A similar approach is
used in subband coders [22]. The theoretical analysis of Ref. [9] is directly
relevant to this group of methods.
Successive Approximation Methods. This class contains methods that progres-
sively refine the precision (e.g., the number of bits) of the reconstructed approx-
imations. Methods that transmit bit planes belong to this category: at each stage
the precision of the reconstructed image increases by 1 bit. Chee assigns to this
category bit planes methods, tree-structured vector quantizers, full-search quan-
tizers with intermediate codebooks [23], the embedded zerotree wavelet coder
(Chapter 8, Section 8.8.2), and the successive approximation mode of the JPEG
standard (Section 9.4.1.1).
Note that these methods, and in particular, transform-domain methods, are not
guaranteed to monotonically improve the fidelity of the reconstructed image at
each stage, if the fidelity is measured with a single-letter distortion measure (i.e.,
a measure that averages the distortions of individual pixels) [1].
Methods Based on Transmission Sequences. In this category, Chee groups
methods that use a classifier to divide the data into portions, prioritize the order
in which different portions are transmitted, and include a protocol for specifying
transmission order to the receiver. The prioritization process can aim at different
goals, such as reducing the MSE or improving the visual appearance of the
reconstructed image at each step.
In Ref. [11] the author assigns to this class the spectral selection method of
the JPEG standard (Section 9.4.1.1), Efstratiadi’s Filter Bank Coder [24], and
several block-based spatial domain coders [25,26].
9.3.3 Comparing Progressive Transmission Schemes
Although it is possible to compare different progressive transmission
schemes [11], no general guidelines exist. However, the following broad
statements can be made:.