Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 21 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
21
Dung lượng
149,94 KB
Nội dung
Image Databases: Search and Retrieval of Digital Imagery
Edited by Vittorio Castelli, Lawrence D. Bergman
Copyright
2002 John Wiley & Sons, Inc.
ISBNs: 0-471-32116-8 (Hardback); 0-471-22463-4 (Electronic)
6 Storage Architectures for Digital
Imagery
HARRICK M. VIN
University of Texas at Austin, Austin, Texas
PRASHANT SHENOY
University of Massachusetts, Amherst, Massachusetts
Rapid advances in computing and communication technologies coupled with the
dramatic growth of the Internet have led to the emergence of a wide variety
of multimedia applications, such as distance education, on-line virtual worlds,
immersive telepresence, and scientific visualization. These applications differ
from conventional distributed applications in at least two ways. First, they involve
storage, transmission, and processing of heterogeneous data types — such as
text, imagery, audio, and video — that differ significantly in their characteris-
tics (e.g., size, data rate, real-time requirements, and so on). Second, unlike
conventional best-effort applications, these applications impose diverse perfor-
mance requirements — for instance, with respect to timeliness — on the networks
and operating systems. Unfortunately, existing networks and operating systems do
not differentiate between data types, offering a single class of best-effort service
to all applications. Hence, to support emerging multimedia applications, existing
networks and operating systems need to be extended along several dimensions.
In this chapter, issues involved in designing storage servers that can support
such a diversity of applications and data types are discussed. First, the specific
issues that arise in designing a storage server for digital imagery are described,
and then the architectural choices for designing storage servers that efficiently
manage the storage and retrieval of multiple data types are discussed. Note that
as it is difficult, if not impossible, to foresee requirements imposed by future
applications and data types, a storage server that supports multiple data types
and applications will need to facilitate easy integration of new application classes
and data types. This dictates that the storage server architecture be extensible,
allowing it to be easily tailored to meet new requirements.
139
140 STORAGE ARCHITECTURES FOR DIGITAL IMAGERY
The rest of the chapter is organized as follows. We begin by examining tech-
niques for placement of digital imagery on a single disk, a disk array, and a hierar-
chical storage architecture. We then examine fault-tolerance techniques employed
by servers to guarantee high availability of image data. Next, we discuss retrieval
techniques employed by storage servers to efficiently access images and image
sequences. We then discuss caching and batching issues employed by such servers
to maximize the number of clients supported. Finally, we examine architectural
issues in incorporating all of these techniques into a general-purpose file system.
6.1 STORAGE MANAGEMENT
6.1.1 Single Disk Placement
A storage server divides images into blocks while storing them on disks. In order
to explore the viability of various placement models for storing these blocks on
magnetic disks, some of the fundamental characteristics of these disks are briefly
reviewed. Generally, magnetic disks consist of a collection of platters, each of
which is composed of a number of circular recording tracks (Fig. 6.1). Platters
spin at a constant rate. Moreover, the amount of data recorded on tracks may
increase from the innermost track to the outermost track (e.g., in the case of
zoned disks). The storage space of each track is divided into several disk blocks,
each consisting of a sequence of physically contiguous sectors. Each platter is
associated with a read or write head that is attached to a common actuator. A
cylinder is a stack of tracks at one actuator position.
In such an environment the access time of a disk block consists of three
components: seek time, rotational latency,anddata-transfer time. Seek time is
the time required to position the disk head on the track, containing the desired
data and is a function of the initial start-up cost to accelerate the disk head and
the number of tracks that are traversed. Rotational latency, on the other hand, is
the time for the desired data to rotate under the head before it can be read or
Platter
Track
Actuator
Head
Direction
of rotation
Figure 6.1. Architectural model of a conventional magnetic disk.
STORAGE MANAGEMENT 141
written and is a function of the angular distance between the current position of
the disk head and the location of the desired data, as well as the rate at which
platters spin. Once the disk head is positioned at the desired disk block, the time
to retrieve its contents is referred to as the data-transfer time; it is a function of
the disk block size and data-transfer rate of the disk.
The placement of data blocks on disks is generally governed by contiguous,
random, or constrained placement policy. Contiguous placement policy requires
that all blocks belonging to an image be placed together on the disk. This ensures
that once the disk head is positioned at the beginning of an image, all its blocks
can be retrieved without incurring any seek or rotational latency. Unfortunately,
the contiguous placement policy results in disk fragmentation in environments
with frequent image creations and deletions. Hence, contiguous placement is well
suited for read-only systems (such as compact discs, CLVs, and so on.), but is
less desirable for a dynamic, read-write storage systems.
Storage servers for read-write systems have traditionally employed random
placement of blocks belonging to an image on disk [1,2]. This placement scheme
does not impose any restrictions on the relative placement on the disks of blocks
belonging to a single image. This approach eliminates disk fragmentation, albeit
at the expense of incurring high seek time and rotational latency overhead while
accessing an image.
Clearly, the contiguous and random placement models represent two ends of a
spectrum; whereas the former does not permit any separation between successive
blocks of an image on disk, the latter does not impose any constraints at all.
The constrained or the clustered placement policy is a generalization of these
extremes; it requires the blocks to be clustered together such that the maximum
seek time and rotational latency incurred while accessing the image does not
exceed a predefined threshold.
For the random and the constrained placement policies, the overall disk
throughput depends on the total seek time and rotational latency incurred per
byte accessed. Hence, to maximize the disk throughput, image servers use as
large a block size as possible.
6.1.2 Multidisk Placement
Because of the large sizes of images and image sequences (i.e., video streams),
most image and video storage servers utilize disk arrays. Disk arrays achieve
high performance by servicing multiple input-output requests concurrently and
by using several disks to service a single request in parallel. The performance of
a disk array, however, is critically dependent on the distribution of the workload
(i.e., the number of blocks to be retrieved from the array) among the disks. The
higher the imbalance in the workload distribution, the lower is the throughput of
the disk array.
To effectively utilize a disk array, a storage server interleaves the storage
of each image or image sequence among the disks in the array. The unit of
data interleaving, referred to as a stripe unit, denotes the maximum amount of
142 STORAGE ARCHITECTURES FOR DIGITAL IMAGERY
logically contiguous data that is stored on a single disk [82,3]. Successive stripe
units of an object are placed on disks, using a round-robin or random allocation
algorithm.
Conventional file systems select stripe unit sizes that minimize the average
response time while maximizing throughput. In contrast, to decrease the
frequency of playback discontinuities, image and video servers select a stripe unit
size that minimizes the variance in response time while maximizing throughput.
Although small stripe units result in a uniform load distribution among disks
in the array (and thereby decrease the variance in response times), they also
increase the overhead of disk seeks and rotational latencies (and thereby decrease
throughput). Large stripe units, on the other hand, increase the array throughput
at the expense of increased load imbalance and variance in response times. To
maximize the number of clients that can be serviced simultaneously, the server
should select a stripe unit size that balances these trade-offs. Table 6.1 illustrates
typical block or stripe unit sizes employed to store different types of data.
The degree of striping — the number of disks over which an image or an
image sequence is striped — is dependent on the number of disks in the array. In
relatively small disk arrays, striping image sequences across all disks in the array
(i.e., wide-striping) yields a balanced load and maximizes throughput. For large
disk arrays, however, to maximize the throughput, the server may need to stripe
image sequences across subsets of disks in the array and replicate their storage
to achieve load balancing. The amount of replication for each image sequence
depends on the popularity of the image sequence and the total storage-space
constraints.
6.1.2.1 From Images to Multiresolution Imagery. The placement technique
becomes more challenging if the imagery is encoded using a multiresolution
encoding algorithm. In general, multiresolution imagery consists of multiple
layers. Although all layers need be retrieved to display the imagery at the highest
resolution, only a subset of the layers need to be retrieved for lower resolution
displays. To efficiently support the retrieval of such images at different resolu-
tions, the placement algorithm needs to ensure that the server access only as much
data as needed and no more. To ensure this property, the placement algorithm
should store multiresolution images such that: (1 ) each layer is independently
Table 6.1. Typical Block or Stripe Unit Size for
Different Data Types
Data Type Storage Block or Stripe
Requirement Unit Size
Text 2–4 KB 0.5–4 KB
Gif Image 64 KB 4–8 KB
Satellite Image 60 MB 16 KB
MPEG Video 1 GB 64–256 KB
STORAGE MANAGEMENT 143
Layer 1
block
Layer 2
block
Layer n
block
. . . .
Data retrieved
for lowest resolution
display
Data retrieved for highest resolution display
Figure 6.2. Contiguous placement of different layers of a multiresolution image. Storing
data from different resolutions in separate disk blocks enables the server to retrieve each
resolution independently of others; storing these blocks contiguously enables the server
to reduce disk seek overheads while accessing multiple layers simultaneously.
accessible, and (2 ) the seek and rotational latency while accessing any subset of
the layers is minimized. Although the former requirement can be met by storing
layers in separate disk blocks, the latter requirement can be met by storing these
disk blocks adjacent on disk. Observe that this placement policy is general and
can be used to interleave any multiresolution image or video stream on the array.
Figure 6.2 illustrates this placement policy.
6.1.2.2 From Images to Video Streams. Consider a disk array–based video
server. If the video streams are compressed using a variable bit rate (VBR)
compression algorithm, then the sizes of frames (or images) will vary. Hence, if
the server stores these video streams on disks using fixed-size stripe units, each
stripe unit will contain a variable number of frames. On the other hand, if each
stripe unit contains a fixed number of frames (and hence data for a fixed playback
duration), then the stripe units will have variable sizes. Thus, depending on the
striping policy, retrieving a fixed number of frames will require the server to
access a fixed number of variable-size blocks or a variable number of fixed-size
blocks [4,5,6].
Because of the periodic nature of video playback, most video servers service
clients by proceeding in terms of periodic rounds. During each round, the
server retrieves a fixed number of video frames (or images) for each client.
To ensure continuous playback, the number of frames accessed for each client
during a round must be sufficient to meet its playback requirements. In such
an architecture, a server that employs variable-size stripe units (or fixed-time
stripe units) accesses a fixed number of stripe units during each round. This
uniformity of access, when coupled with the sequential and periodic nature
of video retrieval, enables the server to balance load across the disks in the
array. This efficiency, however, comes at the expense of increased complexity
of storage-space management.
The placement policy that utilizes fixed-size stripe units, on the other hand,
simplifies storage-space management but results in higher load imbalance across
the disks. In such servers, load across disks within a server may become unbal-
anced, at least transiently, because of the arrival pattern of requests. To smoothen
144 STORAGE ARCHITECTURES FOR DIGITAL IMAGERY
out this load imbalance, servers employ dynamic load-balancing techniques. If
multiple replicas of the requested video stream are stored on the array, then the
server can attempt to balance load across disks by servicing the request from the
least-loaded disk containing a replica. Further, the server can exploit the sequen-
tiality of video retrieval to prefetch data for the streams to smoothen out variation
in the load imposed by individual video stream.
6.1.3 Utilizing Storage Hierarchies
The preceding discussion has focused on fixed disks as the storage medium for
image and video servers. This is primarily because disks provide high throughput
and low latency relative to other storage media such as tape libraries, optical
juke boxes, and so on. The start-up latency for devices such as tape libraries is
substantial as it requires mechanical loading of the appropriate tape into a reader
station. The advantage, however, is that they offer very high storage capacities
(Table 6.2).
In order to construct a cost-effective image and video storage system that
provides adequate throughput, it is logical to use a hierarchy of storage
devices [7,8,9,10]. There are several possible strategies for managing this storage
hierarchy, with different techniques for placement, replacement, and so on. In
one scenario, a relatively small set of frequently requested images and videos
are placed on disks and the large set of less frequently requested data objects are
stored in optical juke boxes or tape libraries. In this storage hierarchy there are
several alternatives for managing the disk system. The most common architecture
is the one in which disks are used as a staging area (cache) for the secondary
storage devices and the entire image and video files are moved from the tertiary
storage to the disk. It is then possible to apply traditional cache-management
techniques to manage the content of the disk array.
For very large-scale servers, it is also possible to use an array of juke boxes
or tape readers [10]. In such a system, images and video objects may need
to be striped across these tertiary storage devices [11]. Although striping can
improve I/O throughput by reading from multiple tape drives in parallel, it can
also increase contention for drives (because each request accesses all drives).
Studies have shown that such systems must carefully balance these trade-offs by
choosing an appropriate degree of striping for a given workload [11,12].
Table 6.2. Tertiary Storage Devices
Disks Tapes
Magnetic Optical Low-End High-End
Capacity 40 GB 200 GB 500 GB 10 TB
Mount Time 0 sec 20 sec 60 sec 90 sec
Transfer Rate 10 MB/s 300 KB/s 100 KB/s 1,000 KB/s
FAULT TOLERANCE 145
6.2 FAULT TOLERANCE
Most image and video servers are based on large disk arrays, and hence the ability
to tolerate disk failures is central to the design of such servers. The design of
fault-tolerant storage systems has been a topic of much research and develop-
ment over the past decade [13,14]. In most of these systems, fault-tolerance is
achieved either by disk mirroring [15] or parity encoding [16,17]. Disk arrays
that employ these techniques are referred to as redundant array of independent
disks (RAID). RAID arrays that employ disk mirroring achieve fault-tolerance by
duplicating data on separate disks (and thereby incur a 100 percent storage-space
overhead). Parity encoding, on the other hand, reduces the overhead consider-
ably by employing error-correcting codes. For instance, in a RAID level five disk
array, consisting of D disks, parity computed over data stored across (D − 1)
disks is stored on another disk (e.g., the left-symmetric parity assignment shown
in Figure 6.3a) [18,19,17]. In such architectures, if one of the disks fails, the
data on the failed disk is recovered using the data and parity blocks stored on
the surviving disks That is, each user access to a block on the failed disk causes
one request to be sent to each of the surviving disks. Thus, if the system is load-
balanced prior to disk failure, the surviving disks would observe at least twice
as many requests in the presence of a failure [20].
The declustered parity disk array organization [21,22,23] addresses this
problem by trading some of the array’s capacity for improved performance
in the presence of disk failures. Specifically, it requires that each parity block
protect some smaller number of data blocks [for e.g., (G − 1)]. By appropriately
distributing the parity information across all the D disks in the array, such a policy
ensures that each surviving disk would see an on-the-fly reconstruction load
increase of (G − 1)/(D − 1) instead of (D − 1)/(D − 1) = 100% [Fig. 6.3b].
M0.0
M1.1
M2.2
P3
M5.0
Disk1
M0.1
M1.2
P2
M4.0
M5.1
Disk2
M0.2
P1
M3.0
M4.1
M5.2
Disk3
P0
M2.0
M3.1
M4.2
P5
Disk4
M1.0
M2.1
M3.2
P4
M6.1
Disk5
M0.0
M1.0
M2.0
M3.0
P4
Disk1
M0.1
M1.1
M2.1
P3
M4.0
Disk2
M0.2
M1.2
P2
M3.1
M4.1
Disk3
M0.3
P1
M2.2
M3.2
M4.2
Disk4
P0
M1.3
M2.3
M3.3
M4.3
Disk5
(a) Left-symmetric data organization in
RAID level 5 disk array with G = D = 5
(b) Declustered parity organization
with G = 4 and D = 5
Figure 6.3. Different techniques for storing parity blocks in a RAID-5 architecture.
(a) depicts the left-symmetric parity organization, in which the parity group size is same
as the number of disks in the array; (b) depicts the declustered parity organization in
which the parity group size is smaller than the number of disks. M
i.j
and P
i
denote data
and parity blocks, respectively, and P
i
= M
i.0
⊕ M
i.1
···⊕M
i.(G−2)
.
146 STORAGE ARCHITECTURES FOR DIGITAL IMAGERY
In general, with such parity-based recovery techniques, increase in the load
on the surviving disks in the event of a disk failure results in deadline violations
in the playback of video streams. To prevent such a scenario with conventional
fault-tolerance techniques, servers must operate at low levels of disk utilization
during the fault-free state. Image and video servers can reduce this overhead by
exploiting the characteristics of imagery. There are two general techniques that
such servers may use.
• A video server can exploit the sequentiality of video access to reduce the
overhead of on-line recovery in a disk array. Specifically, by computing
parity information over a sequence of blocks belonging to the same video
stream, the server can ensure that video data retrieved for recovering a block
stored on the failed disk would be requested by the client in the near future.
By buffering such blocks and then servicing the requests for their access
from the buffer, this method minimizes the overhead of the on-line failure
recovery process.
• Because human perception is tolerant to minor distortions in images, a server
can exploit the inherent redundancies in images to approximately reconstruct
lost image data using error-correcting codes instead of perfectly recovering
image data stored on the failed disk. In such a server, each image is parti-
tioned into subimages and if the subimages are stored on different disks,
then a single disk failure will result in the loss of fractions of several images.
In the simplest case, if the subimages are created in the pixel domain (i.e.,
prior to compression) such that none of the immediate neighbors of a pixel
in the image belong to the same subimage, then even in the presence of
a single disk failure, all the neighbors of the lost pixels will be available.
In this case, the high degree of correlation between neighboring pixels will
make it possible to reconstruct a reasonable approximation of the original
image. Moreover, no additional information will have to be retrieved from
any of the surviving disks for recovery.
Although conceptually elegant, such precompression image partitioning tech-
niques significantly reduce the correlation between the pixels assigned to the same
subimage and hence adversely affect image-compression efficiency [24,25]. The
resultant increase in the bit rate requirement may impose higher load on each
disk in the array even during the fault-free state, thereby reducing the number
of video streams that can be simultaneously retrieved from the server. A number
of postcompression partitioning algorithms that address this limitation have been
proposed [26,27]. The concepts in postcompression partitioning is illustrated by
describing one such algorithm, namely, the loss-resilient joint photographic expert
group (JPEG) (LRJ).
6.2.1 Loss-Resilient JPEG (LRJ) Algorithm
As human perception is less sensitive to high-frequency components of the spec-
tral energy in an image, most compression algorithms transform images into the
FAULT TOLERANCE 147
frequency domain so as to separate low- and high-frequency components. For
instance, the JPEG compression standard fragments image data into a sequence of
8 × 8 pixel blocks and transforms them into the frequency domain using discrete
cosine transform (DCT). DCT uncorrelates each pixel block into an 8 × 8 array
of coefficients such that most of the spectral energy is packed in the fewest
number of low-frequency coefficients. Although the lowest frequency coefficient
(referred to as the DC coefficient) captures the average brightness of the spatial
block, the remaining set of 63 coefficients (referred to as the AC coefficients)
capture the details within the 8 × 8 pixel block. The DC coefficients of successive
blocks are difference-encoded independent of the AC coefficients. Within each
block, the AC coefficients are quantized to remove high-frequency components,
scanned in a zigzag manner to obtain an approximate ordering from lowest to
highest frequency and finally run-length and entropy-encoded. Figure 6.4 depicts
the main steps involved in the JPEG compression algorithm [28].
The loss-resilient JPEG (LRJ) algorithm is an enhancement of the JPEG
compression algorithm and is motivated by the following two observations:
• Because the DC coefficients capture the average brightness of each 8 × 8
pixel block and because the average brightness of pixels gradually changes
across most images, the DC coefficients of neighboring 8 × 8 pixel blocks
are correlated. Consequently, the value of DC coefficient of a block can be
reasonably approximated by extrapolating from the DC coefficients of the
neighboring blocks.
Discrete
cosine
transform
Run-length
and Huffman
encoding
Quantization
Compressed
image data
Image
data
B(i, j−1) B(i, j)
Differential encoding of DC
coefficients
Zig-zag reordering of
AC coefficients
JPEG compression algorithm
DC(B(i,j−1))
DC(B(i,j))
d = DC(B(i, j)) − DC(B(i, j−1))
Figure 6.4. JPEG compression algorithm.
148 STORAGE ARCHITECTURES FOR DIGITAL IMAGERY
• Owing to the very nature of DCT, the set of AC coefficients generated for
each 8 × 8 block are uncorrelated. Moreover, because DCT packs the most
amount of spectral energy into a few low-frequency coefficients, quantizing
the set of AC coefficients (by using a user-defined normalization array)
yields many zeroes, especially at higher frequencies. Consequently, recov-
ering a block by simply substituting a zero for each of the lost AC coefficient
is generally sufficient to obtain a reasonable approximation of the original
image (at least as long as the number of lost coefficients are small and are
scattered throughout the block).
Thus, even when parts of a compressed image have been lost, a reasonable
recovery is possible if: (1 ) the image in the frequency domain is partitioned
into a set of N subimages such that none of the DC coefficients in the eight-
neighborhood of a block belong to the same subimage, and (2 ) the AC coefficients
of a block are scattered among multiple subimages. To ensure that none of the
blocks contained in a subimage are in the eight-neighborhood of each other,
N should be at least 4 [27]. To scatter the AC coefficients of a block among
multiple subimages, the LRJ compression algorithm employs a scrambling tech-
nique, which when given a set of N blocks of AC coefficients, creates a new
set of N blocks such that the AC coefficients from each of the input blocks
are equally distributed among all of the output blocks (Fig. 6.5). Once all the
blocks within the image have been processed, each of the N subimages can be
independently encoded.
A2 A3
A4
A1
A5 A6 A7
A8
A9 A10 A11
A12 A13 A14 A15
B1 B2 B3
B4 B5 B6 B7
B8 B9 B10 B11
B12 B13 B14 B15
A4
A8
A12
A1
A5
A9
A13
A2
A6
A10
A14
B1
B5
B9
B13
B2
B6
B10
B14
B3
B7
B11
B15
B4
B8
B12
C2
C6
C10
C14
C3
C7
C11
C15
C4
C8
C12
D1
D5
D9
D13
D2
D6
D10
D14
D3
D7
D11
D15
D4
D8
D12
A7
A11
A15
A3
Scrambling AC coefficients
C1
C5
C9
C13
A0
A0
B0
B0
C2 C3
C4
C1
C5 C6 C7
C8
C9 C10 C11
C12 C13 C14 C15
D1 D2 D3
D4 D5 D6 D7
D8 D9 D10 D11
D12 D13 D14 D15
C0
C0
D0
D0
Figure 6.5. Scrambling AC coefficients. Here A
0
, B
0
, C
0
,andD
0
denote DC coefficients,
and ∀i ∈ [29, 30] : A
i
,B
i
,C
i
,andD
i
represent AC coefficients.