Image Databases: Search and Retrieval of Digital Imagery
Edited by Vittorio Castelli, Lawrence D. Bergman
Copyright
2002 John Wiley & Sons, Inc.
ISBNs: 0-471-32116-8 (Hardback); 0-471-22463-4 (Electronic)
1 Digital Imagery: Fundamentals
VITTORIO CASTELLI
IBM T.J. Watson Research Center, Yorktown Heights, New York
LAWRENCE D. BERGMAN
IBM T.J. Watson Research Center, Hawthorne, New York
1.1 DIGITAL IMAGERY
Digital images have a predominant position among multimedia data types. Unlike
video and audio, which are mostly used by the entertainment and news industry,
images are central to a wide variety of fields ranging from art history to medicine,
including astronomy, oil exploration, and weather forecasting. Digital imagery
plays a valuable role in numerous human activities, such as law enforcement,
agriculture and forestry management, earth science, urban planning, as well as
sports, newscasting, and entertainment.
This chapter provides an overview of the topics covered in this book. We first
describe several applications of digital imagery, some of which are covered in
Chapters 2 to 5. The main technological factors that support the management and
exchange of digital imagery, namely, acquisition, storage (Chapter 6), database
management (Chapter 7), compression (Chapter 8), and transmission (Chapter 9)
are then discussed.
Finally, a section has been devoted to content-based retrieval, a large class of
techniques specifically designed for retrieving images and video. Chapters 10 to
17 cover these topics in detail.
1.2 APPLICATIONS OF DIGITAL IMAGES
Applications of digital imagery are continually developing. In this section, some
of the major ones have been reviewed and the enabling economical and techno-
logical factors have been discussed briefly.
1.2.1 Visible Imagery
Photographic images are increasingly being acquired, stored, and transmitted in
digital format. Their applications range from personal use to media and adver-
tising, education, art, and even research in the humanities.
1
2 DIGITAL IMAGERY: FUNDAMENTALS
In the consumer market, digital cameras are slowly replacing traditional film-
based cameras. The characteristics of devices for acquiring, displaying, and
printing digital images are improving while their prices are decreasing. The reso-
lution and color fidelity of digital cameras and desktop scanners are improving.
Advancements in storage technology make it possible to store large number
of pictures in digital cameras before uploading them to a personal computer.
Inexpensive color printers can produce good quality reproductions of digital
photographs. Digital images are also easy to share and disseminate: they can
be posted on personal web sites or sent via e-mail to distant friends and relatives
at no cost.
The advertisement and the media industries maintain large collection of images
and need systems to store, archive, browse, and search them by content.
Museums and art galleries are increasingly relying on digital libraries to orga-
nize and promote their collections [1], and advertise special exhibits. These digital
libraries, accessible via the Internet, provide an excellent source of material for
the education sector and, in particular, for K-12.
A novel and extremely important application of digital libraries is to organize
collections of rare and fragile documents. These documents are usually kept in
highly controlled environments, characterized by low light and precise temper-
ature and humidity levels. Only scholars can gain access to such documents,
usually for a very limited time to minimize the risk of damage. Technology is
changing this scenario: complex professional scanners have been developed that
have very good color fidelity and depth (42-bit color or more) as well as high
resolution. These scanners can capture the finest details, even those invisible
without the aid of a magnifying lens, without risk to the original documents. The
resulting digital images can be safely distributed to a wide audience across the
Internet, allowing scholars to study otherwise inaccessible documents.
1.2.2 Remotely Sensed Images
One of the earliest application areas of digital imagery was remote sensing.
Numerous satellites continuously monitor the surface of the earth. The majority
of them measure the reflectance of the surface of the earth or atmospheric layers.
Others measure thermal emission in the far-infrared and near-microwave portion
of the spectrum, while yet others use synthetic-aperture radars and measure both
reflectance and travel time (hence elevation). Some instruments acquire measure-
ments in a single portion of the spectrum; others simultaneously acquire images
in several spectral bands; finally, some radiometers acquire measurements in tens
or hundreds of narrow spectral bands. Geostationary satellites on a high equa-
torial orbit are well suited to acquire low-resolution images of large portions of
the earth’s surface (where each pixel corresponds to tens of square miles), and
are typically used for weather prediction. Nongeostationary satellites are usually
on a polar orbit — their position relative to the ground depends both on their
orbital motion and on the rotation of the earth. Lower-orbiting satellites typi-
cally acquire higher-resolution images but require more revolutions to cover the
APPLICATIONS OF DIGITAL IMAGES 3
entire surface of the earth. Satellites used for environmental monitoring usually
produce low-resolution images, where each pixel corresponds to surface areas on
the order of square kilometers. Other commercial satellites have higher resolu-
tion. The Landsat TM instrument has a resolution of about 30 m on the ground,
and the latest generation of commercial instruments have resolutions of 1 to 3 m.
Satellites for military applications have even higher resolution.
The sheer volume of satellite-produced imagery, on the order of hundreds of
gigabytes a day, makes acquisition, preparation, storage, indexing, retrieval, and
distribution of the data very difficult.
The diverse community of users often combine remotely sensed images with
different types of data, including geographic or demographic information, ground-
station observations, and photographs taken from planes. The resulting need for
data fusion and interoperability poses further challenges to database and appli-
cation developers.
1.2.3 Medical Images
Images are used in medicine for both diagnostic and educational purposes. Radi-
ology departments produce the vast majority of medical images, while anatomic
photographs and histological microphotographs account for a small fraction of
the overall data volume.
Radiological images capture physical properties of the body, such as opacity
to X rays (radiographies and CT scans), reflectance to ultrasounds, concentration
of water or other chemicals (MRI), and distribution of elements within organs
(PET). Medical images can be used to investigate anatomic features (e.g., broken
bones or tumors) and physiological functions (e.g., imbalances in the activity of
specific organs).
The availability of high-quality, high-resolution displays of sensors with better
characteristics (sensitivity, quantum efficiency, etc.) than photographic film, of
fast interconnection networks, and of inexpensive secondary and tertiary storage
are driving radiology departments toward entirely digital, filmless environments,
wherein image databases play a central role.
The main challenges faced by medical image databases are integration with
the hospital work flow and interoperability.
1.2.4 Geologic Images
Oil companies are among the main producers and consumers of digital imagery.
Oil exploration often starts with seismic surveys, in which large geologic forma-
tions are imaged by generating sound waves and measuring how they are reflected
at the interface between different strata. Seismic surveys produce data that is
processed into two- or three-dimensional imagery.
Data are also routinely acquired during drilling operation. Measurements of
physical properties of the rock surrounding the well bore are measured with
special-purpose imaging tools, either during drilling or afterwards. Some instru-
ments measure aggregate properties of the surrounding rock and produce a single
4 DIGITAL IMAGERY: FUNDAMENTALS
measurement every sampling interval; others have arrays of sensors that take
localized measurements along the circumference of the well bore. The former
measures are usually displayed as one-dimensional signals and the latter measures
are displayed as (long and thin) images.
Sections of rock (core) are also selectively removed from the bottom of the
well, prepared, and photographed for further analysis. Visible-light or infrared-
light microphotographs of core sections are often used to assess structural prop-
erties of the rock, and a scanning electron microscope is occasionally used to
produce images at even higher magnification.
Image databases designed for the oil industry face the challenges of large data
volumes, a wide diversity of data formats, and the need to combine data from
multiple sources (data fusion) for the purpose of analysis.
1.2.5 Biometric Identification
Images are widely used for personal-identification purposes. In particular, finger-
prints have long been used in law enforcement and are becoming increasingly
popular for access control to secure information and identity. checks during
firearm sales. Some technologies, such as face recognition, are still in the research
domain, while others, such as retinal scan matching, have very specialized appli-
cations and are not widespread.
Fingerprinting has traditionally been a labor-intensive manual task performed
by highly skilled workers. However, the same technological factors that have
enabled the development of digital libraries, and the availability of inkless finger-
print scanners, have made it possible to create digital fingerprint archives[2,3].
Fingerprint verification (to determine if two fingerprints are from the same
finger), identification (retrieving archived fingerprints that match the one given),
and classification (assigning a fingerprint to a predefined class) all rely on the
positions of distinctive characteristics of the fingerprint called minutiae. Typical
minutiae are the points where a ridge bifurcates or where a ridge terminates.
Fingerprint collections are searched by matching the presence and relative
positions of minutiae. Facial matching procedures operate similarly — extracting
essential features and then matching them against pre-extracted features from
images in the database.
The main challenges in constructing biometric databases are the reliable extrac-
tion of minutiae and the matching strategy. Matching, in particular, is difficult: it
must rely on rotation- and translation-invariant algorithms, it must be robust to
missing and spurious data (the extraction algorithm might fail to identify relevant
features or might extract nonexistent features, especially if the image quality is
poor), and it must account for distortions due to lighting and positioning. The
development of efficient indexing methods that satisfy these requirements is still
an open research problem.
1.2.6 Astronomical Imagery
Astronomers acquire data in all regions of the electromagnetic spectrum.
Although the atmosphere blocks most high-energy waves (UV, X- rays and γ -
APPLICATIONS OF DIGITAL IMAGES 5
rays), as well as large portions of the infrared and lower-frequency waves, it
has large transparency windows that allowed the development of visible-light
and radio wave astronomy. High-energy astronomy is possible using instruments
mounted on high-altitude planes or orbiting satellites. Radio telescopes acquire
signals in the long-wave, short-wave and microwave ranges, and are used to
produce two-dimensional maps, often displayed as images.
Traditionally, astronomy has heavily relied on plate-based photography for
infrared, visual, and ultraviolet studies (in astronomy, glass plates are used instead
of photographic film). Long exposures (of up to tens of hours) make it possible
to capture objects that are one to two orders of magnitude too dim to be detected
by the human eye through the same instrument.
Photographic plates are not the ideal detectors — they are expensive and
fragile, often have small defects that can hide important information, are not
very sensitive to light, lose sensitivity during exposure, and their reproduction
for distribution is labor-intensive. Their main benefits are large size and high
resolution.
Starting from the mid-1980s, sensors that acquire images in digital format
have become more and more widely used. In particular, charge-coupled devices
(CCD) have found widespread application in astronomy. High-resolution sensor
arrays, with responses that go beyond the visible spectrum are now commonly
available. These instruments are extremely sensitive — when coupled with photo-
multipliers, they can almost detect the arrival of individual photons. Images that
used to require hours of exposure can now be produced in minutes or less.
Additionally, techniques exist to digitally reduce the inherent electrical noise of
the sensor, further enhancing the quality of the image. Since the images are
produced directly in digital format, a photograph is often acquired by collecting
a sequence of short-exposure snapshots and combining them digitally. Image-
processing techniques exist to compensate for atmospheric turbulence and for
inaccuracies in telescope movement. Solid-state devices are also the detectors of
choice for orbiting telescopes.
Digital libraries that organize the wealth of astronomical information are
growing continuously and are increasingly providing support for communities
beyond professional astronomers and astrophysicists, including school systems
and amateur astronomers.
1.2.7 Document Management
Digital imagery plays an increasingly important role in traditional office manage-
ment. Although we are far from the “paperless office” that many have envisioned,
more and more information is being stored digitally, much of it in the form of
imagery.
Perhaps the best case in point is archiving of cancelled checks. This
information in the past was stored on microfilm — a medium that was difficult
to manage. Moving this information to digital storage has resulted in enhanced
ease of access and reduced storage volume. The savings are even more dramatic
when digital imagery is used to replace paper records.
6 DIGITAL IMAGERY: FUNDAMENTALS
Although many documents can be very efficiently compressed by using optical
character recognition (OCR) to convert them into text, for many applications the
format of the original document is best retained using two-dimensional images.
Challenges in document management include automated scanning technology,
automated indexing for rapid retrieval, and managing the imagery within the
context of traditional database management systems.
1.2.8 Catalogs
Perhaps the most evident role of digital imagery to end consumers is on-line
catalogs. Rapidly replacing traditional print catalogs, on-line catalogs often run to
hundreds of pages with large volumes of color imagery. These applications range
from simple display of small thumbnails to advanced facilities that allow high-
resolution pan and zoom, and even manipulation of three-dimensional models.
The requirements for digital catalogs include support for multiresolution
storage and transmission, rapid retrieval, and integration with traditional database
facilities.
1.3 TECHNOLOGICAL FACTORS
1.3.1 Acquisition and Compression
In the past decade, high-quality digital image acquisition systems have become
increasingly available. They range from devices for personal use, such as
inexpensive high-definition color scanners and cameras, to professional scanners
with extremely precise color calibration (used to image art and rare document
collections), to complex medical instruments such as digital mammography and
digital radiology sensor arrays.
Compression is an essential technique for the management of large image
collections. Compression techniques belong to one of two classes. Lossless
compression algorithms ensure that the original image is exactly reconstructed
from the compressed data. Lossy algorithms only allow reconstruction of an
approximation of the original data. Lossless schemes usually reduce the data
volume by a factor of 2 or 3 at best, whereas lossy schemes can reduce the
storage requirement by a factor of 10 without introducing visually appreciable
distortions.
In addition to its role in reducing the storage requirements for digital archives
(and the associated enhancements in availability), compression plays an important
role in transmission of imagery. In particular, progressive transmission techniques
enable browsing of large image collections that would otherwise be inaccessible.
Numerous standards for representing, compressing, storing, and manipulating
digital imagery have been developed and are widely employed by the producers
of computer software and hardware. The existence of these standards significantly
simplifies a variety of tasks related to digital imagery management.
TECHNOLOGICAL FACTORS 7
1.3.2 Storage and Database Support
Recent advances in computer hardware have made storage of large collections of
digital imagery both convenient and inexpensive. The capacity of moderately
priced hard drives has increased 10 times over the past four years. Redun-
dant arrays of inexpensive disks (RAIDs) provide fast, fault-tolerant, and high-
capacity storage solutions at low cost. Optical and magneto-optical disks offer
high capacity and random access capability and are well suited for large robotic
tertiary storage systems. The cost of write-once CD-ROMs is below 1 dollar
per gigabyte when they are bought in bulk, making archival storage available at
unprecedented levels.
Modern technology also provides the fault-tolerance required in many appli-
cation areas. High-capacity media with long shelf life allow storage of medical
images for periods of several years, as mandated by law. In oil exploration,
images acquired over a long period of time can help in modeling the evolution
of a reservoir and in determining the best exploitation policies. Loss-resilient
disk-placement techniques and compression algorithms can produce high-quality
approximations of the original images even in the presence of hardware faults
(e.g., while a faulty disk is being replaced and its contents are being recovered
from a backup tape). These technologies are essential in some applications, such
as medicine, where the user must be allowed to retrieve a large part of an image
even during system faults.
Numerous software advances have enabled guarantees of high-quality service
to users accessing large image repositories. At the operating system level, soft-
ware policies exist to optimally place images on disk and multiresolution images
across multiple disks in order to minimize response time. Strategies to distribute
imagery across hierarchical storage management systems exist, and, together
with caching and batching policies, allow the efficient management of very large
collections.
At the middleware level, database technology advancements have been made
to support image repositories. Object-relational databases can store image files
as binary large objects (BLOBs) and allow application developers to define new
data types (user-defined data types — UDTs) and methods to manipulate them
(user-defined functions — UDFs).
1.3.3 Transmission and Display
The explosion of the Internet has facilitated the distribution and sharing of digital
imagery and has fostered new fields, such as teleconsultation in medicine, which
were unthinkable just two decades ago. Numerous techniques have been devel-
oped to transmit digital images. Progressive transmission methods are particularly
interesting. They allow the receiver to display a lossy version of the image,
possibly at low resolution, shortly after the transmission begins, with improve-
ments in quality as more data is received. Because of these techniques, the user
can decide whether the image is of interest and terminate the transmission if
necessary, without having to wait for the entire data file.
8 DIGITAL IMAGERY: FUNDAMENTALS
Advances in display technology are providing affordable solutions for
specialized fields such as medicine, where extremely high-resolution (2,000 ×
2,000 pixels or more), high-contrast, large dynamic range (12 bits or more for
gray scale images), very limited geometric distortion, and small footprint displays
are necessary.
1.4 INDEXING LARGE COLLECTION OF DIGITAL IMAGES
A large body of research has been devoted to developing mechanisms for effi-
ciently retrieving images from a collection. By nature, images contain unstruc-
tured information, which makes the search task extremely difficult. A typical
query on a financial database would ask for all the checks written on a specific
account and cleared during the last statement period; a typical query on a database
of photographic images could request all images containing a red Ferrari F50.
The former operation is rather simple: all the transaction records in the database
contain the desired information (date, type, account, amount) and the database
management system is built to efficiently execute it. The latter operation is a
formidable task, unless all the images in the repository have been manually
annotated.
The unstructured information contained within images is difficult to capture
automatically. Techniques that seek to index this unstructured visual information
are grouped under the collective name of content-based retrieval.
1.4.1 Content-Based Retrieval of Images
In content-based retrieval [4], the user describes the desired content in terms of
visual features, and the system retrieves images that best match the description.
Content-based retrieval is therefore a type of retrieval by similarity.
Image content can be defined at different levels of abstraction [5–7]. At the
lowest level, an image is a collection of pixels. Pixel-level content is rarely
used in retrieval tasks. It is however, important, in very specific applications,
for example, in identifying ground control points used to georegister remotely
sensed images or anatomic details used for coregistering medical images from
different modalities.
The raw data can be processed to produce numeric descriptors capturing
specific visual characteristics called features. The most important features for
image databases are color, texture, and shape. Features can be extracted from
entire images describing global visual characteristics or from portions of images
describing local characteristics. In general, a feature-level representation of an
image requires significantly less space than the image itself.
The next abstraction level describes the semantics. A semantic-level charac-
terization of photographic images is an extremely complex task, and this field is
characterized by countless open problems. However, in numerous scientific disci-
plines, semantic content can be inferred from the lower abstraction levels. For
example, it is possible to deduce the type of land cover using spectral information
OVERVIEW OF THE BOOK 9
from satellite images [8]; similarly, color, texture, and ancillary one-dimensional
measurements can be used to determine the composition of geologic strata imaged
for oil exploration purposes.
At the highest level, images are often accompanied by metadata. Metadata
may contain information that cannot be obtained directly from the image, as
well as an actual description of the image content. For example, medical images
(Chapter 4) stored for clinical purposes are accompanied by a radiological report,
detailing their relevant characteristics.
It is often natural to describe image content in terms of objects, which can
be defined at one or more abstraction levels. It is often very difficult to identify
objects in photographic images; however, scientific data is often more amenable
to automatic object extraction. For example, a bone section in an MRI scan is a
very dark connected area (feature level), a forest in a remotely sensed image is a
connected area covered by evergreen or deciduous trees (semantic level), and so on.
To support content-based retrieval at any of the abstraction levels, appropriate
quantities that describe the characteristics of interest must be defined, algorithms
to extract these quantities from images must be devised, similarity measures
to support the retrieval must be selected, and indexing techniques to efficiently
search large collection of data must be adopted.
1.4.2 Multidimensional Indexing
Content-based retrieval relies heavily on similarity search over high-dimensional
spaces. More specifically, image features such as color or texture are repre-
sented using multidimensional descriptors that may have tens or hundreds of
components.
The search task can be made significantly more efficient by relying on multi-
dimensional indexing structures. There are a large variety of multidimensional
indexing methods, which differ in the type of queries they support and the dimen-
sionality of the space where they are advantageous.
Most existing database management systems (DBMS) do not support multi-
dimensional indexes, and those that do support them usually offer a very limited
selection of such methods. Active research is being conducted on how to provide
mechanisms in the DBMS that would allow users to incorporate the multidimen-
sional indexing structures of choice into the search engine.
1.5 OVERVIEW OF THE BOOK
The chapters that follow are divided into three parts. The first part analyzes
different application areas for digital imagery. Each chapter analyzes the
characteristics of the data and their use, describes the requirements for image
databases, and outlines current research issues. Chapter 2 describes several
applications of visible imagery, including art collections and trademarks.
Chapter 3 is devoted to databases of remotely sensed images. Chapter 4 analyzes
the different types of medical images, discusses standardization efforts, and
10 DIGITAL IMAGERY: FUNDAMENTALS
describes the structure of work flow–integrated medical image databases.
Chapter 5 describes the different types of data used in oil exploration, the
corresponding acquisition and processing procedures, and their individual and
combined uses. This chapter also describes data and metadata formats, as well
as emerging standards for interoperability between acquisition, transmission,
storage, and retrieval systems.
The second part of the book discusses the major enabling technologies for
image repositories. Chapter 6 describes storage architectures for managing multi-
media collections. Chapter 7 discusses support for image and multimedia types in
database management systems. Chapter 8 is an overview of image compression,
and Chapter 9 describes the transmission of digital imagery.
The third part of the book is devoted to organizing, searching, and retrieving
images. Chapter 10 introduces the concept of content-based search. Chapters 11,
12 and 13 discuss how to represent image content using low-level features (color,
texture and shape, respectively) and how to perform feature-based similarity
search. Feature-based search is very expensive if the image repository is large, and
the use of multidimensional indexing structures and appropriate search strategies
significantly improves the response time. Chapter 14 discusses multidimensional
indexing structures and their application to multimedia database indexing, and
Chapter 15 describes retrieval strategies that efficiently prune the search space.
Chapter 16 discusses how to advantageously use properties of compression algo-
rithms to index images. Chapter 17 covers image retrieval using semantic content.
REFERENCES
1. F. Mintzer. Developing digital libraries of cultural content for Internet access. IEEE
Commun. Mag. 37(1), 72–78 (1999).
2. A.K. Jain, H. Lin, and R. Bolle. On-line fingerprint verification. IEEE Trans. Pattern
Anal. Machine Intell. 19(4), 302–314 (1997).
3. A.K. Jain, H. Lin, S. Pankanti, and R. Bolle. An identity-authentication system using
fingerprints. Proc. IEEE 85(9), 1365–1388 (1997).
4. W. Niblack et al., The QBIC Project: querying images by content using color, texture,
and shape, Proc. SPIE, Storage and Retrieval for Image and Video Databases, 1908,
173–187 (1993).
5. L.D. Bergman, V. Castelli, C S. Li, and J.R. Smith. SPIRE, a digital library for
scientific information, Special Issue of IJODL, “in the tradition of Alexandrian
Scholars” 3(1), 85–99 (2000).
6. C S. Li, P.S. Yu, and V. Castelli. MALM, a framework for mining sequence databases
at multiple abstraction levels, Proceedings of the 7th International conference on
Information and Knowledge Management CIKM’98, Bethesda, Md. USA, 3–7 1998
pp. 267–272.
7. V. Castelli et al., Progressive search and retrieval in large image archives, IBM J. Res.
Dev. 42(2), 253–268 (1998).
8. V. Castelli, C S. Li, J.J. Turek, and I. Kontoyiannis, Progressive classification in the
compressed domain for large EOS satellite databases. Proc. IEEE ICASSP’96 4,
2201–2204 (1996).