DATABASE SYSTEMS (phần 17) doc

632 I Chapter 19 Database Recovery Techniques Review Questions 19.1. Discuss the different types of transaction failures. What is meant by catastrophic failure? 19.2. Discuss the actions taken by the read_item and write_item operations on a database. 19.3. (Review from Chapter 17) What is the system log used for? What are the typical kinds of entries in a system log? What are checkpoints, and why are they important? What are transaction commit points, and why are they important? 19.4. How are buffering and caching techniques used by the recovery subsystem? 19.5. What are the before image (BFIM) and after image (AFIM) of a data item? What is the difference between in-place updating and shadowing, with respect to their handling of BFIM and AFIM? 19.6. What are UNDO-type and REDO-type log entries? 19.7. Describe the write-ahead logging protocol. 19.8. Identify three typical lists of transactions that are maintained by the recovery subsystem. 19.9. What is meant by transaction rollback? What is meant by cascading rollback? Why do practical recovery methods use protocols that do not permit cascading rollback? Which recovery techniques do not require any rollback? 19.10. Discuss the UNDO and REDO operations and the recovery techniques that use each. 19.11. Discuss the deferred update technique of recovery. What are the advantages and disadvantages of this technique? Why is it called the NO-UNDO/REDO method? 19.12. How can recovery handle transaction operations that do not affect the database, such as the printing of reports by a transaction? 19.13. Discuss the immediate update recovery technique in both single-user and mul- tiuser environments. What are the advantages and disadvantages of immediate update? 19.14. What is the difference between the UNDO/REDO and the UNDO/NO-REDO algorithms for recovery with immediate update? Develop the outline for an UNDO/NO- REDO algorithm. 19.15. Describe the shadow paging recovery technique. Under what circumstances does it not require a log? 19.16. Describe the three phases of the ARIES recovery method. 19.17. What are log sequence numbers (LSNs) in ARIES? How are they used? What information does the Dirty Page Table and Transaction Table contain? Describe how fuzzy checkpointing is used in ARIES. 19.18. What do the terms steal/no-steal and force/no-force mean with regard to buffer management for transaction processing. 19.19. Describe the two-phase commit protocol for multidatabase transactions. 19.20. Discuss how recovery from catastrophic failures is handled. Exercises 19.21. Suppose that the system crashes before the [read_item,T3,A] entry is written to the log in Figure 19.1b. Will that make any difference in the recovery process? 19.22. Suppose that the system crashes before the [write_item,T2,D,25,26] entry is written to the log in Figure 19.1b. Will that make any difference in the recovery process? 19.23. Figure 19.7 shows the log corresponding to a particular schedule at the point of a system crash for four transactions T I , T z , T 3 , and T 4 . Suppose that we use the immediate update protocol with checkpointing. Describe the recovery process from the system crash. Specify which transactions are rolled back, which operations in the log are redone and which (if any) are undone, and whether any cascading rollback takes place. 19.24. Suppose that we use the deferred update protocol for the example in Figure 19.7. Show how the log would be different in the case of deferred update by removing the unnecessary log entries; then describe the recovery process, using your modi- fied log. Assume that only REDO operations are applied, and specify which operations in the log are redone and which are ignored. 19.25. How does checkpointing in ARIES differ from checkpointing as described in Sec- tion 19.1.4? 19.26. How are log sequence numbers used by ARIES to reduce the amount of REDO work needed for recovery? Illustrate with an example using the information shown in Fig- ure 19.6. You can make your own assumptions as to when a page is written to disk. [start_transaction, T 1 ] [read_item, T 1 ,A] [read_item, T 1 ,0] [write_item, T 1 ,0, 20, 25] [commit,Td [checkpoint] [start_transaction, T 2 ] [read Item, T 2,B] [writejtem, T 2,B, 12,18] [starttransaction, T 4 ] [read_item, T 4,D] [write_item, T 4,D, 25,15] [start_transaction, T 3 1 [write_item, T 3,C, 30,40] [read_item, T 4,A] [write_item, hA. 30, 20] [commit,T 4 1 [read_item, T 2,D] [write_item, T 2,D, 15, 25]f- system crash FIGURE 19.7 An example schedule and its corresponding log. Exercises I 633 634 IChapter 19 Database Recovery Techniques 19.27. What implications would a no-steal/force buffer management policy have on checkpointing and recovery? Choose the correct answer for each of the following multiple-choice questions: 19.28. Incremental logging with deferred updates implies that the recovery system must necessarily a. store the old value of the updated item in the log. b. store the new value of the updated item in the log. e. store both the old and new value of the updated item in the log. d. store only the Begin Transaction and Commit Transaction records in the log. 19.29. The write ahead logging (WAL) protocol simply means that a. the writing of a data item should be done ahead of any logging operation. b. the log record for an operation should be written before the actual data is written. e. all log records should be written before a new transaction begins execution. d. the log never needs to be written to disk. 19.30. In case of transaction failure under a deferred update incremental logging scheme, which of the following will be needed: a. an undo operation. b. a redo operation. e. an undo and redo operation. d. none of the above. 19.31. For incremental logging with immediate updates, a log record for a transaction would contain: a. a transaction name, data item name, old value of item, new value of item. b. a transaction name, data item name, old value of item. e. a transaction name, data item name, new value of item. d. a transaction name and a data item name. 19.32. For correct behavior during recovery, undo and redo operations must be a. commutative. b. associative. e. idempotent. d. distributive. 19.33. When a failure occurs, the log is consulted and each operation is either undone or redone. This is a problem because a. searching the entire log is time consuming. b. many redo's are unnecessary. e. both (a) and (b). d. none of the above. 19.34. When using a log based recovery scheme, it might improve performance as well as providing a recovery mechanism by a. writing the log records to disk when each transaction commits. b. writing the appropriate log records to disk during the transaction's execution. c. waiting to write the log records until multiple transactions commit and writing them as a batch. d. never writing the log records to disk. Selected Bibliography I 635 19.35. There is a possibility of a cascading rollback when a. a transaction writes items that have been written only by a committed transaction. b. a transaction writes an item that is previously written by an uncommitted transaction. c. a transaction reads an item that is previously written by an uncommitted transaction. d. both (b) and (c). 19.36. To cope with media (disk) failures, it is necessary a. for the DBMS to only execute transactions in a single user environment. b. to keep a redundant copy of the database. c. to never abort a transaction. d. all of the above. 19.37. If the shadowing approach is used for flushing a data item back to disk, then a. the item is written to disk only after the transaction commits. b. the item is written to a different location on disk. c. the item is written to disk before the transaction commits. d. the item is written to the same disk location from which it was read. Selected Bibliography The books by Bernstein et al. (1987) and Papadimitriou (1986) are devoted to the theory andprinciples of concurrency control and recovery. The book by Gray and Reuter (1993) is an encyclopedic work on concurrency control, recovery, and other transaction-processing issues. Verhofstad (1978) presents a tutorial and survey of recovery techniques in database systems. Categorizing algorithms based on their UNDO/REDO characteristics is discussed in Haerder and Reuter (1983) and in Bernstein et al. (1983). Gray (1978) discusses recovery, along with other system aspects of implementing operating systems for databases. The shadow paging technique is discussed in Lorie (1977), Verhofstad (1978), and Reuter (1980). Gray et al. (1981) discuss the recovery mechanism in SYSTEM R. Lockeman and Knutsen (1968), Davies (1972), and Bjork (1973) are early papers that discuss recovery. Chandy et al. (1975) discuss transaction rollback. Lilien and Bhargava (1985) discuss the concept of integrity block and its use to improve the efficiency of recovery. Recovery using write-ahead logging is analyzed in [hingran and Khedkar (1992) and isused in the ARIES system (Mohan et al. 1992a). More recent work on recovery includes compensating transactions (Korth et al. 1990) and main memory database recovery (Kumar 1991). The ARIES recovery algorithms (Mohan et al. 1992) have been quite successful in practice. Franklin et al. (1992) discusses recovery in the EXODUS system. Two recent books by Kumar and Hsu (1998) and Kumar and Son (1998) discuss recovery in detail and contain descriptions of recovery methods used in a number of existing relational database products. OBJECT AND OBJECT -RELATIONAL DATABASES Concepts for Object Databases In this chapter and the next, we discuss object-oriented data models and database sys- terns.' Traditional data models and systems, such as relational, network, and hierarchical, have been quite successful in developing the database technology required for many traditional business database applications. However, they have certain shortcomings when more complex database applications must be designed and implemented-for example, databases for engineering design and manufacturing (CAD/CAM and CIM 2), scientific experiments, telecommunications, geographic information systems, and rnultimedia' These newer applications have requirements and characteristics that differ from those of traditional business applications, such as more complex structures for objects, longer- duration transactions, new data types for storing images or large textual items, and the need to define nonstandard application-specific operations. Object-oriented databases were proposed to meet the needs of these more complex applications. The object- oriented approach offers the flexibility to handle some of these requirements without 1.These darabases are often referred to as Object Databases and the systems are referred to as Object Database Management Systems (ODBMS). However, because this chapter discusses many general object-oriented concepts, we willuse the term object-oriented instead of just object. 2. Computer-Aided Design/Computer-Aided Manufacturing and Computer-Integrated Manufac- turing. 3. Multimedia databases must store various types of multimedia objects, such as video, audio, images,graphics, and documents (see Chapter 24). 639 640 I Chapter 20 Concepts for Object Databases being limited by the data types and query languages available in traditional database systems. A key feature of object-oriented databases is the power they give the designer to specify both the structure of complex objects and the operations that can be applied to these objects. Another reason for the creation of object-oriented databases is the increasing use of object-oriented programming languages in developing software applications. Databases are now becoming fundamental components in many software systems, and traditional databases were difficult to use with object-oriented software applications that are developed in an object-oriented programming language such as C++, SMALLTALK, or JAVA. Object-oriented databases are designed so they can be directly-or seamlessly- integrated with software that is developed using object-oriented programming languages. The need for additional data modeling features has also been recognized by relational DBMS vendors, and newer versions of relational systems are incorporating many of the features that were proposed for object-oriented databases. This has led to systems that are characterized as object-relational or extended relational DBMSs (see Chapter 22). The latest version of the SQL standard for relational DBMSs includes some of these features. Although many experimental prototypes and commercial object-oriented database systems have been created, they have not found widespread use because of the popularity of relational and object-relational systems. The experimental prototypes included the ORION system developed at MCC,4 OPENOODB at Texas Instruments, the IRIS system at Hewlett-Packard laboratories, the ODE system at AT&T Bell Labs.? and the ENCORE! ObServer project at Brown University. Commercially available systems included GEMSTONE/OPAL of GemStone Systems, ONTOS of Ontos, Objectivity of Objectivity Inc., Versant of Versant Object Technology, ObjectStore of Object Design, ARDENT of ARDENT Software," and POET of POET Software. These represent only a partial list of the experimental prototypes and commercial object-oriented database systems that were created. As commercial object-oriented DBMSs became available, the need for a standard model and language was recognized. Because the formal procedure for approval of standards normally takes a number of years, a consortium of object-oriented DBMS vendors and users, called ODMG, 7 proposed a standard that is known as the ODMG-93 standard, which has since been revised. We will describe some features of the ODMG standard in Chapter 21. Object-oriented databases have adopted many of the concepts that were developed originally for object-oriented programming languages.f In Section 20.1, we examine the origins of the object-oriented approach and discuss how it applies to database systems. Then, in Sections 20.2 through 20.6, we describe the key concepts utilized in many object- 4. Microelectronics and Computer Technology Corporation, Austin, Texas. 5. Now called Lucent Technologies. 6. Formerly 02 of 02 Technology. 7. Object Database Management Group. 8. Similar concepts were also developed in the fields of semantic data modeling and knowledge representation. 20.1 Overview of Object-Oriented Concepts I 641 oriented database systems. Section 20.2 discusses object identity, object structure, and type constructors. Section 20.3 presents the concepts of encapsulation of operations and definition of methods as part of class declarations, and also discusses the mechanisms for storing objects in a database by making them persistent. Section 2004 describes type and class hierarchies and inheritance in object-oriented databases, and Section 20.5 provides an overview of the issues that arise when complex objects need to be represented and stored. Section 20.6 discusses additional concepts, including polymorphism, operator overloading, dynamic binding, multiple and selective inheritance, and versioning and configuration of objects. This chapter presents the general concepts of object-oriented databases, whereas Chapter 22 will present the ODMG standard. The reader may skip Sections 20.5 and 20.6 ofthis chapter if a less detailed introduction to the topic is desired. 20.1 OVERVIEW OF OBJECT-ORIENTED CONCEPTS This section gives a quick overview of the history and main concepts of object-oriented databases, or OODBs for short. The OODB concepts are then explained in more detail in Sections 20.2 through 20.6. The term object-oriented-abbreviated by 00 or O-O-has its origins in 00 programming languages, or OOPLs. Today 00 concepts are applied in the areas of databases, software engineering, knowledge bases, artificial intelligence, and com- putersystems in general. OOPLs have their roots in the SIMULA language, which was proposed in the late 1960s. In SIMULA, the concept of a class groups together the internal data structure of an object in a class declaration. Subsequently, researchers proposed the concept of abstractdatatype, which hides the internal data structures and specifies all pos- sible external operations that can be applied to an object, leading to the concept of encapsulation. The programming language SMALL TALK, developed at Xerox P ARC 9 in the 1970s, was one of the first languages to explicitly incorporate additional 00 concepts, suchas message passing and inheritance. It is known as a pure 00 programming language, meaning that it was explicitly designed to be object-oriented. This contrasts with hybrid 00 programming languages, which incorporate 00 concepts into an already existing language. An example of the latter is C++, which incorporates 00 concepts into the popular cprogramming language. An object typically has two components; state (value) and behavior (operations). Hence, it is somewhat similar to a program variable in a programming language, except that it will typically have a complexdata structure as well as specific operations defined by the programmer. 10 Objects in an OOPL exist only during program execution and are hence called transient objects. An 00 database can extend the existence of objects so that they are stored permanently, and hence the objects persist beyond program termination and can be retrieved later and shared by other programs. In other words, 00 databases store 9.Palo Alto Research Center, Palo Alto, California. 10.Objects have many other characteristics, as we discuss in the rest of this chapter. 642 IChapter 20 Concepts for Object Databases persistent objects permanently on secondary storage, and allow the sharing of these objects among multiple programs and applications. This requires the incorporation of other well- known features of database management systems, such as indexing mechanisms, concurrency control, and recovery. An 00 database system interfaces with one or more 00 programming languages to provide persistent and shared object capabilities. One goal of 00 databases is to maintain a direct correspondence between real-world and database objects so that objects do not lose their integrity and identity and can easily be identified and operated upon. Hence, 00 databases provide a unique system-generated object identifier (OID) for each object. We can compare this with the relational model where each relation must have a primary key attribute whose value identifies each tuple uniquely. In the relational model, if the value of the primary key is changed, the tuple will have a new identity, even though it may still represent the same real-world object. Alternatively, a real-world object may have different names for key attributes in different relations, making it difficult to ascertain that the keys represent the same object (for example, the object identifier may be represented as EMP _ID in one relation and as SSN in another). Another feature of 00 databases is that objects may have an object structure of arbitrary complexity in order to contain all of the necessary information that describes the object. In contrast, in traditional database systems, information about a complex object is often scattered over many relations or records, leading to loss of direct correspondence between a real-world object and its database representation. The internal structure of an object in OOPLs includes the specification of instance variables, which hold the values that define the internal state of the object. Hence, an instance variable is similar to the concept of an attribute in the relational model, except that instance variables may be encapsulated within the object and thus are not necessarily visible to external users. Instance variables may also be of arbitrarily complex data types. Object-oriented systems allow definition of the operations or functions (behavior) that can be applied to objects of a particular type. In fact, some 00 models insist that all operations a user can apply to an object must be predefined. This forces a complete encapsulation of objects. This rigid approach has been relaxed in most 00 data models for several reasons. First, the database user often needs to know the attribute names so they can specify selection conditions on the attributes to retrieve specific objects. Second, complete encapsulation implies that any simple retrieval requires a predefined operation, thus making ad hoc queries difficult to specify on the fly. To encourage encapsulation, an operation is defined in two parts. The first part, called the signature or interface of the operation, specifies the operation name and arguments (or parameters). The second part, called the method or body, specifies the implementation of the operation. Operations can be invoked by passing a message to an object, which includes the operation name and the parameters. The object then executes the method for that operation. This encapsulation permits modification of the internal structure of an object, as well as the implementation of its operations, without the need to disturb the external programs that invoke these operations. Hence, encapsulation provides a form of data and operation independence (see Chapter 2). Another key concept in 00 systems is that of type and class hierarchies and inheritance. This permits specification of new types or classes that inherit much of their structure and/or operations from previously defined types or classes. Hence, specification of object types can 20.2 Object Identity, Object Structure, and Type Constructors I 643 proceed systematically. This makes it easier to develop the data types of a system incrementally, and to reuse existing type definitions when creating new types of objects. One problem in early 00 database systems involved representing relationships among objects. The insistence on complete encapsulation in early 00 data models led to the argument that relationships should not be explicitly represented, but should instead be described by defining appropriate methods that locate related objects. However, this approach does not work very well for complex databases with many relationships, because it is useful to identify these relationships and make them visible to users. The ODMG standard has recognized this need and it explicitly represents binary relationships via a pairof inverse references-that is, by placing the OIDs of related objects within the objects themselves, and maintaining referential integrity, as we shall describe in Chapter 21. Some 00 systems provide capabilities for dealing with multiple versions of the same object-a feature that is essential in design and engineering applications. For example, an old version of an object that represents a tested and verified design should be retained until the new version is tested and verified. A new version of a complex object may include only a few new versions of its component objects, whereas other components remain unchanged. In addition to permitting versioning, 00 databases should also allow for schema evolution, which occurs when type declarations are changed or when new types or relationships are created. These two features are not specific to OODBs and should ideally be included in all types of DBMSs. 11 Another 00 concept is operator overloading, which refers to an operation's ability to beapplied to different types of objects; in such a situation, an operation name may refer to several distinct implementations, depending on the type of objects it is applied to. This feature is also called operator polymorphism. For example, an operation to calculate the area of a geometric object may differ in its method (implementation), depending on whether the object is of type triangle, circle, or rectangle. This may require the use of late binding of the operation name to the appropriate method at run-time, when the type of object to which the operation is applied becomes known. This section provided an overview of the main concepts of 00 databases. In Sections 20.2 through 20.6, we discuss these concepts in more detail. 20.2 OBJECT IDENTITY, OBJECT STRUCTURE, AND TYPE CONSTRUCTORS In this section we first discuss the concept of object identity, and then we present the typical structuring operations for defining the structure of the state of an object. These structuring operations are often called type constructors. They define basic data-structuring operations that can be combined to form complex object structures. 11.Several schema evolution operations, such as ALTER TABLE, are already defined in the relational SQL standard (see Section 8.3). [...]... and systems It is also related to the concepts of abstract data types and information hiding in programming languages In traditional database models and systems, this concept was not I 649 650 I Chapter 20 Concepts for Object Databases applied, since it is customary to make the structure of database objects visible to users and external programs In these traditional models, a number of standard database. .. collections for the same class definition, if desired 21 Some systems, such as POET, automatically create the extent for a class I 653 654 I Chapter 20 Concepts for Object Databases 20.4 TYPE AND CLASS HIERARCHIES AND INHERITANCE Another main characteristic of 00 database systems is that they allow type hierarchies and inheritance Type hierarchies in databases usually imply a constraint on the extents corresponding... standard, which differs from the model discussed here 20.4.1 Type Hierarchies and Inheritance In most database applications, there are numerous objects of the same type or class Hence, 00 databases must provide a capability for classifying objects based on their type, as do other database systems But in 00 databases, a further requirement is that the system permit the definition of new types based on other... 20.7 SUMMARY In this chapter we discussed the concepts of the object-oriented approach to database systems, which was proposed to meet the needs of complex database applications and to add database functionality to object-oriented programming languages such as c++ We first discussed the main concepts used in 00 databases, which include the following: • Object identity: Objects have unique identities... Object-oriented database concepts are an amalgam of concepts from 00 programming languages and from database systems and conceptual data models A number of textbooks describe 00 programming languages-for example, Stroustrup (1986) and Pohl (1991) for C++, and Goldberg (1989) for SMALLTALK Recent books by Cattell (1994) and Lausen and Vossen (1997) describe 00 database concepts There is a vast bibliography on 00 databases,... illustrated in Figure 20.4 All such names given to objects must be unique within a particular database Hence, the named persistent objects are used as entry points to the database through which users and applications can start their database access Obviously, it is not practical to give names to all objects in a large database that includes thousands of objects, so most objects are made persistent by using... the end of Chapter 21 Object Database Standards, Languages, and Design As we discussed at the beginning of Chapter 8, having a standard for a particular type of database system is very important, because it provides support for portability of database applications Portability is generally defined as the capability to execute a particular application program on different systems with minimal modifications... object database instead of object-oriented database (as in the previous chapter), since this is now more commonly accepted terminology 665 666 I Chapter 21 Object Database Standards, Languages, and Design A second potential advantage of having and adhering to standards is that it helps in achieving interoperability, which generally refers to the ability of an application to access multiple distinct systems. .. object-oriented databases by many researchers Following the description of the ODMG model, we will describe a technique for object database conceptual design in Section 21.5 We will discuss how object-oriented databases differ from relational databases and show how to map a conceptual database design in the EER model to the ODL statements of the ODMG model The reader may skip Sections 21.3 through 21.7 if... Indeed, some 00 systems do not permit multiple inheritance at all Selective inheritance occurs when a subtype inherits only some of the functions of a supertype Other functions are not inherited In this case, an EXCEPT clause may be used to list the functions in a supertype that are not to be inherited by the subtype The mechanism of selective inheritance is not typically provided in 00 database systems, . existing relational database products. OBJECT AND OBJECT -RELATIONAL DATABASES Concepts for Object Databases In this chapter and the next, we discuss object-oriented data models and database sys- terns.'. requirements without 1.These darabases are often referred to as Object Databases and the systems are referred to as Object Database Management Systems (ODBMS). However, because this chapter discusses many general. languages. In traditional database models and systems, this concept was not 650 IChapter 20 Concepts for Object Databases applied, since it is customary to make the structure of database objects visible

Định dạng
Số trang	40
Dung lượng	1,61 MB