Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 105 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
105
Dung lượng
4,33 MB
Nội dung
724 I Chapter 22 Object-Relational and Extended-Relational Systems 22.5 IMPLEMENTATION AND RELATED ISSUES FOR EXTENDED TYPE SYSTEMS There are various implementation issues regarding the support of an extended type system with associated functions (operations). We briefly summarize them hereP • The ORDBMS must dynamically link a user-defined function in its address space only when it is required. As we saw in the case of the two ORDBMSs, numerous functions are required to operate on two- or three-dimensional spatial data, images, text, and so on. With a static linking of all function libraries, the DBMS address space may increase by an order of magnitude. Dynamic linking is available in the two ORDBMSs that we studied. • Client-server issues deal with the placement and activation of functions. If the server needs to perform a function, it is best to do so in the DBMS address space rather than remotely, due to the large amount of overhead. If the function demands computation that is too intensive or if the server is attending to a very large number of clients, the server may ship the function to a separate client machine. For security reasons, it is better to run functions at the client using the user ID of the client. In the future func- tions are likely to be written in interpreted languages like JA VA. • It should be possible to run queries inside functions. A function must operate the same way whether it is used from an application using the application program inter- face (API), or whether it is invoked by the DBMS as a part of executing SQL with the function embedded in an SQL statement. Systems should support a nesting of these "callbacks." • Because of the variety in the data types in an ORDBMS and associated operators, effi- cient storage and access of the data is important. For spatial data or multidimensional data, new storage structures such as Rvtrees, quad trees, or Grid files may be used. The OR DBMS must allow new types to be defined with new access structures. Dealing with large text strings or binary files also opens up a number of storage and search options. It should be possible to explore such new options by defining new data types within the ORDBMS. Other Issues Concerning Object-Relational Systems. In the above discussion of Informix Universal Server and Oracle 8, we have concentrated on how an ORDBMS extends the relational model. We discussed the features and facilities it provides to operate on relational data stored as tables as if it were an object database. There are other obvious problems to consider in the context of an ORDBMS: • Object-relational database design.: We described a procedure for designing object sche- mas in Section 21.5. Object-relational design is more complicated because we have to consider not only the underlying design considerations of application semantics and dependencies in the relational data model (which we discussed in Chapters 10 13.This discussion isderived largely from Stonebraker and Moore (1996). 22.6 The Nested Relational Model I725 and 11) but also the object-oriented nature of the extended features that we have just discussed. • Query processing and optimization: By extending SQL with functions and rules, this problem is further compounded beyond the query optimization overview that we dis- cuss for the relational model in Chapter 15. • Interaction of rules with transactions: Rule processing as implied in SQL covers more than just the update-update rules (see Section 24.1), which are implemented in RDBMSs as triggers. Moreover, RDBMSs currently implement only immediate execu- tion of triggers. A deferred execution of triggers involves additional processing. 22.6 THE NESTED RELATIONAL MODEL To complete this discussion, we summarize in this section an approach that proposes the use of nested tables, also known as nonnormal form relations. No commercial DBMS has chosen to implement this concept in its original form. The nested relational model removes the restriction of first normal form (iNF, see Chapter 11) from the basic rela- tional model, and thus is also known as the Non-lNF or Non-First Normal Form (NFNF) or NF 2 relational model. In the basic relational model-also called the flat rela- tional model-attributes are required to be single-valued and to have atomic domains. The nested relational model allows composite and multivalued attributes, thus leading to complex tuples with a hierarchical structure. This is useful for representing objects that are naturally hierarchically structured. In Figure 22.1, part (a) shows a nested relation schema DEPT based on part of the COMPANY database, and part (b) gives an example of a Non-INf tuple in DEPT. To define the DEPT schema as a nested structure, we can write the following: dept = (dno, dname, manager, employees, projects, locations) employees = (ename, dependents) projects = (pname, ploc) locations = (dloc) dependents = (dname, age) First, all attributes of the DEPT relation are defined. Next, any nested attributes of DEPT-namely, EMPLOYEES, PROJECTS, and LOCATIONS-are themselves defined. Next, any second-level nested attributes, such as DEPENDENTS of EMPLOYEES, are defined, and so on. All attribute names must be distinct in the nested relation definition. Notice that a nested attribute is typically a multivalued composite attribute, thus leading to a "nested relation" within each tuple. For example, the value of the PROJ ECTS attribute within each DEPT tuple is a relation with two attributes (PNAME, PLOC). In the DEPT tuple of Figure 22.lb, the PROJECTS attribute contains three tuples as its value. Other nested attributes may be multivalued simple attributes, such as LOCATIONS of DEPT. It is also possible to have a nested attribute that is single-valued and composite, although most nested relational models treat such an attribute as though it were multivalued. 726 I Chapter 22 Object-Relational and Extended-Relational Systems (a) EMPLOYEES PROJECTS LOCATIONS DNO DNAME MANAGER ENAME DEPENDENTS PNAME PLOC DLOC DNAME I AGE (b) 4 Administration Wallace Zelaya Thomas 8 New benefits Stafford Stafford Jennifer 6 computerization Stafford Greenway Wallace Jack 18 PhoneSystem Greenway Robert 15 Mary 10 Jabbar PROJECTS LOCATIONS ~\ (c) DNO DEPT r~ DNAME MANAGER EMPLOYEES /\ ENAME DEPENDENTS /\ DNAME AGE PNAME PLOC DLOC FIGURE 22.1 Illustrating a nested relation. (a) DEPT schema. (b) Example of a Non-l NF tuple of DEPT. (c) Tree representation of DEPT schema. When a nested relational database schema is defined, it consists of a number of external relation schemas; these define the top level of the individual nested relations. In addition, nested attributes are called internal relation schemas, since they define relational structures that are nested inside another relation. In our example, DEPT is the only external relation. All the others-EMPLOYEES, PROJECTS, LOCATIONS, and DEPENDENTs-are internal relations. Finally, simple attributes appear at the leaf level and are not nested. 22.7 Summary I 727 We can represent each relation schema by means of a tree structure, as shown in Figure 22.1c, where the root is an external relation schema, the leaves are simple attributes, and the internal nodes are internal relation schemas. Notice the similarity between this representation and a hierarchical schema (see Appendix E) and XML (see Chapter 26). It is important to be aware that the three first-level nested relations in DEPT represent independent information. Hence, EMPLOYEES represents the employees working for the department, PROJECTS represents the projects controlled by the department, and LOCATIONS represents the various department locations. The relationship between EMPLOYEES and PROJECTS is not represented in the schema; this is an M:N relationship, which is difficult to represent in a hierarchical structure. Extensions to the relational algebra and to the relational calculus, as well as to SQL, have been proposed for nested relations. The interested reader is referred to the selected bibliography at the end of this chapter for details. Here, we illustrate two operations, NEST and UNNEST, that can be used to augment standard relational algebra operations for converting between nested and flat relations. Consider the flat EMP _PROJ relation of Figure 11.4, and suppose that we project it over the attributes SSN, PNUMBER, HOURS, ENAME as follows: EMP _PROJ_FLAH-nssN, ENAME, PNUMBER, HOURS (EMP_PROJ) To create a nested version of this relation, where one tuple exists for each employee and the (PNUMBER, HOURS) are nested, we use the NEST operation as follows: EMP _PROJ_NESTED<c-NEST PROJS ~ (PNUMBER, HOURS) (EMP_PROJ_FLAT) The effect of this operation is to create an internal nested relation PROJS = (PNUMBER, HOURS) within the external relation EMP _PROJ_NESTED. Hence, NEST groups together the tuples with the same value for the attributes that are not specified in the NEST operation; these are the SSN and ENAME attributes in our example. For each such group, which represents one employee in our example, a single nested tuple is created with an internal nested relation PROJS = (PNUMBER, HOURS). Hence, the EMP _PROJ_NESTED relation looks like the EMP _PROJ relation shown in Figure 11.9a and b. Notice the similarity between nesting and grouping for aggregate functions. In the former, each group of tuples becomes a single nested tuple; in the latter, each group becomes a single summary tuple after an aggregate function is applied to the group. The UNNEST operation is the inverse of NEST. We can reconvert EMP _PROJ_NESTED to EMP _PROJ_FLAT as follows: EMP _PROJ_FLAT<c-UNNEST pR OJ S " (PNUMBER, HOURS) (EMP_PROJ_NESTED) Here, the PROJS nested attribute is flattened into its components PNUMBER, HOURS. 22.7 SUMMARY In this chapter, we first gave an overview of the object-oriented features in sQL-99, which are applicable to object-relational systems. Then we discussed the history and current trends in database management systems that led to the development of object-relational DBMSs (ORDBMSs). We then focused on some of the features of Informix Universal Server 728 I Chapter 22 Object-Relational and Extended-Relational Systems and of Oracle 8 in order to illustrate how commercial RDBMSs are being extended with object features. Other commercial RDBMSs are providing similar extensions. We saw that these systems also provide Data Blades (Inforrnix) or Cartridges (Oracle) that provide specific type extensions for newer application domains, such as spatial, time series, or text/document databases. Because of the extendibility of ORDBMSs, these packages can be included as abstract data type (ADT) libraries whenever the users need to implement the types of applications they support. Users can also implement their own extensions as needed by using the ADT facilities of these systems. We briefly discussed some implemen- tation issues for ADTs. Finally, we gave an overview of the nested relational model, which extends the flat relational model with hierarchically structured complex objects. Selected Bibliography The references provided for the object-oriented database approach in Chapters 11 and 12 are also relevant for object-relational systems. Stonebraker and Moore (1996) provides a comprehensive reference for object-relational DBMSs. The discussion about concepts related to Illustra in that book are mostly applicable to the current Informix Universal Server. Kim (1995) discusses many issues related to modern database systems that include object orientation. For the most current information on Informix and Oracle, consult their Web sites: www.informix.com and www.oracle.corn, respectively. The SQL3 standard is described in various publications of the ISO WG3 (Working Group 3) reports; for example, see Kulkarni et al. (1995) and Melton et al. (1991). An excellent tutorial on SQL3 was given at the Very Large Data Bases Conference by Melton and Mattos (1996). Ullman and Widom (1997) have a good discussion of SQL3 with examples. For issues related to rules and triggers, Widom and Ceri (1995) have a collection of chapters on active databases. Some comparative studies-for example, Ketabchi et al. (1990)-compare relational DBMSs with object DBMSs; their conclusion shows the superi- ority of the object-oriented approach for nonconventional applications. The nested rela- tional model is discussed in Schek and Scholl (1985), ]aeshke and Schek (1982), Chen and Kambayashi (1991), and Makinouchi (1977), among others. Algebras and query lan- guages for nested relations are presented in Paredaens and VanGucht (1992), Pistor and Andersen (1986), Roth et al. (1988), and Ozsoyoglu et al. (1987), among others. Imple- mentation of prototype nested relational systems is described in Dadam et al. (1986), Deshpande and VanGucht (1988), and Schek and Scholl (1989). 7 FURTHER TOPICS Database Security and Authorization This chapter discusses the techniques used for protecting the database against persons who are not authorized to access either certain parts of a database or the whole data- base. Section 23.1 provides an introduction to security issues and the threats to data- bases and an overview of the countermeasures that are covered in the rest of this chapter. Section 23.2 discusses the mechanisms used to grant and revoke privileges in relational database systems and in SQL, mechanisms that are often referred to as discre- tionary access control. Section 23.3 offers an overview of the mechanisms for enforc- ing multiple levels of security-a more recent concern in database system security that is known as mandatory access control. It also introduces the more recently developed strategy of role-based access control. Section 23.4 briefly discusses the security problem in statistical databases. Section 23.5 introduces flow control and mentions problems associated with covert channels. Section 23.6 is a brief summary of encryption and pub- lic key infrastructure schemes. Section 23.7 summarizes the chapter. Readers who are interested only in basic database security mechanisms will find it sufficient to cover the material in Sections 23.1 and 23.2. 731 732 I Chapter 23 Database Security and Authorization 23.1 INTRODUCTION TO DATABASE SECURITY ISSUES 23.1.1 Types of Security Database security is a very broad area that addresses many issues, including the following: • Legal and ethical issues regarding the right to access certain information. Some informa- tion may be deemed to be private and cannot be accessed legally by unauthorized persons. In the United States, there are numerous laws governing privacy of information. • Policy issues at the governmental, institutional, or corporate level as to what kinds of information should not be made publicly available-for example, credit ratings and personal medical records. • System-related issues such as the system levels at which various security functions should be enforced-for example, whether a security function should be handled at the physical hardware level, the operating system level, or the DBMSlevel. • The need in some organizations to identify multiple security levels and to categorize the data and users based on these classifications-for example, top secret, secret, con- fidential, and unclassified. The security policy of the organization with respect to per- mitting access to various classifications of data must be enforced. Threats to Databases. Threats to databases result in the loss or degradation of some or all of the following security goals: integrity, availability, and confidentiality. • Loss of integrity: Database integrity refers to the requirement that information be pro- tected from improper modification. Modification of data includes creation, insertion, modification, changing the status of data, and deletion. Integrity is lost if unautho- rized changes are made to the data by either intentional or accidental acts. If the loss of system or data integrity is not corrected, continued use of the contaminated system or corrupted data could result in inaccuracy, fraud, or erroneous decisions. • Lossof availability: Database availability refers to making objects available to a human user or a program to which they have a legitimate right. • Loss of confidentiality: Database confidentiality refers to the protection of data from unauthorized disclosure. The impact of unauthorized disclosure of confidential informa- tion can range from violation of the Data Privacy Act to the jeopardization of national security. Unauthorized, unanticipated, or unintentional disclosure could result in lossof public confidence, embarrassment, or legal action against the organization. To protect databases against these types of threats four kinds of countermeasures can be implemented: access control, inference control, flow control, and encryption. We discuss each of these in this chapter. In a multiuser database system, the DBMS must provide techniques to enable certain users or user groups to access selected portions of a database without gaining access to the rest of the database. This is particularly important when a large integrated database is to be used by many different users within the same organization. For example, sensitive 23.1 Introduction to Database Security Issues I 733 information such as employee salaries or performance reviews should be kept confidential from most of the database system's users. A DBMS typically includes a database security and authorization subsystem that is responsible for ensuring the security of portions of a database against unauthorized access. It is now customary to refer to two types of database security mechanisms: • Discretionary security mechanisms: These are used to grant privileges to users, includ- ing the capability to access specific data files, records, or fields in a specified mode (such as read, insert, delete, or update). • Mandatory security mechanisms: These are used to enforce multilevel security by classify- ing the data and users into various security classes (or levels) and then implementing the appropriate security policy of the organization. For example, a typical security pol- icy is to permit users at a certain classification level to see only the data items classified at the user's own (or lower) classification level. An extension of this is role-based secu- rity, which enforces policies and privileges based on the concept of roles. We discuss discretionary security in Section 23.2 and mandatory and role-based security in Section 23.3. A second security problem common to all computer systems is that of preventing unauthorized persons from accessing the system itself, either to obtain information or to make malicious changes in a portion of the database. The security mechanism of a DBMS must include provisions for restricting access to the database system as a whole. This function is called access control and is handled by creating user accounts and passwords to control the login process by the DBMS. We discuss access control techniques in Section 23.1.3. A third security problem associated with databases is that of controlling the access to a statistical database, which is used to provide statistical information or summaries of values based on various criteria. For example, a database for population statistics may provide statistics based on age groups, income levels, size of household, education levels, and other criteria. Statistical database users such as government statisticians or market research firms are allowed to access the database to retrieve statistical information about a population but not to access the detailed confidential information on specific individuals. Security for statistical databases must ensure that information on individuals cannot be accessed. It is sometimes possible to deduce or infer certain facts concerning individuals from queries that involve only summary statistics on groups; consequently, this must not be permitted either. This problem, called statistical database security, is discussed briefly in Section 23.4. The corresponding countermeasures are called inference control measures. Another security issue is that of flow control, which prevents information from flowing in such a way that it reaches unauthorized users. It is discused in Section 23.5. Channels that are pathways for information to flow implicitly in ways that violate the security policy of an organization are called covert channels. We briefly discuss some issues related to covert channels in Section 23.5.1. A final security issue is data encryption, which is used to protect sensitive data (such as credit card numbers) that is being transmitted via some type of communications network. Encryption can be used to provide additional protection for sensitive portions of a database as well. The data is encoded using some coding algorithm. An unauthorized user who accesses encoded data will have difficulty deciphering it, but authorized users are given decoding or 734 I Chapter 23 Database Security and Authorization decrypting algorithms (or keys) to decipher the data. Encrypting techniques that are very difficult to decode without a key have been developed for military applications. Section 23.6 briefly discusses encryption techniques, including popular techniques such as public key encryption, which is heavily used to support Web-based transactions against databases, and digital signatures, which are used in personal communications. A complete discussion of security in computer systems and databases is outside the scope of this textbook. We give only a brief overview of database security techniques here. The interested reader can refer to several of the references discussed in the selected bibliography at the end of this chapter for a more comprehensive discussion. 23.1.2 Database Security and the DBA As we discussed in Chapter 1, the database administrator (DBA) is the central authority for managing a database system. The DBA's responsibilities include granting privileges to users who need to use the system and classifying users and data in accordance with the policy of the organization. The DBA has a DBA account in the DBMS, sometimes called a system or superuser account, which provides powerful capabilities that are not made available to regular database accounts and users.' DBA-privileged commands include com- mands for granting and revoking privileges to individual accounts, users, or user groups and for performing the following types of actions: 1. Account creation: This action creates a new account and password for a user or a group of users to enable access to the DBMS. 2. Privilege granting: This action permits the DBA to grant certain privileges to cer- tain accounts. 3. Privilegerevocation: This action permits the DBA to revoke (cancel) certain privi- leges that were previously given to certain accounts. 4. Security level assignment: This action consists of assigning user accounts to the appropriate security classification level. The DBA is responsible for the overall security of the database system. Action 1 in the preceding list is used to control access to the DBMS as a whole, whereas actions 2 and 3 are used to control discretionary database authorization, and action 4 is used to control mandatory authorization. 23.1.3 Access Protection, User Accounts, and Database Audits Whenever a person or a group of persons needs to access a database system, the individual or group must first apply for a user account. The DBA will then create a new account 1. This account issimilarto the root or superuser accounts that aregiven to computer system admin- istrators,allowing access to restricted operating system commands. [...]... security in databases and computer systems in general, including the books by Leiss (1 982 a) and Fernandez et al (1 981 ) Denning and Denning (1979) is a tutorial paper on data security Many papers discuss different techniques for the design and protection of statistical databases These include McLeish (1 989 ), Chin and Ozsoyoglu (1 981 ), Leiss (1 982 ), Wong (1 984 ), and Denning (1 980 ) Ghosh (1 984 ) discusses... systems is mathematical logic, such rules are often referred to as logic databases Other types of systems, referred to as expert database systems or knowledge-based systems, also incorporate reasoning and inferencing capabilities; such systems use techniques that were developed in the field of artificial intelligence, including semantic networks, frames, production systems, or rules for capturing domain-specific... access the database Section 24.3 will give a brief overview of spatial and multimedia databases Spatial databases provide concepts for databases that keep track of objects in a multidimensional space For example, cartographic databases that store maps include two-dimensional spatial positions of their objects, which include countries, states, rivers, cities, roads, seas, and so on Other databases,... deductive databases.' an area that is at the intersection of databases, logic, and artificial intelligence or knowledge bases A deductive database system is a database system that includes capabilities to define (deductive) rules, which can deduce or infer additional information from the facts that are stored in a database Because part of the theoretical foundation for some deductive database systems. .. canceled, the corresponding record must be deleted from the table The database system must also keep track of all operations on the database that are applied by a certain user throughout each login session, which consists of the sequence of database interactions that a user performs from the time of logging in to the time of logging off When a user logs in, the DBMS can record the user's account number... responsibility of a database management system to ensure the confidentiality of information about individuals, while still providing useful statistical summaries of data about those individuals to users Provision of privacy protection of users in a statistical database is paramount; its violation is illustrated in the following example In some cases it is possible to infer the values of individual tuples... that we discussed in Chapters 17 and 18 prevent concurrent wntmg of the information by users with different security levels into the same objects, preventing the storage-type covert channels Operating systems and distributed databases provide control over the multiprogramming of operations that allow a sharing of resources without the possibility of encroachment of one program or process into another's... combination of two of the fundamental building blocks of encryption: substitution and permutation (transposition) The algorithm derives its strength from repeated application of these two techniques for a total of 16 cycles Plaintext (the original form of the message) is encrypted as blocks of 64 bits Although the key is 64 bits long, in effect the key can be any 56-bit number After questioning the adequacy of. .. and a unique secret number of the signer The verifier of the signature, however, should not need to know any secret number Public key techniques are the best means of creating digital signatures with these properties 23.7 SUMMARY This chapter discussed several techniques for enforcing security in database systems It presented the different threats to databases in terms of loss of integrity, availability,... some general design and implementation issues for active databases We then give examples of how active databases are implemented in the STARBURST experimental DBMS in Section 24.1.3, since STARBURST provides for many of the concepts of generalized active databases within its framework Section 24.1.4 discusses possible applications of active databases Finally, Section 24.1.5 describes how triggers are . and Andersen (1 986 ), Roth et al. (1 988 ), and Ozsoyoglu et al. (1 987 ), among others. Imple- mentation of prototype nested relational systems is described in Dadam et al. (1 986 ), Deshpande and VanGucht (1 988 ), and Schek and Scholl. in database management systems that led to the development of object-relational DBMSs (ORDBMSs). We then focused on some of the features of Informix Universal Server 7 28 I Chapter 22 Object-Relational and Extended-Relational Systems and of Oracle 8. (a) DEPT schema. (b) Example of a Non-l NF tuple of DEPT. (c) Tree representation of DEPT schema. When a nested relational database schema is defined, it consists of a number of external relation schemas; these define the top level