Tài liệu Managing time in relational databases- P11 doc

20 368 1
Tài liệu Managing time in relational databases- P11 doc

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

appears in it as a non-key column. Wellness program name is left unchanged. Episode begin date, effective end date, assertion end date and row create date are added as non-key columns. As before, unique constraints and indexes are augmented and are modified, as required. Wellpgmcat_cd code appears in the logical data model as a foreign key to the Wellness Program table, and so the AVF must convert it into a temporal foreign key. The foreign key declaration is dropped from the DDL, the wellness program category code col- umn is also dropped, and a wellpgmcat_oid column replaces it. With these changes, the temporalization of this table is complete. The Wellness Program Enrollment Table. Unlike the other tables in this sample database, Wellness Program Enrollment is an associative table, commonly called an “xref table”. But its conversion to a temporal table follows the pattern we have already seen. The only difference is that this table has two for- eign keys to convert to temporal foreign keys, not just one, and two columns in its original primar y key. According to the Table Type metadata table, the Wellness Pro- gram Enrollment table is an asserted version table. Prior to temporalization, the primary key of this table consisted of the two foreign keys client_nbr and wellpgm_nbr. But asserted ver- sion tables must have single-column object identifiers, and so instead of creating an object identifier for both client and well- ness program, we create a single object identifier and name it client_wellpgm_oid . We then add effective begin date and asser- tion begin date as the other two primary key columns. As we see in Figure 8.8, the business key of this table is the pair of temporal foreign keys. The other four non-key columns are left unchanged. Episode begin date, effective end date, asser- tion end date and row create date are added as non-key columns. As before, unique constraints and indexes are aug- mented and are modified, as required. Client_nbr and wellpgm _nbr appear in the logical data model as foreign keys to the Client and Wellness Program tables, respectively. The foreign key declarations are dropped from the DDL, the client number and wellness program number columns are also dropped, and the client_oid and wellpgm_oid columns, respectively, replace them. With these changes, the temporalization of this table is comple te. In fact, the temporalization of the entire physical da ta model is now complete. The result is the Asserted Versioning phys ical data model shown in Figure 8.8. But an asse rted version data- base is not simply one that contains one or more temporal tables. It is also a database that includes the code which enforces 184 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES the semantic constraints without which those tables would just be a collection of columns with nothing particularly temporal about them at all. Generating Temporal Entity and Temporal Referential Integrity Constraints If this temporalized physical data model were submitted to the DBMS, and an empty database were created from it, we could begin to populate the tables in the database right away. We could populate them using conventional SQL insert, update and delete statements. But we would have to be very careful. We already have some idea of what temporal entity integrity and temporal referential integrity are, but we have yet to see these integrity constraints at work. Some of the work they do is quite complex. The AVF enforces temporal integrity as data is being updated, not as it is being read. Today’s DBMSs do not support temporal integrity constraints on versions and episodes, so it is the AVF— or a developer-written framework—that must do it. Applying those constraints, the AVF would reject some temporal transactions because they would violate one or both of those constraints. But if we write our transactions in native SQL, then whenever we do maintenance to the database, we will have to manually check the contents of the database, compare each transaction to those contents, and determine for ourselves whether or not the transactions both did what they were intended to do, and resulted in a temporally valid database state. Past experience has shown us that doing our own application-developed bi-temporal data maintenance, using standard SQL, is resource-intensive and error-prone. It is a job for a company’s most experienced DBAs, and even they will have a difficult time with it. Having an enter- prise standard framework like the AVF to carry out these oper- ations significantly reduces the work involved in maintaining temporal data, and will eliminate the errors that would otherwise inevitably happen as temporal data is maintained. Using a framework like the AVF, temporal transactions will be no more difficult to write than conventional transactions. The reason is that the AVF suppor ts a temporal insert, temporal update and temporal delete transaction in which all temporal qualifiers on the transaction are expressed declaratively. These transactions also preserve a fundamentally important feature of standard insert, update and delete transactions. They allow one bi-temporal semantic unit of work to be expressed in one transaction. Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 185 Typically, a single standard SQL transaction will insert, update or delete a single row in a conventional table. And typi- cally, the corresponding temporal transaction will require two or three physical transactions to complete. In addition, many temporal update transactions, as we will see, and many temporal delete cascade transactions too, can require a dozen or more physical transactions to complete. If we attempt to maintain a bi-temporal database ourselves, using standard SQL, then for each semantic intention we want to express in the database, we will have to figure out and write these multiple physical transactions ourselves. As Chapter 7 indicated, and as Chapters 9 through 12 will make abundantly clear, that is a daunting task. Redundancies in the Asserted Versioning Bi-Temporal Schema An Asserted Versioning database is a physical impl ementa- tion of a logical data model, a logical model which does not contain any mention of temporal data in the model itself. In fact, the logical data models of Asserted Versioning databases are indistinguishable from the logical data models of conventional databases. Apparent Redundancies in the Asserted Versioning Schema However, some data modelers have objected to an apparent third normal form (3NF) violation in the bi-temporal schema common to all asserted version tables. They point to the effec- tive end date, the assertion end date and the row creation date to support their claims. Their objections, in summary, are one or more of the following: (i) The effective end date is redundant because it can be inferred from the effective begin date of the following version. (ii) The assertion end date is redundant because it can be inferred from the assertion begin date of the next assertion of a version. (iii) The row create date is redundant because it is the same as the assertion end date. Now in fact, none of these objections are correct. As for the first objection, an effective end date would be redundant if every version of an object followed immediately after the previous 186 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES version. If we could depend on that being true, which means if we could depend on there never being a requirement to support multiple episodes of the same object, then the effective end date would be redundant. One could make the argument that all versions within one episode have versions that [meet] and so , within each episode, the e nd date could be inferred. Al though that is true, we would still need an episode end date to mark the e nd of the episode . Furthermore , the end dates on each version significantly improve performance because both dates are searched on the same row, reducing the need, otherwise, for expensive subselects on every read. Also, we are not interested in implementing just the minimal temporal requirements a specific business use may require, especially when it would be difficult and expensive to add additional functionality, such as support for multiple episodes (i.e. for temporal gaps between some adjacent versions of the same object), to a database already built and populated, and to a set of maintenance tran sactions and queries already written and in use. All asserted version tables are ready to support gaps between versions. On the other hand, as long as temporal trans- actions issued to the AVF do not specify an effective begin date, that capability of Asserted Versioning will remain unused and the mechanics of its use will remain invisible. As for the second objection, an assertion end date would be redundant with the following asserted version’s assertion begin date only if every assertion of a version followed the previous one without a gap of even a single clock tick in assertion time. But once again, we are not interested in implementing just the minimal temporal requirements a specific business use may require. All asserted version tables are ready to support deferred assertions, and deferred assertions may involve a gap in asser- tion time. On the other hand, as long as temporal transactions issued to the AVF do not specify an assertion begin date, that capability of Asserted Versioning will remain unused and the mechanics of its use will remain invisible. In addition, as we will see in following chapters, single vers- ions can be replaced by multiple versions as new assertions are made, and vice versa. In that case, the logic for inferring asser- tion begin dates from the assertion end dates of other versions could become quite complex. This complexity could affect the performance, not only of maintenance transactions, but also of queries. The reason is that, if we followed this suggestion, it would be impossible to deter mine, from just the data on any one row, whether or not that row has an Allen relationship with the assertion time specified on a query. To determine that, we Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 187 would need to know the assertion time period of the row, not just when that time period ended. As for the third objection, a row create date would be redun- dant with an assertion end date if Asserted Versioning did not support deferred assertions. In fact, neither the standard tempo- ral model, nor any more recent computer science research that we are aware of, includes deferred assertions. But Asserted Versioning does. Because it does, the AVF may insert rows into asserted version tables whose assertion begin dates are later than their row creation dates. A Real Redundancy in the Asserted Versioning Schema But there is one redundancy that we did introduce into the Asserted Versioning schema. It was to add the episode begin date to ever y row. The episode begin date, as we all know by now, is the effective begin date of the effective-time earliest version of an episode. So it is not functionally dependent on the primary key of any row which is not the initial version of an episode. 2 The primary use of this column is to indicate, for any version, when the episode that version is a part of began. It efficiently associates every version with the one episode it belongs to. Lacking this column, we would only be able to find all versions of an episode by looking for versions with the same oid that [meet], and we would only be able to distinguish one episode from the next one by looking for a [before] or [before À1 ] relation- ship between adjacent versions with the same oid. Together with that version’s own effective end date, this tells us that the object that version designates has been continuously represented, in current assertion time, from the effective-time beginning of that version’s episode to the effective-time end of that version. Since the parent managed object in a temporal ref- erential integrity relationship is an episode, this means that when we are validating temporal referential integrity on a child version, all we need to do is find one parent version whose effec- tive end date is not earlier than the effective end date of the new 2 Interestingly enough, although clearly redundant, this replication of the effective begin date of each episode’s initial version onto all other versions of the episode is not a violation of any relational normal form. Its presence involves no partial, transitive or multi-valued dependencies. For other examples of redundancies that are not caught by fully normalizing a database, see Johnston’s articles in the archives at Information_Management.com (formerly DM Review), with links listed in the bibliography. 188 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES child version, and whose episode begin date is not later than the effective begin date of the new version. In other words, it enables us to do TRI checking from one parent-side row, rather than hav- ing to go back and find the row that begins that parent episode. This significantly improves performance for temporal referential integrity checking. The result of TRI enforcement is to guarantee that the effec- tive-time extent of any version representing a TRI child object completely [ fills] the effective-time extent of one set of contigu- ous versions representing a TRI parent object. In addition, note that the presence of this redundant column has little maintenance cost associated with it. As new versions are added to an episode, the episode begin date of the previous version is just copied onto that of the new version. Only in the rare cases in which an episode’s begin date is changed will this redundancy require us to update all the versions in the episode. Glossary References Glossary entries whose definitions form strong inter- dependencies are grouped together in the following list . The same glossary entries may be grouped together in different ways at the end of different chapters, each grouping reflecting the semantic perspective of each chapter. There will usually be sev- eral other, and often many other, glossary entries that are not included in the list, and we recommend that the Glossary be consulted whenever an unfami liar term is encountered. Allen relationships contiguous filled by include asserted version table Asserted Versioning Asserted Versioning database Asserted Versioning Framework (AVF) assertion begin date assertion end date assertion time assertion time period business key reliable business key unreliable business key Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 189 child object clock tick closed-open granularity conventional database conventional table conventional transaction deferred assertion design encapsulation maintenance encapsulation query encapsulation effective begin date effective end date effective time effective time period episode episode begin date existence dependency managed object mechanics object object identifier oid parent episode parent object PERIOD datatype represented row creation date temporal database temporal entity integrity (TEI) temporal foreign key (TFK) temporal referential integrity (TRI) temporal transaction temporal update transaction temporalize version 190 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS CONTENTS Effective Time Within Assertion Time 192 Explicitly Temporal Transactions: The Mental Model 195 A Taxonomy of Temporal Extent State Transformations 197 The Asserted Versioning Temporal Transactions 200 The Temporal Insert Transac tion 201 The Temporal Update Transaction 206 The Temporal Delete Transaction 209 Glossary References 211 Temporal transactions are inserts, updates or deletes whose targe ts are asserted version tables. But temporal transactions are not submitted directly to the DBMS. The work that has to be done to manage conventional tables is straightforward enough that we can let users directly manipulate those tables. But bi-temporal tables, including asserted version tables, are too complex to expose to the transaction author. The difference between what the user wants done, and what has to take place to accomplish it, is too great. And so temporal transactions are the way that the quer y author tells us what she wants done to the database, without having to tell us how to do it. The mechanics of how her intentions are carried out are encapsulated within our Asserted Versioning Framework. All that the appli- cation accepting the transaction has to do is to pass it on to the AVF. A DBMS can enforce such constraints as entity integrity and referential integrity, but it cannot enforce the significantly more complex constraints of their temporal analogs. It is the AVF which enforces temporal entity integrity and temporal refer- ential integrity. It is the AVF which rejects any temporal Managing Time in Relational Databases. Doi: 10.1016/B978-0-12-375041-9.00009-1 Copyright # 2010 Elsevier Inc. All rights of reproduction in any form reserved. 191 transactions that violate the semantic constraints that give bi- temporal data its meaning. It is the AVF that gives the user a declarative means of expressing her intentions with respect to the transactions she submits. In the Asserted Versioning temporal model, the two bi- temporal dimensions are effective time and assertion time. If assertion time were completely equivalent to the standard tem- poral model’s transaction time, then every row added to an asserted version table would use the date the transac tion was physically applied as its assertion begin date. Important addi- tional functionality is possible, however, if we permit rows to be added with assertion begin dates in the future. This is func- tionality not supported by the standard temporal model. But it comes at the price of additional complexity, both in its seman- tics and in its implementation. Fortunately, it is possible to segregate this additional func- tionality, which is based on what we call deferred transactions and deferred assertions, and to discuss Asserted Versioning as though both its temporal dimensions are strictly analogous to the temporal dimensions of the standard temporal model. This makes the discussion easier to follow, and so this is the approach we will adopt. Deferred assertions, then, will not be discussed until Chapter 12. Effective Time Within Assertion Time A row in a conventional table makes a statement. Such a row, in a conventional Policy table, is shown in Figure 9.1. This row makes the following statement: “I represent a policy which has an object identifier of P861, a client of C882, a type of HMO and a copay of $15.” The statement makes no explicit ref- erence to time. But we all understand that it means “I represent a policy which exists at the current moment, and which at the current moment has an object identifier of ”. This same row, with an effective time period attached, is shown in Figure 9.2. It makes the fol lowing statement: “I represent a policy which has an object identifier of P861 and which, from January 2010 to oid client copaytype P861 HMO $15C882 Figure 9.1 A Non-Temporal Row. 192 Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS July 2010, has a client of C882, a type of HMO and a copay of $15.” In other words, the row shown in Figure 9.2 has been placed in a temporal container, and is treated as representing the object as it exists within that container, but as saying nothing about the object as it may exist outside that container. If we were managing uni-temporal versioned data, that would be the end of the story. But if we are managing bi-temporal data, there is one more temporal tag to add. This same row, with an assertion time period attached, is shown in Figure 9.3. It m akes the f ollowing state ment: “I represent the assertion, ma de on January 2010 but wit hdrawn on October 2010, that this row represents a policy which has an object identifier of P861 and which, from January 2010 to July 2010, has a client of C882, a type of HMO and a copay of $15.” In other words, th e row sh ow n in Figure 9.2, as included in its first temporal container, has been placed in a second temporal container, and is treated as representing w hat we claim, within th at sec- ond container, is true of the object as it exists within that first container, but as saying nothing about what we might claim about t he object within its first container outside that second container. From January to July, this statement makes a current claim ab out what P861 is like during that period of time. From July to October, this statement makes an historical claim, a claim about what P861 was like at that time. But from October on, this statement makes no claim at all, not even an historical one. It is simply a record of what we once claimed was true, but no longer claim is true. All this is another way of saying (i) that a non-temporal row represents an object; (ii) that when that row is tagged with an effective time period, it represents that object as it exists during that period of time (January to July in our example); and (iii) that when that tagged row receives an additional time period tag, it represents our assertion, during the indicated period of time oid P861 Jan10 Jul10 C882 $15HMO eff-beg eff-end type copayclient Figure 9.2 A Uni-Temporal Version. oid P861 eff-beg eff-end asr-beg asr-end Oct10 C882 type HMO $15 copay client Jan10 Jan10 Jul10 Figure 9.3 A Bi-Temporal Row. Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS 193 [...]... neither [intersect] nor do not [intersect] During those two periods of assertion time, the comparison doesn’t apply During those times, those two versions are what philosophers call “incommensurable” In the following discussion of temporal integrity constraints, we will assume that all the rows involved exist in shared assertion time Note that it is effective time that exists with assertion time, and... tag qualifies the effective -time qualified representation of an object Effective time containment turns a row representing an object into a version Assertion time containment turns a row representing a version into an assertion of a version, i.e into a temporally delimited truth claim.1 This is illustrated in Figure 9.4 Temporal integrity constraints govern the effective time relationships among bi-temporal... object into one or more clock ticks of effective time (ii) A temporal update replaces business data representing an object in one or more clock ticks of effective time (iii) A temporal delete removes business data representing an object from one or more clock ticks of effective time In all three cases, those clock ticks are contiguous with one another, as they must be since they constitute a continuous... its instances We can create an instance of it, modify an existing instance, or remove an instance This is reflected in the three nodes of the first level underneath the root node Of course, in the case of episodes, the {erase} transformation is neither a physical nor a logical deletion Instead, it is the action in which the entire episode is withdrawn from current assertion time into past assertion time. .. transformation that is its inverse A third way to assure ourselves of completeness is to analyze the taxonomy in terms of its topology On a line representing a timeline, we can place a line segment representing an episode We can also remove a line segment from that line Given a line segment, we can either lengthen it forwards or backwards, shorten it forwards or backwards, or split it Given two line segments with... would be like an insert to a conventional table with a missing or incomplete primary key 201 202 Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS (ii) No oid, no business key, business key is not reliable In this case, the AVF accepts the insert and assigns it a new oid The reason is that if a business key is not reliable, it is not required on an insert transaction Since no business key match logic... Note that in this case, multiple temporal inserts which lack an oid but which contain the same business key value will result in multiple object identifiers all using that same value Semantically, the business key will be a homonym, a single value designating multiple different objects This, of course, is precisely what the “unreliable” means in “unreliable business key” (v) Oid present, no business key,... business key, business key is reliable In this case, the AVF rejects the insert The reason is the same as it was for case (i); if the business key is reliable, an insert must provide it Otherwise, it would be like an insert to a conventional table with a missing or incomplete primary key (vi) Oid present, no business key, business key is not reliable In this case, the AVF accepts the insert and uses... vice versa If the semantic containment were reversed, 1 And if there were no versioning, and non-temporal statements were contained directly in assertion time, i.e non-temporal rows were given an assertion time tag but not an effective time tag, then assertion time containment would turn non-temporal statements directly into temporally delimited truth claims Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS... Model In every clock tick within a continuous period of effective time, an object is either represented by a row or not represented If it is represented, there is business data which describes what that object is like during that clock tick So our three temporal transactions affect the representation of an object in a period of time as follows: (i) A temporal insert places business data representing . Asserted Versioning will remain unused and the mechanics of its use will remain invisible. In addition, as we will see in following chapters, single vers- ions. clock tick in assertion time. But once again, we are not interested in implementing just the minimal temporal requirements a specific business use may require.

Ngày đăng: 21/01/2014, 08:20

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan