Tài liệu Managing time in relational databases- P14 ppt

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	20
Dung lượng	336,6 KB

Nội dung

date of Now(). 2 So if it were those versions that were the parents in a TRI relationship, this process would continually invalidate temporal foreign keys (TFKs) by ending the assertion time of the versions they refer to. Temporal Referential Integrity: The Basic Diagram Figure 11.1 is the basic diagram we will use in our discussion of temporal referential integrity. It consists of timelines for three objects. Besides policy P861, there is a timeline for client C882 and for client C903. The dotted-line vertical arrows represent temporal foreign key (TFK) relationships from a child version to a parent episode. Parent episodes are underlined to empha- size that those vertical arrows are not pointing to specific versions, but rather to entire episodes. The shaded rectangle on the left covers the effective time pe riod of version 2 of episode P861-A, which extends from July 2010 to May 2011. It graphically illustrates that the effective time period of this version is wholly included in the effective time period of an episode of its parent object, client C903, that episode being C903-A. It also graphically shows why a TRI relationship is between a child version and a parent episode. No single version of C903-A could be a TRI parent to P861- A(2), because no single version of C903-A covers [Jul 2010 – May 2011], the effective time period for P861-A(2). 3 The shaded rectangle on the right covers [Oct 2013 – 12/31/ 9999]. This is the effective time period of P861-C(8). In this case, a single parent version effective time includes (i.e. [ fills -1 ]) that child version, but that is merely happenstance. For example, suppose that we wanted to change client C882’s name from “Smith” to “Jones”, effective May 2014. This would make the effective time period of C882-C(4) [Sep 2013 – May 2014]. But if that happens, there would be no version of C882-C that could 2 This, of course, is a description of a basic temporal update transaction. But a similar description of the mechanics of non-basic temporal updates leads to the same conclusion, that TFKs do not point to specific versions in a parent asserted version table. 3 We use the notation X-{A, B, . Z}todenote an episode of an object. Thus, C882-B denotes episode B of client C882. We use the notation E(n) to denote a version of an episode. Thus, P861-A(2) denotes version 2 of policy P861, included within episode A. Note, however, that it only happens to denote the second version of that episode. For example, P861-C(8) denotes version 8 of that policy, but that version is the second version of that episode, not the eighth one. Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES 245 be a TRI parent to P861-C(8). The new C882-C(5) goes into effect on May 2014, so its effective time period does not cover the earlier clock ticks in P861-C(8). And C882-C(4) ends its effectivity on May 2014, so its effective time period does not cover the ongoing effectivity of P861-C(8), whose effective time period is, once again, [Oct 2013 – 12/31/9999]. As in the previous chapte r, we assume for now that all relationships exist within current assertion time, and that all temporal transactions specify an assertion time of [Now() – 12/ 31/9999]. We also assume that delete transactions against clients cascade down to the policies that they own, in accordance with the metadata declaration made in the Temporal Foreign Key metadata table, shown in Figure 8.4. We can read the somewhat schizophrenic history of policy P861 from this diagram. 4 Think of a vertical line running from the top to the bottom of the diagram, and initially positioned at January 2010. As time passes, this line moves to the right. The history of P861 is recorded in the begin and end dates of its versions. So as that line reaches each such date, there is a change in the state of P861. As Figure 11.1 sho ws, the policy was originally owned by client C882. The only episode of C882 whose effe ctive time period included that of P861, at the time P861-A(1) was created, was C882-A. And so that became the episode of client C882 that the policy pointed to. The next thing that happened was that, on July 2010, P861 change d hands. At that time, ownership was transferred to client C903. The only episode of C903 that existed at that time was C903-A, and so that became the parent episode to P861, begin- ning on that date. This change of ownership is recorded in version 2 of P861-A. Note that C903-A became effective on April 2010, two months after P861-A did. If episodes were the child managed objects in TRI relationships, then this relationship would be invalid. But they are not. C882-A is the parent to P861-A(1). C903-A is the parent to P861-A(2). The third event in the life of P861 was a delete casca de issued against client C903 . As of May 2011, C903 was no longer a client. Because C903 owned policy P861 at that point in time, the policy’s existence was terminated on that same date, May 2011. 4 Schizophrenic in that the policy can’t make up its mind which client it belongs to. As unlikely as such a policy history might be, in the real world, it will have to serve as an example of how TRI relationships are managed. 246 Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES The next event in the life of this policy occurred in November 2011. It took place as part of the same event in which client C882 was reinstated. On that date, a second episode of client C882 began, and a second episode of policy P861 began also, and was designated as a policy owned by C882. After that, three changes occurred to the policy between November 2011 and January 2013, but none of them changed the ownership of the policy. The fifth event in the life of the policy was that client C882 asked to terminate her relationship with our company as of January 2013. Since she owned P861 at that time, and would still own it on that termination date, the policy was terminated along with the client. Four months later, on May 2013, policy P861 was reinstated and assigned to client C902. So a third episode of the policy was created, P861-C. It was an open-ended episode, one with an effective end date of 12/31/9999, and so the only owner that could be assigned to it would be one with an open-ended episode that began on or before May 2013. Fortunately, client C903 had such an episode, having been reinstated, after a 5-month absence, with episode C903-C. With this information as part of our production data, we know, at any point in the history of policy P861, who its owner was and when and for how long she had been the owner. For any claims submitted for medical services provided to either C903 or C882, no matter how delayed the filing of those claims may have been, we know exactly when each client wa s covered by that policy and exactly when she was not covered by it—an essential piece of information needed to pay claim s correctly. And we don’t have to go digging in archival storage, or historical data warehouses, for that information—which, in a high transaction volume claims processing system, is a very good thing. That historical data exists in the same table as data about current policies and their current owners. The service date on the claim selects the correct version of the policy, and that version points to its owner. If its owner is not the person for whom the claim is submitted, the claim is rejected. Foreign Keys and Temporal Foreign Keys Before proceeding, let’s remind ourselves of the difference between (i) foreign keys (FKs), the relationships they implement and the constraints they impose, and (ii) temporal foreign keys (TFKs), the relationships they implement and the constraints they impose. Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES 247 A foreign key is a column in a relational table whose job is to relate rows to other rows. 5 If the foreign key column is declared to the DBMS to be nullable, then any row in that table may or may not contain a value in its instance of that column. But if it does contain a value, that value must match the value of the primary key of a row in the table declared as the target table for that foreign key. For non-nullable foreign keys, of course, every row in the source table must contain a valid value in its foreign key column. In addition, once the FK relationship is declared to the DBMS, the DBMS is able to guar antee that the two managed objects—the child row and the parent row—accurately reflect the existence dependency between the objects they represent. It does so by enforcing the constraint expressed in the declaration, the const raint that if the child row’s FK points to a parent row, that parent row must have existed in its table at the time the child row was adde d to its table, and must continue to exist in the parent table for as long as the child row exists in its table and continues to point to that same parent. This is a somewhat elaborate way of describing something that most of us already understand quite well, and that few of us may think is worth describing quite so carefully—that foreign keys relate child rows to parent rows and that, in doing so, they reflect a relationship that exists in the real world. We have gone to this length in order to be very clear about both the semantics and the mechanics of foreign keys—semantics described in our talk about objects, and mechanics in our talk about managed objects—and to place the descriptions at a level of generality where the semantics and mechanics of TFKs can be seen as analogous to those of the more familiar FKs. So if we use an “X/Y” notation in which the “X” term is part of the referential integrity description and the “Y” term is part of the temporal referential integrity description, we have a description which m akes it clear that temporal referential integrity really is temporalized referential integrity, that TRI is RI as it applies to temporal data. That description is given in the following paragraph. Once the FK/TFK relationship is declared to the DBMS/AVF, the DBMS/AVF is able to guarantee that the two managed objects—the child row/version and the parent row/episode— accurately reflect the existence dependency between the objects they represent. Each does so by enforcing the constraint expressed in the declaration, the constraint that if the FK/TFK in the child row/version points to a parent row/episode, that parent 5 We will assume that all primary and foreign keys consist of single columns, since the complications that arise with multi-column keys are irrelevant to this discussion. 248 Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES row/episode must have existed in its table/be currently asserted and currently effective at the time the child row/version was added to its table, and must continue to exist/be currently asserted and currently effective in the parent table for as long as the child row/version exists/is currently asserted and currently effective in its table and continues to point to that same parent. TFKs: A Data Part and a Function Part As a data element, a TFK is a column in an asserted version table whose job is to relate child managed objects to parent managed objects. Of course, the same may be said of FKs. The difference is that the parent managed object of a FK is a non- temporal row, while the parent managed object of a TFK is a group of possibly many rows. A TRI child table is an asserted version table that contains a TFK. A TRI parent table is an asserted version table referenced by a TFK. The FK reference is a data value, and is una mbiguous; but the TFK reference, as a data value, is not unambiguous. So as a data element, all a TFK can do is designate the object on which the object represented by its own row is existence dependent. There may be any number of versions representing that object in the parent table, and those versions may be grouped into any number of episodes scattered along the assertion and effective time timelines. So as a data value, a TFK reference is incomplete. For example, a TFK data value in a Policy table references all the episodes in a Client ta ble which represent the client on which that policy is existence dependent, that being the client whose oid matches the data value in the TFK. To complete the reference, we need to identify, from among those episodes, the one episode which was in effect when the policy version went into effect, and will remain in effect as long as that policy version remains in effect. What is needed to complete the reference is a function. We will name this function fTRI. It has the following syntax: fTRI(PTN, TFK, [eff-beg-dt – eff-end-dt]) PTN is the name of the parent table which this TFK points to. Given the TFK and effective time period of a version in a TRI child table, the AVF searches the parent table for an episode whose versions have that oid as part of their primary key, and whose effective time period fully incl udes the effective time period design ated by the function. If there is such an episode, it is the TRI parent episode of that version, and the fTRI function Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES 249 evaluates to True. If there is no such episode, then the function evaluates to False, and that version will never be added to the database because if it were, it would vio late TRI. If the AVF finds such an episode, in carrying out this function, it does not have to check further to insure that there is only one such episode. If there were more than one, then those episodes would be in TEI conflict across all their clock ticks which [ intersect]. The AVF does not allow TEI violations to occur, so if there is a TRI parent episode for the TFK reference, there is only one of them. For example, the oid value in the TFK of P861-A(2) picks out client C903. Before the AVF added that version to the database, it used the fTRI function to deter mine whether or not it was ref- erentially valid. 6 That TRI validation check would look something like this: IF ISTRUE(fTRI(Client, C903, [Jul10 – 9999])) THEN {add the version} ELSE {notify the calling program of a TRI error} ENDIF Together, the explicit and implicit parts of the TFK, its data element part and its function part, complete an unambiguous reference from a TFK to the one episode which satisfies the TRI constraint on the relationship from that version to that episode. Note that this description of a TFK is a semantic description, not an implementation-level description. The fTRI function is one component of a TFK. Its representation here is obviously not source code that could be compiled or interpreted. But however it is expressed, whether in the AVF or in some other framework based on these concepts, it is a function; and without it, the columns of data we call TFKs are not TFKs. Those columns of data are simply those components of TFKs which can be expressed as data. Temporal Transactions and Associative Tables In a non-temporal database, an associative table, often infor- mally referred to as an xref table, implements a many-to-many relationship between two other tables. Each of those other tables 6 This is a logical description of what the AVF does. It does not imply that the AVF code makes a single function call to carry out its TRI checks, let alone that it calls a function named fTRI. 250 Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES is a parent to the xref table, which is thus RI dependent on both of them. Each row in the xref table has two FKs, one to a parent row in one table and one to a parent row in another table (or, possibly, in the same table). As we already know, this dual RI dependency means that a row cannot be inserted into the xref table unless both its parent rows already exist in the database, and neither parent row can be deleted as long as that xref row remains in the database. TRI with Multiple TFKs If a child version has two or more TFKs, the effective timespan of an episode of each of the objects which those TFKs reference must fully include the effective timespan of the version. If either of them did not, that would be a TRI violation. So consider an associative asserted version table, whose versions each contain two TFKs. What of the Allen relationships between the two parent episodes related by any version in this table? Are there any constraints on those parent episodes? In fact, there are. Those two effective timespans must [ intersect]. If they did not [intersect], then there would be no clock tick when both were in effect, and so no clock tick in which an xref row, TRI dependent on both parents, could exist. Consider an example in which we have a customer episode C773-B with an effective timespan from March 2013 until further notice, which we will write as C773-B[Mar 2013 – 12/31/9999], and also a salesperson episode S217-D[Sep 2013 – Dec 2013]. What can we say of the effective timespan of a version in an asserted version associative table relating that customer ep isode to that salesperson episode? 7 First, that associative table version cannot have an effective begin date prior to September 2013 because that would make the start of its effective time period earlier than the start of S217-D. By the same token, that version cannot have an effective end date after December 2013 because that would make the end of its effective time period later than the end of S217-D. So knowing what we do of the two parent episodes, what is the maximum effective timespan that would be valid for the 7 As a complete aside, we note that the in-line notations developed in Chapter 6 and elsewhere in this book, for example the S217-D[Sep 2013 – Dec 2013] notation developed in this chapter, might be the basis for a degree of automated semantic interoperability between structured and semi-structured representations of temporal data. Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES 251 child version? It is the later of the two parents’ begin dates, and the earlier of their end dates. This gives a maximum effective timespan of the xref table child version of [Sep 2013 – Dec 2013], which happens to be the effective timespan of its parent salesperson episode. This is because the salesperson episode occurs [during] the customer episode. Next, let’s consider an example that does not involve 12/31/ 9999. Suppose that the effective timespans of our parent episodes are like this: C773-B[Mar 2013 – Jun 2013] and S217-D [Sep 2013 – Dec 2013]. Using our earlier/later rule, the maximum effective timespan of the xref version happens to be the same as it was in the previous case : [Sep 2013 – Dec 2013]. But this isn’t the end of the story. In our first example, the two parent episodes [ intersected], and the timespan during which they intersected was that widest timespan possible for the child version. But in this second example, the parent episodes do not [ intersect]. C773-B ceases being in effect three months before S217-D begins to be in effect. An associative table version cannot have two non-intersecting TRI parents because there would then be no effective time clock ticks shared by the parents, and therefore no clock ticks in which both TRI relationships are satisfied. In summary: the effective timespan of an xref row must be fully included in the effective timespans of both of its parent episodes. It follows that if there are no effective time clock ticks which those parent episodes have in common, no version which is TRI dependent on both of them can exist in the database. It also follows that if there are one or more clock ticks which those two parent episodes do have in common, the widest extent of the effective time period of the TRI dependent version is pre- cisely that set of [ intersecting] clock ticks. Temporal Delete Options The three options for standard delete transactions are (i) RESTRICT, (ii) SET NULL, and (iii) CASCADE. As applied to temporal delete transactions, the RESTRICT option is straightfor- ward. For example, suppose there is a RESTRICT option on deletes applied to the Client table, and suppose that the database is populated as shown in Figure 11.1. Episode C903-B could be deleted in its entirety because no policies are dependent on it. Episode C882-A could be deleted from the single clock tick Januar y 2010, or from July 2010 through April 2011 because the resulting episode, removed from any of those months, will still 252 Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES satisfy the TRI relationship from P861-A(1). But an attempt to remove client C903 from January 2011, for example, would be restricted because a dependent child—P861-A(2)—is TRI dependent on it during that month. As for the SET NULL option, its temporal form is not as strai ghtforward. It means that if a temporal delete would violate a TRI constraint, and the SET NULL option is in effect for that table, then the TFK in the child row that would otherwise be orphaned will be set to NULL. In the last example just men- tioned, if the delete option was SET NULL, episode C903-A would be split into two episodes by removing it from January 2011. P-861A(2) would be split into three versions, with effective time periods of [Jul 2010 – Jan 2011], [Jan 2011 – Feb 2011] and [Feb 2011 – May 2011]. The TFK in the middle of the three versions would then be set to NULL. But the temporal form of the CASCADE option is both mechan- ically and semantically even more complex than this. As for its semantics, a temporal delete cascade will attempt to remove both the parent object, and all its dependent children, from the clock ticks specified in the transaction. For example, if we specified a temporal delete cascade on client C882 for the effective time period [Jul 2012 – Jan 2013], we would find that episode P861-B would be subject to a {shorten backwards} transformation for those six clock ticks. This would remove P861-B(6) from current assertion time, and would also shorten P861-B(5) by one clock tick. But this should cause no concern. We already understand the mechanics of temporal extent state transformations. Temporal Referential Integrity Applied to Temporal Transactions A Temporal Insert Transaction Let’s assume that the Client and Policy tables are as shown in Figure 11.1, and let’s begin by considering a temporal insert of P861 which has a TFK of C903. In order to satisfy TRI constraints, every clock tick in the effective time period specified on the transaction must already be occupied by C903. So there are only a lim- ited number of effective time spans that can validly be specified by a temporal insert transaction, in this situation. They are: (i) The three m onths of [Feb 2013 – May 2013], or the two months of [Mar 2013 – Ma y 2013] or the month of [Apr 2013 – May 2013], each of which will {lengthen P861-C backwards}. Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES 253 (ii) The two months of [Feb 2013 – Apr 2013], which will create a new episode between P861-B and P861-C. Let’s be sure we understand why these are the only possibilities. To begin with, the existing episodes of C903, the parent object, cover the effective time clock ticks [Apr 2010 – May 2011], [Apr 2012 – Sep 2012] and [Feb 2013 – 12/31/9999]. So if all the clock ticks in a new version of P861 fall anywhere within any one of those three ranges, that version will satisfy TRI; and otherwise, it won’t. However, this is a temporal insert transaction, and therefore none of the clock ticks in the new version being created can already be occupied by another version of P861. This is the TEI constraint applied to temporal insert transactions. This rules out [Feb 2010 – May 2011], [Nov 2011 – Jan 2013] and [May 2013 – 12/31/9999]. So, eliminating these clock ticks that are already occupied by P861 from the clock ticks occupied by C903, we are left with only the three clock ticks of February, March and April 2013. A Temporal Update Transaction By definition, temporal updates neither add a representation of an object to a clock tick nor remove a representation of an object from a clock tick. But they can still cause temporal referential constraints to be violated. They can do so by changing the TFK value in one or more clock ticks. For example, suppose a temporal update is submitted which specifies that in November and December of 2012, P861’s owning client should be C903. The transaction looks like this: UPDATE Policy [P861, C903,, ] Nov 2012, Jan 2013 The problem is that there is no representation of C903 in either of those two clock ticks. The function fTRI(Client, C903, [Nov12 – Jan13]) will evaluate to False. Therefore, the AVF will restrict this transaction because of TRI constraints. This is the equivalent of working with a non-temporal table, and trying to change a FK value to point to a parent row that does not, at that time, exist. A Temporal Delete Transaction A temporal delete withdraws its target object from one or more effective time clock ticks. In the process, it may {erase} an entire episode from current assertion time, or {spl it} an episode in two, or { shorten} an episode either forwards or backwards, or do several of these things to one or more episodes with one and the same transaction. 254 Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES [...]... in those tables represent both what things are currently like and also what we currently believe those things are like They represent both what things are like now and what we now believe they are like There is a timeline along which persistent objects are located, and a timeline along which we hold various beliefs Data in conventional tables is “pinned”, along both timelines, to the moving point in. .. standard temporal model, the rows inserted into bi-temporal tables begin to be asserted on the date they are physically inserted into the database With Asserted Versioning, this is the default for those rows; but Asserted Versioning permits Managing Time in Relational Databases Doi: 10.1016/B978-0-12-375041-9.00012-1 Copyright # 2010 Elsevier Inc All rights of reproduction in any form reserved 261 262 Chapter... Future 279 Approving a Deferred Assertion 280 Deferred Assertions and Temporal Referential Integrity 284 Glossary References 285 We normally think of inserting a row into a table as the same thing as claiming, or asserting, that the statement which that row makes is true From that point of view, a distinction between the physical act of creating a row in a table, and the semantic act of claiming that what... a true statement of what that object is like And (ix) What we think is a true statement of what that object is like Whatever semantic differences there may be between accepting, agreeing, assenting, asserting, believing, claiming, knowing, saying and thinking—and such differences are of great importance in such fields as epistemology, linguistics and the foundations of logic—these differences make no... point in time we call “the present” and which, in this book, we designate as Now() The maintenance of conventional data is an ongoing effort to keep up with the changes that follow in the trail of that moving point But as well as the present, there are the past and the future So if we “unpin” data along both these timelines, we end up with nine possible ways that data and time may be related In this... sometimes be wrong; but they assume that our intention is to be truthful, and that we take reasonable care to be accurate Without those assumptions, the creation and maintenance of data would be a pointless activity So underlying the activity of creating, maintaining and consuming data lies the matter of what we claim or assert to be true For purposes of this discussion, we will take the following... Client table is shown in the upper table in Figure 11.3 C903(r3 & r4) have been withdrawn into past assertion time They are now part of the assertion history of this table, a record of what we used to assert is true, but no longer do In their place are C903(r11 & r12) Everything, in current assertion time, is as it was except that a “hole” has been created in C903’s effective time C903 is no longer... Policy table Here there is a single version of a policy owned by C903 that exists in the transaction’s timespan P861(r2)’s effective time begins prior to the transaction’s timespan, and extends past the end of the transaction’s timespan The transaction thus splits the version which, in turn, {splits} the episode The first step is to withdraw P861(r2) into past assertion time The second step is to replace... effective begin date end date time time period DEFERRED ASSERTIONS AND OTHER PIPELINE DATASETS 12 CONTENTS The Semantics of Deferred Assertion Time 262 Assertions, Statements and Time 264 The Internalization of Pipeline Datasets 267 Deferred Assertions 269 A Deferred Update to a Current Episode 269 A Deferred Update to a Deferred Assertion 274 Reflections on Empty Assertion Time 275 Completing the Deferred... entries that are not included in the list, and we recommend that the Glossary be consulted whenever an unfamiliar term is encountered We note, in particular, that none of the nodes in the two taxonomies referenced in this chapter are included in this list 8 Jan 2014 259 260 Chapter 11 TEMPORAL TRANSACTIONS ON MULTIPLE TABLES In general, we leave taxonomy nodes out of these lists since they are long enough . Versioning permits Managing Time in Relational Databases. Doi: 10.1016/B978-0-12-375041-9.00012-1 Copyright # 2010 Elsevier Inc. All rights of reproduction in. the creation and maintenance of data would be a pointless activity. So underlying the activity of creating, maintaining and consuming data lies the matter

Ngày đăng: 21/01/2014, 08:20

Xem thêm