Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 20 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
20
Dung lượng
371,81 KB
Nội dung
transactions until it is the right time to apply them, Asserted
Versioning applies them right away, but does not immediately
assert them. These deferred assertions may themselves be
updated or deleted, and the moment on which their assertion
periods become current is the moment on which we begin to
claim that the world was, is or will be as they describe it.
Just as deferred assertions replace collections of transac tions
that have not yet been applied to the database, bi-temporal data
in any of the other seven categories replaces other physically
external datasets. Asserted version tables contain data in all
these temporal categories and, in doing so, internalize what
would otherwise be physically distinct datasets, ones whose
management costs are obviously significant.
In Chapter 13, we look more closely at the entire family of
pipeline datasets. We distinguish eight logical categories of pipe-
line datasets, based on where in a combination of past, present
or future assertion and effective time their data is located. Hav-
ing previously shown how to eliminate these physically distinct
datasets by bringing them into the production tables which are
their destinations and points of origin, we now discuss each of
them and show how queries and views can reassemble, as
queryable objects, exactly the data that had existed in those
datasets. This demonstrates that while eliminating the manage-
ment costs associated with this data, we can still make this data
available in whatever combinations it is needed.
In Chapter 14, we discuss how to query asse rted version
tables. As we said before, many queries, especially t he ad hoc
queries written by non-technical database users, will be
directed against non-temporal or uni-temporal views of
asserted version tables, not agai nst those bi-temporal tab les
themselves. But many queries will b e writt en directly against
those physical tables, especially those we call production
queries. In that case, the effective time period specified on the
query, and w hich qualifies the result set, w ill have t o be com-
pared to the effective time periods of the rows targeted by the
query; and as we know from our re view of the Allen
relationships, there are 13 different ways in which those two
time periods may b e posi tioned with respect to one anot her.
And when those queries i nvolve joins across two (or more)
asserted version tab les, then the Allen relationship issues can
become even more difficult.
In Chapter 15, we discuss how to op timize the performance
of Asserted Versioning databases. Our focus is on optimizing
access to currently asserted current versions, i.e. to the rows that
correspond to rows in a conventi onal table of persistent objects.
164 Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES
In this chapter, we focus on index design, although a wide range
of other optimization techniques are also considered.
In Chapter 16, we conclude our presentation of Asserted
Versioning. We discuss each of the four objectives we had for
Asserted Versioning, and which we described in the Preface,
and explain why we think those objectives have been met. We
point out that Asserted Versioning has value both as a bridge to
a future standards-based and vendor-provided implementation
of bi-temporal data, and as a destination, being itself a semanti-
cally complete implementation of bi-temporal data which works
with today’s SQL and today’s databases. In the last section,
we discuss ongoing research and development at Asserted
Versioning LLC, and explain how interested readers can learn
more about Asserted Versioning.
Glossary References
Glossary entries whose definitions form strong inter-
dependencies are grouped together in the following list. The
same Glossary entries may be grouped together in different ways
at the end of different chapters, each grouping reflecting the
semantic perspective of each chapter. There will usually be sev-
eral other, and often many other, Glossary entries that are not
included in the list, and we recommend that the Glossary be
consulted whenever an unfamiliar term is encountered.
We no te, in particular, that none of the nodes in our taxon-
omy of data management methods, or our state transformation
taxonomy, are included in this list. In general, we leave taxon-
omy nodes out of these lists, but recommend that the reader
look them up in the Glossary.
Allen relationships
asserted version table
Asserted Versioning Framework (AVF)
assertion time
transaction time
bi-temporal
uni-temporal
deferred assertion
deferred transaction
Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES 165
effective time
valid time
object
persistent object
physical transaction
temporal transaction
temporal parameter
pipeline dataset
production table
temporal entity integrity (TEI)
temporal referential integrity (TRI)
temporal extent state transformation
the standard temporal model
166 Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES
8
DESIGNING AND GENERATING
ASSERTED VERSIONING
DATABASES
CONTENTS
Translating a Non-Temporal Logical Data Model into a Temporal
Physical Data Model 169
The Logical Data Model 169
Referential Constraints Between Non-Temporal and Bi-Temporal
Tabl
es 171
Asserted Versio
ning Metadata 173
The Physical Data Model 180
Generating an Asserted Versioning Database from a Physical Data
Model
and Metadata 181
Tem
poralizing the Physical Data Model 182
Generating Temporal Entity and Temporal Referential Integrity
Const
raints 185
Redundancies in
the Asserted Versioning Bi-Temporal Schema 186
Apparent Redundancies in the Asserted Versioning Schema 186
A Real Redundancy in the Asserted Versioning Schema 188
Glossary References 189
An Asserted Versioning database is one that contains at least
one
asserted version
table. An asserted version table is one
whose schema is that shown in Chapter 6, and on which the
two temporal integrity constraints are enforced.
Figure 8.1 sh
ows h
ow Asserted Versioning databases are gener-
ated from the combination of a conventional logical data model
and a set of metadata entries. Note that the logical data model
has no temporal features. This means that logical data models of
conventional databases, developed perhaps years ago, do not
have to be changed if a decision is made to convert one or
more of the tables in those databases into bi-temporal asserted
Managing TimeinRelational Databases. Doi: 10.1016/B978-0-12-375041-9.00008-X
Copyright
#
2010 Elsevier Inc. All rights of reproduction in any form reserved. 167
version tables. This means that when building new logical data
models, or extending old ones, data modelers can ignore tempo-
ral requirements and focus on design issues which are often
complex enough without introducing temporal considerations.
It means that temporal requirements can be expressed declara-
tively, in metadata associated with a conventional data model,
rather than by hardcoding those requirements in the data model
itself.
This greatly simplifies the work of the data modeler. Her
work,
as far as
temporality is concerned, is not to translate tem-
poral requirements into data model constructs. Instead, it
becomes that of simply expressing business requirements for
temporal data as a set of metadata associated with the data
model.
As well as developing the logical model, the other task for the
data modeler is to translate business requirements for temporal
information into metadata. There are metadata entries for each
table in the data model which is to be generated as an asserted
version table. For these tables, there are entries to specify which
business column or columns make up the business key for the
table. This metadata also provides the information which the
AVF needs to enforce temporal entity integrity and temporal
referential integrity.
Once the logical model and its associated metadata are com-
plete, the next step is to generate a physical data model from the
logical model. At this point, of course, the physical model that is
Logical Data Model
Temporal Requirements
Physical Data Model Temporal Metadata
An Asserted Versioning Database
Non-Temporal Tables Asserted Versioning Tables
TEI Enforcement
TRI Enforcement
Figure 8.1 Designing and Generating an Asserted Versioning Database.
168 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES
generated has no temporal features; all of its tables are conven-
tional non-temporal tables.
The final step is a process in which a team consisting of the
data modeler and a DBA uses the temporal metadata to modify
the physical data model, changing specific tables into asserted
version tables. In this process, pairs of date columns are added
to implement assertion time and effective time. Surrogate pri-
mary keys are created as object identifiers. Physical primary keys
are converted into Asserted Versioning business keys, and physi-
cal foreign keys into Asserted Versioning temporal foreign keys.
However, for organizations using the ERwin data modeling
tool, this manual process is unnecessary. In the first release of
the AVF, we provide ERwin user-defined properties (UDPs) to
hold all temporal metadata, and ERwin scripting macros which
use these UDPs to generate a physical data model in which all
the temporal conversion work has already been done.
Note also that the Asserted Versioning database is more than
a set of entries in a database catalog—more than the temporal
data schemas shown in Figure 8.1.
It is
also the stored pro-
cedures, triggers or other code that enforces temporal integrity
constraints on temporal tables.
In the Preface, we stated that Asserted Versioning simplifies
the
management
of temporal databases by providing mainte-
nance encapsulation, query encapsulation and design encapsu-
lation. What we have just described here is how Asserted
Versioning provides design encapsulation. In the rest of this
chapter, we will see how design encapsulation works.
Translating a Non-Temporal Logical Data
Model into a Temporal Physical Data Model
The Logical Data Model
Figure 8.2 is the logical data model (LDM) of a sample data-
base we have constructed, and which can be accessed at
AssertedVersioning.com. The most important thing to notice
about this LDM is that there is nothing special about it. In par-
ticular, there is nothing explicitly temporal about it. And yet
from this model, supplemented with metadata provided by the
data modeler, the AVF will create an Asserted Versioning data-
base in which all of the tables are bi-temporal tables.
There may be other tables in an Asserted Versioning database
whi
ch are non-temp
oral tables. But we are not concerned with
them. The DBMS enforces entity integrity on them, while the
Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 169
AVF enforces temporal entity integrity on its tables. The DBMS
enforces referential integrity on them, while the AVF enforces
temporal referential integrity on its tables. Later on, ad ditional
non-temporal tables may be converted to asserted version
tables, and this can be done without making any changes to
the logical data models of those databases. Temporality is
introduced “downstream” from the logical data models, by
making entrie s in asserted version metadata tables, and then
by modifying DDL in accordance with this metadata before that
DDL is submitted to the DBMS.
This particular logical data model is a simple one. In it, a cli-
ent may own any number of policies, each of which must be
owned by exactly one client. Each policy may be amended by
any number of policy amendments, each of which amends
exactly one policy.
1
A wellness program category categorizes
any number of wellness programs, each of which is catego rized
by exactly one wellness program category. A client may be
client-nbr: CHAR(10)
Client
may own
Policy
policy-type: CHAR(3)
copay-amt: MONEY
client-nbr: CHAR(10) (FK)
policy-nbr: CHAR(10)
may be enrolled in
may categorize
Wellness-Program
wellpgm-nbr: CHAR(10)
wellpgmcat-cd: CHAR(4) (FK)
wellpgm-nm: VARCHAR(50)
may enroll
client-nbr: CHAR(10) (FK)
wellpgm-nbr: CHAR(10) (FK)
wellpgm-enroll-begin-wgt: SMALLINT
wellpgm-enroll-end-wgt: SMALLINT
wellpgm-enroll-begin-a1c-nbr: DECIMAL(2,1)
wellpgm-enroll-end-a1c-nbr: DECIMAL(2,1)
Wellness-Program-Enrollment
Wellness-Program-Category
wellpgmcat-cd: CHAR(4)
wellpgmcat-nm: VARCHAR(50)
may be amended by
Policy-Amendment
policy-amend-nbr: CHAR(10)
policy-nbr: CHAR(10) (FK)
policy-amend-txt: VARCHAR(100)
client-nm: VARCHAR(40)
Figure 8.2 The Sample Database Logical Data Model.
1
“Any number of” is our substitute for the less graceful expression “zero, one or more”.
The two expressions mean the same thing.
170 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES
enrolled in any number of wellness programs, each of which
may enroll any number of clients. Thus, the entity Wellness
Program Enrollment is an associative entit y, implementing a
many-to-many relationship between clients and programs.
The business meaning of the entities, attributes and relation-
ships
should need no
explanation, with the possible exception of
the two attributes with a suffix of “a1c”. As all diabetics know,
a1c is a blood test that measures what percentage of a person’s
hemoglobin has glucose attached to it.
As ERwin data modelers will immediately recognize, primary
keys are shown above the horizontal line in each entity. Foreign
keys, of course, have “(FK)” as a separate suffix. Since all of these
entities will be generated as temporal tables, all these FKs will be
replaced by temporal foreign keys, by TFKs.
As we said earlier, the current implementation of Asserted
Versioning uses ERwin’s user-defined properties to capture the
metadata needed to generate a bi-temporal data base schema
from a non-temporal data model. In this chapter, however, we
will organize that metadata as a set of five metadata tables.
Referential Constraints Between Non-Temporal
and Bi-Temporal Tables
There is nothing semantically wrong about a bi-temporal
table being the child table in a referential integrity relationship.
In that case, the bi-temporal table will contain a conventional
foreign key which points to a row in a parent non-temporal
table. Conversely, there is nothing semantically wrong about a
non-temporal table being the child table in a temporal referen-
tial integrity relationship. In that case, the non-temporal table
will contain a temporal foreign key which points to an episode
in a parent bi-temporal table.
In both cases, the referential relationships reflect an existence
dependency between the object s involved. When both tables are
non-temporal, we represent that existence dependency as a ref-
erential integrity dependency. When both tables are bi-temporal,
we represent it as a temporal referential integrity dependency.
When one table is non-temporal and the other bi-temporal, the
existence dependency between their objects isn’t somehow
nullified because of our choice of how to represent it. And so
our managed objects should be able to express that dependency
even in that “mixed” case.
As bi-temporal theory, Asserted Versioning interprets non-
temporal tables as tables whose rows are bi-temporal, but
Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 171
implicitly so. Rows in non-temporal tables exist in an assertion
time which is co-temporal with their physical presence, and so
too for effective time. In other words, non-temporal rows are
asserted for as long as they physically exist, and are versions
which describe what their objects are currently like for as long
as those rows physically exist. Their assertion time periods and
their effective time periods are fixed; both are always [row create
date – 12/31/9999].
In an alternative interpretation, non-temporal rows are
asserted for as long as they physically exist in their current form ,
and are versions which describe their objects for as long as those
rows physically exist in their current form. Each time a row is
updated, its old form, i.e. an exact image of all of the data in that
row, is lost because at least some of it is overwritten. In this
interpretation, those rows must have a last update date, in which
case their assertion time periods and their effective time periods
are not fixed because both are [last update date – 12/31/9999].
In our initial release of the AVF, however, we will not support
mixed referential relationships. One of these relationships won’t
work, and the other one is dangerous. The relationship that
won’t work is the one in which the child table is a non-tem poral
table, and contains a tempo ral foreign key. This temporal foreign
key is not declared in DDL because current DBMSs cannot rec-
ognize it. This temporal foreign key cannot be managed by the
DBMS because, unlike normal foreign keys, it does not point to
a specific row in the parent table.
The relationship that is dangerous is the one in which the
child table is an asserted version table, and contains a conven-
tional foreign key. This foreign key is declared in DDL, and
the DBMS can recognize it. The danger lies in the fact that the
DBMS can then carry out a delete cascade from the parent table
to the child table, if it is so directed.
This delete cascade, however, is unaware of the temporal
semantics of the child table. It will simply find every physical
row in that child table that contains the referenced foreign key
value, and will then physically delete that row. This is meat
cleaver work where delicate surgery is required. It can destroy
past, current and future episodes in the child table, leaving col-
lections of versions which are semantically invalid, and which
the AVF will be unable to manage. It will physically remove both
version history and assertion history, whereas bi-temporal data
management is a promise to preserve both. The conventional
delete set null rule woul d be a safer alternative because episode
timelines would not be destroyed. Nonetheless, column-level
history would still be lost.
172 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES
Mixed referential relationships should be addressed, but they
will not be addressed in the first release of the AVF. And so, in the
remainder of this chapter, and in most of the remainder of this
book, we will not discuss them.
Asserted Versioning Metadata
Figures 8.3 through 8.7 show the metadata needed by the AVF
to generate an Asserted Versioning database from the LDM
shown in Figure 8.2. As with other figures showing tables, we
indicate foreign keys by italicizing the column heading, and
primary keys by underlining the column heading.
We show these metadata tables as themselves conventional
tabl
es, and therefor
e all relationships as ones implemented with
conventional foreign keys. This simplifies the discussions in this
chapter, and allows us to concentrate on the metadata without
being concerned about keeping a bi-temporal history of changes
to that data.
Table Type Metadata
In a logical data model that will generate an Asserted
Versioning database, we need a metadata list of which entities
to generate as non-temporal tables and which entities to gener-
ate as asserted v ersion tables. This metadata table lists all the
tables that will be generated as asserted version tables, as shown
in Figure 8.3.
For this
data model, we will generate all its entities
as asserted version tables.
The non-key column in this metadata table is the business
key
flag. If it
is set to ‘ Y’, then the table is considered to have a
reliable business key. Otherwise, it is set to ‘N’, indicating that
the business key for the table is not reliable.
Client Y
Y
Y
Y
Y
Y
tbl-nm
Table-Type
bus-key-
rlb-flag
Policy
Wellness_Program
Wellness_Program_Category
Wellness_Program_Enrollment
Policy_Amendment
Figure 8.3 The Table Type Metadata Table.
Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 173
[...]... datetime asr_end_dt: datetime row_crt_dt: datetime Wellness_Program_Category wellpgmcat_oid: bigint eff_beg_dt: datetime asr_beg_dt: datetime wellpgmcat_cd: char(4) epis_beg_dt: datetime wellpgmcat_nm: varchar(50) eff_end_dt: datetime asr_end_dt: datetime row_crt_dt: datetime Policy policy_oid: bigint eff_beg_dt: datetime asr_beg_dt: datetime policy_nbr: char(10) epis_beg_dt; datetime client_oid: bigint... Business Key Metadata Table Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES But in a temporal table, multiple rows may represent the same object, and so all of those rows will have the same business key Consequently, we cannot guarantee that each business key points to one and only one object by defining a unique index on it Nor can we simply extend the scope of the index by defining... between them or not By using the same granularity for all asserted version tables in the same database, it is easy to spot two versions of the same object that are contiguous in either assertion or in effective time Because of the closedopen convention, two time periods [meet] (are contiguous) if and only if the end point in time of one has the same value as the begin point in time of the other This... client_nbr: char(10) epis_beg_dt: datetime client_nm: varchar(40) eff_end_dt: datetime asr_end_dt: datetime row_crt_dt: datetime Wellness_Program_Enrollment client_wellpgm_oid: bigint eff_beg_dt: datetime asr_beg_dt: datetime client_oid: bigint wellpgm_oid: bigint epis_beg_dt: char(18) wellpgm_enroll_begin_wgt: smallint wellpgm_enroll_end_wgt: smallint wellpgm_enroll_begin_a1c_nbr: decimal(2,1) wellpgm_enroll_end_a1c_nbr:... by defining a unique index on them Nonetheless, they have an important role to play We discuss business keys, how the AVF’s enforcement of temporal entity integrity guarantees that no two objects will ever have the same business key, and how business keys help the business user clarify her intentions when submitting transactions to an Asserted Versioning database, in Chapter 9 Foreign Key Mapping Metadata... datetime asr_end_dt: datetime row_crt_dt: datetime Policy_Amendment policy-amend_oid: bigint eff_beg_dt: datetime asr_beg_dt: datetime policy_oid: bigint policy_amend_nbr: char(10) epis_beg_dt: datetime policy_amend_txt: varchar(100) eff_end_dt: datetime asr_end_dt: datetime row_crt_dt: datetime Figure 8.8 The Sample Database Physical Data Model Wellness_Program Wellpgm_oid: bigint eff_beg_dt: datetime... to Asserted Versioning’s assertion time And in our own prior implementations of bitemporal data management, we have used dates for effective time and microsecond timestamps for assertion time By using the same granularity for all assertion times in the same database, and the same granularity for all effective times, it is easy to determine the Allen relationship between any two time periods So suppose... are two time periods which start at the same time, one of which is delimited by dates and the other by timestamps The values, each of which designate the same point in time, are not identical But if the same granularity is used, the EQUALS operator will tell us whether or not those time periods begin at the same time Of particular importance is whether or not two time periods have a gap in time between... of business keys in asserted version tables is to identify the object represented by each row in the same way that object would be identified, or was identified, in a conventional table Most of the time, business keys are reliable In other words, most of the time, each business key value is a unique identifier for one and only one object So in a non-temporal table, it would be possible to define a... TFK in each row of the table must, at all times, contain an oid to an object one of whose episodes has an effective time period that includes ([fills-1]) the effective time period of the row that contains the TFK If the TFK is not required, the TFK in each row must either contain a valid oid reference, or be null In our sample database, we have made the TFK to Wellness 175 176 Chapter 8 DESIGNING AND . compare an assertion time
period or point in time to an effective time period or point in
time.
One final point. We recommend that assertion time granular-
ity. Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES
8
DESIGNING AND GENERATING
ASSERTED VERSIONING
DATABASES
CONTENTS
Translating a Non-Temporal