1. Trang chủ
  2. » Công Nghệ Thông Tin

Beginning Database Design- P23 docx

20 285 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 20
Dung lượng 481,44 KB

Nội dung

Materialized view — A physically preconstructed view of data containing data copied into the material- ized view. Materialized views can be highly efficient in read-only environments and are often used for replication, distribution and in data warehouses. Metadata — The tables and the fields defining the structure of the data; the data about the data. Method — The equivalent to a relational database stored procedure, except that it executes on the data contents of an object, within the bounds of that object. Microsoft Windows — The Microsoft Windows operating system. Multi-valued dependency — A field containing a comma-delimited list or collection of some kind. A col- lection could be an array of values of the same type. Those multiple values are dependent as a whole on the primary key, the whole meaning the entire collection in the comma-delimited list. Each individual value is not dependent on the primary key. Nested query — A query executed from within another query. In theory, queries can be nested up to any number of hierarchical layers. The only limitation is on complexity and the abilities of the programmer. Network — A system of connected computers. A local area network (LAN) is contained within a single company, in a single office. A wide area network (WAN) is generally distributed across a geographical area — even globally. The Internet is a very loosely connected network, meaning that it is usable by any- one and everyone. Network database model — Essentially a refinement of the hierarchical database model. The network model allows child tables to have more than one parent, thus creating a networked-like table structure. Multiple parent tables for each child allow for many-to-many relationships, in addition to one-to-many relationships. Non trivial multi-valued dependency — A multi-valued dependency with more than two fields in the table. (See Multi valued dependency.) Non-identifying relationship — The child table is not dependent on the parent table, such that the child table includes the parent table primary key as a foreign key, but not as part of the child table’s primary key. In other words, the parent record does not require, that a related record, exists in the child table. A foreign key field can contain a NULL value, and it can’t be a part of the primary key because a primary key requires uniqueness. Normal Forms —The steps contained within the process of Normalization. Normal Forms are cumulative, such that a database model in 3rd Normal Form is in both 2nd and 1st Normal Forms, but not Boyce-Codd (can be the same as 3rd Normal Form), 4th, 5th Normal Form, or Domain Key Normal Form. Normalization — The process of simplifying the structure of data. Normalization increases granularity and Granularity is the scope of a definition for any particular thing. The more granular a data model is, the easier it becomes to manage, up to a point, depending, of course, on the application of the database model. NOT NULL constraint — A constraint that implies a field must have a value placed into it; otherwise, an error is returned. 413 Glossary 21_574906 glos.qxd 10/28/05 11:39 PM Page 413 NULL — A field that has never been initialized with any value. A NULL field setting allows a field to con- tain nothing when a record is created or changed in a table. Number — A numeric datatype allowing only numbers of various formats. Number crunching — Computer jargon for large quantities of extremely complex calculations. Object — In object methodology, the creation (instantiation) of a class at run-time, such that multiple object instances can be created from a class. An object is also a generic term applied to anything tangible, such as a table in a relational database. Object database model — A model that provides a three-dimensional structure to data where any item in a database can be retrieved from any point very rapidly. Whereas the relational database model lends itself to retrieval of groups of records in two dimensions, the object database model is very efficient for finding unique items. Consequently, the object database model performs very poorly when retrieving more than a single item, at which the relational database model is very good. Object-relational database model —The object-relational database model includes minimal aspects of the object database model into the relational database model. In some respects, the object-relational database model was created in answer to conflicting capabilities of relational and object database models —and also as a commercial competitor to the object database model. The object database model is somewhat spherical in nature, allowing access to unique elements anywhere within a database structure, with extremely high performance. The object database model performs extremely poorly when retrieving more than a single data item. The relational database model, on the other hand, contains records of data in tables across two dimensions. The relational database model is best suited for retrieval of groups of data but can also be used to access unique data items fairly efficiently. OLAP — See Online Analytical Processing. OLTP — See Online Transaction Processing. ON clause — The ON clause is an ANSI standard join format that allows exact field join specifications when you want to include one or more fields in a join, which have different names in different tables. One-to-many relationship — The relationship between two tables dictated by having one record in one table, and many related records in another table. One-to-one relationship — The relationship between two tables dictated by having one record in each table, and not more than one record in either table, related back to the other table. Online Analytical Processing (OLAP) — A functionality that provides rapid interactive analysis of data into multiple dimensions, usually involving extremely large databases. The objective of analysis is to highlight trends, patterns and exceptions. Online Transaction Processing OLTP) — Databases that were devised to cater for the enormous concur- rency requirements of Internet (online) applications. OLTP databases cause problems with concurrency. The number of users that can be reached over the Internet is an unimaginable order of magnitude larger than that of an in-house company client-server database. Thus, the concurrency requirements for OLTP database models explodes — well beyond the scope of previous experience with client-server databases. 414 Glossary 21_574906 glos.qxd 10/28/05 11:39 PM Page 414 Operating system — The lowest level of software on a computer, generally managing the interface and the hardware. Windows, UNIX, and Linux are all operating systems. Operations — A term describing what a company does to make a profit. Optimizer — A term applied to a process, within a database engine, that attempts to find the fastest method of executing a SQL command against a database. ORDER BY clause — Query SELECT command adjustment allowing resorting (reordering) of records as they are returned from a query to a database. Outer join — An intersection plus rows outside the intersection, in one table and not in the other table of a join. Overflow — A situation where new data is added to a table or index, but outside of the most effective structure, making subsequent reads potentially very inefficient. Certain types of indexes are subject to overflow. Paper trail — The pieces of paper a company produces, and those passing through it, while it conducts its day-to-day affairs. A company in the process of performing its day-to-day business is likely to have a paper trail of orders, invoices, bills, checks, and so on. Analysis can gain copious amounts of information from a company paper trail. Following the paper trail is a very useful method of gathering analytical details of the business operational processes of a company. Parallel processing — Execution of more than one thing at the same time, typically using multiple CPUs (but not always). Additionally, parallel processing used in hand with partitioning can result in some very effective performance improvements. Partitioning — Physical splitting of tables into separate sections (partitions), including parallel process- ing on multiple partitions and individual operations on individual partitions. One particularly efficient aspect is the capability when querying a table to read fewer than all the partitions making up a table, perhaps even a single partition. This is also known as partition pruning. Performance — Performance is a measure of how fast a database services applications, and ultimately end-users. Planning — A process whereby a project plan and timeline are used for larger projects. Project plans typically include, who does what and when. More sophisticated plans integrate multiple tasks, sharing them out among many people, ensuring dependencies are catered for. For example, if task B requires completion of task A, the same person can do both tasks A and B. If there is no dependency, two people can do both tasks A an B at the same time. Power-user—A user who is between an end-user and an expert computer programmer, in terms of knowing how to use a computer. An end-user uses a computer as a tool to solve business problems. A computer programmer writes the software that end-users make use of. A power user is someone in between, typically an end-user who writes his or her own software. Precedence — The order of resolution of an expression, and generally acts from left to right, across an expression. 415 Glossary 21_574906 glos.qxd 10/28/05 11:39 PM Page 415 Primary key — A key uniquely identifying each row in a table. The entity on the many side of the rela- tionship has a foreign key. The foreign key column contains primary key values of the entity on the one side of the relationship. Projection Normal Form (PJNF) — See 5th Normal Form. Query — A statement interrogating the database and returning information. Most often tables are inter- rogated and records from those tables are returned. Queries can be both simple and complex. A query is executed using the SQL SELECT command. Random access memory (RAM) — The memory chips inside your computer. RAM provides an ultra- fast buffering storage area between CPU (the processor) and your I/O devices (disks). RDBMS — See Relational Database Management System. Record — A repetition of a field structure across a table. Records repeat field structure in a table, where each repeated field can (and sometimes should) have a different value. Tables are divided into fields and records. Fields impose structure and datatype specifics onto each of the field values, in each record. Redundant Array of Inexpensive Disks (RAID) — A bunch of small, cheap disks. A RAID array is a group of disks used together as a single unit logical disk. RAID arrays can help with storage capacity, recoverability and performance, using what are called mirroring and striping. Mirroring creates dupli- cate copies of all physical data. Striping breaks data into many small pieces, where those small pieces can be accessed in parallel. Referential integrity — A process (usually contained within a relational database model) of validation between related primary and foreign key field values. For example, a foreign key value cannot be added to a table unless the related primary key value exists in the parent table. Similarly, deleting a primary key value necessitates removing all records in subsidiary tables, containing that primary key value in foreign key fields. Additionally, it follows that preventing the deletion of a primary key record is not allowed if a foreign key exists elsewhere. Relational Database Management System (RDBMS) — A system that uses a database that contains tables with data. The management system part is the part allowing you access to that database, and the power to manipulate both the database and the data contained within it. Relational database model — A model that provides a two-dimensional structure to data. The relational database model more or less throws out the window the concept and restriction of a hierarchical struc- ture, but does not completely abandon data hierarchies. Any table can be accessed directly with having to access all parent objects. Precise data values (such as primary keys) are required to facilitate skirting the hierarchy (to find individual records) in specific tables. Replication — A method used to duplicate (replicate) and distribute data from a primary or master database, out to a number of other copies of the master database. Those copies can be fully dependent slave databases, or even other master databases, capable of passing their own changes back. Right outer join — A query finding the combination of intersection, plus records in the right-sided table, but not in the left-sided table. 416 Glossary 21_574906 glos.qxd 10/28/05 11:39 PM Page 416 ROLLBACK — This command undoes any database changes not yet committed to the database using the COMMIT command. SDK — Software development kit is a tool containing a programming language (Java, for example). SDKs are often used to build applications software. Secondary Index — See Alternate index. SELECT command — A command used to execute a query on a database. A SELECT command contains all the fields to be retrieved from tables. Additionally, a SELECT command can have optional additions used to perform special alterations to queries, such as filtering using a WHERE clause, and sorting using an ORDER BY clause. Self join — Joins records in a table to the same table. Typically used for a table containing hierarchically structured records, such as a family tree. Semi-join — Join two tables using a subquery, but not necessarily returning any field values to the call- ing query. Semi-joins occur when using IN and EXISTS operators. Sequence — Allows automated generation of sequences of numbers, usually one after the other, such as 101, 102, 103, and so on. Some database engines call these auto counters. Simple datatype — A term used to describe the most basic of datatypes, containing a simple value, such as an integer or a string. Snowflake schema — A data warehouse, single fact table structure, with dimension tables in multiple layered hierarchies of dimensional tables. Sorted query — See ORDER BY clause. SQL — See Structured Query Language. Standby database — A failover database. A standby database has minimal activity, usually only adding new records, changing existing records, and deleting existing records. Some database engines, however, allow standby databases to be utilized as secondary, active database platforms. Star schema — A single fact table surrounded by a single hierarchical layer of dimensional tables, in a data warehouse database. Static data — Data that does not change significantly. Stored function — The same as a stored procedure, except that it returns a single value. Stored procedure — Also called a database procedure, a chunk of code stored within and executed from within a database, typically on data stored in a database (but not always). String — A simple datatype containing a sequence of alphanumeric characters. 417 Glossary 21_574906 glos.qxd 10/28/05 11:39 PM Page 417 Structured Query Language (SQL) —A non-procedural language that does not allow dependencies between successive commands. SQL is the language used to access data in a relational database. Generally, for any relational database other than Microsoft SQL-Server, SQL is pronounced “ess-queue-ell” and not “sequel.” Surrogate key — Used as a replacement or substitute for a descriptive primary key, allowing for better control, better structure, less storage space, more efficient indexing, and absolute surety of uniqueness. Surrogate keys are usually integers, and usually automatically generated using auto counters or sequences. Table—An entity that is divided into fields and records. Fields impose structure and datatype specifics onto each of the field values in a record. Tertiary index—See Alternate index. Time dimension—Used for temporal analysis in data warehouses. Timeline—For a project plan, a plotting of who does what and when. A project plan and timeline are useful for larger projects. Project plans typically include who does what and when. More sophisticated plans integrate multiple tasks, sharing them out among many people, ensuring dependencies are catered for. For example, if task B requires completion of task A, the same person can do both tasks A and B. If there is no dependency, two people can do both tasks A an B at the same time. Timestamp—A datatype used to store date values, with a time of day attached as well. Transaction—In SQL, a sequence of one or more commands where changes are not as yet committed permanently to a database. A transaction is completed once changes are committed or undone (rolled back). Transactional control—A transaction is comprised of one or more database change commands, which make database changes. A transaction is completed on the execution of a COMMIT or ROLLBACK command, manually or automatically. The concept of transactional control is that SQL allows sets of commands to be permanently stored all at once, or undone all at once. Transactional data—Data about the day-to-day dynamic activities of a company, such as invoices. Transitive dependence—Z is transitively dependent on X when X determines Y and Y determines Z. Transitive dependence thus describes that Z is indirectly dependent on X through its relationship with Y. Trigger—A chunk of code that executes when a specified event occurs, usually before or after an INSERT, UPDATE, or DELETE command. Trivial multi-valued dependency—A multi-valued dependency with only two fields in the table. (See Multi valued dependency.) Truncate—A term implying the removal of characters from a value, typically a number, where no rounding occurs. Tuple — See Record. 418 Glossary 21_574906 glos.qxd 10/28/05 11:39 PM Page 418 Unique key — A key created on a field containing only unique values throughout an entire table. UNIX — An operating system that is far more complex and far more difficult to manage than an operat- ing system like Microsoft Windows. UNIX is, however, far more versatile and far more powerful but also much more expensive. UPDATE — The command used to change data in records in tables. Update anomaly — An error caused when a database allows an error to be generated, by updating incorrectly across a primary and foreign key relationship. A record cannot be updated in a master table unless all sibling records, in all related child tables, are updated first. Note that changes can be propa- gated to sibling records in child tables, using cascading. User — See End-user. User-friendly— Describes a software application (or otherwise) that allows ease of use for the non-com- puter literate, or end-user population. Validation check.—See Check constraint. Variable-length records—Every record in a table does not have to be the same byte-length. This allows use of datatypes, such as variable-length strings ( CHAR VARYING(nn)). Most modern relational database engines use variable-length records. Variable-length string—A string with 0 or more characters, up to a maximum length of characters. View—A logical overlay containing a query, executed whenever the view is accessed. Repeated query execution can make views very inefficient in busy environments. WHERE clause — A clause that is an optional part of the SELECT statement, the UPDATE, and DELETE com- mands. The WHERE clause allows inclusion of wanted records, and filtering out of unwanted records. Windows Explorer — A Microsoft Windows tool used to view and access files on disk. 419 Glossary 21_574906 glos.qxd 10/28/05 11:39 PM Page 419 21_574906 glos.qxd 10/28/05 11:39 PM Page 420 A Exercise Answers This appendix contains all the answers to the exercises appearing at the ends of chapters. Chapter 3 Exercise 1 solution Two CREATE TABLE commands: CREATE TABLE Band ( band_id INTEGER NOT NULL, band_name VARCHAR(32) NULL, CONSTRAINT XPK_Band PRIMARY KEY (band_id), CONSTRAINT XUK_B_Name UNIQUE (band_name) ); CREATE TABLE Track ( track_id INTEGER NOT NULL, band_id INTEGER NOT NULL, track_name VARCHAR(32) NULL, description VARCHAR(256) NULL, CONSTRAINT XPK_Track PRIMARY KEY (track_id), CONSTRAINT FK_T_Band FOREIGN KEY (band_id) REFERENCES Band, CONSTRAINT XUK_T_Name UNIQUE (track_name) ); Exercise 2 solution One CREATE INDEX command: CREATE INDEX XFK_T_Band ON Track(band_id); 22_574906 appa.qxd 10/28/05 11:38 PM Page 421 Chapter 4 Exercise 1 solution Five CREATE TABLE commands are shown here. Note that the order in which tables are created is impor- tant, as assignment of foreign key columns requires that primary keys in parent tables already exist: CREATE TABLE Customer ( customer_name VARCHAR(32) PRIMARY KEY, customer_address VARCHAR(256), customer_phone VARCHAR(32) ); CREATE TABLE Stock_Source_Department ( stock_source_department VARCHAR(32) PRIMARY KEY, stock_source_city VARCHAR(32) NOT NULL ); CREATE TABLE Stock_Item ( stock# INTEGER PRIMARY KEY, stock_source_department VARCHAR(32) NOT NULL REFERENCES Stock_Source_Department, stock_description VARCHAR(256), stock_unit_price FLOAT ); CREATE TABLE Sale_Order ( order# INTEGER PRIMARY KEY, customer_name VARCHAR(32) REFERENCES Customer, dte DATE, sales_tax_percentage FLOAT ); CREATE TABLE Sale_Order_Item ( order# INTEGER PRIMARY KEY, stock# INTEGER NOT NULL REFERENCES Sale_Order, stock_quantity INTEGER ); Note how the PRIMARY KEY and FOREIGN KEY specifications are included within the specification of the column. These are called in line constraint definitions. This is a different method of definition from that of exercises in Chapter 3. Chapter 3 uses what are called out of line constraint definitions. Inline constraint definitions can only be used for constraints on single columns. For example, a multiple column primary key would have to be defined as an out of line constraint. It is also possible to define primary and foreign key constraints using an ALTER TABLE command to change a table specification, after the table has already been created. 422 Appendix A 22_574906 appa.qxd 10/28/05 11:38 PM Page 422 [...]... instantaneous reaction to database changes and activities are essential If you withdraw cash from an ATM at your bank and then check your statement online in an hour or so, you would expect to see the transaction Similarly, if you purchase something online, you would hope to see the transaction on your credit card account within minutes, if not seconds Exercise 2 solution Very large database is the only... only correct answer: ❑ Frightening Database Size — Data warehouses can become incredibly large Administrators and developers have to decide how much detail to retain, when to remove data, when to summarize, what to summarize A lot of these decisions are done during production when the data warehouse is in use Also, ad-hoc queries can cause serious problems because if the database is very large User education . relational database model is very good. Object-relational database model —The object-relational database model includes minimal aspects of the object database model into the relational database. object-relational database model was created in answer to conflicting capabilities of relational and object database models —and also as a commercial competitor to the object database model. The object database. from a primary or master database, out to a number of other copies of the master database. Those copies can be fully dependent slave databases, or even other master databases, capable of passing

Ngày đăng: 03/07/2014, 01:20

TỪ KHÓA LIÊN QUAN