1. Trang chủ
  2. » Công Nghệ Thông Tin

OReilly SQL and relational theory feb 2009 ISBN 0596523068

769 185 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 769
Dung lượng 4,03 MB

Nội dung

SQL and Relational Theory, 1st Edition by C.J Date Publisher: O'Reilly Media, Inc Pub Date: February 5, 2009 Print ISBN-13: 978-0-596-52306-0 Pages: 432 Overview Understanding SQL's underlying theory is the best way to guarantee that your SQL code is correct and your database schema is robust and maintainable On the other hand, if you're not well versed in the theory, you can fall into several traps In SQL and Relational Theory, author C.J Date demonstrates how you can apply relational theory directly to your use of SQL With numerous examples and clear explanations of the reasoning behind them, you'll learn how to deal with common SQL dilemmas, such as: Should database access granted be through views instead of base tables? Nulls in your database are causing you to get wrong answers Why? What can you do about it? Could you write an SQL query to find employees who have never been in the same department for more than six months at a time? SQL supports "quantified comparisons," but they're better avoided Why? How do you avoid them? Constraints are crucially important, but most SQL products don't support them properly What can you do to resolve this situation? Database theory and practice have evolved since Edgar Codd originally defined the relational model back in 1969 Independent of any SQL products, SQL and Relational Theory draws on decades of research to present the most up-to-date treatment of the material available anywhere Anyone with a modest to advanced background in SQL will benefit from the many insights in this book Copyright Copyright © 2009, O'Reilly Media All rights reserved Published by O'Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472 O'Reilly books may be purchased for educational, business, or sales promotional use Online editions are also available for most titles (safari.oreilly.com) For more information, contact our corporate/institutional sales department: (800) 998-9938 or corporate@oreilly.com The O'Reilly logo is a registered trademark of O'Reilly Media, Inc SQL and Relational Theory and related trade dress are trademarks of O'Reilly Media, Inc Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks Where those designations appear in this book, and O'Reilly Media, Inc was aware of a trademark claim, the designations have been printed in caps or initial caps While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein To all those who think an exercise like this one is worthwhile, and in particular to the memory of Lex de Haan, who is very much missed Those who are enamored of practice without theory are like a pilot who goes into a ship without rudder or compass and never has any certainty where he is going Practice should always be based upon a sound knowledge of theory Leonardo da Vinci (1452–1519) The trouble with people is not that they don't know but that they know so much that ain't so Josh Billings (1818–1885) Languages die… mathematical ideas do not G H Hardy Unfortunately, the gap between theory and practice is not as wide in theory as it is in practice Anonymous Preface SQL is ubiquitous But SQL is hard to use: It's complicated, confusing, and error prone—much more so, I venture to suggest, than its apologists would have you believe In order to have any hope of writing SQL code that you can be sure is accurate, therefore (meaning it does exactly what it's supposed to do, no more and no less), you must follow some appropriate discipline—and it's the thesis of this book that using SQL relationally is the discipline you need But what does this mean? Isn't SQL relational anyway? Well, it's true that SQL is the standard language for use with relational databases—but that fact in itself doesn't make it relational The sad truth is, SQL departs from relational theory in all too many ways; duplicate rows and nulls are two obvious examples, but they're not the only ones As a consequence, it gives you rope to hang yourself with, as it were So if you don't want to hang yourself, you need to understand relational theory (what it is and why); you need to know about SQL's departures from that theory; and you need to know how to avoid the problems they can cause In a word, you need to use SQL relationally Then you can behave as if SQL truly were relational, and you can enjoy the benefits of working with what is, in effect, a truly relational system Now, a book like this wouldn't be needed if everyone was using SQL relationally already—but they aren't On the contrary, I observe much bad practice in current SQL usage I even observe such practice being recommended, in textbooks and similar publications, by writers who really ought to know better (no names, no pack drill); in fact, a review of the literature in this regard is a pretty dispiriting exercise The relational model first saw the light of day in 1969, and yet here we are, almost 40 years on, and it still doesn't seem to be very well understood by the database community at large Partly for such reasons, this book uses the relational model itself as an organizing principle; it explains various features of the model in depth, and shows in every case how best to use SQL to implement the feature in question P.1 Prerequisites I assume you're a database practitioner and therefore reasonably familiar with SQL already To be specific, I assume you have a working knowledge of either the SQL standard or (perhaps more likely in practice) at least one SQL product However, I don't assume you have a deep knowledge of relational theory as such (though I do hope you understand that the relational model is a good thing in general, and adhering to it wherever possible is a desirable goal) In order to avoid misunderstandings, therefore, I'll be describing various features of the relational model in detail, as well as showing how to use SQL to conform to those features But what I won't do is attempt to justify all of those features; rather, I'll assume you're sufficiently experienced in database matters to understand why, e.g., the notion of a key makes sense, or why you sometimes need to do a join, or why many to many relationships need to be supported (If I were to include such justifications, this would be a very different book—quite apart from anything else, it would be much bigger than it already is—and in any case, that book has already been written.) I've said I expect you to be reasonably familiar with SQL However, I should add that I'll be explaining certain aspects of SQL in detail anyway—especially aspects that might be encountered less frequently in practice (The SQL notion of "possibly nondeterministic expressions" is a case in point here See Chapter 12.) P.2 Database in Depth This book is based on, and intended to replace, an earlier one with the title Database in Depth: Relational Theory for Practitioners (O'Reilly, 2005) My aim in that earlier book was as follows (this is a quote from the preface): After many years working in the database community in various capacities, I've come to realize there's a real need for a book for practitioners (not novices) that explains the basic principles of relational theory in a way not tainted by the quirks and peculiarities of existing products, commercial practice, or the SQL standard I wrote this book to fill that need My intended audience is thus experienced database practitioners who are honest enough to admit they don't understand the theory underlying their own field as well as they might, or should That theory is, of course, the relational model— and while it's true that the fundamental ideas of that theory are all quite simple, it's also true that they're widely misrepresented, or underappreciated, or both Often, in fact, they don't seem to be understood at all For example, here are a few relational questions[1]… How many of them can you answer? What exactly is first normal form? What's the connection between relations and predicates? What's semantic optimization? What's an image relation? Why is semidifference important? Why doesn't deferred integrity checking make sense? What's a relation variable? What's prenex normal form? Can a relation have an attribute whose values are relations? Is SQL relationally complete? Why is The Information Principle important? How does XML fit with the relational model? This book provides answers to these and many related questions Overall, it's meant to help database practitioners understand relational theory in depth and make good use of that understanding in their professional day-to-day activities [1] For reasons that aren't important here, I've replaced a few of the questions in this list by new ones As the final sentence in this extract indicates, it was my hope that readers of that book would be able to apply its ideas for themselves, without further assistance from me as it were But I've since come to realize that, contrary to popular opinion, SQL is such a difficult language that it can be far from obvious how to use it without violating relational principles I therefore decided to expand the original book to include explicit, concrete advice on exactly that issue (how to use SQL relationally, I mean) So my aim in the present book is still the same as before (i.e., I want to help database practitioners understand relational theory in depth and make good use of that understanding in their professional activities), but I've tried to make the material a little easier to digest, perhaps, and certainly easier to apply In other words, I've included a great deal of SQL-specific material (and it's this fact, more than anything else, that accounts for the increase in size over the previous version) P.3 Further Remarks on the Text I need to take care of several further preliminaries First of all, my own understanding of the relational model has evolved over the years, and continues to do so This book represents my very latest thinking on the subject; thus, if you detect any technical discrepancies—and there are a few —between this book and other books you might have seen by myself (including in particular the one this book is meant to replace), the present book should be taken as superseding Though I hasten to add that such discrepancies are mostly of a fairly minor nature; what's more, I've taken care always to relate new terms and concepts to earlier ones, wherever I felt it was necessary to do so Second, I will, as advertised, be talking about theory—but it's an article of faith with me that Theory is practical I mention this point explicitly because so many people seem to believe the opposite: namely, that if something's theoretical, it can't be practical But the truth is that theory (at least, relational theory, which is what I'm talking about here) is most definitely very practical indeed The purpose of that theory is not just theory for its own sake; the purpose of that theory is to allow us to build systems that are 100 percent practical Every detail of the theory is there for solid practical reasons As one reviewer of the earlier book, Stèphane Faroult, wrote: "When you have a bit of practice, you realize there's no way to avoid having to know the theory." What's more, that theory is not only practical, it's fundamental, straightforward, simple, useful, and it can be fun (as I hope to demonstrate in the course of this book) Of course, we really don't have to look any further than the relational model itself to find the most striking possible illustration of the foregoing thesis In fact, it really shouldn't be necessary to have to defend the notion that theory is practical, in a context such as ours: namely, a multibillion dollar industry totally founded on one great theoretical idea But I suppose the cynic's position would be "Yes, but what has theory done for me lately?" In other words, those of us who do think theory is important must be continually justifying ourselves to our critics—which is another reason why I think a book like this one is needed Third, as I've said, the book does go into a fair amount of detail regarding features of SQL or the relational model or both (It deliberately has little to say on topics that aren't particularly relational; for example, there isn't much on transactions.) Throughout, I've tried to make it clear when the discussions apply to SQL specifically, when they apply to the relational model specifically, and when they apply to both I should emphasize, however, that the SQL discussions in particular aren't meant to be exhaustive SQL is such a complex language, and provides so many different ways of doing the same thing, and is subject to so many exceptions and special cases, that to be exhaustive—even if it were possible, which I tend to doubt—would be counterproductive; certainly it would make the book much too long So I've tried to focus on what I think are the most important issues, and I've tried to be as brief as possible on the issues I've chosen to cover And I'd like to claim that if you do everything I tell you, and don't do anything I don't tell you, then to a first approximation you'll be safe: You'll be using SQL relationally But whether that claim is justified, or to what extent it is, must be for you to judge To the foregoing I have to add that, unfortunately, there are some situations in which SQL just can't be used relationally For example, some SQL integrity checking simply has to be deferred (usually to commit time), even though the relational model rejects such checking as logically flawed The book does offer advice on what to do in such cases, but I fear it often boils down to just Do the best you can At least I hope you'll understand the risks involved in departing from the model I should say too that some of the recommendations offered definition direct image disjoint union 2nd 3rd DISTINCT 2nd 3rd 4th distributive laws divide 2nd 3rd 4th 5th definition domain SQL vs type domain check override dot qualification 2nd 3rd 4th double negation law double underlining duplicates 2nd 3rd 4th 5th 6th 7th 8th 9th and nulls and projection and restriction and union 2nd generated by SQL durability [top] [E] Elmasri empty key empty range 2nd empty relation empty set SQL empty tuple empty type 2nd entity integrity 2nd 3rd entity/relationship modeling equal but distinguishable equality 2nd SQL equijoin 2nd equivalence law of transformation EVERY existential quantification 3VL iterated OR 2nd SQL 2nd explicit table expression transformation 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th 17th 18th 19th 20th 21st 22nd 23rd 24th 25th laws 2nd 3rd 4th expression vs statement 2nd extend 2nd 3rd 4th definition extension (predicate) 2nd [top] [F] Fagin Fagin's Theorem Faroult field fifth normal form significance 2nd 3rd first normal form 2nd 3rd flat relation 2nd foreign key 2nd 3rd 4th 5th definition fourth normal form 2nd free variable 2nd 3rd 4th functional dependence 2nd 3rd 4th definition implied by superkey [top] [G] generated type Gennick Golden Rule 2nd Gray 2nd group 2nd 3rd GROUP BY 2nd 3rd redundant [top] [H] Hall HAVING 2nd 3rd redundant heading relation 2nd tuple Heath Heath's Theorem 2nd 3rd proof Hitchcock Hodges [top] [I] idempotence identity projection restriction value image relation 2nd 3rd 4th 5th 6th 7th 8th definition implementation implementation defined implementation dependent implication 2nd 3rd 4th 5th law SQL (simulating) 2nd inclusion information equivalence Information Principle INSERT (SQL) 2nd INSERT expansion intended interpretation intension 2nd intersect 2nd definition vs join irreducibility relvar IS_EMPTY isolation [top] [J] join 2nd 3rd 4th 5th 6th definition greater-than SQL 2nd join dependence implied by superkeys trivial joinability [top] [K] key 2nd 3rd 4th 5th 6th definition irreducibility 2nd 3rd relvars not relations uniqueness Koppelaars 2nd [top] [L] Lincoln literal 2nd vs constant Lorentzos 2nd 3rd 4th [top] [M] Magritte materialization (view) McGoveran Melton missing information without nulls 2nd 3rd model vs implementation multivalued dependence 2nd 3rd implied by superkey trivial [top] [N] n-ary relation Navathe NO PAD nonloss decomposition 2nd 3rd normalization 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th NOT NULL 2nd null 2nd 3rd 4th 5th 6th 7th 8th 9th 10th generated by SQL vs real world 2nd [top] [O] object/relational 2nd Ohori Open World Assumption operator 2nd optimizer 2nd 3rd 4th 5th OR (aggregate operator) ORDER BY 2nd 3rd ordering not for attributes ordinal position (SQL) orthogonality (database design) 2nd 3rd outer join [top] [P] PAD SPACE parameter in predicates Peirce arrow pointer positioned update 2nd possible representation possibly nondeterministic 2nd 3rd 4th 5th possrep predicate 2nd compound instantiation simple view vs boolean expression predicate calculus prenex normal form primary key 2nd Principle of Identity of Indiscernibles 2nd Principle of Interchangeability 2nd 3rd 4th Principle of Orthogonal Design 2nd Principle of Uniform Representation Principle of Uniformity of Representation product 2nd definition vs join project 2nd definition projection-join normal form proper inclusion proper subset proper superkey proposition 2nd compound simple proto tuple public table 2nd purple parts 2nd 3rd 4th 5th 6th 7th [top] [Q] quantification law quantifiers 2nd 3rd 4th don't need both other kinds vs COUNT query query rewrite 2nd quota query 2nd [top] [R] range variable 2nd 3rd 4th SQL 2nd 3rd 4th 5th read-only redundancy 2nd 3rd 4th REF type reference value referenced relvar referencing vs foreign key referencing relvar referential action referential constraint referential integrity 2nd 3rd metaconstraint relation 2nd definition origin of term vs table 2nd 3rd 4th 5th vs type 2nd 3rd 4th relation comparison 2nd relation constant relation equality relation selector relation type inference SQL relation valued attribute 2nd 3rd 4th contraindicated 2nd relational algebra 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th 17th 18th 19th 20th 21st 22nd 23rd 24th 25th 26th 27th 28th 29th Tutorial D grammar relational calculus 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th relational completeness 2nd SQL 2nd relational model 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th definition 2nd 3rd objectives 2nd relcon relvar 2nd 3rd relvar predicate 2nd 3rd 4th relvar reference 2nd rename 2nd definition not primitive repeating group 2nd restrict 2nd 3rd definition restriction condition Reuter 2nd rewrite rule Robson row assignment row comparison 2nd row ID 2nd row type row type constructor row value constructor rule of inference Russell [top] [S] scalar 2nd 3rd 4th SQL 2nd SELECT * selector 2nd 3rd 4th SQL vs literal semantic optimization 2nd semidifference definition semijoin 2nd definition set function 2nd 3rd Shakespeare Sheffer stroke Simon sixth normal form snapshot 2nd 3rd source statement logic Stonebraker strong typing SQL subkey subquery correlated 2nd lateral 2nd row 2nd scalar table substitution (view) SUMD summarize 2nd 3rd 4th 5th 6th definition superkey 2nd surrogate key 2nd [top] [T] table expression SQL grammar 2nd 3rd table reference table value constructor TABLE_DEE 2nd 3rd 4th vs TRUE TABLE_DUM 2nd tables and views 2nd target tautology 3VL temporal data 2nd THE_ operator 2nd 3rd 4th SQL 2nd theorem three-valued logic Todd transaction 2nd TransRelational™ Model 2nd 3rd trigger 2nd truth functional completeness truth tables 2VL 3VL 2nd tuple 2nd 3rd 4th 5th 6th 7th definition tuple equality 2nd TUPLE FROM tuple selector tuple type two-valued logic type 2nd 3rd 4th 5th 6th 7th 8th type constraint 2nd 3rd 4th 5th SQL 2nd type constructor not for tables 2nd ROW type generator RELATION TUPLE typed table [top] [U] U_ operators ungroup 2nd union 2nd 3rd 4th definition UNIQUE (quantifier) 2nd UNIQUE (SQL) 2nd 3rd 4th 5th 6th 7th 8th unique index universal quantification iterated AND 2nd not in SQL 2nd 3rd 4th SQL (simulating) 2nd UNKNOWN 2nd UPDATE (read-only) update anomalies 2nd UPDATE expansion 2nd update vs UPDATE updating is set level 2nd 3rd 4th [top] [V] value vs variable VALUES variable 2nd view definition predicate purpose retrieval 2nd 3rd view defining expression view strategy 2nd view updating 2nd SQL 2nd [top] [W] Weikum what if queries WITH 2nd Wittgenstein 2nd 3rd [top] [X] XML 2nd 3rd 4th 5th 6th 7th XOR (aggregate operator) [top] ...Database theory and practice have evolved since Edgar Codd originally defined the relational model back in 1969 Independent of any SQL products, SQL and Relational Theory draws on decades of research to present the most up-to-date... appropriate discipline and it's the thesis of this book that using SQL relationally is the discipline you need But what does this mean? Isn't SQL relational anyway? Well, it's true that SQL is the standard language for use with relational databases—but that fact in itself doesn't make it... So if you don't want to hang yourself, you need to understand relational theory (what it is and why); you need to know about SQL' s departures from that theory; and you need to know how to avoid the problems they can cause

Ngày đăng: 26/03/2019, 17:07