www.it-ebooks.info www.it-ebooks.info SQL and Relational Theory How to Write Accurate SQL Code SECOND EDITION C. J. Date sql_final.pdf 1 12/8/11 2:33:04 PM www.it-ebooks.info SQL and Relational Theory: How to Write Accurate SQL Code (2 nd edition) by C. J. Date Copyright © 2012 C. J. Date. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://my.safaribooksonline.com). For more information, contact our corporate/institutional sales department: (800) 998-9938 or corporate@oreilly.com. Printing History: January 2009: First Edition. December 2011: Second Edition. Revision History: 2011-12-08 First release See http://oreilly.com/catalog/errata.csp?isbn= 9781449316402 for release details. Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. SQL and Relational Theory: How to Write Accurate SQL Code and related trade dress are trademarks of O’Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc., was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. ISBN: 978-1-449-31640-2 [LSI] sql_final.pdf 2 12/8/11 2:33:05 PM www.it-ebooks.info Those who are enamored of practice without theory are like a pilot who goes into a ship without rudder or compass and never has any certainty where he is going Practice should always be based upon a sound knowledge of theory. —Leonardo da Vinci (1452–1519) The trouble with people is not that they don’t know but that they know so much that ain’t so. —Josh Billings (1818–1885) Languages die mathematical ideas do not. —G. H. Hardy (1877–1947) Unfortunately, the gap between theory and practice is not as wide in theory as it is in practice. —Anon. These are my principles. If you don’t like them, I have others. —Groucho Marx (1890–1977) There is no royal road to geometry. —Euclid (c. 365–275 BCE), attrib. ——— ®®®®® ——— To all those who think an exercise like this one is worthwhile, and in particular to the memory of Lex de Haan, who is very much missed sql_final.pdf 3 12/8/11 2:33:05 PM www.it-ebooks.info A b o u t t h e A u t h o r C. J. Date is an independent author, lecturer, researcher, and consultant, specializing in relational database technology. He is best known for his book An Introduction to Database Systems, 8th edition (Addison-Wesley, 2004), which has sold some 850,000 copies at the time of writing and is used by several hundred colleges and universities worldwide. He is also the author of many other books on database management, including most recently: From Addison-Wesley: Databases, Types, and the Relational Model: The Third Manifesto, 3rd edition (coauthored with Hugh Darwen, 2006) From Apress: Date on Database: Writings 2000–2006 (2006) From Trafford: Logic and Databases: The Roots of Relational Theory (2007) From Apress: The Relational Database Dictionary, Extended Edition (2008) From Trafford: Database Explorations: Essays on The Third Manifesto and Related Topics (coauthored with Hugh Darwen, 2010) From Ventus: Go Faster! The TransRelational TM Approach to DBMS Implementation (2011) Another book, Normal Forms and All That Jazz: A Database Professional’s Guide to Database Design Theory (a companion to the present book), is also due for publication in the near future. Mr. Date was inducted into the Computing Industry Hall of Fame in 2004. He enjoys a reputation that is second to none for his ability to explain complex technical subjects in a clear and understandable fashion. sql_final.pdf 4 12/8/11 2:33:05 PM www.it-ebooks.info C o n t e n t s Preface to the First Edition xi Preface to the Second Edition xvi Chapter 1 Setting the Scene 1 The relational model is much misunderstood 1 Some remarks on terminology 2 Principles not products 4 A review of the original model 5 Model vs. implementation 12 Properties of relations 14 Base vs. derived relations 18 Relations vs. relvars 19 Values vs. variables 21 Concluding remarks 22 Exercises 23 Chapter 2 Types and Domains 25 Types and relations 25 Equality comparisons 26 Data value atomicity 31 What’s a type? 34 Scalar vs. nonscalar types 37 Scalar types in SQL 39 Type checking and coercion in SQL 40 Collations in SQL 42 Row and table types in SQL 43 Concluding remarks 45 Exercises 46 Chapter 3 Tuples and Relations, Rows and Tables 49 What’s a tuple? 49 Rows in SQL 53 What’s a relation? 55 Relations and their bodies 57 Relations are n-dimensional 58 Relational comparisons 58 TABLE_DUM and TABLE_DEE 59 Tables in SQL 60 sql_final.pdf 5 12/8/11 2:33:05 PM www.it-ebooks.info vi Contents Column naming in SQL 62 Concluding remarks 64 Exercises 64 Chapter 4 No Duplicates, No Nulls 67 What’s wrong with duplicates? 67 Duplicates: further issues 72 Avoiding duplicates in SQL 72 What’s wrong with nulls? 74 Avoiding nulls in SQL 77 A remark on outer join 79 Concluding remarks 80 Exercises 80 Chapter 5 Base Relvars, Base Tables 85 Updating is set level 86 Relational assignment 88 More on candidate keys 92 More on foreign keys 94 Relvars and predicates 97 Relations vs. types 99 Exercises 101 Chapter 6 SQL and Relational Algebra I: The Original Operators 105 Some preliminaries 105 More on closure 108 Restriction 110 Projection 111 Join 112 Union, intersection, and difference 116 Which operators are primitive? 119 Formulating expressions one step at a time 119 What do relational expressions mean? 121 Evaluating SQL table expressions 122 Expression transformation 123 The reliance on attribute names 125 Exercises 127 Chapter 7 SQL and Relational Algebra II: Additional Operators 131 Exclusive union 131 Semijoin and semidifference 132 Extend 133 Image relations 135 Divide 138 sql_final.pdf 6 12/8/11 2:33:05 PM www.it-ebooks.info Contents vii Aggregate operators 139 Image relations bis 144 Summarization 146 Summarization bis 150 Group, ungroup, and relation valued attributes 152 “What if” queries 157 A note on recursion 159 What about ORDER BY? 163 Exercises 164 Chapter 8 SQL and Constraints 169 Type constraints 169 Type constraints in SQL 173 Database constraints 174 Database constraints in SQL 178 Transactions 180 Why database constraint checking must be immediate 180 But doesn’t some checking have to be deferred? 182 Constraints and predicates 185 Miscellaneous issues 186 Exercises 188 Chapter 9 SQL and Views 193 Views are relvars 194 Views and predicates 197 Retrieval operations 198 Views and constraints 199 Update operations 203 What are views for? 211 Views and snapshots 212 Exercises 213 Chapter 10 SQL and Logic 215 Why do we need logic? 216 Simple and compound propositions 217 Simple and compound predicates 222 Quantification 223 Relational calculus 227 More on quantification 234 Some equivalences 241 Concluding remarks 244 Exercises 244 sql_final.pdf 7 12/8/11 2:33:05 PM www.it-ebooks.info viii Contents Chapter 11 Using Logic to Formulate SQL Expressions 247 Some transformation laws 247 Example 1: Logical implication 250 Example 2: Universal quantification 251 Example 3: Implication and universal quantification 252 Example 4: Correlated subqueries 254 Example 5: Naming subexpressions 255 Example 6: More on naming subexpressions 258 Example 7: Dealing with ambiguity 259 Example 8: Using COUNT 261 Example 9: Join queries 262 Example 10: UNIQUE quantification 263 Example 11: ALL or ANY comparisons 265 Example 12: GROUP BY and HAVING 269 Exercises 270 Chapter 12 Miscellaneous SQL Topics 273 SELECT * 273 Explicit tables 274 Name qualification 274 Range variables 275 Subqueries 277 “Possibly nondeterministic” expressions 280 Empty sets 281 A simplified BNF grammar 281 Exercises 285 Appendix A The Relational Model 287 The relational model vs. others 288 The significance of theory 291 The relational model defined 293 Database variables 298 Objectives of the relational model 299 Some database principles 300 What remains to be done? 301 Appendix B SQL Departures from the Relational Model 305 Appendix C A Relational Approach to Missing Information 307 Vertical decomposition 308 Horizontal decomposition 309 What do the shaded entries mean? 311 Constraints 313 sql_final.pdf 8 12/8/11 2:33:05 PM www.it-ebooks.info [...]... do, no more and no less—you must follow some appropriate discipline And it’s the thesis of this book that using SQL relationally is the discipline you need But what does this mean? Isn’t SQL relational anyway? Well, it’s true that SQL is the standard language for use with relational databases—but that fact in itself doesn’t make it relational The sad truth is, SQL departs from relational theory in all... rows and nulls are two obvious examples, but they’re not the only ones As a consequence, the language gives you rope to hang yourself with, as it were So if you don’t want to hang yourself, you need to understand relational theory (what it is and why); you need to know about SQL s departures from that theory; and you need to know how to avoid the problems they can cause In a word, you need to use SQL relationally... database practitioner and therefore reasonably familiar with SQL already To be specific, I assume you have a working knowledge of either the SQL standard or (perhaps more likely in practice) at least one SQL product However, I don’t assume you have a deep knowledge of relational theory as such (though I do hope you understand that the relational model is a good thing in general, and adherence to it wherever... about SQL; but and I apologize for the possibly offensive tone here─if your knowledge of the relational model derives only from your knowledge of SQL, then I’m afraid you won’t know the relational model as well as you should, and you’ll probably know “some things that ain’t so.” I can’t say it too strongly: SQL and the relational model aren’t the same thing Here by way of illustration are some relational. .. or editions, over the years The version current at the time of writing is SQL: 2008 (a formal reference for which can be found in Appendix G); the previous version was SQL: 2003, the one before that was SQL: 1999, and the one before that was SQL: 1992 Most of the SQL features discussed in this book were present in SQL: 1992, and often in even earlier versions 3 These particular limitations were added in SQL: 2003;... without using nulls); examples, exercises, and answers have been expanded and improved in various respects; and the treatment of SQL has been upgraded to cover recent changes to the SQL standard A variety of corrections and numerous cosmetic improvements have also been made.2 (In particular, the Tutorial D examples—Tutorial D being the language I use to illustrate relational concepts—have been upgraded... principles of relational theory in a way not tainted by the quirks and peculiarities of existing products, commercial practice, or the SQL standard I wrote this book to fill that need My intended audience is thus experienced database practitioners who are honest enough to admit they don’t understand the theory underlying their own field as well as they might, or should That theory is, of course, the relational. .. opinion, SQL is such a difficult language that it can be far from obvious how to use it without violating relational principles I therefore decided to expand the original book to include explicit, concrete advice on exactly that issue (how to use SQL relationally, I mean) So my aim in the present book is still the same as before—I want to help database practitioners understand relational theory in depth and. .. of tuples, and sets in mathematics don’t contain duplicate elements Now, SQL fails here, as I’m sure you know: SQL tables are allowed to contain duplicate rows and thus aren’t relations, in general Please understand, therefore, that throughout this book I always use the term “relation” to mean a relation─without duplicate tuples, by definition and not an SQL table Please understand too that relational. .. this list by new ones www.it-ebooks.info sql_ final.pdf 12 12/8/11 2:33:05 PM Preface to the First Edition xiii theoretical, it can’t be practical But the truth is that theory (at least, relational theory, which is what I’m talking about here) is most definitely very practical indeed The purpose of that theory is not just theory for its own sake; the purpose of that theory is to allow us to build systems . www.it-ebooks.info www.it-ebooks.info SQL and Relational Theory How to Write Accurate SQL Code SECOND EDITION C. J. Date sql_ final.pdf 1 12/8/11 2:33:04 PM www.it-ebooks.info SQL and Relational Theory: How to. Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. SQL and Relational Theory: How to Write Accurate SQL Code and related. exercises, and answers have been expanded and improved in various respects; and the treatment of SQL has been upgraded to cover recent changes to the SQL standard. A variety of corrections and numerous