If you are a data processing professional, architect, or manager, this book will give you a perspective on the impact that SQL is having across the information technology industry—from p
Trang 2SQL The Complete Reference,
Third Edition
Trang 4SQL The Complete Reference,
Third Edition
Paul WeinbergJames GroffAndrew Oppel
New York Chicago San Francisco Lisbon London Madrid Mexico City
Milan New Delhi San Juan Seoul Singapore Sydney Toronto
Trang 5ISBN: 978-0-07-159256-7MHID: 0-07-159256-3The material in this eBook also appears in the print version of this title: ISBN: 978-0-07-159255-0, MHID: 0-07-159255-5.All trademarks are trademarks of their respective owners Rather than put a trademark symbol after every occurrence of a trademarked name,we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark Wheresuch designations appear in this book, they have been printed with initial caps.
McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate training grams To contact a representative please e-mail us at bulksales@mcgraw-hill.com.
pro-Information has been obtained by McGraw-Hill from sources believed to be reliable However, because of the possibility of human or ical error by our sources, McGraw-Hill, or others, McGraw-Hill does not guarantee the accuracy, adequacy, or completeness of any informa-tion and is not responsible for any errors or omissions or the results obtained from the use of such information.
mechan-TERMS OF USEThis is a copyrighted work and The McGraw-Hill Companies, Inc (“McGraw-Hill”) and its licensors reserve all rights in and to the work Useof this work is subject to these terms Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of thework, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, dis-seminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent You may use the work for your own non-commercial and personal use; any other use of the work is strictly prohibited Your right to use the work may be terminated if you fail to com-ply with these terms.
THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS TO THEACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DIS-CLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MER-CHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE McGraw-Hill and its licensors do not warrant or guarantee that the func-tions contained in the work will meet your requirements or that its operation will be uninterrupted or error free Neither McGraw-Hill nor itslicensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause, in the work or for any damages result-ing therefrom McGraw-Hill has no responsibility for the content of any information accessed through the work Under no circumstances shallMcGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from theuse of or inability to use the work, even if any of them has been advised of the possibility of such damages This limitation of liability shallapply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise.
Trang 6provider of in-memory SQL databases He led TimesTen from its early days through eight years of growth and a successful acquisition by Oracle in 2005, where he served as a senior vice president, and Oracle TimesTen became Oracle’s flagship real-time database product Groff was the
cofounder, with Paul Weinberg, of Network Innovations Corporation, a developer of SQL-based
networking software, and coauthor with him of Understanding UNIX: A Conceptual Guide as well
as this book Groff has also held senior division management and marketing positions at Apple Computer and Hewlett-Packard He holds a BS in Mathematics from the Massachusetts Institute of Technology and an MBA from Harvard University
Paul N Weinberg is a senior vice president at SAP, where he runs core MDM (Master Data Management) development Prior to working at SAP, Weinberg was president of A2i, Inc., which was acquired by SAP in 2004 for its enterprisewide platform for product content management and catalog publishing Weinberg was the cofounder, with James Groff, of Network Innovations Corporation, a pioneer in client/server database access that was acquired by Apple Computer in
1988, and coauthor with him of Understanding UNIX:
A Conceptual Guide as well as this book He has also held software development and marketing positions at Bell Laboratories, Hewlett-Packard, and Plexus Computers In 1981, he collaborated
on The Simple Solution to Rubik’s Cube, the number-one best-selling book of that year, with over 6
million copies sold He holds a BS from the University of Michigan and an MS from Stanford University, both in Computer Science
Andrew J (Andy) Oppel is lead data modeler at Blue Shield of California In addition, he has served as a part-time instructor in database technology with the University of California at Berkeley, Extension for more than 20 years Andy has designed and implemented hundreds of databases for a wide range of applications, including heath care, banking, insurance, apparel manufacturing, telecommunications, wireless communications, and human resources He is the
author of Databases Demystified, SQL Demystified, and Databases: A Beginner’s Guide and is coauthor of SQL: A Beginner’s Guide He holds a BA in Computer Science from Transylvania University
(Lexington, KY)
About the Technical Editor
Aaron Davenport has been working with SQL-based RDBMS technologies for over ten years He is currently a principal at LCS Technologies, Inc., a Sacramento and San Francisco Bay Area database consulting firm specializing in performance tuning, application development, and database architecture Prior to joining LCS, Aaron had tenures at Yahoo!, Gap Inc., and Blue Shield of California
Trang 89 Subqueries and Query Expressions 187
Part III Updating Data10 Database Updates 231
Trang 9Part V Programming with SQL
21 SQL and Data Warehousing 667
22 SQL and Application Servers 681
23 SQL Networking and Distributed Databases 699
Trang 10High-Level, English-Like Structure 10
Interactive, Ad Hoc Queries 10
Programmatic Database Access 10
Multiple Views of Data 10
Complete Database Language 10
Dynamic Data Definition 10
Client/Server Architecture 11
Enterprise Application Support 11
Extensibility and Object Technology 11
Internet Database Access 11
Trang 113 SQL in Perspective 21
SQL and the Evolution of Database Management 21
A Brief History of SQL 22
The Early Years 22
Early Relational Products 22
IBM Products 24
Commercial Acceptance 25
SQL Standards 26
The ANSI/ISO Standards 26
Other Early SQL Standards 29
ODBC and the SQL Access Group 29
JDBC and Application Servers 30
SQL and Transaction Processing 38
SQL and Workgroup Databases 39
SQL, Data Warehousing, and Business Intelligence 40
SQL and Internet Applications 42
Summary 43
4 Relational Databases 45
Early Data Models 45
File Management Systems 45
Hierarchical Databases 47
Network Databases 48
The Relational Data Model 50
The Sample Database 51
Trang 12Part II Retrieving Data
The SELECT Statement 85
The SELECT Clause 87
The FROM Clause 88
Query Results 88
Simple Queries 90
Calculated Columns 91
Selecting All Columns (SELECT *) 93
Duplicate Rows (DISTINCT) 94
Row Selection (WHERE Clause) 95
Search Conditions 97
The Comparison Test (=, <>, <, <=, >, >=) 97
The Range Test (BETWEEN) 100
The Set Membership Test (IN) 102
The Pattern Matching Test (LIKE) 104
The Null Value Test (IS NULL) 106
Compound Search Conditions (AND, OR, and NOT) 107
Sorting Query Results (ORDER BY Clause) 110
Rules for Single-Table Query Processing 112
Combining Query Results (UNION)* 113
Unions and Duplicate Rows* 115
Unions and Sorting* 116
Multiple UNIONs* 117
Summary 118
Trang 137 Multitable Queries (Joins) 119
A Two-Table Query Example 119
Simple Joins (Equi-Joins) 121
Parent/Child Queries 123
An Alternative Way to Specify Joins 125
Joins with Row Selection Criteria 126
Multiple Matching Columns 127
Natural Joins 128
Queries with Three or More Tables 129
Other Equi-Joins 131
Non-Equi-Joins 134
SQL Considerations for Multitable Queries 134
Qualified Column Names 135
All-Column Selections 136
Self-Joins 137
Table Aliases 139
Multitable Query Performance 141
The Structure of a Join 142
Table Multiplication 142
Rules for Multitable Query Processing 143
Outer Joins 144
Left and Right Outer Joins 148
Older Outer Join Notation* 151
Joins and the SQL Standard 153
Inner Joins in Standard SQL 153
Outer Joins in Standard SQL* 154
Cross Joins in Standard SQL* 155
Multitable Joins in Standard SQL 157
Summary 162
8 Summary Queries 163
Column Functions 163
Computing a Column Total (SUM) 165
Computing a Column Average (AVG) 166
Finding Extreme Values (MIN and MAX) 166
Counting Data Values (COUNT) 168
Column Functions in the Select List 169
NULL Values and Column Functions 171
Duplicate Row Elimination (DISTINCT) 173
Grouped Queries (GROUP BY Clause) 173
Multiple Grouping Columns 176
Restrictions on Grouped Queries 179
NULL Values in Grouping Columns 181
Trang 14Group Search Conditions (HAVING Clause) 182
Restrictions on Group Search Conditions 185
NULL Values and Group Search Conditions 186
HAVING Without GROUP BY 186
Subquery Search Conditions 192
The Subquery Comparison Test (=, <>, <, <=, >, >=) 192
The Set Membership Test (IN) 194
The Existence Test (EXISTS) 196
Quantified Tests (ANY and ALL)* 198
Subqueries and Joins 203
SQL Queries: A Final Summary 227
Part III Updating Data10 Database Updates 231
Adding Data to the Database 231
The Single-Row INSERT Statement 232
The Multirow INSERT Statement 235
Bulk Load Utilities 238
Deleting Data from the Database 238
The DELETE Statement 239
Deleting All Rows 240
DELETE with Subquery* 241
Modifying Data in the Database 242
The UPDATE Statement 243
Updating All Rows 245
UPDATE with Subquery* 245
Summary 246
Trang 1511 Data Integrity 247
What Is Data Integrity? 248
Required Data 249
Simple Validity Checking 250
Column Check Constraints 251
Domains 251
Entity Integrity 253
Other Uniqueness Constraints 253
Uniqueness and NULL Values 254
Referential Integrity 255
Referential Integrity Problems 256
Delete and Update Rules* 258
Cascaded Deletes and Updates* 262
Referential Cycles* 262
Foreign Keys and NULL Values* 267
Advanced Constraint Capabilities 269
Triggers and Referential Integrity 277
Trigger Advantages and Disadvantages 277
Triggers and the SQL Standard 278
Summary 279
12 Transaction Processing 281
What Is a Transaction? 282
The ANSI/ISO SQL Transaction Model 284
The START TRANSACTION and SET TRANSACTION Statements 284
The SAVEPOINT and RELEASE SAVEPOINT Statements 286
The COMMIT and ROLLBACK Statements 286
Transactions: Behind the Scenes* 289
Transactions and Multiuser Processing 290
The Lost Update Problem 291
The Uncommitted Data Problem 292
The Inconsistent Data Problem 293
The Phantom Insert Problem 294
Trang 16Creating a Table (CREATE TABLE) 318
Removing a Table (DROP TABLE) 327
Changing a Table Definition (ALTER TABLE) 328
Constraint Definitions 332
Assertions 332
Domains 333
Aliases and Synonyms (CREATE/DROP ALIAS) 333
Indexes (CREATE/DROP INDEX) 335
Managing Other Database Objects 339
Database Structure 342
Single-Database Architecture 343
Multidatabase Architecture 344
Multilocation Architecture 346
Databases on Multiple Servers 348
Database Structure and the ANSI/ISO Standard 348
View Updates and the ANSI/ISO Standard 367
View Updates in Commercial SQL Products 368
Checking View Updates (CHECK OPTION) 368
Trang 17Dropping a View (DROP VIEW) 371
Views and SQL Security 384
Granting Privileges (GRANT) 386
Column Privileges 388
Passing Privileges (GRANT OPTION) 389
Revoking Privileges (REVOKE) 391
REVOKE and the GRANT OPTION 393
REVOKE and the ANSI/ISO Standard 394
Role-Based Security 396
Summary 398
16 The System Catalog 399
What Is the System Catalog? 399
The Catalog and Query Tools 400
The Catalog and the ANSI/ISO Standard 401
The SQL Information Schema 418
Other Catalog Information 425
Developing an Embedded SQL Program 434
Running an Embedded SQL Program 437
Trang 18Simple Embedded SQL Statements 439
Declaring Tables 441
Error Handling 443
Using Host Variables 451
Data Retrieval in Embedded SQL 457
Single-Row Queries 457
Multirow Queries 464
Cursor-Based Deletes and Updates 470
Cursors and Transaction Processing 475
Summary 476
18 Dynamic SQL* 477
Limitations of Static SQL 477
Dynamic SQL Concepts 479
Dynamic Statement Execution (EXECUTE IMMEDIATE) 480
Two-Step Dynamic Execution 483
The PREPARE Statement 485
The EXECUTE Statement 486
Dynamic Queries 493
The DESCRIBE Statement 495
The DECLARE CURSOR Statement 500
The Dynamic OPEN Statement 500
The Dynamic FETCH Statement 503
The Dynamic CLOSE Statement 504
Dynamic SQL Dialects 504
Dynamic SQL in Oracle* 504
Dynamic SQL and the SQL Standard 508
Basic Dynamic SQL Statements 508
The Standard SQLDA 510
The SQL Standard and Dynamic SQL Queries 515
Summary 518
19 SQL APIs 521
API Concepts 522
The dblib API (SQL Server) 523
Basic SQL Server Techniques 524
SQL Server Queries 532
Positioned Updates 539
Dynamic Queries 540
ODBC and the SQL/CLI Standard 549
The Call-Level Interface Standardization 549
CLI Structures 552
CLI Statement Processing 557
CLI Errors and Diagnostic Information 575
CLI Attributes 577
CLI Information Calls 577
Trang 19The ODBC API 579
The Structure of ODBC 580
ODBC and DBMS Independence 581
ODBC Catalog Functions 581
Extended ODBC Capabilities 582
The Oracle Call Interface (OCI) 586
Java Database Connectivity (JDBC) 592
JDBC History and Versions 592
JDBC Implementations and Driver Types 593
Using Stored Procedures 621
Creating a Stored Procedure 622
Calling a Stored Procedure 624
Stored Procedure Variables 625
Handling Error Conditions 643
Advantages of Stored Procedures 645
Stored Procedure Performance 646
System-Defined Stored Procedures 647
External Stored Procedures 647
Trang 20Stored Procedures, Functions, Triggers, and the SQL Standard 655
The SQL/PSM Stored Procedures Standard 656
The SQL/PSM Triggers Standard 664
Summary 666
21 SQL and Data Warehousing 667
Data Warehousing Concepts 668
Components of a Data Warehouse 669
The Evolution of Data Warehousing 670
Database Architecture for Warehousing 671
22 SQL and Application Servers 681
SQL and Web Sites: Early Implementations 681
Application Servers and Three-Tier Web Site Architectures 682
Database Access from Application Servers 684
EJB Types 685
Session Bean Database Access 686
Entity Bean Database Access 689
EJB 2.0 Enhancements 692
EJB 3.0 Enhancements 693
Open Source Application Development 695
Application Server Caching 695
Summary 698
23 SQL Networking and Distributed Databases 699
The Challenge of Distributed Data Management 700
Distributing Data: Practical Approaches 704
Remote Database Access 705
Remote Data Transparency 708
Table Extracts 709
Table Replication 711
Updateable Replicas 713
Replication Trade-Offs 715
Typical Replication Architectures 715
Distributed Database Access 719
Remote Requests 720
Remote Transactions 721
Distributed Transactions 722
Distributed Requests 722
Trang 21The Two-Phase Commit Protocol* 724
Network Applications and Database Architecture 727
Client/Server Applications and Database Architecture 728
Client/Server Applications with Stored Procedures 729
Enterprise Applications and Data Caching 730
High-Volume Internet Data Management 731
Summary 733
24 SQL and Objects 735
Object-Oriented Databases 735
Object-Oriented Database Characteristics 736
Pros and Cons of Object-Oriented Databases 737
Objects and the Database Market 738
Object-Relational Databases 739
Large Object Support 740
LOBs in the Relational Model 740
Specialized LOB Processing 742
Abstract (Structured) Data Types 744
Defining Abstract Data Types 746
Manipulating Abstract Data Types 748
Inheritance 749
Table Inheritance: Implementing Object Classes 751
Sets, Arrays, and Collections 754
Defining Collections 755
Querying Collection Data 758
Manipulating Collection Data 759
Collections and Stored Procedures 760
User-Defined Data Types 762
Methods and Stored Procedures 763
Object Support in the SQL Standard 766
Trang 22XML and Metadata 788Document Type Definitions (DTDs) 790XML Schema 791XML and Queries 797XQuery Concepts 798Query Processing in XQuery 800XML Databases 802Summary 803
26 Specialty Databases 805
Very Low Latency and In-Memory Databases 805Anatomy of an In-Memory Database 806In-Memory Database Implementations 808Caching with In-Memory Databases 808Complex Event-Processing and Stream Databases 810Continuous Queries in Stream Databases 811Stream Database Implementations 812Stream Database Components 813Embedded Databases 814Embedded Database Characteristics 815Embedded Database Implementations 815Mobile Databases 816Mobile Database Roles 816Mobile Database Implementations 817Summary 818
27 The Future of SQL 819
Database Market Trends 820Enterprise Database Market Maturity 820Market Diversity and Segmentation 821Packaged Enterprise Applications 822Software-as-a-Service (SaaS) 823Hardware Performance Gains 823Database Server Appliances 824SQL Standardization 825SQL in the Next Decade 826Distributed Databases 826Massive Data Warehousing for Business Optimization 826Ultrahigh-Performance Databases 827Internet and Network Services Integration 828Embedded Databases 829Object Integration 829Cloud-Based and Horizontally Scalable Databases 830Summary 832
Trang 23Part VII Appendixes
A The Sample Database 835B DBMS Vendor Profiles 841C SQL Syntax Reference 857
Data Definition Statements 858Access Control Statements 859Basic Data Manipulation Statements 859Transaction-Processing Statements 860Cursor-Based Statements 860Query Expressions 860Search Conditions 862Expressions 863Statement Elements 863Simple Elements 864
Index 865
Trang 24Acknowledgments
Special thanks to Andy Oppel, our new coauthor for this third edition of SQL: The
Complete Reference. His impressive high-level mastery of the subject matter coupled with his meticulous attention to detail made this a better book, and we are fortunate to have had his involvement
—Jim and PaulIt’s an honor to join such an accomplished team of authors for this edition of SQL: The
Complete Reference My thanks for the excellent support of the entire McGraw-Hill team for their tireless support in this effort In particular I wish to thank technical editor Aaron Davenport and copy editor Jan Jue for their persistence and attention to detail, which contributed so much to the overall quality of this book
—Andy
xxiii
Trang 26SQL: The Complete Reference, Third Edition provides a comprehensive, in-depth treatment of the SQL language for both technical and nontechnical users, programmers, data processing professionals, and managers who want to understand the impact of SQL in today’s computer industry This book offers a conceptual framework for understanding and using SQL, describes the history of SQL and SQL standards, and explains the role of SQL in various computer industry segments, from enterprise data processing to data warehousing to web site architectures This new edition contains new chapters specially focused on the role of SQL in application server architectures, and the integration of SQL with XML and other object-based technologies
This book will show you, step-by-step, how to use SQL features, with many illustrations and realistic examples to clarify SQL concepts The book also compares SQL products from leading DBMS vendors—describing their advantages, benefits, and trade-offs—to help you select the right product for your application Most of the examples in this book are based on the sample database described in Appendix A The sample database contains data that supports a simple order-processing application for a small distribution company Appendix A also contains instructions for downloading the SQL statements required to create and populate the sample database tables in a DBMS of you choice, such as Oracle, SQL Server, MySQL, and DB2 This allows you to try the examples in the book yourself and gain actual experience writing and running SQL statements
In some of the chapters, the subject matter is explored at two different levels—a fundamental description of the topic, and an advanced discussion intended for computer professionals who need to understand some of the internals behind SQL The more advanced information is covered in sections marked with an asterisk (*) You do not need to read these sections to obtain an understanding of what SQL is and what it does
xxv
Trang 27How This Book Is Organized
The book is divided into six parts that cover various aspects of the SQL language:• Part I, “An Overview of SQL,” provides an introduction to SQL and a market
perspective of its role as a database language Its four chapters describe the history of SQL, the evolution of SQL standards, and how SQL relates to the relational data model and to earlier database technologies Part I also contains a quick tour of SQL that briefly illustrates its most important features and provides you with an overview of the entire language early in the book
• Part II, “Retrieving Data,” describes the features of SQL that allow you to perform database queries The first chapter in this part describes the basic structure of the SQL language The next four chapters start with the simplest SQL queries and progressively build to more complex queries, including multitable queries, summary queries, and queries that use subqueries
• Part III, “Updating Data,” shows how you can use SQL to add new data to a database, delete data from a database, and modify existing database data It also describes the database integrity issues that arise when data is updated, and how SQL addresses these issues The last of the three chapters in this part discusses the SQL transaction concept and SQL support for multiuser transaction processing.• Part IV, “Database Structure,” deals with creating and administering a SQL-based
database Its four chapters tell you how to create the tables, views, and indexes that form the structure of a relational database It also describes the SQL security scheme that prevents unauthorized access to data, and the SQL system catalog that
describes the structure of a database This part also discusses the significant differences between the database structures supported by various SQL-based DBMS products
• Part V, “Programming with SQL,” describes how application programs use SQL for database access It discusses the embedded SQL specified by the ANSI standard and used by IBM, Oracle, Ingres, Informix, and many other SQL-based DBMS products It also describes the dynamic SQL interface that is used to build general-purpose database tables, such as report writers and database browsing programs Finally, this part describes the popular SQL APIs, including ODBC, the ISO-standard Call-Level Interface, and JDBC, the standard call-level interface for Java, as well as proprietary call-level interfaces such as Oracle’s OCI API
• Part VI, “SQL Today and Tomorrow,” examines the use of SQL in several of today’s “hottest” application areas, and the current state of SQL-based DBMS products Two chapters describe the use of SQL stored procedures and triggers for online
transaction processing, and the contrasting use of SQL for data warehousing Four additional chapters describe SQL-based distributed databases, the influence of object technologies on SQL, specialty databases, and the integration of SQL with XML technologies Finally, the last chapter explores the future of SQL and some of the most important trends in SQL-based data management
Trang 28Conventions Used in This Book
SQL: The Complete Reference, Third Edition describes the SQL features and functions available in the most popular SQL-based DBMS products and those described in the ANSI/ISO SQL standards Whenever possible, the SQL statement syntax described in this book and used in the examples applies to all dialects of SQL When the dialects differ, the differences are pointed out in the text, and the examples follow the most common practice In these cases, you may have to modify the SQL statements in the examples slightly to suit your particular brand of DBMS
Throughout the book, technical terms appear in italics the first time they are used and
defined SQL language elements, including SQL keywords, table and column names, and sample SQL statements, appear in an UPPERCASE MONOSPACE font SQL API function names appear in a lowercase monospace font Program listings also appear in monospace font and use the normal case conventions for the particular programming language (uppercase for COBOL and FORTRAN, lowercase for C and Java) Note that these conventions are used solely to improve readability; most SQL implementations will accept either uppercase or lowercase statements Many of the SQL examples include query results, which appear immediately following the SQL statement, as they would in an interactive SQL session In some cases, long query results are truncated after a few rows; this is indicated by a vertical ellipsis (…) following the last row of query results
Why This Book Is for You
SQL: The Complete Reference, Third Edition is the right book for anyone who wants to understand and learn SQL, including database users, data processing professionals and architects, programmers, students, and managers It describes—in simple, understandable language liberally illustrated with figures and examples—what SQL is, why it is important, and how you use it This book is not specific to one particular brand or dialect of SQL Rather, it describes the standard, central core of the SQL language and then goes on to describe the differences among the most popular SQL products, including Oracle, Microsoft SQL Server, IBM’s DB2 Universal Database and Informix, Sybase, and MySQL It also explains the importance of SQL-based standards, such as ODBC and JDBC, and the ANSI/ISO standards for SQL and SQL-related technologies This third edition contains new chapters and sections that cover the latest SQL innovations, in the areas of object-relational technologies, XML, and application server architectures
If you are new to SQL, this book offers comprehensive, step-by-step treatment of the language, building from simple queries to more advanced concepts The structure of the book will allow you to quickly start using SQL, but the book will continue to be valuable as you begin to use the more complex features of the language You can create the sample database using an SQL script available on the McGraw-Hill website (see Appendix A) and use it to try out the examples and build your SQL skills
If you are a data processing professional, architect, or manager, this book will give you a perspective on the impact that SQL is having across the information technology industry—from personal computers to mainframes to data warehousing to Internet web sites and Internet-based distributed applications The early chapters describe the history of SQL, its role in the market, and its evolution from earlier database technologies Later chapters describe the future of SQL and the development of new database technologies, such as distributed databases, object-oriented extensions to SQL, business intelligence databases, and database/XML integration
Trang 29If you are a programmer, this book offers a very complete treatment of programming with SQL Unlike the reference manuals of many DBMS products, it offers a conceptual framework for SQL programming, explaining the why as well as the how of developing a SQL-based application It contrasts the SQL programming interfaces offered by all of the leading SQL products, including embedded SQL, dynamic SQL, ODBC, JDBC, and proprietary APIs such as the Oracle Call Interface The description and comparison of programming techniques provides a perspective not found in any other book.
If you are selecting a DBMS product, this book offers a comparison of the SQL features, advantages, and benefits offered by the various DBMS vendors The differences between the leading DBMS products are explained, not only in technical terms, but also in terms of their impact on applications and their evolving competitive position in the marketplace The “sample database” can be used to try these features in a prototype of your own application
In short, both technical and nontechnical users can benefit from this book It is the most comprehensive source of information available about the SQL language, SQL features and benefits, popular SQL-based products, the history of SQL, and the impact of SQL on the future direction of the information technology industry
Trang 30I An Overview of SQL
The first four chapters of this book provide a perspective and a quick introduction to SQL Chapter 1 describes what SQL is and explains its major features and benefits In Chapter 2, a quick tour of SQL shows you many of its capabilities with simple, rapid-fire examples Chapter 3 offers a market perspective of SQL by tracing its history, describing the SQL standards and the major vendors of SQL-based products, and by identifying the reasons for SQL’s prominence today Chapter 4 describes the relational data model upon which SQL is based and compares it with earlier data models
CHAPTER 1Introduction
CHAPTER 2A Quick Tour of SQL
CHAPTER 3SQL in Perspective
CHAPTER 4Relational Databases
PART
Trang 321 Introduction
The SQL language and relational database systems based on it constitute one of the most important foundation technologies in the computer industry Over the last three decades, SQL has grown from its first commercial use into a computer product and services market segment worth tens of billions of dollars per year, and SQL stands today as
the standard computer database language Hundreds of database products now support SQL, running on computer systems from mainframes to personal computers A SQL-based database may even be embedded in your mobile phone or PDA, or in the entertainment system of your car An official international SQL standard has been adopted and expanded several times Every major enterprise software product relies on SQL for its data management, and SQL is at the core of the flagship database products from Microsoft, Oracle, and IBM, three of the largest software companies in the world SQL is also at the heart of open-source database products such as MySQL and Postgres that are helping to fuel the popularity of Linux and the open source movement From its obscure beginnings as an IBM research project, SQL has grown to become both an important piece of information technology and a powerful market force
What, exactly, is SQL? Why is it important? What can it do, and how does it work? If SQL is really a standard, why do we have so many different versions and dialects? How do popular SQL products like SQL Server, Oracle, MySQL, Sybase, and DB2 compare? How does SQL relate to Microsoft standards such as ODBC and NET? How does JDBC link SQL to the world of Java and object technology? What role does it play in the Service-Oriented Architecture (SOA) and web services being embraced by enterprise IT organizations? Does SQL really scale from mainframes to handheld devices? Has it really delivered the
performance needed for high-volume transaction processing? How will SQL impact the way you use computers, and how can you get the most out of this important data management tool? This book answers those questions by giving you a complete perspective and a solid working knowledge of SQL
33
CHAPTER
Trang 33The SQL LanguageSQL is a tool for organizing, managing, and retrieving data stored by a computer database
The original name given it by IBM was Structured English Query Language, shortened to the acronym SEQUEL When IBM discovered that SEQUEL was a trademark owned by the Hawker Siddeley Aircraft Company of the United Kingdom, they shortened the acronym to SQL The
word “English” was then dropped from the spelled-out name to match the new acronym To this day, you will hear the acronym SQL pronounced as either a word (“sequel”) or as a string
of letters (“S-Q-L”), and while the latter is generally preferred, both are considered correct As the name implies, SQL is a computer language that you use to interact with a database In fact, SQL works with one specific type of database, called a relational database, which has become
the mainstream way to organize data across a very broad range of computer applications
Figure 1-1 shows how SQL works The computer system in the figure has a database that
stores important information If the computer system is in a business, the database might store inventory, production, sales, or payroll data On a personal computer, the database might store data about the checks you have written, lists of people and their phone numbers, or data extracted from a larger computer system The computer program that controls the
database is called a database management system (DBMS).
When you need to retrieve data from a database, you use the SQL to make the request The DBMS processes the SQL request, retrieves the requested data, and returns it to you This process of requesting data from a database and receiving the results is called a database
query—hence the name Structured Query Language.
“Structured Query Language” is actually somewhat of a misnomer First of all, SQL is far more than a query tool, although that was its original purpose, and retrieving data is still one of its most important functions SQL is used to control all of the functions that a DBMS provides for its users, including
FIGURE 1-1 Using SQL for database access
Trang 34• Data definition SQL lets a user define the structure and organization of the stored
data and relationships among the stored data items
• Data retrieval SQL allows a user or an application program to retrieve stored data
from the database and use it
• Data manipulation SQL allows a user or an application program to update the
database by adding new data, removing old data, and modifying previously stored data
• Access control SQL can be used to restrict a user’s ability to retrieve, add, and
modify data, protecting stored data against unauthorized access
• Data sharing SQL is used to coordinate data sharing by concurrent users, ensuring
that changes made by one user do not inadvertently wipe out changes made at nearly the same time by another user
• Data integrity SQL defines integrity constraints in the database, protecting it from
corruption due to inconsistent updates or system failures.SQL is thus a comprehensive language for controlling and interacting with a database management system
Second, SQL is not really a complete computer language like COBOL, C, C++, or Java Instead, SQL is a database sublanguage, consisting of about 40 statements specialized for database management tasks These SQL statements can be embedded into another language such as COBOL or C to extend that language for use in database access Alternatively, the statements can be explicitly sent to a database management system for processing,
via a call-level interface from a language such as C, C++, or Java, or via messages sent over
Finally, SQL is not a particularly structured language, especially when compared with highly structured languages such as C, Pascal, or Java Instead, SQL statements resemble English sentences, complete with “noise words” that don’t add to the meaning of the statement but make it read more naturally The SQL has quite a few inconsistencies and also some special rules to prevent you from constructing SQL statements that look perfectly legal but that don’t make sense
Despite the inaccuracy of its name, SQL has emerged as the standard language for using
relational databases SQL is both a powerful language and one that is relatively easy to learn The quick tour of SQL in Chapter 2 will give you a good overview of the language and its capabilities
Trang 35The Role of SQLSQL is not itself a database management system, nor is it a stand-alone product You cannot go to a computer retailer or a web site selling computer software and buy SQL Instead, SQL is an integral part of a database management system, a language and a tool for communicating with the DBMS Figure 1-2 shows some of the components of a typical DBMS and how SQL links them together.
The database engine is the heart of the DBMS, responsible for actually structuring, storing,
and retrieving the data in the database It accepts SQL requests from other DBMS components (such as a forms facility, report writer, or interactive query facility), from user-written
application programs, and even from other computer systems As the figure shows, SQL plays many different roles:
• SQL is an interactive query language Users type SQL commands into an interactive
SQL program to retrieve data and display it on the screen, providing a convenient, easy-to-use tool for ad hoc database queries
• SQL is a database programming language Programmers embed SQL commands into
their application programs to access the data in a database Both user-written programs and database utility programs (such as report writers and data entry tools) use this technique for database access
FIGURE 1-2 Components of a typical database management system
Trang 36• SQL is a database administration language The database administrator responsible for
managing a minicomputer or mainframe database uses SQL to define the database structure and to control access to the stored data
• SQL is a client/server language Personal computer programs use SQL to communicate
over a network with database servers that store shared data This client/server architecture is used by many popular enterprise-class applications
• SQL is an Internet data access language Internet web servers that interact with
corporate data and Internet application servers all use SQL as a standard language for accessing corporate databases, often by embedding SQL database access within popular scripting languages like PHP or Perl
• SQL is a distributed database language Distributed database management systems use
SQL to help distribute data across many connected computer systems The DBMS software on each system uses SQL to communicate with the other systems, sending requests for data access
• SQL is a database gateway language In a computer network with a mix of different DBMS products, SQL is often used in a gateway that allows one brand of DBMS to
communicate with another brand.SQL has thus emerged as a useful, powerful tool for linking people, computer programs, and computer systems to the data stored in a relational database
SQL Success FactorsIn historical terms, SQL has been an extraordinarily successful information technology Think about the computer market in the mid-1980s, when SQL first started to become important Mainframes and minicomputers dominated corporate computing The IBM personal computer had been introduced only a few years before, and the MS-DOS command line was its user interface IBM’s mainframe operating systems and minicomputer operating systems from Digital Equipment, Data General, Hewlett-Packard, and others dominated business computing Proprietary networking schemes like IBM’s SNA or Digital Equipment’s DECnet linked computers together The Internet was still a tool for collaboration among research labs, and the World Wide Web had not yet appeared on the scene COBOL, C, and Pascal were dominant computer languages; object-oriented programming was only beginning to emerge; and Java had not been invented
Across all of these areas of computer technology—from computer hardware to operating systems to networking to languages—the important key technologies of the mid-1980s have faded or become obsolete, replaced by significant new ones But in the world of data management, the relational database and SQL continue to dominate the landscape They have expanded over the years to support new hardware, operating systems, networks, and languages, but despite many attempts to dethrone them, the core relational model and the
SQL have thrived and remain the dominant forces in data management Here are some of the
major features and market forces that have contributed to this success over the past 25 years:• Vendor independence
• Portability across computer systems
Trang 37• Official SQL standards• Early IBM commitment• Microsoft support• Relational foundation• High-level, English-like structure• Interactive, ad hoc queries• Programmatic database access• Multiple views of data• Complete database language• Dynamic data definition• Client/Server architecture• Enterprise application support• Extensibility and object technology• Internet database access
• Java integration (JDBC)• Open source support• Industry infrastructureThe sections that follow briefly describe each of these and how they contributed to SQL’s success
Vendor Independence
SQL is offered by all of the leading DBMS vendors, and no new database product over the last decade has been highly successful without SQL support A SQL-based database and the programs that use it can be moved from one DBMS to another vendor’s DBMS with minimal conversion effort and little retraining of personnel Database tools such as query tools, report writers, and application generators work with many different brands of SQL databases The vendor independence thus provided by SQL was one of the most important reasons for its early popularity and remains an important feature today
Portability Across Computer Systems
SQL-based database products run on computer systems ranging from mainframes and midrange systems to personal computers, workstations, a wide range of specialized server computers, and even handheld devices They operate on stand-alone computer systems, in departmental local area networks, and in enterprisewide or Internetwide networks SQL-based applications that begin on single-user or departmental server systems can be moved to larger server systems as they grow Data from corporate SQL-based databases can be extracted and downloaded into departmental or personal databases Finally, economical personal computers can be used to test a prototype of a SQL-based database application before moving it to an expensive multiuser system
Trang 38Early IBM Commitment
SQL was originally invented by IBM researchers and fairly quickly became a strategic product for IBM based on its flagship DB2 database SQL support is available on all major IBM product families, from personal computers through midrange systems and UNIX-based servers to IBM mainframes IBM’s initial work provided a clear signal of IBM’s direction for other database and system vendors to follow early in the development of SQL and relational databases Later, IBM’s commitment and broad support speeded the market acceptance of SQL In the 1970s, IBM was the dominant force in business computing, so its early and sustained support as the inventor and champion of SQL ensured its early importance
Microsoft Support
Microsoft has long considered database access a key part of its Windows personal computer software architecture Both desktop and server versions of Windows provide standardized relational database access through Open Database Connectivity (ODBC), a SQL-based call-level API (application programming interface) Leading Windows software applications (spreadsheets, word processors, databases, etc.) from Microsoft and other vendors
support ODBC, and all leading SQL databases provide ODBC access Microsoft has enhanced ODBC support with higher-level, more object-oriented database access layers over the years, including data management support in NET today But these new technologies could always interact with relational databases through the ODBC/SQL layers below When Microsoft began its effort in the late 1980s to make Windows a viable server operating system, it introduced SQL Server as its own SQL-based offering SQL Server continues today as a flagship Microsoft product and as a key component of the Microsoft NET architecture for web services
Relational Foundation
SQL is a language for relational databases, and it has become popular along with the relational database model The tabular, row/column structure of a relational database is intuitive to users, keeping the SQL simple and easy to understand The relational model also has a strong theoretical foundation that has guided the evolution and implementation of relational databases Riding a wave of acceptance brought about by
the success of the relational model, SQL has become the database language for relational
databases
Trang 39High-Level, English-Like Structure
SQL statements look like simple English sentences, making SQL relatively easy to learn and
understand This is in part because SQL statements describe the data to be retrieved, rather than specifying how to find the data Tables and columns in a SQL database can have long,
descriptive names As a result, most SQL statements “say what they mean” and can be read as clear, natural sentences
Interactive, Ad Hoc Queries
SQL is an interactive query language that gives users ad hoc access to stored data Using SQL interactively, a user can get answers even to complex questions in minutes or seconds, in sharp contrast to the days or weeks it would take for a programmer to write a custom report program Because of the SQL ad hoc query power, data is more accessible and can be used to help an organization make better, more informed decisions SQL’s ad hoc query capability was an important advantage over nonrelational databases early in its evolution and more recently has continued as a key advantage over pure object-based databases
Programmatic Database Access
SQL is also a database language used by programmers to write applications that access a database The same SQL statements are used for both interactive and programmatic access, so the database access parts of a program can be tested first with interactive SQL and then embedded into the program In contrast, nonrelational or object-oriented databases provided one set of tools for programmatic access and a separate query facility for ad hoc requests, without any synergy between the two modes of access
Multiple Views of Data
Using SQL, the creator of a database can give different users of the database different views
of its structure and contents For example, the database can be constructed so that each user sees data only for his or her department or sales region In addition, data from several different parts of the database can be combined and presented to the user as a simple row/column table SQL views can thus be used to enhance the security of a database and to tailor it to the particular needs of individual users while preserving the fundamental row/column structure of the data
Complete Database Language
SQL was first developed as an ad hoc query language, but its powers now go far beyond data retrieval SQL provides a complete, consistent language for creating a database, managing its security, updating its contents, retrieving data, and sharing data among many concurrent users SQL concepts that are learned in one part of the language can be applied to other SQL commands, making users more productive
Dynamic Data Definition
Using SQL, the structure of a database can be changed and expanded dynamically, even while users are accessing database contents This is a major advance over static data definition languages, which prevented access to the database while its structure was being changed SQL thus provides maximum flexibility, allowing a database to adapt to changing requirements while online applications continue uninterrupted
Trang 40function as front-ends to network servers or to larger minicomputer and mainframe databases, providing access to corporate data from personal computer applications.
Enterprise Application Support
The largest enterprise applications that support the daily operation of large companies and organizations all use SQL-based databases to store and organize their data In the 1990s, driven by the impending deadline for supporting dates in the year 2000 and beyond (the so-called “Y2K” problem), large enterprises moved en masse to abandon their homegrown systems and convert to packaged enterprise applications from vendors like SAP, Oracle, PeopleSoft, Siebel, and others The data processed by these applications (orders, sales amounts, customers, inventory levels, payment amounts, etc.) tends to have a structured, records-and-fields format, which converts easily into the row/column format of SQL By constructing their applications to use enterprise-class SQL databases, the major application vendors eliminated the need to develop their own data management software and benefited from existing tools and programming skills Because every major enterprise application requires a SQL-based database for its operation, new sales of enterprise applications automatically generate “drag-along” demand for new copies of database software
Extensibility and Object Technology
The major challenge to SQL’s continued dominance as a database standard has come from the emergence of object-based programming through languages such as Java and C++, and from the introduction of object-based databases as an extension of the broad market trend toward object-based technology SQL-based database vendors have responded to this challenge by slowly expanding and enhancing SQL to include object features These “object/relational” databases, which continue to be based on SQL, have emerged as a more popular alternative to “pure object” databases and have perpetuated SQL’s dominance through the last decade The newest wave of object technology, embodied in the XML standard and web services architectures, once again created a crop of “XML databases” and alternative query languages to challenge SQL in the early 2000s But once again, the major vendors of SQL-based databases responded by adding XML-based extensions, meeting the challenge and securing SQL’s continuing importance History suggests that this “extend and integrate” approach will be successful in warding off new challenges in the future as well
Internet Database Access
With the exploding popularity of the Internet and the World Wide Web, and their standards-based foundation, SQL found a new role in the late 1990s as an Internet data access standard Early in the development of the Web, developers needed a way to retrieve and present database information on web pages and used SQL as a common language for database gateways More recently, the emergence of three-tiered Internet architectures with distinct thin client, application server, and database server layers, has established SQL as the standard link between the application and database tiers The role of SQL in multitier