Joe Celko s SQL for Smarties - Advanced SQL Programming P83 doc

792 INDEX defined, 449 for numeric values, 449 with scalar subqueries, 454 for strings, 449 for temporal data types, 449 Median, 512–27 Celko’s first, 514–16 Celko’s second, 517–19 Celko’s third, 522–26 as central tendency measure, 513 with characteristic function, 520– 22 Date’s first, 513–14 Date’s second, 516 defined, 512 defining, 523 financial, 521 Henderson’s, 526–27 Murchison’s, 516–17 statistical, 512, 520 Vaughan’s, with VIEWs, 519–20 See also Statistics MERGE statement, 232–34 correlation name, 233 syntax, 233 Metaphone, 177–81 defined, 177 Pascal version, 177–81 See also Phonetic matching MIN() function, 349, 449–50 defined, 449 for numeric values, 449 for strings, 450 for temporal data types, 449–50 Minimum subsets, 620 Missing tables, 187 Missing times in contiguous events, 652–56 end date, 653, 655 start date, 653, 655 See also Time(s) Missing values, 187–90 in columns, 187–89 context and, 189–90 multiple, 199 See also Values Modes, 510–12 changes, 511 defined, 510 derived tables for, 512 multiple, 510 MOD() function, 114–15 computation, 605 odd/even determination, 472 Modifications audit log, 164 bitemporal tables, 165–66 current, 146–50 nonsequenced, 155 sequenced, 150–55 Moreno, Francisco, 364 Multiple aggregation levels, 431–35 CASE expressions for, 434–35 grouped VIEWs for, 432 intent, 431 subquery expressions for, 433–34 Multiple column data elements, 201–9 currency conversion, 205–6 distance functions, 201–2 IP address storage, 202–5 rational numbers, 209 Social Security numbers, 206–9 Multiple criteria extrema functions, 460–62 forms, 461 ordering, 461 Multiple parameter auxiliary tables, 488–89 Multiple translation auxiliary tables, 487–88 Multivalued dependencies (MVDs), 75, 76 INDEX 793 Multivariable descriptive statistics, 546–48 covariance, 546–47 NULLs in, 548 Pearson’s r, 547 Murchison’s median, 516–17 N Named columns, 576–79 Names column, 5 SQL, 5–6 table, 5 use guidelines, 5–6 NaN (Not a Number), 103 NATURAL JOINs, 327 Natural keys, 89 NCHAR() data type, 169 Negative values, 114 Nested EXISTs, 410 Nested parenthesis, 565 Nested queries, 745–46 Nested set model, 631–39 acyclic directed graphs, loading, 682 adjacency list conversion to, 637– 39 containment property, 634–35 converting, to adjacency list model, 635 converting adjacency list model to, 637–39 counting property, 633–34 defined, 631 deleting nodes and subtrees, 636– 37 hierarchical aggregations, 636 results, 631, 632 self- JOIN query, 634 subordinates, 635 See also Hierarchies Nested sets, 458 Nested UNIQUE constraints, 18–22 defined, 18 example, 18–22 Nested VIEWs, 370, 377–79 drawback, 378 restrictions, 377–78 See also VIEWs Nesting aggregate functions, 431, 433, 434 subqueries, 454 VIEWs, 370 Net Present Value (NPV), 494, 497 Nodes all in graph, viewing, 682–83 children, 623–24 defined, 623 deleting, 630–31, 636–37 depth, 630 descendents, 630 duplicate, 695 edges, 684 finding, 629–30 indegree, 684–85 inserting, 628 internal, 686 isolated, 685–86 leaf, 623, 625, 626 outdegree, 684–85 pairs, 693 reachable, 683–84 root, 629 sink, 685 source, 685 splitting, 682 total number, 691 See also Graphs; Trees Nonacyclic graphs, 703–4 Nonsequenced queries, 139, 144, 162, 163 Nonsequence modifications, 155 Nonsubversion Rule, 63 794 INDEX Normal forms, 64–87 1NF, 64–69 2NF, 70–71 3NF, 71–72 4NF, 75–76 5NF, 76–78 BCNF, 73–75 defined, 64 DKNF, 78–87 EKNF, 72–73 Normalization, 61–99 denormalization, 91–93 key types, 88–99 normal forms, 64–87 practical hints, 87–88 NOT DETERMINISTIC option, 607 Not equal (<>) operator, 235 NOT EXISTS() predicate, 291–92, 559 outer joins and, 303–4 subquery expression, 435 TRUE return, 302 NOT IN() predicate, 291–92, 294 NOT NULL constraint, 11–12 NULLIF() function, 110–11, 193, 251–52, 473 NULLs, 185–200 arithmetic and, 109–10 avoiding, 87 BETWEEN predicate results, 274 comparing, 190 concept, 109 converting values to/from, 110–13 cosine of, 193 in date fields, 196 dates, 655 design advice for, 195–98 encoding schemes and, 196 as eternity marker, 673 EXISTS predicate and, 300–302 FOREIGN KEYS and, 195 functions and, 193–94 general-purpose, 187, 346 as global, 110 groups and, 427 host languages and, 194–95 in host programs, 197–98 IN() predicate and, 293–95 INTERSECT/EXCEPT with, 600– 601 INTERSECT/EXCEPT without, 599–600 introduction, 185–87 logic and, 190–93 math and, 193 multiple values, 198–200 in multivariable descriptive statistics, 548 “Not Applicable,” 188 not using, 186–87 ORDER BY clause and, 329–33 OUTER JOINs and, 342–44 PRIMARY KEY columns and, 195 propagation, 110 quantities and, 196 row comparisons and, 240 rules, 186 sources, 242 in subquery predicates, 191–93 values, treatment of, 62 Number generators, 45 Numbering regions, 551–52 Numbers approximate, 102 converting to words, 117–18 exact, 102 lists, condensing, 567 lists, folding, 567–68 ordinal, 554 rational, 209 row, 606 sequence, filling in, 560–62 INDEX 795 sequence, mapping to cycle, 481– 83 Social Security, 206–9 summation, 444 Number theory operators, 113–16 Numeric data, 101–18 MAX() function for, 449 MIN() function for, 449 NUMERIC numeric type, 102 Numeric types, 101–6 BIGINT, 102 conversion, 105–7 DECIMAL, 102 INTEGER, 102 NUMERIC, 102 SMALLINT, 102 Numeric values approximate, 102–3 exact, 102 NVARCHAR() data type, 169 NYSIIS algorithm, 181–82 O OCTET_LENGTH() function, 173 ON clause join conditions and, 354 OUTER JOINs and, 340 search predicate in, 340–41 One-level SELECT statement, 317–24 defined, 317–18 execution order, 318–20 FROM clause, 321, 324 GROUP BY clause, 319, 323 HAVING clause, 319 ORDER BY clause, 328–36 SELECT clause, 319–20 starting tables, 320–21 syntax, 318, 326–28 WHERE clause, 318, 323 See also SELECT statement One-to-many relationships, 375 One True Lookup Table (OTLT), 491– 93 data type choices, 493 defined, 491 See also Lookup auxiliary tables Online Analytic Processing (OLAP), 709–18 CUBES function, 713–14 defined, 709 DENSE_RANK function, 711 enterprise-wide dimensional layer, 717–18 example, 716–17 functionality, 711–18 functions, specifying, 711 GROUPING operators, 712–14 languages, 710 RANK function, 711 ROLLUP function, 713, 716 ROW_NUMBER function, 711–12 Star Schema, 710–11 window clause, 714–16 Online Transaction Processing (OLTP), 709 data warehousing and, 709–10 speed, 710 OPEN statement, 55 Optimistic concurrency control, 727– 29 Optimization, 731–60 Optimizers <> comparison and, 736 cost-based, 731 defined, 731 “hot spots,” 756 JOIN orderings and, 740 JOIN pairs, 739 knowing, 754–56 rule-based, 731 types, 731 ORDER BY clause, 328–36 796 INDEX CASE expression and, 333–36 cursor and, 328 execution expense, 437 NULLs and, 329–33 rules, applying, 331 SELECT statement and, 328 syntax, 328 Ordering default, 438 multiple criteria, 461 predicates, 460 strings, 171 subset, 437 Ordinal numbers, 554 ORed predicates, 292–93 OR function, 474–75 ORM (Object Role Model), 78 Outdegree, 684–85 OUTER JOINs, 336–51 aggregate functions and, 348–49 crosstabs by, 543–44 execution order, 341 FULL, 337, 349–50, 351 functioning of, 337–38 LEFT, 337, 343, 351 multiple, 346–48 NATURAL, 344–45 NOT EXISTS predicate and, 303– 4 NULL result in, 349 NULLs and, 342–44 OLAP functions and, 342 ON clause and, 340 operators, 347 as query within SELECT clause, 341 RIGHT, 337, 351 searched, 344–45 self, 345–46 syntax, 337–42 table reconstruction from, 343 universal use, 336 WHERE clause and, 350–51 See also JOINs Overlapping keys, 22–25, 34 OVERLAPS predicate, 275–85, 667 avoiding, 285 defined, 273 end point interval and, 278 result, 276 rules, 276 time periods and, 275–85 P Packing joins, 358–59 Pairs duration, 672–73 grouping into, 436–37 linear regression with, 548 node, 693 Parallel processing, 2 Parsing lists, 68–69 Partitions, 401–23 coverings and, 401–6 by functions, 403–4 by ranges, 402–3 by sequences, 404–6 Path enumeration model, 628–31 defined, 628 deleting nodes/subtrees, 630–31 finding levels/subordinates, 630 finding subtrees/nodes, 629–30 integrity constraints, 631 See also Trees Paths, 686–95 cost, 692 with CTE, 697–705 eliminating, 694 endpoints, 683 finding, 686 by iteration, 688–90 least cost, 690 lengths, 687, 692 INDEX 797 listing, 691–95 shortest, 687–88 shortest, without recursion, 689– 90 steps, 693 tables holding, 691 See also Graphs Patterns % in front of, 263 special symbols, 267–68 tricks with, 262–64 See also LIKE predicate Pearson’s r, 547 Period of applicability (PA), 150 Period of validity (PV), 150 Persistent tables, 3 Personal calendars, 643–45 Pessimistic concurrency control, 726– 27 Phonetic matching, 175–82 Metaphone, 177–81 NYSIIS algorithm, 181–82 Soundex, 176–77 Soundex functions, 175–76 Physical addresses, 39 Physical Data Independence Rule, 63 Physical grouping, 716 Pointer structures, 382–83 Points inside polygons, 706–7 Polygons convex, 706 defined, 706 points inside, 706–7 POSITION() function, 173, 259, 540 POWER() function, 116, 565 PRD() function, 468–73 DISTINCT option, 469 by expressions, 469–70 by logarithms, 470–73 Preallocated values, 44–45 Predicates, 17 ALL, 312, 313–14 ANY, 312, 313 BETWEEN, 240, 273–75 in CHECK() constraint, 254 CONTAINS, 613 dummy, 736 EXISTS, 216, 288, 299–308 IN, 192, 287–97, 742–44 IS [NOT] NORMALIZED, 244–45 LIKE, 261–71 NOT IN, 192, 291–92 ordering, 460 ORed, 292–93 OVERLAPS, 275–85, 667 quantified, 309–15 SIMILAR TO, 267–69 subquery, NULLs in, 191–93 UNIQUE, 314–15 valued, 241–45 WHERE, 144, 212–16 PRIMARY KEY constraint, 14, 738, 741 compound, 20 defined, 14 Primary keys choosing, 22 fundamental requirement, 62 uniqueness, 751 Procedural loops, 326 Procedures, 53 Pseudo-random number generators, 609–10 defined, 609 linear congruence, 609–10 Q QNaN (Quiet NaN), 103 Quantified predicates, 309–15 Quantifiers defined, 309 EXISTS() predicate and, 304–5 forms, 309 798 INDEX as logical quantity, 309 missing data and, 311–13 Queries ad hoc, 742 audit log, 160–64 bound, 557 current, 144 date arithmetic, 128–29 derived tables inside, 370 extra join information, 738–40 GROUP BY, 252–53 JOIN, 290 leaf nodes, 625 nested, 745–46 nonsequenced, 139, 144, 162, 163 partitioning data in, 401–23 procedural traversal, 627–28 recursive, 705 relational division, 408 runs and sequence, 557–62 scalar, 297 sequenced, 138, 144, 162, 163 sequenced JOINs, 141 temporal, 641–80 unnested, 733–38 VIEWs in, 370 Quintiles, 537–38 defined, 537–38 example, 538 See also Statistics R RANDOM() function, 250, 608–10, 611 Random numbers calculating, 611–12 generators, 45, 609 Random-order keys, 45 Random order values, 45–48 additive congruential method, 45– 46 defined, 45 four-bit generator, 46 tap positions, 47 See also Values Range auxiliary tables, 489–90 Ranges counters, 40 holes in, 572 partitioning by, 402–3 single-column tables, 402–3 RANK function, 711 Rankings, 533–37 defined, 533 defining, 556 query, 534 versions, 534–36 See also Statistics Rational numbers, 209 Reachable nodes, 683–84 READ COMMITTED isolation level, 725 Read-only VIEWs, 371–73 READ UNCOMMITTED isolation level, 726 Reconvergent graphs, 681 Recursive queries, 705 Redundancy removal, 622 Redundant duplicates, 217–19 defined, 217 removal with ROWID, 218–19 rows, 617 in tables, 217–18 REFERENCES clause, 15–17 actions, 15–17 defined, 15 lookup tables and, 296 REFERENCES constraint, 36 <references specification>, 15 Referential actions, 16–17 CASCADE option, 16 deleting multiple tables without, 220 NO ACTION option, 16–17 INDEX 799 SET DEFAULT option, 16 SET NULL option, 16 Referential constraints EXISTS() and, 305–6 IN() predicate and, 295–96 See also Constraints Referential integrity declarative, 31 redundant duplicate removal and, 217–18 Regions defined, 550 finding, of maximum size, 552–56 numbering, 551–52 Relational database management system (RDBMS), 418–19 Relational division, 406–8 CROSS JOIN, 408 defined, 406 exact, 409 example, 406–8 HAVING clause, 416 with JOINs, 412–13 query, 408 with remainder, 408–9 Romley’s, 414–18 Todd’s, 410–12 Relations, 61–63 Relationships first rule, 549 many-to-many, 87 one-to-many, 375 REPEATABLE READ isolation level, 725 Repeating groups, 66–69 columns, 67–68 parsing lists, 68–69 See also Groups RIGHT OUTER JOINs, 337, 351 ROLLUP group, 713 Romley’s division, 414–18 ROUND() function, 117 Rounding, 105–7, 612 conventions, 106 implementation, 106 types of, 444 Row comparisons, 238–40 defined, 238 NULLs and, 240 rules, 239–40 ROWID physical addresses and, 39 redundant duplicate removal with, 218–19 ROW_NUMBER function, 711–12 Rows attribute split, 33–34 candidate, 622 constructing, 397 deleting, 38 duplicate, 48–50 equality, 614 inserting, 38 numbers, 606 random, picking, 607–12 redundant duplicate, 617 in self-join, 33 sorting, 93–99 subqueries, 257, 310 subset, removing, 213 updating, 38 value-equivalent, prevention, 132–33 See also Tables Rule-based optimizers, 731 Running differences, 530–31 Running statistics. See Cumulative statistics Running totals, 529–30, 562 Runs construction, 557 defined, 550 800 INDEX queries, 557–62 S Scalar queries IN() predicate and, 297 use, 433 Scalar subqueries, 257 comparisons, 310–11 with MAX() function, 454 placing, 310 running, 310 Schema-level constraints, 25–29 Schemas bad design, 1–2 creating, 1, 3–5 default character set, 3 defined, 3 name, 3 Schema tables, 50–51 information, 50 querying, 50 Second Normal Form (2NF), 70–71 Seed values, 609 SELECT statement, 58, 143–44, 317– 68 correlated subqueries, 324–26 DISTINCT option, 745, 746, 747 INNER JOINs and, 327 JOINs and, 317–36 one-level, 317–24 ORDER BY clause, 328–36 syntax, 326–28 Self joins, 606 in nested set model, 634 quintuple, 571 Self OUTER JOINs, 345–46 Sequenced duplicates, 132, 134, 135 Sequence deletions, 150–52 illustrated, 151–52 period of applicability (PA), 150 period of validity (PV), 150 physical modifications, 151 Sequenced JOINs, 140, 141 Sequenced modifications, 150–55 deletion, 150–52 update, 152–55 Sequenced queries, 138, 144, 162, 163 Sequence generators, 36–37, 42 Sequences, 554 columns, 549 mapping, into cycles, 481–83 missing values, finding, 554 numbers, filling in, 560–62 partition by, 404–6 queries, 557–62 resetting, 479 restrictions, 559 start/finish values, 553 Sequence tables, 477–85, 572 Cartesian product and, 485 constructor syntax, 479 defined, 477–78 general declaration, 478 Sequential access, 732 Sequential numbers generating as keys, 36–48 in pure SQL, 39–41 SERIALIZABLE isolation level, 725 Sessions, 719–20 SET clause execution, 225 row change, 225 UPDATE statement, 224, 225–26 Set functions. See Aggregate functions Set operators, 591–603 ALL option, 601–2 defined, 591 division with, 413–14 EXCEPT, 596–601 INTERSECT, 596–601 UNION, 592–96 UNION ALL, 592–96 Sets, 591 INDEX 801 equality, testing for, 589 nested, 695–97 SET TRANSACTION statement, 724– 25 SIGN() function, 471, 475 SIMILAR TO predicate, 267–69 Single-column range tables, 402–3 Sink nodes, 685 SMALLINT numeric type, 102 Smisteru rule, 254, 256 SNaN (Signalling NaN), 103, 104 Snapshot isolation, 727–29 Social Security numbers (SSNs), 206–9 Area portion, 206 Group portion, 208–9 parts, 206 Serial portion, 209 Sorting avoiding, 746–50 Bose-Nelson, 94, 95, 98 columns, 329 direction, controlling, 334 GROUP BY clause and, 437–38 grouped query results, 437 networks, 94 orders, 331 rows, 93–99 stable, 438 values, 333 by weekday names, 669–70 Soundex, 403 algorithm, 176–77 algorithm alternative, 177 defined, 175 drawback, 177 English pronunciation, 176 functions, 175–76 original, 176–77 See also Phonetic matching Source nodes, 685 SQL arrays, 575–89 graphs, 681–707 learning, 2 model for, 2 names, 5–6 numeric data in, 101–18 OLAP in, 709–18 optimizing, 731–60 static, 756–57 statistics, 509–48 temporal support, 167 trees/hierarchies, 623–40 working with, 4 SQRT() function, 116, 547 Stable sorts, 438 Standard deviation, 527–28 Star Schema, 710–11 Statements ALTER TABLE, 5, 7–8 CLOSE, 56 CONNECT TO, 719–20 CREATE ASSERTION, 26 CREATE DOMAIN, 51–52 CREATE INDEX, 752 CREATE PROCEDURE, 53 CREATE SCHEMA, 3–5 CREATE TABLE, 5, 8–9 CREATE TEMP TABLE, 390–91 CREATE TRIGGER, 52–53 CREATE VIEW, 380 DEALLOCATE, 56 DECLARE CURSOR, 53–55, 58 DELETE, 58–59, 214 DELETE FROM, 211–20, 694 DROP ASSERTION, 25 DROP TABLE, 5, 6–7, 391 DROP VIEW, 389 FETCH, 55–56, 194 INSERT INTO, 221–23, 404 MERGE, 232–34 OPEN, 55 . defined, 449 for numeric values, 449 with scalar subqueries, 454 for strings, 449 for temporal data types, 449 Median, 512–27 Celko s first, 514–16 Celko s second, 517–19 Celko s third, 522–26 as central. 449–50 defined, 449 for numeric values, 449 for strings, 450 for temporal data types, 449–50 Minimum subsets, 620 Missing tables, 187 Missing times in contiguous events, 652–56 end date, 653, 655 start date,. 797 listing, 691–95 shortest, 687–88 shortest, without recursion, 689– 90 steps, 693 tables holding, 691 See also Graphs Patterns % in front of, 263 special symbols, 267–68 tricks with, 262–64 See

Định dạng
Số trang	10
Dung lượng	113,26 KB