SQL VISUAL QUICKSTART GUIDE- P44 ppsx

Listing 15.4 defines the sequence shown in Figure 15.4. You can use a sequence generator in a few ways. The SQL standard provides the built-in function NEXT VALUE FOR to increment a sequence value, as in: INSERT INTO shipment( part_num, desc, quantity) VALUES( NEXT VALUE FOR part_seq, ‘motherboard’, 5); If you’re creating a column of unique values, you can use the keyword IDENTITY to define a sequence right in the CREATE TABLE statement: CREATE TABLE parts ( part_num INTEGER AS IDENTITY(INCREMENT BY 1 MINVALUE 1 MAXVALUE 10000 START WITH 1 NO CYCLE), desc AS VARCHAR(100), quantity INTEGER; This table definition lets you omit NEXT VALUE FOR when you insert a row: INSERT INTO shipment( desc, quantity) VALUES( ‘motherboard’, 5); SQL also provides ALTER SEQUENCE and DROP SEQUENCE to change and remove sequence generators. 410 Chapter 15 Generating Sequences Listing 15.4 Create a sequence generator for the consecutive integers 1 to 10,000. See Figure 15.4 for the result. CREATE SEQUENCE part_seq INCREMENT BY 1 MINVALUE 1 MAXVALUE 10000 START WITH 1 NO CYCLE; Listing 1 2 3 9998 9999 10000 Figure 15.4 The sequence that Listing 15.4 generates. ✔ Tip ■ Oracle, DB2, and PostgreSQL support CREATE SEQUENCE , ALTER SEQUENCE , and DROP SEQUENCE . In Oracle, use NOCYCLE instead of NO CYCLE . See your DBMS documentation to see how sequences are used in your system. Most DBMSs don’t support IDENTITY columns because they have other (pre- SQL:2003) ways that define columns with unique values. See Table 3.18 in “Unique Identifiers” in Chapter 3. PostgreSQL’s generate_series() function offers a quick way to generate numbered rows. A one-column table containing a sequence of consecutive integers makes it easy to solve problems that would otherwise be difficult with SQL’s limited computational power. Sequence tables aren’t really part of the data model—they’re auxiliary tables that are adjuncts to queries and other “real” tables. You can create a sequence table by using one of the methods just described. Alternatively, you can create one by using Listing 15.5, which creates the sequence table seq by cross-joining the intermediate table temp09 with itself. The CAST expression concatenates digit characters into sequential numbers and then casts them as integers. You can drop temp09 after seq is created. Figure 15.5 shows the result. The table seq contains the integer sequence 0, 1, 2, …, 9999. You can shrink or grow this sequence by changing the SELECT and FROM expressions in the INSERT INTO seq statement. 411 SQL Tricks Generating Sequences Listing 15.5 Create a one-column table that contains consecutive integers. See Figure 15.5 for the result. CREATE TABLE temp09 ( i CHAR(1) NOT NULL PRIMARY KEY ); INSERT INTO temp09 VALUES('0'); INSERT INTO temp09 VALUES('1'); INSERT INTO temp09 VALUES('2'); INSERT INTO temp09 VALUES('3'); INSERT INTO temp09 VALUES('4'); INSERT INTO temp09 VALUES('5'); INSERT INTO temp09 VALUES('6'); INSERT INTO temp09 VALUES('7'); INSERT INTO temp09 VALUES('8'); INSERT INTO temp09 VALUES('9'); CREATE TABLE seq ( i INTEGER NOT NULL PRIMARY KEY ); INSERT INTO seq SELECT CAST(t1.i || t2.i || t3.i || t4.i AS INTEGER) FROM temp09 t1, temp09 t2, temp09 t3, temp09 t4; DROP TABLE temp09; Listing i 0 1 2 3 4 9996 9997 9998 9999 Figure 15.5 Result of Listing 15.5. A sequence table is especially useful for enumerative and datetime functions. Listing 15.6 lists the 95 printable characters in the ASCII character set (if that’s the character set in use). See Figure 15.6 for the result. Listing 15.7 adds monthly intervals to today’s date (7-March-2005) for the next six months. See Figure 15.7 for the result. This example works on Microsoft SQL Server; the other DBMSs have similar functions that increment dates. Sequence tables are handy for normalizing data that you’ve imported from a non- relational environment such as a spreadsheet. Suppose that you have the following non- normalized table, named au_orders , showing the order of the authors’ names on each book’s cover: title_id author1 author2 author3 ———————— ——————— ——————— ——————— T01 A01 NULL NULL T02 A01 NULL NULL T03 A05 NULL NULL T04 A03 A04 NULL T05 A04 NULL NULL T06 A02 NULL NULL T07 A02 A04 NULL T08 A06 NULL NULL T09 A06 NULL NULL T10 A02 NULL NULL T11 A06 A03 A04 T12 A02 NULL NULL T13 A01 NULL NULL Listing 15.8 cross-joins au_orders with seq to produce Figure 15.8. You can DELETE the result rows with nulls in the column au_id , leaving the result set looking like the table title_authors in the sample database. Note that Listing 15.8 does the reverse of Listing 8.18 in Chapter 8. 412 Chapter 15 Generating Sequences Listing 15.6 List the characters associated with a set of character codes. See Figure 15.6 for the result. SELECT i AS CharCode, CHR(i) AS Ch FROM seq WHERE i BETWEEN 32 AND 126; Listing CharCode Ch 32 33 ! 34 " 35 # 36 $ 37 % 38 & 39 ' 40 ( 41 ) 42 * 43 + 44 , 45 - 46 . 47 / 48 0 49 1 50 2 51 3 52 4 Figure 15.6 Result of Listing 15.6. 413 SQL Tricks Generating Sequences Listing 15.7 Increment today’s date to six months hence, in one-month intervals. See Figure 15.7 for the result. SELECT i AS MonthsAhead, DATEADD("m", i, CURRENT_TIMESTAMP) AS FutureDate FROM seq WHERE i BETWEEN 1 AND 6; Listing MonthsAhead FutureDate 1 2005-04-07 2 2005-05-07 3 2005-06-07 4 2005-07-07 5 2005-08-07 6 2005-09-07 Figure 15.7 Result of Listing 15.7. Listing 15.8 Normalize the table au_orders . See Figure 15.8 for the result. SELECT title_id, (CASE WHEN i=1 THEN '1' WHEN i=2 THEN '2' WHEN i=3 THEN '3' END) AS au_order, (CASE WHEN i=1 THEN author1 WHEN i=2 THEN author2 WHEN i=3 THEN author3 END) AS au_id FROM au_orders, seq WHERE i BETWEEN 1 AND 3 ORDER BY title_id, i; Listing title_id au_order au_id T01 1 A01 T01 2 NULL T01 3 NULL T02 1 A01 T02 2 NULL T02 3 NULL T03 1 A05 T03 2 NULL T03 3 NULL T04 1 A03 T04 2 A04 T04 3 NULL T05 1 A04 T05 2 NULL T05 3 NULL T06 1 A02 T06 2 NULL T06 3 NULL T07 1 A02 T07 2 A04 T07 3 NULL T08 1 A06 T08 2 NULL T08 3 NULL T09 1 A06 T09 2 NULL T09 3 NULL T10 1 A02 T10 2 NULL T10 3 NULL T11 1 A06 T11 2 A03 T11 3 A04 T12 1 A02 T12 2 NULL T12 3 NULL T13 1 A01 T13 2 NULL T13 3 NULL Figure 15.8 Result of Listing 15.8. ✔ Tips ■ If you have a column of sequential integers that’s missing some numbers, you can fill in the gaps by EXCEPT ing the column with a sequence column. See “Finding Different Rows with EXCEPT ” earlier in this chapter. ■ To run Listing 15.5 in Microsoft Access and Microsoft SQL Server, change the CAST expression to: t1.i + t2.i + t3.i + t4.i To run Listing 15.5 in MySQL, change the CAST expression to: CONCAT(t1.i, t2.i, t3.i, t4.i) To run Listing 15.6 in Microsoft SQL Server and MySQL, change CHR(i) to CHAR(i) . To run Listing 15.8 in Microsoft Access, change the CASE expressions to Switch() function calls (see the DBMS Tip in “Evaluating Conditional Values with CASE ” in Chapter 5): (Switch(i=1, ‘1’, i=2, ‘2’, i=3, ‘3’)) AS au_order, (Switch(i=1, author1, i=2, author2, i=3, author3)) AS au_id 414 Chapter 15 Generating Sequences Calendar Tables Another useful auxiliary table is a calendar table. One type of calendar table has a primary-key column that contains a row for each calendar date (past and future) and other columns that indicate the date’s attributes: business day, holiday, international holiday, fiscal-month end, fiscal-year end, Julian date, business- day offsets, and so on. Another type of calendar table stores the starting and ending dates of events (in the columns event_id , start_date , and end_date , for example). Spreadsheets have more date- arithmetic functions than DBMSs, so it might be easier to build a calendar table in a spreadsheet and then import it as a database table. Even if your DBMS has plenty of date- arithmetic functions, it might be faster to look up data in a calendar table than to call these functions in a query. Finding Sequences, Runs, and Regions A sequence is a series of consecutive values without gaps. A run is like a sequence, but the values don’t have to be consecutive, just increasing (that is, gaps are allowed). A region is an unbroken series of values that all are equal. Finding these series requires a table that has at least two columns: a primary-key column that holds a sequence of consecutive integers and a column that holds the values of interest. The table temps (Listing 15.9 and Figure 15.9) shows a series of high temper- atures over 15 days. As a set-oriented language, SQL isn’t a good choice for finding series of values. The following queries won’t run very fast, so if you have a lot of data to analyze, you might con- sider exporting it to a statistical package or using a procedural host language. ✔ Tip ■ These queries are based on the ideas in David Rozenshtein, Anatoly Abramovich, and Eugene Birger’s Optimizing Transact- SQL: Advanced Programming Techniques (SQL Forum Press). You can use the queries’ common framework to create similar queries that find other series of values. 415 SQL Tricks Finding Sequences, Runs, and Regions Listing 15.9 List all the column in the table temps . See Figure 15.9 for the result. SELECT * FROM temps; Listing id hi_temp 1 49 2 46 3 48 4 50 5 50 6 50 7 51 8 52 9 53 10 50 11 50 12 47 13 50 14 51 15 52 Figure 15.9 Result of Listing 15.9. Listing 15.10 finds all the sequences in temps and lists each sequence’s start position, end position, and length. See Figure 15.10 for the result. This query is a lot to take in at first glance, but it’s easier to understand it if you look at it piecemeal. Then you’ll be able to understand the rest of the queries in this section. The subquery’s WHERE clause subtracts id from hi_temp , yielding (internally): id hi_temp diff —— ——————— ———— 1 49 48 2 46 44 3 48 45 4 50 46 5 50 45 6 50 44 7 51 44 8 52 44 9 53 44 10 50 40 11 50 39 12 47 35 13 50 37 14 51 37 15 52 37 In the column diff , note that successive differences are constant for sequences (50 – 6 = 44, 51 – 7 = 44, and so on). To find neighboring rows, the outer query cross-joins two instances of the same table ( t1 and t2 ), as described in “Calculating Running Statistics” earlier in this chapter. The condition WHERE (t1.id < t2.id) guarantees that any t1 row represents an element with an index ( id ) lower than the corresponding t2 row. 416 Chapter 15 Finding Sequences, Runs, and Regions Listing 15.10 List the starting point, ending point, and length of each sequence in the table temps . See Figure 15.10 for the result. SELECT t1.id AS StartSeq, t2.id AS EndSeq, t2.id - t1.id + 1 AS SeqLen FROM temps t1, temps t2 WHERE (t1.id < t2.id) AND NOT EXISTS( SELECT * FROM temps t3 WHERE (t3.hi_temp - t3.id <> t1.hi_temp - t1.id AND t3.id BETWEEN t1.id AND t2.id) OR (t3.id = t1.id - 1 AND t3.hi_temp - t3.id = t1.hi_temp - t1.id) OR (t3.id = t2.id + 1 AND t3.hi_temp - t3.id = t1.hi_temp - t1.id) ); Listing StartSeq EndSeq SeqSize 6 9 4 13 15 3 Figure 15.10 Result of Listing 15.10. The subquery detects sequence breaks with the condition t3.hi_temp - t3.id <> t1.hi_temp - t1.id The third instance of temps ( t3 ) in the subquery is used to determine whether any row in a candidate sequence ( t3 ) has the same difference as the sequence’s first row ( t1 ). If so, it’s a sequence member. If not, the candidate pair ( t1 and t2 ) is rejected. The last two OR conditions determine whether the candidate sequence’s borders can expand. A row that satisfies these conditions means the current candidate sequence can be extended and is rejected in favor of a longer one. ✔ Tip ■ To find only sequences larger than n rows, add the WHERE condition AND (t2.id - t1.id) >= n - 1 To change Listing 15.10 to find all sequences of four or more rows, for example, replace WHERE (t1.id < t2.id) with WHERE (t1.id < t2.id) AND (t2.id - t1.id) >= 3 The result is: StartSeq EndSeq SeqSize ———————— —————— ——————— 6 9 4 417 SQL Tricks Finding Sequences, Runs, and Regions Listing 15.11 finds all the runs in temps and lists each run’s start position, end position, and length. See Figure 15.11 for the result. The logic of this query is similar to that of the preceding one but accounts for run values needing only to increase, not (neces- sarily) be consecutive. The fourth instance of temps ( t4 ) is needed because there doesn’t have to be a constant difference between id and hi_temp values. The subquery cross- joins t3 and t4 to check rows in the middle of a candidate run, whose borders are t1 and t2 . For every element between t1 and t2 (limited by BETWEEN ), t3 and its predecessor t4 are compared to see whether their values are increasing. 418 Chapter 15 Finding Sequences, Runs, and Regions Listing 15.11 List the starting point, ending point, and length of each run in the table temps . See Figure 15.11 for the result. SELECT t1.id AS StartRun, t2.id AS EndRun, t2.id - t1.id + 1 AS RunLen FROM temps t1, temps t2 WHERE (t1.id < t2.id) AND NOT EXISTS( SELECT * FROM temps t3, temps t4 WHERE (t3.hi_temp <= t4.hi_temp AND t4.id = t3.id - 1 AND t3.id BETWEEN t1.id + 1 AND t2.id) OR (t3.id = t1.id - 1 AND t3.hi_temp < t1.hi_temp) OR (t3.id = t2.id + 1 AND t3.hi_temp > t2.hi_temp) ); Listing StartRun EndRun RunLen 2 4 3 6 9 4 12 15 4 Figure 15.11 Result of Listing 15.11. Listing 15.12 finds all regions in temps with a high temperature of 50 and lists each region’s start position, end position, and length. See Figure 15.12 for the result. ✔ Tips ■ To rank regions by length, add an ORDER BY clause to the outer query: ORDER BY t2.id - t1.id DESC ■ To list the individual id s that fall in a region (with value 50), type: SELECT DISTINCT t1.id FROM temps t1, temps t2 WHERE t1.hi_temp = 50 AND t2.hi_temp = 50 AND ABS(t1.id - t2.id) = 1; The standard function ABS() , which all DBMSs support, returns the absolute value of its argument. The result is: id –– 4 5 6 10 11 419 SQL Tricks Finding Sequences, Runs, and Regions Listing 15.12 List the starting point, ending point, and length of each region (with value 50) in the table temps . See Figure 15.12 for the result. SELECT t1.id AS StartReg, t2.id AS EndReg, t2.id - t1.id + 1 AS RegLen FROM temps t1, temps t2 WHERE (t1.id < t2.id) AND NOT EXISTS( SELECT * FROM temps t3 WHERE (t3.hi_temp <> 50 AND t3.id BETWEEN t1.id AND t2.id) OR (t3.id = t1.id - 1 AND t3.hi_temp = 50) OR (t3.id = t2.id + 1 AND t3.hi_temp = 50) ); Listing StartReg EndReg RegLen 4 6 3 10 11 2 Figure 15.12 Result of Listing 15.12. . Optimizing Transact- SQL: Advanced Programming Techniques (SQL Forum Press). You can use the queries’ common framework to create similar queries that find other series of values. 415 SQL Tricks Finding. run Listing 15.5 in Microsoft Access and Microsoft SQL Server, change the CAST expression to: t1.i + t2.i + t3.i + t4.i To run Listing 15.5 in MySQL, change the CAST expression to: CONCAT(t1.i,. change the CAST expression to: CONCAT(t1.i, t2.i, t3.i, t4.i) To run Listing 15.6 in Microsoft SQL Server and MySQL, change CHR(i) to CHAR(i) . To run Listing 15.8 in Microsoft Access, change the CASE expressions

Định dạng
Số trang	10
Dung lượng	165,64 KB