202 CHAPTER 7: MULTIPLE COLUMN DATA ELEMENTS CREATE FUNCTION Distance (IN latitude1 REAL, IN longitude1 REAL, IN latitude2 REAL, IN longitude2 REAL) RETURNS REAL BEGIN DECLARE r REAL; DECLARE lat REAL; DECLARE lon REAL; DECLARE a REAL; DECLARE c REAL; SET r = 6367.00 * 0.6214; calculate the Deltas SET lon = longitude2 - longitude1; SET lat = latitude2 - latitude1; Intermediate values SET a = SIN(lat / 2) + COS(latitude1) * COS(latitude2) * SIN(lon / 2); Intermediate result c is the great circle distance in radians SET c = 2 * ARCSIN(LEAST(1.00, SQRT(a))); Multiply the radians by the radius to get the distance RETURN (r * c); END; LEAST() function protects against possible round-off errors that could sabotage computation of the ARCSIN() if the two points are very nearly antipodal. It exists as a vendor extension in Oracle and MySQL, but can be written with a CASE expression in Standard SQL. 7.2 Storing an IP Address in SQL While not exactly a data type, IP addresses are being used as unique identifiers for people or companies. If you need to verify them, you can send an e-mail or ping them. There are three popular ways to store an IP address: a string, an integer, and a set of four octets. In a test conducted in SQL Server, all three methods required about the same amount of time, work, and I/O to return data as a string. The 7.2 Storing an IP Address in SQL 203 latter two have some additional computations, but the overhead was not enough to affect performance very much. The conclusion was that the octet model with four TINYINT columns had two advantages: simpler programming indexes on individual octets, and human readability. But you should look at what happens in your own environment. TINYINT is a one-bit integer data type found in SQL Server and other products; SMALLINT is the closest thing to it in Standard SQL. 7.2.1 A Single VARCHAR(15) Column The most obvious way to store IP addresses (for example, ‘63.246.173.210’) is a VARCHAR(15) column, with a CHECK() constraint that uses a SIMILAR TO predicate to be sure that it has the “dots and digits” in the right positions. You have to decide the meaning of leading zeros in an octet and trim them to do string comparisons. The good points are that programming this is reasonably simple and it is immediately readable by a human. The bad points are that this solution has higher storage costs and requires pattern-matching string functions in searches. It also is harder to pass to some host programs that expect to see the octets to make their IP connections. To convert the string into octets, you need to use a string procedure. You can write one based on the code given for parsing a comma- separated string into individual integers in Section 22.1, on using the sequence auxiliary table. 7.2.2 One INTEGER Column This solution has the lowest storage requirements of all the methods, and it keeps the address in one column. Searching and indexing are also minimal. The bad side is that programming for this solution is much more complex, and you need to write user functions to break it apart into octets. It also has poor human readability. Can you tell me that an INTEGER value like ‘2130706433’ represents ‘127.0.0.1’ on sight? CREATE FUNCTION IPIntegerToString (IN ip INTEGER) RETURNS VARCHAR(15) LANGUAGE SQL DETERMINISTIC BEGIN DECLARE o1 INTEGER; 204 CHAPTER 7: MULTIPLE COLUMN DATA ELEMENTS DECLARE o2 INTEGER; DECLARE o3 INTEGER; DECLARE o4 INTEGER; IF ABS(ip) > 2147483647 THEN RETURN '255.255.255.255'; END IF; SET o1 = ip / 16777216; IF o1 = 0 THEN SET o1 = 255; SET ip = ip + 16777216; ELSE IF o1 < 0 THEN IF MOD(ip, 16777216) = 0 THEN SET o1 = o1 + 256; ELSE SET o1 = o1 + 255; IF o1 = 128 THEN SET ip = ip + 2147483648; ELSE SET ip = ip + (16777216 * (256 - o1)); END IF; END IF; ELSE SET ip = ip - (16777216 * o1); END IF; END IF; SET ip = MOD(ip, 16777216); SET o2 = ip / 65536; SET ip = MOD(ip, 65536); SET o3 = ip / 256; SET ip = MOD(ip, 256); SET o4 = ip; return the string RETURN CAST(o1 AS VARCHAR(3)) || '.' || CAST(o2 AS VARCHAR(3)) || '.' || CAST(o3 AS VARCHAR(3)) || '.' || CAST(o4 AS VARCHAR(3)); END; 7.3 Currency and Other Unit Conversions 205 7.2.3 Four SMALLINT Columns The good points of this solution are that it has a lower storage cost than VARCHAR(15) , searching is easy and relatively fast, and you can index on each octet of the address. If you have an SQL with a TINYINT (usually one byte) data type, then you can save even more space. The bad point is that programming is slightly more complex. CREATE TABLE FourColumnIP (octet1 SMALLINT NOT NULL CHECK (octet1 BETWEEN 0 AND 255), octet2 SMALLINT NOT NULL CHECK (octet2 BETWEEN 0 AND 255), octet3 SMALLINT NOT NULL CHECK (octet3 BETWEEN 0 AND 255), octet4 SMALLINT NOT NULL CHECK (octet4 BETWEEN 0 AND 255), ); You will need a view for display, but that is straightforward: CREATE VIEW DisplayIP (IP_address_display) AS SELECT (CAST(octet1 AS VARCHAR(3))||'.'|| CAST(octet2 AS VARCHAR(3))||'.'|| CAST(octet3 AS VARCHAR(3))||'.'|| CAST(octet4 AS VARCHAR(3)) FROM FourColumnIP; 7.3 Currency and Other Unit Conversions Currency has to be expressed in both an amount and a unit of currency. The ISO 4217 currency code gives you a standard way of identifying the unit. There are no nondecimal currency systems left on earth, but you will need to talk to the accounting department about the number of decimal places to use in computations. The rules for euros are established by the European Union, and those for U.S. dollars are part of the GAAP (Generally Accepted Accounting Practices). CREATE TABLE InternationalMoney ( currency_code CHAR(3) NOT NULL, 206 CHAPTER 7: MULTIPLE COLUMN DATA ELEMENTS currency_amt DECIMAL (12,4) NOT NULL, ); This mixed table is not easy to work with, so it is best to create VIEW s with a single currency for each group of users. This will entail maintaining an exchange rate table to use in the VIEW s. CREATE VIEW EuroMoney ( euro_amt, ) AS SELECT (M1.currency_amt * E1.conversion_factor), FROM InternationalMoney AS M1, ExchangeRate AS E1 WHERE E1.to_currency_code = 'EUR' AND E1.from_currency_code = M1.curency_code; But there is a gimmick. There are specific rules about precision and rounding that are mandatory in currency conversion to, from, and through the euro. Conversion between two national currencies must be triangulated; this means that you convert the first currency to euros, then convert the euros to the second currency. Six-figure conversion rates are mandatory, but you should check the status of “Article 235 Regulation” to be sure that nothing has changed since this writing. 7.4 Social Security Numbers Social Security numbers (SSNs) are so important in the United States that they deserve a separate mention. You can look up death records using Social Security and the first five digits of the Social Security number, with location and approximate year of issue. The Ancestry.com Web site has a Social Security death search that gives the full nine-digit number of the deceased individual. It does not supply the years or location of issue. Commercial firms such as Security Software Solutions (Box 30125; Tucson, AZ 85751-0125; phone 800-681-8933; www.veris-ssn.com) will verify Social Security Numbers for living and deceased persons. The Social Security number is composed of 3 parts, XXX-XX-XXXX, called the Area, Group, and Serial. For the most part, (there are a few exceptions), the Area is determined by where the individual applied for the Social Security Number (before 1972) or resided at time of application (after 1972). The areas are assigned as follows: 7.4 Social Security Numbers 207 000 unused 001-003 NH 004-007 ME 008-009 VT 010-034 MA 035-039 RI 040-049 CT 050-134 NY 135-158 NJ 159-211 PA 212-220 MD 221-222 DE 223-231 VA 232-236 WV 237-246 NC 247-251 SC 252-260 GA 261-267 FL 268-302 OH 303-317 IN 318-361 IL 362-386 MI 387-399 WI 400-407 KY 408-415 TN 416-424 AL 425-428 MS 429-432 AR 433-439 LA 440-448 OK 449-467 TX 468-477 MN 478-485 IA 486-500 MO 501-502 ND 503-504 SD 505-508 NE 509-515 KS 516-517 MT 518-519 ID 520 WY 208 CHAPTER 7: MULTIPLE COLUMN DATA ELEMENTS 521-524 CO 525 NM 526-527 AZ 528-529 UT 530 NV 531-539 WA 540-544 OR 545-573 CA 574 AK 575-576 HI 577-579 DC 580 VI Virgin Islands 581-584 PR Puerto Rico 585 NM 586 PI Pacific Islands (Northern Mariana Islands, Guam, American Samoa, Philippine Islands) 587-588 MS 589-595 FL 596-599 PR Puerto Rico 600-601 AZ 602-626 CA 627-699 unassigned, for future use 700-728 Railroad workers through 1963, then discontinued 729-899 unassigned, for future use 900-999 not valid Social Security Numbers, but were used for program purposes when state aid to the aged, blind, and disabled was converted to a federal program administered by Social Security Adminstration. As the Areas assigned to a locality are exhausted, new areas from the pool are assigned. This is why some states have noncontiguous groups of Areas. The Group portion of the Social Security number has no meaning other than to determine whether or not a number has been assigned. Social Security Administration publishes a list every month of the highest Group assigned for each Area. The order of assignment for the Groups is: odd numbers under 10, even numbers over 9, even numbers under 9 except for 00, which is never used, and odd numbers over 10. For example, if the highest group assigned for area 999 is 72, then we 7.5 Rational Numbers 209 know that the number 999-04-1234 is an invalid number because even Groups under 9 have not yet been assigned. The Serial portion of the Social Security number has no meaning. The Serial is not assigned in strictly numerical order. The Serial 0000 is never assigned. Before 1973, Social Security cards with preprinted numbers were issued to each local Social Security Administration office. The local office assigned the numbers. In 1973, Social Security number assignment was automated, and outstanding stocks of preprinted cards were destroyed. Computers at headquarters now assign all Social Security numbers. There are rare cases in which the computer system can be forced to accept a manual assignment, such as a person refusing a number with 666 in it. A pamphlet entitled “The Social Security Number” (Pub. No. 05- 10633) provides an explanation of the Social Security number’s structure and the method of assigning and validating Social Security numbers. You can also verify a number with software packages; look at www.searchbug.com/peoplefinder/ssn.aspx. 7.5 Rational Numbers A rational number is defined as a fraction (a/b) where a and b are both integers. In contrast, an irrational number cannot be defined that way. The classic example of an irrational number is the square root of two. Technically, a binary computer can only represent a subset of the rational numbers. But for some purposes, it is handy to actually model them as (numerator, denominator) pairs. For example, Vadim Tropashko uses rational numbers in the nested interval model for hierarchies in SQL (see Joe Celko’s Trees and Hierarchies in SQL for Smarties ). This means that you need a set of user-defined functions to do basic four-function math and to reduce the fractions. Elementary school students, when questioned what the sum of 1/2 and 1/4 is, will add the denominators and numerators like this: 1/2 + 1/4 = (1+1)/(2+4) = 2/6 = 1/3. This operation is called the mediant, and it returns the simplest number between the two fractions, if we use smallness of denominator as a measure of simplicity. Indeed, the average of 1/4 and 1/2 has denominator 8, while the mediant has 3. CHAPTER 8 Table Operations T HERE ARE ONLY FOUR things you can do with a set of rows in an SQL table: insert them into a table, delete them from a table, update the values in them, or query them. The unit of work is a set of whole rows inside a base table. When you worked with file systems, access was one record at a time, then one field within a record. Since you had repeated groups and other forms of variant records, you could change the structure of each record in the file. The mental mode in SQL is that you grab a subset as a unit, all at once, in a base table and insert, update, or delete as a unit, all at once. Imagine that you have enough computer power that you can allocate one processor to every row in a table. When you blow your whistle, all the processors do their work in parallel. 8.1 DELETE FROM Statement The DELETE FROM statement in SQL removes zero or more rows of one table. Interactive SQL tools will tell the user how many rows were affected by an update operation, and Standard SQL requires the database engine to raise a completion condition of “no data” if there are zero rows. There are two forms of DELETE FROM in SQL: positioned and searched. The positioned deletion is done with cursors; . nested interval model for hierarchies in SQL (see Joe Celko s Trees and Hierarchies in SQL for Smarties ). This means that you need a set of user-defined functions to do basic four-function. firms such as Security Software Solutions (Box 30125; Tucson, AZ 8575 1-0 125; phone 80 0-6 8 1-8 933; www.veris-ssn.com) will verify Social Security Numbers for living and deceased persons. The Social. MySQL, but can be written with a CASE expression in Standard SQL. 7.2 Storing an IP Address in SQL While not exactly a data type, IP addresses are being used as unique identifiers