< previous page page_156 next page > Page 156 Types of Output Files Generally speaking, there are four types of output files you can produce when extracting data with SQL*Plus: Delimited columns Fixed-width columns DML (Data Manipulation Language) DDL (Data Definition Language) There may be variations on these typesdelimited files, for example, could be either tab-delimited or comma-delimited, and you may be able to dream up some novel format, but, generally speaking, these are the most useful. Delimited Files Delimited files use a special text character to separate each data value in a record. Typically the delimiter is either a tab or a comma, but any character may be used. Here's an example of a comma-delimited file containing employee information (ID, rate, hire date, and name): 1,135, 15-Nov-1961, Jonathan Gennick 2,95, 10-Jan-1991, Bohdan Khmelnytsky 3,200, 17-Jul-1984, Ivan Mazepa This example illustrates a very commonly used format called the CSV (Comma Separated Values) format. CSV- formatted files use commas to delimit the values, and they enclose text fields within quotes. The CSV format is recognized by most spreadsheets and desktop databases, and is the format for the examples in this chapter. Fixed-Width Files A fixed-width file contains data in columns, where each column is a certain width, and all values in that column are the same width. Here's an example of the same employee data shown earlier, but formatted into fixed-width columns: 00113515-Nov-1961Jonathan Gennick 00209510-Jan-1991Bohdan Khmelnytsky 00320017-Jul-1984Ivan Mazepa In the above example, the columns abut each other with no space in between. If you aren't trying to match an existing file layout, you may prefer to allow at least one space between columns to aid readability. < previous page page_156 next page > < previous page page_157 next page > Page 157 DML Files. A DML file contains Data Manipulation Language statements, such as INSERT, DELETE, UPDATE, and SELECT. This type of file can be used as a quick and dirty way of extracting data from one database for insertion into another. Here, for example, is a file of INSERT statements that recreate some employee data: INSERT INTO employee (employee_ID,employee_billing_rate,employee_hire_date,employe e_name) VALUES (1,135,TO_DATE(15-Nov-1961, DD-MON-YYYY), Jonathan Gennick); INSERT INTO employee (employee_ID,employee_billing_rate,employee_hire_date,employee_name) VALUES (2,95, TO_DATE (10-Jan-1991, DD-MON-YYYY), Bohdan Khmelnytsky); INSERT INTO employee (employee_ID, employee_billing_rate, employee_hire_date, employee_name) VALUES (3,200,TO_DATE(17-Jul-1984,DD-MON-YYYY),Ivan Mazepa); You can generate these INSERT statements, based on existing data, using SQL*Plus and SQL. Then you can easily apply those inserts to another database. This may not seem to be the most efficient way of moving data around, but if you have a low volume of data, such as a few dozen records that you want to send off to a client, it works very well. DDL Files A DDL file contains Data Definition Language statements. It's not much different from a DML file, except that the goal is to modify your database rather than to extract data for another application. Say, for example, that you need to create public synonyms for all your tables. You could use a SQL query to generate the needed CREATE PUBLIC SYNONYM statements, spool those to a file, and then execute that file. You will find a brief example showing how to do this later in this chapter. Chapter 7, Advanced Scripting, explores this subject in greater depth. Limitations of SQL*Plus When using SQL*Plus to extract data, there are some limitations to keep in mind. Because SQL*Plus was designed as a reporting tool and not a data extraction tool, the output must be text. If you need to write a file containing packed-decimal data or binary data, SQL*Plus is not the tool to use. A second SQL*Plus limitation you may encounter when extracting data is the linesize. This maximum output line supported by SQL*Plus varies from one platform to the next, but under Windows NT/95 it is 32,767 characters. The final thing to keep in mind are the datatypes you can extract. SQL*Plus was really designed to work with the traditional scalar datatypes, such as NUMBER, < previous page page_157 next page > < previous page page_158 next page > Page 158 VARCHAR2, etc. It can't be used to extract large object types, such as BLOBs and CLOBs, nor can it handle other large, complex data types, such as VARRAYs and Oracle8 objects. Here is a list of the datatypes you can easily extract using SQL*Plus: VARCHAR2 NVARCHAR2 NUMBER DATE ROWID CHAR NCHAR These types can be converted to character strings, and you can easily concatenate those strings together to produce whatever output format you want, whether that be comma-delimited, tab-delimited, or fixed-width. Text is the key to extracting data with SQL*Plus. Think text. If you have nonalphanumeric data to extract, you should use a different tool. Extracting the Data To write a script to extract data from Oracle and place it in a flat file, follow these steps: 1. Formulate the query. 2. Format the data. 3. Spool the extract to a file. 4. Make the script user-friendly. The last step, making the script user-friendly, isn't really necessary for a one-off effort. However, if it's an extraction you are going to perform often, it's worth taking a bit of time to make it easy and convenient to use. Formulate the Query The very first step in extracting data is to figure out just what data you need to extract. You need to develop a SQL query that will return the data you need. For the example in this chapter, we will just extract the employee data, so the query will look like this: SELECT employee_id, employee_name, employee_hire_date, employee_termination_date, employee_billing_rate FROM employee; < previous page page_158 next page > < previous page page_159 next page > Page 159 Keep in mind that you can write queries that are much more complicated than those shown here. If necessary, you can join several tables together, or you can UNION several queries together. Format the Data The next step, once you have your query worked out, is to format the data to be extracted. The way to do this is to modify your query so it returns a single, long expression that combines the columns together in the format that you want in your output file. It's often necessary to include text literals in the SELECT statement as part of this calculation. For example, if you want to produce a comma-delimited file, you will need to include those commas in your SELECT statement. Be sure to keep in mind the ultimate destination of the data. If your purpose is to pull data for someone to load into a spreadsheet, you will probably want to use a comma-delimited format. If you are passing data to another application, you may find it easier to format the data in fixed-width columns. Dates require some extra thought. With Oracle's built- in TO_CHAR function, you can format a date any way you want. Be sure, though, to use a format easily recognized by the application that needs to read that date. Comma-delimited To produce a comma-delimited text file, you need to do two things. First, you need to add commas between each field. Second, you need to enclose text fields within quotes. The final query looks like this: SELECT TO_CHAR(employee_id) ¦¦ , ¦¦ ¦¦ employee_name ¦¦ , ¦¦TO_CHAR(employee_hire_date,MM/DD/YYYY) ¦¦ , ¦¦ TO_CHAR(employee_termination_date, MM/DD/YYYY) ¦¦ , ¦¦ TO_CHAR(employee_billing_rate) FROM employee; Oracle's TO_CHAR function has been used to explicitly convert numeric fields to text strings. TO_CHAR is also used to convert date fields to text, and a date format string is included to get the dates into MM/DD/YYYY format. SQL's concatenation operator, ¦¦, is used to concatenate all the fields together into one long string, and you can see that commas are included between fields. The output from this query will look like this: 107, Bohdan Khmelnytsky,01/02/1998,,45 108, Pavlo Chubynsky,03/01/1994,11/15/1998,220 110, Ivan Mazepa,04/04/1998,09/30/1998,84 111, Taras Shevchenko,08/23/1976,,100 < previous page page_159 next page > < previous page page_160 next page > Page 160 Notice that in addition to the commas, the employee_name field has been enclosed in quotes. This is done to accommodate the possibility that someone's name will contain a comma. Most commercial programs that load comma- delimited data will allow text strings to be optionally enclosed in quotes. You can use the same technique to generate tab-delimited data. Instead of a comma, use CHR(9) to put a tab character between each field. CHR() is an Oracle SQL function that converts an ASCII code into a character. The ASCII code uses the value 9 to represent a tab character. Fixed-width The easiest way to produce an output file with fixed-width columns is to use the SQL*Plus COLUMN command to format the output from a standard SQL query. The following example shows one way to dump the employee data in a fixed-width column format: COLUMN employee_id FORMAT 09999 HEADING COLUMN employee_name FORMAT A20 HEADING TRUNCATED COLUMN employee_hire_date FORMAT A10 HEADING COLUMN employee_termination_date FORMAT A10 HEADING COLUMN employee_billing_rate FORMAT S099.99 HEADING SELECT employee_id, employee_name, TO_CHAR(employee_hire_date,MM/DD/YYYY) employee_hire_date, TO_CHAR(employee_termination_date, MM/DD/YYYY) employee_termination_date, employee_billing_rate FROM employee; Notice some things about the example: The heading for each column has explicitly been set to a null string. That's because, in the case of a numeric column, SQL*Plus will make the column wide enough to accommodate the heading. You don't want that behavior. You want the format specification to control the column width. Both numeric fields have been formatted to show leading zeros. Most programs that read fixed-width data, such as COBOL programs, will expect this. The employee_billing_rate column will always display a leading sign character, either a + or a The employee_id field, on the other hand, displays a leading sign only for negative values. Positive employee_id values are output with a leading space. Neither field would ever be negative, and these two approaches were taken just for illustrative purposes. Finally, notice that the TRUNCATED option was used to format the employee_name field. That's because the employee_name can be up to 40 characters long in the database. If TRUNCATED were not specified, any names that hap- < previous page page_160 next page > < previous page page_161 next page > Page 161 pened to be longer than 20 characters would wrap to a second linedefinitely not what you want to happen in a flat file. Here's how the output from the example will look: 00107 Bohdan Khmelnytsky 01/02/1998 +045.00 00108 Pavlo Chubynsky 03/01/1994 11/15/1998 +220.00 00110 Ivan Mazepa 04/04/1998 09/30/1998 +084.00 00111 Taras Shevchenko 08/23/1976 +100.00 Each column in the output is separated by one space, because that's the SQL*Plus default. If you like, you can use the SET COLSEP command to change the number of spaces or eliminate them entirely. To run the columns together, you could eliminate the space between columns by setting the column separator to a null string. For example: SET COLSEP Now the output will look like this: 00107Bohdan Khmelnytsky 01/02/1998 +045.00 00108Pavlo Chubynsky 03/01/199411/15/1998+220.00 00110Ivan Mazepa 04/04/199809/30/1998+084.00 00111Taras Shevchenko 08/23/1976 +100.00 You can use any column separation string that you like, and you aren't limited to just one character. The command SET SPACE 0 has the same effect as SET COLSEP it eliminates the space between columns. Oracle considers SET SPACE to be an obsolete command, but some people still use it. Using SQL*Plus to format fixed-width output works best when you have some control over the format expected by the destination of that data. If you are also writing the program to load the data somewhere else, of course you can code it to match what you can easily produce with SQL*Plus. Sometimes though, you need to match an existing format required by the destination; one you cannot change. Depending on your exact requirements, it may be easier to code one large expression in your SQL statement, and use Oracle's built-in functions to gain more control over the output. The following example produces the same output as before, but without a leading sign character in the numeric fields: SELECT LTRIM(TO_CHAR(employee_id,09999)) ¦¦ SUBSTR(RPAD(employee_name,20, ),1,20) < previous page page_161 next page > < previous page page_162 next page > Page 162 ¦¦ TO_CHAR(employee_hire_date, MM/DD/YYYY) ¦¦ NVL(TO_CHAR(employee_termination_date, MM/DD/YYYY), ) ¦¦ LTRIM(TO_CHAR(employee_billing_rate, 099.99)) FROM employee; The output from this example looks like this: 00107Bohdan Khmelnytsky 01/02/1998 045.00 00108Pavlo Chubynsky 03/01/199411/15/1998220.00 00110Ivan Mazepa 04/04/199809/30/1998084.00 00111Taras Shevchenko 08/23/1976 100.00 Look carefully, and you will see that neither the employee_id field nor the employee_billing_rate field has a leading sign. Not only do they not have a leading sign, there isn't even space left for one. The LTRIM function was used to remove it. There is a wide variety of built-in Oracle functions available. Add to that the ability to write your own, and you should be able to generate output in any conceivable format. DML If you are extracting data from Oracle in order to move it to another database, and if the volume of data isn't too high, you can use SQL*Plus to generate a file of INSERT statements. Here is a query to generate INSERT statements for each employee record: SELECT INSERT INTO employee ¦¦ chr(10) ¦¦ (employee_id, employee_name, employee_hire_date ¦¦ chr(10) ¦¦ ,employee_termination_date, employee_billing_rate) ¦¦ chr(10) ¦¦ VALUES ( ¦¦ TO_CHAR(employee_id) ¦¦ , ¦¦ chr(10) ¦¦ ¦¦ employee_name ¦¦ , ¦¦ chr(10) ¦¦ TO_DATE( ¦¦ TO_CHAR(employee_hire_date,MM/DD/YYYY) ¦¦ ,MM/DD/YYYY), ¦¦ chr(10) ¦¦ TO_DATE ( ¦¦ TO_CHAR(employee_termination_date,MM/DD/YYYY) ¦¦ ,MM/DD/YYYY ), ¦¦ chr(10) ¦¦ ¦¦ TO_CHAR(employee_billing_rate) ¦¦ ); FROM employee; As you can see, this type of query can get a bit hairy. You have to deal with nested, quoted strings; you have to concatenate everything together; and you have to place line breaks so the output at least looks decent. The doubled-up quotes you see in the above statement are there because single quotes are required in the final output. So, for example, the string MM/DD/YYYY ) will resolve to MM/DD/YYYY) when the SELECT statement is executed. The SELECT statement just shown will produce the following INSERT statement for each employee record: < previous page page_162 next page > < previous page page_163 next page > Page 163 INSERT INTO employee (employee_id, employee_name, employee_hire_date, ,employee_termination_date, employee_billing_rate) VALUES (111, Taras Shevcheriko, TO_DATE( 08/23/1976, MM/DD/YYYY), TO_DATE(, MM/DD/YYYY), 100); This is not a technique I use very often, because it can be frustrating to get the SQL just right. I use it most often on code tables and other small tables with only two or three columns. I also use it sometimes when I'm sending data to a client. That way I just send one file, and my client doesn't have to mess with SQL*Loader or Oracle's Import utility. DDL Another twist on using SQL to write SQL is to generate commands that help you maintain your database. Such commands are referred to as Data Definition Language, or DDL, commands. Using SQL*Plus to generate DDL scripts can help in automating many database administration tasks, and is often well worth the effort. The following command, for example, generates CREATE PUBLIC SYNONYM commands for each table you own: SELECT CREATE PUBLIC SYNONYM ¦¦ table_name ¦¦ for ¦¦ user ¦¦ , ¦¦ table_name ¦¦ ; FROM USER_TABLES; It's not unusual to need public synonyms, and if you have a large number of tables, you can save yourself a lot of typing by letting SQL*Plus do the work for you. The output from the above command looks like this: CREATE PUBLIC SYNONYM EMPLOYEE for JONATHAN. EMPLOYEE; CREATE PUBLIC SYNONYM PROJECT for JONATHAN. PROJECT; CREATE PUBLIC SYNONYM PROJECT_HOURS for JONATHAN. PROJECT_HOURS; Once you have spooled the above commands to a file, you can then execute that file to create the synonyms. In addition to one-off tasks like creating synonyms, you can use SQL*Plus to generate DDL commands for use by ongoing maintenance tasks. Going beyond that, you can even use SQL*Plus to generate operating-system script files. Spool the Extract to a File Once you have your query worked out and the data formatted the way you need it, it's time to spool your output to a file. In order to get a clean file, there are four things you must do. < previous page page_163 next page > < previous page page_164 next page > Page 164 1. Set the linesize large enough to accommodate the longest possible line. Pay close attention to this if you are generating comma-delimited data. You need to allow for the case where each field is at its maximum size. Use the SET LINESIZE command for this. 2. Turn off all pagination features. You can use SET PAGESIZE 0 for this purpose. It turns off all column headings, page headings, page footers, page breaks, etc. 3. Turn feedback off with the SET FEEDBACK OFF command. 4. Use the SET TRIMSPOOL ON command to eliminate trailing spaces in the output data file. Use this for comma- delimited output and when you generate a file of SQL statements. Do not use this command if you are generating a file with fixed-width columns. The following script generates a clean, comma-delimited file containing employee information: This script extracts data from the employee table and writes it to a text file in a comma-delimited format. Set the linesize to accommodate the longest possible line. SET LINESIZE 136 Turn off all page headings, column headings, etc. SET PAGESIZE 0 Turn off feedback SET FEEDBACK OFF Eliminate trailing blanks at the end of a line. SET TRIMSPOOL ON SET TERMOUT OFF SPOOL C:\A\EMP_DATA.CSV SELECT TO_CHAR(employee_id) ¦¦ , ¦¦ ¦¦ employee_name ¦¦ , ¦¦ TO_CHAR(employee_hire_date, MM/DD/YYYY) ¦¦ , ¦¦ TO_CHAR(employee_termination_date, MM/DD/YYYY) ¦¦ , ¦¦ TO_CHAR (employee_billing_rate) FROM employee; SPOOL OFF Restore the default settings. SET LIMESIZE 80 SET PAGESIZE 24 SET FEEDBACK ON SET TERMOUT ON < previous page page_164 next page > < previous page page_165 next page > Page 165 The SPOOL command is used to send the output to C:\A\EMP_DATA.CSV, and SET TERMOUT OFF is used to disable the display while the data is being written to the file. Here's how you run this script: SQL> @C\EMPLOYEE_EXTRACT SQL> The resulting output file looks like this: 101,Jonathan Gennick,11/15/1961,,169 102,Jenny Gennick,09/16/1964,05/05/1998,135 104,Jeff Gennick,12/29/1987,04/01/1998,99 105,Horace Walker,06/15/1998,,121 107,Bohdan Khmelnytsky,01/02/1998,,45 108,Pavlo Chubynsky,03/01/1994,11/15/1998,220 110,Ivan Mazepa,04/04/1998,09/30/1998,84 111,Taras Shevchenko,08/23/1976,,100 112,Hermon Goche,11/15/1961,04/04/1998,70 113,Jacob Marley,03/03/1998,10/31/1998,300 211,Taras Shevchenko,08/23/1976,,100 You can see that the output contains no page or column headings of any kind. It's just plain, comma-delimited data, with one line for each record in the table. Make Your Extract Script User-Friendly There are at least two things you can do to improve on the extract script just shown in the previous section. First, it might be nice to display a brief message to remind the user of what the script does. This will serve to give the user confidence that he or she has indeed started the correct script. The following PROMPT commands, added at the beginning of the script, should serve the purpose: PROMPT PROMPT This script creates a comma-delimited text file containing PROMPT employee data. All records from the employee table will PROMPT be output to this file. PROMPT It may not be much, but it always makes me feel better to have some indication that the script I am executing really does what I thought it did. For scripts that change data or do something difficult to reverse, I will often include a PAUSE command such as this: PAUSE Press ENTER to continue, or ctrl-C to abort. Another nice thing to do would be to prompt for the output filename. The following ACCEPT command will prompt the user for a filename, and put the user's response into the variable named output_file: ACCEPT output_file CHAR PROMPT Enter the output filename > < previous page page_165 next page > . depth. Limitations of SQL* Plus When using SQL* Plus to extract data, there are some limitations to keep in mind. Because SQL* Plus was designed as a reporting tool and not a data extraction tool, the output. or binary data, SQL* Plus is not the tool to use. A second SQL* Plus limitation you may encounter when extracting data is the linesize. This maximum output line supported by SQL* Plus varies from. from Oracle and place it in a flat file, follow these steps: 1. Formulate the query. 2. Format the data. 3. Spool the extract to a file. 4. Make the script user-friendly. The last step, making the