1. Trang chủ
  2. » Công Nghệ Thông Tin

Mastering unix shell scripting phần 2 pdf

70 392 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 70
Dung lượng 525,87 KB

Nội dung

then # This error would be a programming error print “ERROR: $(basename $0) requires one argument” return 1 fi # Assign arg1 to the variable > STRING STRING=$1 # This is where the string test begins case $STRING in +([0-9]).+([0-9]).+([0-9]).+([0-9])) # Testing for an IP address - valid and invalid INVALID=FALSE # Separate the integer portions of the “IP” address # and test to ensure that nothing is greater than 255 # or it is an invalid IP address. for i in $(echo $STRING | awk -F . ‘{print $1, $2, $3, $4}’) do if (( i > 255 )) then INVALID=TRUE fi done case $INVALID in TRUE) print ‘INVALID_IP_ADDRESS’ ;; FALSE) print ‘VALID_IP_ADDRESS’ ;; esac ;; +([0-1])) # Testing for 0-1 only print ‘BINARY_OR_POSITIVE_INTEGER’ ;; +([0-7])) # Testing for 0-7 only print ‘OCTAL_OR_POSITIVE_INTEGER’ ;; +([0-9])) # Check for an integer print ‘INTEGER’ ;; +([-0-9])) # Check for a negative whole number print ‘NEGATIVE_WHOLE_NUMBER’ ;; +([0-9]|[.][0-9])) 48 Chapter 1 # Check for a positive floating point number print ‘POSITIVE_FLOATING_POINT’ ;; +(+[0-9][.][0-9])) # Check for a positive floating point number # with a + prefix print ‘POSITIVE_FLOATING_POINT’ ;; +(-[0-9][.][0-9])) # Check for a negative floating point number print ‘NEGATIVE_FLOATING_POINT’ ;; +([ 0-9])) # Check for a negative floating point number print ‘NEGATIVE_FLOATING_POINT’ ;; +([+.0-9])) # Check for a positive floating point number print ‘POSITIVE_FLOATING_POINT’ ;; +([a-f])) # Test for hexidecimal or all lowercase characters print ‘HEXIDECIMAL_OR_ALL_LOWERCASE’ ;; +([a-f]|[0-9])) # Test for hexidecimal or all lowercase characters print ‘HEXIDECIMAL_OR_ALL_LOWERCASE_ALPHANUMERIC’ ;; +([A-F])) # Test for hexidecimal or all uppercase characters print ‘HEXIDECIMAL_OR_ALL_UPPERCASE’ ;; +([A-F]|[0-9])) # Test for hexidecimal or all uppercase characters print ‘HEXIDECIMAL_OR_ALL_UPPERCASE_ALPHANUMERIC’ ;; +([a-f]|[A-F])) # Testing for hexidecimal or mixed-case characters print ‘HEXIDECIMAL_OR_MIXED_CASE’ ;; +([a-f]|[A-F]|[0-9])) # Testing for hexidecimal/alpha-numeric strings only print ‘HEXIDECIMAL_OR_MIXED_CASE_ALPHANUMERIC’ ;; +([a-z]|[A-Z]|[0-9])) # Testing for any alpha-numeric string only print ‘ALPHA-NUMERIC’ ;; +([a-z])) # Testing for all lowercase characters only print ‘ALL_LOWERCASE’ ;; +([A-Z])) # Testing for all uppercase numbers only print ‘ALL_UPPERCASE’ ;; Scripting Quick Start and Review 49 +([a-z]|[A-Z])) # Testing for mixed case alpha strings only print ‘MIXED_CASE’ ;; *) # None of the tests matched the string coposition print ‘INVALID_STRING_COMPOSITION’ ;; esac } #################################################### usage () { echo “\nERROR: Please supply one character string or variable\n” echo “USAGE: $THIS_SCRIPT {character string or variable}\n” } #################################################### ############# BEGINNING OF MAIN #################### #################################################### # Query the system for the name of this shell script. # This is used for the “usage” function. THIS_SCRIPT=$(basename $0) # Check for exactly one command-line argument if (( $# != 1 )) then usage exit 1 fi # Everything looks okay if we got here. Assign the # single command-line argument to the variable “STRING” STRING=$1 # Call the “test_string” function to test the composition # of the character string stored in the $STRING variable. test_string $STRING # End of script This is a good start but this shell script does not cover everything. Play around with it and see if you can make some improvements. 50 Chapter 1 Summary This chapter is just a primer to get you started with a quick review and some little tricks and tips. In the next 24 chapters we are going to write a lot of shell scripts to solve some real-world problems. Sit back and get ready to take on the Unix world! The first thing that we are going to study is the 12 ways to process a file line by line. I have seen a lot of good and bad techniques for processing a file line by line over the last 10 years, and some have been rather inventive. The next chapter presents the 12 techniques that I have seen the most; at the end of the chapter there is a shell script that times each technique to find the fastest. Read on, and find out which one wins the race. See you in the next chapter! Scripting Quick Start and Review 51 53 Have you ever created a really slick shell script to process file data and found that you have to wait until after lunch to get the results? The script may be running so slowly because of how you are processing the file. I have come up with 12 ways to process a file line by line. Some techniques are very fast, and some make you wait for half a day. The techniques used in this chapter are measurable, and I created a shell script that will time each method so that you can see which technique suits your needs. When processing an ASCII text/data file, we are normally inside a loop of some kind. Then, as we go through the file from the top to the bottom, we process each line of text. A Korn shell script is really not meant to work on text character by character, but you can do it using various techniques. The task for this chapter is to show the line- by-line parsing techniques. We are also going to look at using file descriptors as a pro- cessing technique. Command Syntax First, as always, we need to go over the command syntax that we are going to use. The commands that we want to concentrate on in this chapter have to deal with while loops. When parsing a file in a while loop, we need a method to read in the entire line to a variable. The most prevalent command is read. The read command is flexible in that you can extract individual strings as well as the entire line. Speaking of line, the Twelve Ways to Process a File Line by Line CHAPTER 2 line command is another alternative to grab a full line of text. Some operating systems do not support the line command. I did not find the line command on Linux or Solaris; however, the line may have been added in subsequent OS releases. In addition to the read and line, we need to look at the different ways you can use the while loop, which is the major cause of fast or slow execution times. A while loop can be used as a standalone loop in a predefined configuration; it can be used in a com- mand pipe or with file descriptors. Each method has its own set of rules. The use of the while loop is critical to get the quickest execution times. I have seen many renditions of the proper use of a while loop, and some techniques I have seen are unique. Using File Descriptors Under the covers of the Unix operating system, files are referenced, copied, and moved by unique numbers known as file descriptors. You already know about three of these file descriptors: 0 - stdin 1 - stdout 2 - stderr We have redirected output using the stdout (standard output) and stderr (stan- dard error) in other scripts in this book. This is the first time we are going to use the stdin (standard input) file descriptor. For a short definition of each of these we can talk about the devices on the computer. Standard input usually comes into the com- puter from the keyboard or mouse. Standard output usually has output to the screen or to a file. Standard error is where error messages are routed by commands, programs, and scripts. We have used stderr before to send the error messages to the bit bucket, or /dev/null, and also more commonly to combine the stdout and stderr outputs together. You should remember a command like the following one: some_command 2>&1 The previous command sends all of the error messages to the same output device that standard output goes to, which is normally the terminal. We can also use other file descriptors. Valid descriptor values range from 0 to 19 on most operating systems. You have to do a lot of testing when you use the upper values to ensure that they are not reserved by the system for some reason. We will see more on using file descriptors in some of the following code listings. Creating a Large File to Use in the Timing Test Before I get into each method of parsing the file, I want to show you a little script you can use to create a file that has the exact number of lines that you want to process. The number of characters to create on each line can be changed by modifying the LINE_LENGTH variable in the shell script, but the default value is 80. This script also uses a while loop but this time to build a file. To create a file that has 7,500 lines, you 54 Chapter 2 add the number of lines as a parameter to the shell script name. Using the shell script in Listing 2.1, you create a 7,500-line file with the following syntax: # mk_large_file.ksh 7500 The full shell script is shown in Listing 2.1. #!/bin/ksh # # SCRIPT: mk_large_file.ksh # AUTHOR: Randy Michael # DATE: 03/15/2002 # REV: 1.2.P # # PURPOSE: This script is used to create a text file that # has a specified number of lines that is specified # on the command line. # # set -n # Uncomment to check syntax without any execution # set -x # Uncomment to debug this shell script # ################################################ # Define functions here ################################################ function usage { echo “\n USAGE ERROR \n” echo “\nUSAGE: $SCRIPT_NAME <number_of_lines_to_create>\n” } ################################################ # Check for the correct number of parameters ################################################ if (( $# != 1 )) # Looking for exactly one parameter then usage # Usage error was made exit 1 # Exit on a usage error fi ################################################ # Define files and variables here ################################################ LINE_LENGTH=80 # Number of characters per line OUT_FILE=/scripts/bigfile # New file to create Listing 2.1 mk_large_file.ksh shell script listing. (continues) Twelve Ways to Process a File Line by Line 55 >$OUT_FILE # Initialize to a zero-sized file SCRIPT_NAME=$(basename $0) # Extract the name of the script TOTAL_LINES=$1 # Total number of lines to create LINE_COUNT=0 # Character counter CHAR=X # Character to write to the file ################################################ # BEGINNING of MAIN ################################################ while ((LINE_COUNT < TOTAL_LINES)) # Specified by $1 do CHAR_COUNT=0 # Initialize the CHAR_COUNT to zero on every new line while ((CHAR_COUNT < LINE_LENGTH)) # Each line is fixed length do echo “${CHAR}\c” >> $OUT_FILE # Echo a single character ((CHAR_COUNT = CHAR_COUNT + 1)) # Increment the character counter done ((LINE_COUNT = LINE_COUNT + 1)) # Increment the line counter echo>>$OUT_FILE # Give a newline character done Listing 2.1 mk_large_file.ksh shell script listing. (continued) Each line produced by the mk_large_file.ksh script is the same length. The user specifies the total number of lines to create as a parameter to the shell script. Twelve Methods to Parse a File Line by Line The following paragraphs describe 12 of the parsing techniques I have commonly seen over the years. I have put them all together in one shell script separated as functions. After the functions are defined, I execute each method, or function, while timing the execution using the time command. To get accurate timing results I use a file that has 7,500 lines, where each line is the same length (we built this file using the mk_large_file.ksh shell script). A 7,500-line file is an extremely large file to be parsing line by line in a shell script, about 600 MB, but my Linux machine is so fast that I needed a large file to get the timing data greater than zero! Now it is time to look at the 12 methods to parse a file line by line. Each method uses a while statement to create a loop. The only two commands within the loop are cat $LINE, to output each line as it is read, and a no-op, specified by the : (colon) charac- ter. The thing that makes each method different is how the while loop is used. 56 Chapter 2 Method 1: cat $FILENAME | while read LINE Let’s start with the most common method that I see, which is catting a file and piping the file output to a while read loop. On each loop iteration a single line of text is read into a variable named LINE. This continuous loop will run until all of the lines in the file have been processed one at a time. The pipe is the key to the popularity of this method. It is intuitively obvious that the output from the previous command in the pipe is used as input to the next command in the pipe. As an example, if I execute the df command to list filesystem statistics and it scrolls across the screen out of view, I can use a pipe to send the output to the more command, as in the following command: df | more When the df command is executed, the pipe stores the output in a temporary system file. Then this temporary system file is used as input to the more command, allowing me to view the df command output one page/line at a time. Our use of piping output to a while loop works the same way; the output of the cat command is used as input to the while loop and is read into the LINE variable on each loop iteration. Look at the complete function in Listing 2.2. function while_read_LINE { cat $FILENAME | while read LINE do echo “$LINE” : done } Listing 2.2 while_read_LINE function listing. Each of these test loops is created as a function so that we can time each method using the shell script. You could also use () C-type function definition if you wanted, as shown in Listing 2.3. while_read_LINE () { cat $FILENAME | while read LINE do echo “$LINE” : done } Listing 2.3 Using the () declaration method function listing. Twelve Ways to Process a File Line by Line 57 [...]... “\nMethod 11:” “\nfunction while_LINE_line_cmdsub2_FD\n” >> $TIMEFILE “function while_LINE_line_cmdsub2_FD” while_LINE_line_cmdsub2_FD >> $TIMEFILE “\nMethod 12: ” “\nfunction while_line_LINE_FD\n” >> $TIMEFILE “function while_line_LINE_FD” while_line_LINE_FD >> $TIMEFILE Listing 2. 15 12_ ways_to_parse.ksh shell script listing (continued) The shell script in Listing 2. 15 first defines all of the functions that... while_LINE_line_cmdsub2 real user sys 7m18.04s 0m 52. 01s 6m10.94s Method 8: function while_LINE_line_bottom_cmdsub2 real 7m20.34s Listing 2. 17 Timing data for each loop method (continued) Twelve Ways to Process a File Line by Line user sys 0m50.82s 6m14 .26 s Method 9: function while_read_LINE_FD real user sys 0m5.89s 0m5.53s 0m0 .28 s Method 10: function while_LINE_line_FD real user sys 8m25.35s 0m50.68s 7m15.33s... 7 minutes to over 8 minutes and 25 .35 seconds The sorted timing output for the real time is shown in Listing 2. 18 real real real 0m5.89s 0m5.89s 1m30.34s Method 2 Method 9 Method 1 Listing 2. 18 Sorted timing data by method (continues) 75 76 Chapter 2 real real real real real real real real real 6m50.79s 6m53.71s 7m16.87s 7m18.04s 7m20.34s 7m20.48s 7m54.57s 8m24.58s 8m25.35s Method Method Method Method... others Listing 2. 15 12_ ways_to_parse.ksh shell script listing (continues) 67 68 Chapter 2 # # REV LIST: # # 02/ 19 /20 02 - Randy Michael # Set each of the while loops up as functions and the timing # of each function to see which one is the fastest # ####################################################################### # # NOTE: To output the timing to a file use the following syntax: # # 12_ ways_to_parse.ksh... Listing 2. 21 shows the second-place loop method function while_read_LINE { cat $FILENAME | while read LINE do echo “$LINE” : done } Listing 2. 21 Method 1: Made second place in timing tests The method in Listing 2. 21 is the most popular way to process a file line by line I see this technique in almost every shell script that does file parsing Method 1 is 1,433 percent slower than either Method 2 or 9... 7m15.33s Method 11: function while_LINE_line_cmdsub2_FD real user sys 8m24.58s 0m50.04s 7m16.07s Method 12: function while_line_LINE_FD real user sys 7m54.57s 0m35.88s 7m2 .26 s Listing 2. 17 Timing data for each loop method (continued) As you can see, all file processing loops are not created equal Two of the methods are tied for first place Methods 2 and 9 produce the exact same real execution time... while_LINE_line_bottom >> $TIMEFILE “\nMethod 7:” “\nfunction while_LINE_line_cmdsub2\n” >> $TIMEFILE “function while_LINE_line_cmdsub2” while_LINE_line_cmdsub2 >> $TIMEFILE “\nMethod 8:” “\nfunction while_LINE_line_bottom_cmdsub2\n” >> $TIMEFILE “function while_LINE_line_bottom_cmdsub2” while_LINE_line_bottom_cmdsub2 >> $TIMEFILE “\nMethod 9:” “\nfunction while_read_LINE_FD\n” >> $TIMEFILE “function... large file to parse 61 62 Chapter 2 through to get accurate timing results When we do our timing tests we may see a difference between the two command substitution techniques Study the function in Listing 2. 9, and we will cover the function at the end function while_LINE_line_cmdsub2 { cat $FILENAME | while LINE=$(line) do echo “$LINE” : done } Listing 2. 9 while_LINE_line_cmdsub2 function listing The... 73 74 Chapter 2 Method 2: function while_read_LINE_bottom real user sys 0m5.89s 0m5.62s 0m0.16s Method 3: function while_line_LINE_bottom real user sys 6m53.71s 0m36.62s 6m2.03s Method 4: function cat_while_LINE_line real user sys 7m16.87s 0m51.87s 6m8.54s Method 5: function while_line_LINE real user sys 6m50.79s 0m36.65s 5m59.66s Method 6: function while_LINE_line_bottom real user sys 7m20.48s 0m51.01s... our while loop We use the same technique described in Method 9 Study the function in Listing 2. 12 function while_LINE_line_FD { exec 3 . to the shell script name. Using the shell script in Listing 2. 1, you create a 7,500-line file with the following syntax: # mk_large_file.ksh 7500 The full shell script is shown in Listing 2. 1. #!/bin/ksh # #. Method 9. Study the function in Listing 2. 12. function while_LINE_line_FD { exec 3<&0 Listing 2. 12 while_LINE_line_FD function listing. 64 Chapter 2 exec 0< $FILENAME while LINE=`line` do echo. strategy replaces the read command from Listings 2. 2 and 2. 4 with the line command in a slightly different command structure. Look at the function in Listing 2. 6, and we will see how it works at the

Ngày đăng: 09/08/2014, 16:20

TỪ KHÓA LIÊN QUAN

w