Unix Shell Programming Third Edition phần 3 docx

69 448 0
Unix Shell Programming Third Edition phần 3 docx

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

5 27 $ sort -n data Sort arithmetically -5 11 2 12 3 33 5 27 14 -9 15 6 23 2 $ Skipping Fields If you had to sort your data file by the y value—that is, the second number in each line—you could tell sort to skip past the first number on the line by using the option +1n instead of -n. The +1 says to skip the first field. Similarly, +5n would mean to skip the first five fields on each line and then sort the data numerically. Fields are delimited by space or tab characters by default. If a different delimiter is to be used, the -t option must be used. $ sort +1n data Skip the first field in the sort 14 -9 23 2 15 6 -5 11 2 12 5 27 3 33 $ The -t Option As mentioned, if you skip over fields, sort assumes that the fields being skipped are delimited by space or tab characters. The -t option says otherwise. In this case, the character that follows the -t is taken as the delimiter character. Look at our sample password file again: $ cat /etc/passwd root:*:0:0:The super User:/:/usr/bin/ksh steve:*:203:100::/users/steve:/usr/bin/ksh bin:*:3:3:The owner of system files:/: cron:*:1:1:Cron Daemon for periodic tasks:/: george:*:75:75::/users/george:/usr/lib/rsh pat:*:300:300::/users/pat:/usr/bin/ksh uucp:*:5:5::/usr/spool/uucppublic:/usr/lib/uucp/uucico asg:*:6:6:The Owner of Assignable Devices:/: sysinfo:*:10:10:Access to System Information:/:/usr/bin/sh mail:*:301:301::/usr/mail: $ If you wanted to sort this file by username (the first field on each line), you could just issue the command sort /etc/passwd To sort the file instead by the third colon-delimited field (which contains what is known as your user id), you would want an arithmetic sort, skipping the first two fields (+2n), specifying the colon character as the field delimiter (-t:): $ sort +2n -t: /etc/passwd Sort by user id root:*:0:0:The Super User:/:/usr/bin/ksh cron:*:1:1:Cron Daemon for periodic tasks:/: bin:*:3:3:The owner of system files:/: uucp:*:5:5::/usr/spool/uucppublic:/usr/lib/uucp/uucico asg:*:6:6:The Owner of Assignable Devices:/: sysinfo:*:10:10:Access to System Information:/:/usr/bin/sh george:*:75:75::/users/george:/usr/lib/rsh steve:*:203:100::/users/steve:/usr/bin/ksh pat:*:300:300::/users/pat:/usr/bin/ksh mail:*:301:301::/usr/mail: $ Here we've emboldened the third field of each line so that you can easily verify that the file was sorted correctly by user id. Other Options Other options to sort enable you to skip characters within a field, specify the field to end the sort on, merge sorted input files, and sort in "dictionary order" (only letters, numbers, and spaces are used for the comparison). For more details on these options, look under sort in your Unix User's Manual. uniq The uniq command is useful when you need to find duplicate lines in a file. The basic format of the command is uniq in_file out_file In this format, uniq copies in_file to out_file, removing any duplicate lines in the process. uniq's definition of duplicated lines are consecutive-occurring lines that match exactly. If out_file is not specified, the results will be written to standard output. If in_file is also not specified, uniq acts as a filter and reads its input from standard input. Here are some examples to see how uniq works. Suppose that you have a file called names with contents as shown: $ cat names Charlie Tony Emanuel Lucy Ralph Fred Tony $ You can see that the name Tony appears twice in the file. You can use uniq to "remove" such duplicate entries: $ uniq names Print unique lines Charlie Tony Emanuel Lucy Ralph Fred Tony $ Tony still appears twice in the preceding output because the multiple occurrences are not consecutive in the file, and thus uniq's definition of duplicate is not satisfied. To remedy this situation, sort is often used to get the duplicate lines adjacent to each other. The result of the sort is then run through uniq: $ sort names | uniq Charlie Emanuel Fred Lucy Ralph Tony $ So the sort moves the two Tony lines together, and then uniq filters out the duplicate line (recall that sort with the -u option performs precisely this function). The -d Option Frequently, you'll be interested in finding the duplicate entries in a file. The -d option to uniq should be used for such purposes: It tells uniq to write only the duplicated lines to out_file (or standard output). Such lines are written just once, no matter how many consecutive occurrences there are. $ sort names | uniq -d List duplicate lines Tony $ As a more practical example, let's return to our /etc/passwd file. This file contains information about each user on the system. It's conceivable that over the course of adding and removing users from this file that perhaps the same username has been inadvertently entered more than once. You can easily find such duplicate entries by first sorting /etc/passwd and piping the results into uniq -d as done previously: $ sort /etc/passwd | uniq -d Find duplicate entries in /etc/passwd $ So there are no duplicate entries. But we think that you really want to find duplicate entries for the same username. This means that you want to just look at the first field from each line of /etc/passwd (recall that the leading characters of each line of /etc/passwd up to the colon are the username). This can't be done directly through an option to uniq, but can be accomplished indirectly by using cut to extract the username from each line of the password file before sending it to uniq. $ sort /etc/passwd | cut -f1 -d: | uniq -d Find duplicates cem harry $ So there are multiple entries in /etc/passwd for cem and harry. If you wanted more information on the particular entries, you could grep them from /etc/passwd: $ grep -n 'cem' /etc/passwd 20:cem:*:91:91::/users/cem: 166:cem:*:91:91::/users/cem: $ grep -n 'harry' /etc/passwd 29:harry:*:103:103:Harry Johnson:/users/harry: 79:harry:*:90:90:Harry Johnson:/users/harry: $ The -n option was used to find out where the duplicate entries occur. In the case of cem, there are two entries on lines 20 and 166; in harry's case, the two entries are on lines 29 and 79. If you now want to remove the second cem entry, you could use sed: $ sed '166d' /etc/passwd > /tmp/passwd Remove duplicate $ mv /tmp/passwd /etc/passwd mv: /etc/passwd: 444 modey mv: cannot unlink /etc/passwd $ Naturally, /etc/passwd is one of the most important files on a Unix system. As such, only the superuser is allowed to write to the file. That's why the mv command failed. Other Options The -c option to uniq behaves like uniq with no options (that is, duplicate lines are removed), except that each output line gets preceded by a count of the number of times the line occurred in the input. $ sort names | uniq –c Count line occurrences 1 Charlie 1 Emanuel 1 Fred 1 Lucy 1 Ralph 2 Tony $ Two other options that won't be described enable you to tell uniq to ignore leading characters/fields on a line. For more information, consult your Unix User's Manual. We would be remiss if we neglected to mention the programs awk and perl that can be useful when writing shell programs. However, to do justice to these programs requires more space than we can provide in this text. We'll refer you to the document Awk—A Pattern Scanning and Processing Language, by Aho, et al., in the Unix Programmer's Manual, Volume II for a description of awk. Kernighan and Pike's The Unix Programming Environment (Prentice Hall, 1984) contains a detailed discussion of awk. Learning Perl and Programming Perl, both from O'Reilly and Associates, present a good tutorial and reference on the language, respectively. Exercises 1: What will be matched by the following regular expressions? x* [0-9]\{3\} xx* [0-9]\{3,5\} x\{1,5\} [0-9]\{1,3\},[0-9]\{3\} x\{5,\} ^\ x\{10\} [A-Za-z_][A-Za-z_0-9]* [0-9] \([A-Za-z0-9]\{1,\}\)\1 [0-9]* ^Begin$ [0-9][0-9][0-9] ^\(.\).*\1$ 2: What will be the effect of the following commands? who | grep 'mary' who | grep '^mary' grep '[Uu]nix' ch?/* ls -l | sort +4n sed '/^$/d' text > text.out sed 's/\([Uu]nix\)/\1(TM)/g' text > text.out date | cut -c12-16 date | cut -c5-11,25- | sed 's/\([0-9]\{1,2\}\)/\1,/' 3: Write the commands to a. Find all logged-in users with usernames of at least four characters.a. Find all users on your system whose user ids are greater than 99.b. Find the number of users on your system whose user ids are greater than 99.c. List all the files in your directory in decreasing order of file size.d. [...]... scanned the line, substituting * as the value of x 2 The shell rescanned the line, encountered the *, and then substituted the names of all files in the current directory 3 The shell initiated execution of echo, passing it the file list as arguments (see Figure 5 .3) Figure 5 .3 echo $x This order of evaluation is important Remember, first the shell does variable substitution, then does filename substitution,... in terms of program readability They're simply ignored by the shell Variables Like virtually all programming languages, the shell allows you to store values into variables A shell variable begins with an alphabetic or underscore (_) character and is followed by zero or more alphanumeric or underscore characters To store a value inside a shell variable, you simply write the name of the variable, followed... language, you can't put those spaces in Second, unlike most other programming languages, the shell has no concept whatsoever of data types Whenever you assign a value to a shell variable, no matter what it is, the shell simply interprets that value as a string of characters So when you assigned 1 to the variable count previously, the shell simply stored the character 1 inside the variable count, making... We Go IN THIS CHAPTER Command Files Variables Built-in Integer Arithmetic Exercises Based on the discussions in Chapter 3, "What Is the Shell? ," you should realize that whenever you type something like who | wc -l that you are actually programming in the shell! That's because the shell is interpreting the command line, recognizing the pipe symbol, connecting the output of the first command to the input... Built-in Integer Arithmetic The POSIX standard shell provides a mechanism for performing integer arithmetic on shell variables called arithmetic expansion Note that some older shells do not support this feature The format for arithmetic expansion is $((expression)) where expression is an arithmetic expression using shell variables and operators Valid shell variables are those that contain numeric... The $ character is a special character to the shell If a valid variable name follows the $, the shell takes this as an indication that the value stored inside that variable is to be substituted at that point So, when you type echo $count the shell replaces $count with the value stored there; then it executes the echo command: $ echo $count 1 $ Remember, the shell performs variable substitution before... the shell do the substitution when echo $x was executed? The answer is that the shell does not perform filename substitution when assigning values to variables Therefore, x=* assigns the single character * to x This means that the shell did the filename substitution when executing the echo command In fact, the precise sequence of steps that occurred when echo $x was executed is as follows: 1 The shell. .. the value 1 to the shell variable count, you simply write count=1 and to assign the value /users/steve/bin to the shell variable my_bin, you simply write my_bin=/users/steve/bin A few important points here First, spaces are not permitted on either side of the equals sign Keep that in mind, especially if you're in the good programming habit of inserting spaces around operators In the shell language, you... variable If you're used to programming in a language such as C or Pascal, where all variables must be declared, you're in for another readjustment Because the shell has no concept of data types, variables are not declared before they're used; they're simply assigned values when you want to use them As you'll see later in this chapter, the shell does support integer operations on shell variables that contain... allowed) Valid operators are taken from the C programming language and are listed in Appendix A , "Shell Summary." The result of computing expression is substituted on the command line For example, echo $((i+1)) adds one to the value in the shell variable i and prints the result Notice that the variable i doesn't have to be preceded by a dollar sign That's because the shell knows that the only valid items . Information:/:/usr/bin/sh george:*:75:75::/users/george:/usr/lib/rsh steve:*:2 03: 100::/users/steve:/usr/bin/ksh pat:* :30 0 :30 0::/users/pat:/usr/bin/ksh mail:* :30 1 :30 1::/usr/mail: $ Here we've emboldened the third field of each line so that. User:/:/usr/bin/ksh steve:*:2 03: 100::/users/steve:/usr/bin/ksh bin:* :3: 3:The owner of system files:/: cron:*:1:1:Cron Daemon for periodic tasks:/: george:*:75:75::/users/george:/usr/lib/rsh pat:* :30 0 :30 0::/users/pat:/usr/bin/ksh uucp:*:5:5::/usr/spool/uucppublic:/usr/lib/uucp/uucico asg:*:6:6:The. Chapter 3, "What Is the Shell? ," you should realize that whenever you type something like who | wc -l that you are actually programming in the shell! That's because the shell is

Ngày đăng: 13/08/2014, 15:21

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan