Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 58 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
58
Dung lượng
449,03 KB
Nội dung
298 CHAPTER 9 LIST VARIABLES As shown in the table’s last row, the Shell uses the special @ index to retrieve all values, preserve any whitespace within them, and separate them from each other by a space. As usual, double quotes are also required if further processing of the extracted values isn’t desired. With Perl, on the other hand, all values are retrieved by using the array name with- out an index. The only effect of the double quotes is to separate the values on output with the contents of the ‘ $"’ variable—they’re not needed to suppress further process- ing of the extracted values, because that doesn’t happen anyway. 2 Next, we’ll look at different ways to initialize arrays. Table 9.2 Syntax for using arrays in the Shell and Perl Shell Perl a Remarks Assigning a value n[0]=13 $n[0]=13; In Perl, the $ symbol is always used with the variable name when referring to a scalar value. With the Shell, it’s only used when retrieving a value. Retrieving and displaying a value echo ${n[0]} print $n[0]; The Shell requires the array name and index to be enclosed in curly braces. Deleting a value unset n[0] delete $n[0]; The Shell deletes the designated element, but Perl maintains the element’s slot after marking its value as undefined. Assigning multiple values n=(13 42) @n=(13, 42); @n=qw/13 42/; @n=qw!\ | /!; The Shell recognizes whitespace as separators in the parenthe- sized list of initializers. By default, Perl requires a comma, and allows additional whitespace. With the qwX syntax, only whitespace separators are recog- nized between paired occur- rences of the X delimiter. b Retrieving and displaying all values echo "${n[@]}" print "@n"; See text for explanation. a. The examples using print assume the use of Perl’s l invocation option. b. Examples of the qwX quoting syntax are shown in chapter 12. 2 See http://TeachMePerl.com/DQs_in_shell_vs_perl.html for details on the comparative use of double quotes in the two languages. USING ARRAY VARIABLES 299 9.1.1 Initializing arrays with piecemeal assignments and push As shown in the top row of table 9.2, you can initialize arrays in piecemeal fashion: $stooges[2]='Curly'; $stooges[0]='Moe'; $stooges[1]='Larry'; Alternatively, you can use explicit lists on both sides of the assignment operator: ($stooges[2], $stooges[0], $stooges[1])=('Curly', 'Moe', 'Larry'); When it’s acceptable to add new elements to the end of an array, you can avoid man- aging an array index by using push @arrayname, 'new value'. This technique is used in the shell_types script, which categorizes Unix accounts into those having human-usable shells (such as /usr/bin/ksh) or “inhuman” shells (such as /sbin/shutdown): $ shell_types | fmt -68 # format to fit on screen THESE ACCOUNTS USE HUMAN SHELLS: root, bin, daemon, lp, games, wwwrun, named, nobody, ftp, man, news, uucp, at, tim, yeshe, info, contix, linux, spug, mailman, snort, stu01 THESE ACCOUNTS USE INHUMAN SHELLS: mail, sshd, postfix, ntp, vscan Because the listing of “human” account names produces a very long line, the Unix fmt command is used to reformat the text to fit within the width of the screen. The script’s shebang line (see listing 9.1) arranges for input lines to be automati- cally split into fields on the basis of individual colons, because that’s the field separa- tor used in the /etc/passwd file, which associates shells with user accounts. The matching operator on Line 8 checks the last field of each line for the pattern characteristic of “human” shells and stores the associated account names in @human using push. Alternatively, Line 12 arranges for the names of the accounts that fail the test to be stored in @inhuman. 1 #! /usr/bin/perl -wnlaF':' 2 3 BEGIN { 4 @ARGV=( '/etc/passwd' ); # Specify input file 5 } 6 7 # Separate users of "human" oriented shells from others 8 if ( $F[-1] =~ /sh$/ ) { 9 push @human, $F[0]; 10 } 11 else { 12 push @inhuman, $F[0]; 13 } Listing 9.1 The shell_types script 300 CHAPTER 9 LIST VARIABLES 14 END { 15 $"=', '; 16 print "\UThese accounts use human shells: \E\n@human\n"; 17 print "\UThese accounts use inhuman shells:\E\n@inhuman"; 18 } To make the output more presentable, Line 15 sets the ‘$"’ variable to a comma-space sequence, and \U is used to convert the output headings to uppercase. In programs like this, where you don’t care what position a data element is allo- cated in the array, it’s more convenient to push them onto the array’s end than to manage an index. In other cases, it may be more appropriate to do piecemeal array- initializations using indexing (see, e.g., section 10.5), to maintain control over where an element is stored. Next, we’ll look at the syntax and rules for using more advanced indexing techniques. 9.1.2 Understanding advanced array indexing Table 9.3 shows the association between array values and indices of both the positive and negative varieties, both of which are usable in Perl. Negative indexing counts backward from the end of the array and is most commonly used to access an array’s last element. Another way to do that is by using an array’s maximum index variable, whose name is $#arrayname. Table 9.3 Syntax for advanced array indexing Initialization @X=('A', 'B', 'C', 'D'); Stored value A B C D Ordinal Position 1 2 3 4 Positive indexing $X[ 0] $X[ 1] $X[ 2] $X[ 3] Negative indexing $X[-4] $X[-3] $X[-2] $X[-1] Indexing with maximum-index variable $X[$#X] Result A B C D Slice indexing @X[2,3] "@X[2,0 1]" @X[3,0 2] Result CD C A B DABC USING ARRAY VARIABLES 301 As an alternative to repeatedly indexing into an array to access several values, Perl allows a collection of values—called an array slice 3 —to be addressed in one indexing expression (as shown in the table’s bottom panel). You do this by arranging the comma-separated indices within square brackets in the order desired for retrieval (or assignment) and putting @arrayname before that expression. The @ symbol is used rather than $, because multiple indices extract a list of values, not a single scalar value. You can also specify a range of consecutive indices by placing the range operator ( ) between the end points, allowing the use of 3 5, for instance, as a shortcut for 3, 4, 5. The following Perl command retrieves multiple values from an array using a slice: $ cat newage_contacts # field number exceeds index by 1 (510) 246-7890 sadhu3@nirvana.org (225) 424-4242 guru@enlighten.com (928) 312-5789 shaman@healing.net 1/0 2/1 3/2 $ perl -wnla -e 'print "@F[2,0,1]";' newage_contacts sadhu3@nirvana.org (510) 246-7890 guru@enlighten.com (225) 424-4242 shaman@healing.net (928) 312-5789 We could have written the Perl command without using an array slice, by typing print "$F[2] $F[0] $F[1]" in place of print "@F[2,0,1]". But that involves a lot of extra typing, so it’s not Lazy enough! Because each array slice is itself a list, you can set the ‘ $"’ formatting variable to insert a custom separator between the list elements: $ perl -wnla -e '$"=":\t"; print "@F[0,2]";' newage_contacts (510): sadhu3@nirvana.org (225): guru@enlighten.com (928): shaman@healing.net We’ll continue with this theme of finding friendlier ways to write array-indexing expressions in the next section, where you’ll see how a script that lets the user think like a human makes access to fields a lot easier. 9.1.3 Extracting fields in a friendlier fashion Sooner or later, every Perl programmer makes the mistake of attempting to use 1 as the index to extract the first value from an array—rather than 0—because humans natu- rally count from 1. But with a little creative coding, you can indulge this tendency. 3 The indexed elements needn’t be adjacent, and subsequent slices needn’t run parallel to earlier ones (as with bread slices), so a better name for this feature might be an index group. Field numbers / indices 302 CHAPTER 9 LIST VARIABLES As a case in point, the show_fields2 script allows the user to select fields for dis- play using human-oriented numbers, which start from 1: $ cat zappa_floyd Weasels Ripped my Flesh Frank Zappa Dark Side of the Moon Pink Floyd $ show_fields2 '1' zappa_floyd # 1 means first field Weasels Dark It works by using unshift (introduced in table 8.2) to prepend a new value to the array, which shifts the existing values rightward. As a result, the value originally stored under the index of N gets moved to N+1. As depicted in Figure 9.1, if 0 was the original index for the value A, after unshift prepends one new item, A would then be found under 1. The show_fields2 script also supports index ranges and array slices: $ cat zappa_floyd # field numbers added Weasels Ripped my Flesh Frank Zappa 1 2 3 4 5 6 Dark Side of the Moon Pink Floyd 1 2 3 4 5 6 7 $ cat zappa_floyd | show_fields2 '2 4,1' # indices 1 3,0 Ripped my Flesh Weasels Side of the Dark It even issues a warning if the user attempts to access (the illegitimate) field 0: $ show_fields2 '0' zappa_floyd # WRONG! Usage: show_fields2 '2,1,4 7, etc.' [ file1 ] There's no field #0! The first is #1. The show_fields2 script, which uses several advanced array-handling techniques, is shown in listing 9.2. Line 7 pulls the argument containing the field specifications out of @ARGV and saves it in the $fields variable. Then, a matching operator is used to ask whether Figure 9.1 Effect of unshift Field numbers USING ARRAY VARIABLES 303 $fields contains only the permitted characters: digits, commas, and periods. 4 If the answer is “no,” the program terminates on Line 11 after showing the usage message. 1 #! /usr/bin/perl -wnla 2 3 BEGIN { 4 $Usage="Usage: $0 '2,1,4 7, etc.' [ file1 ]"; 5 # Order of field numbers dictates print order; 6 # the first field is specified as 1 7 $fields=shift; 8 9 # Proofread field specifications 10 defined $fields and $fields =~ /^[\d,.]+$/g or 11 warn "$Usage\n" and exit 1; 12 13 # Convert 5,2 4 => 5,2,3,4 14 # and load those index numbers into @fields 15 @fields=eval " ( $fields ) "; 16 } 17 18 if (@F > 0) { # only process lines that have fields 19 # Load warning message into 0th slot, to flag errors 20 unshift @F, 21 "$Usage\n\tThere's no field #0! The first is #1.\n"; 22 print "@F[ @fields ]"; # DQs yield space-separated values 23 } The next step is to turn the user’s field specification into one that Perl can understand, which requires some special processing. The easy part is arranging for the request for field 1 to produce the value for the element having index 0. This is accomplished (on Line 20) by using unshift to shift the original values one position rightward within the array. A combined usage and warning message is then placed in the freshly vacated first position so that the program automatically prints a warning if the user requests the (illegitimate) field #0. Now for the tricky part. In Line 15, the user’s field specification—for instance 1,3 5—needs to be converted into the corresponding list—in this case (1,3,4,5). You may think that placing $fields into an explicit list and assigning the result to an array would do the trick, using @fields=( $fields ), but it doesn’t. The reason is that commas and double-dots arising from variable interpola- tion are treated as literal characters, rather than being recognized as the comma opera- tor and the range operator. 4 The “.” becomes a literal character within square brackets, like most metacharacters (see chapter 3). Listing 9.2 The show_fields2 script 304 CHAPTER 9 LIST VARIABLES Accordingly, after the variable interpolation permitted by the surrounding dou- ble quotes in Line 15 yields the contents of $fields, the expression ( 1,3 5 ) must be processed by eval—to allow recognition of “ ” as the range operator and the comma as the list-element separator. 5 The end result is exactly as if @fields=( 1,3 5 ) had appeared on Line 15 in the first place, 6 resulting in the assignment of the desired index numbers to the @fields array. Line 18 checks the field count, to exempt empty lines from the processing that follows. As mentioned earlier, unshift loads a special message into the now illegit- imate 0th position of the array; then, the contents of the @fields array are inserted into the subscripting expression for use as indices, to pull out the desired values for printing. Having just seen a demonstration of how to carefully control indexing so that the wrong number can produce the right result, we’ll next throw caution to the wind, for- sake all control over indexing, and see what fortune has in store for those who ran- domly access arrays. 9.1.4 Telling fortunes: The fcookie script In the early days of UNIX we were easily entertained, which was good because the multi-media capabilities of the computers of that era were quite rudimentary. As a case in point, I remember being called over one December day by a beaming system administrator to gawk in amazement with my colleagues at a long sheet of paper taped to the wall. It was a fan-fold printout of a Christmas tree, with slashes and back- slashes representing the needles and pound signs representing ornaments. Shortly after this milestone in the development of ASCII art was achieved, “comedy” arrived on our computers in the form of a command called fortune, which displayed a humorous message like you might find in a verbose fortune cookie. We’ll pay homage to that comedic technological breakthrough by seeing how Perl scripts can be used not only to emulate the behavior of the fortune program, but also to do its job even better. But before we can use them for our script, we need to understand how fortunes are stored in their data files. Let’s examine a file devoted to Star Trek quips: $ head -7 /usr/share/fortune/startrek A father doesn't destroy his children. Lt. Carolyn Palamas, "Who Mourns for Adonais?", stardate 3468.1. % 5 eval evaluates code starting with the compilation phase, allowing it to detect special tokens that can- not otherwise be recognized during a program’s execution (see section 8.7). 6 Allowing a user to effectively paste source code into an eval’d statement could lead to abuses, al- though the argument validation performed on Line 10 of show_fields2 is a good initial safeguard. For more on Perl security, including Perl’s remarkable taint-checking mode, see man perlsec. USING ARRAY VARIABLES 305 A little suffering is good for the soul. Kirk, "The Corbomite Maneuver", stardate 1514.0 % As you can see, each fortune’s record is terminated by a line containing only a % sym- bol. Armed with this knowledge, it’s easy to write a script that loads each fortune into an array and then displays a randomly selected one on the screen (see listing 9.3). Using the implicit loop, the script reads one record ending in % at a time, as instructed by the setting of the $/ variable, and installs it in the @fortunes array. A suitable array index for each record could be derived from the record number variable ( $.), as shown in the commented-out Line 8, but it’s easier to use push (Line 9) to build up the array. Then, a random array element is selected for printing, using the standard technique of providing rand with the array’s number of elements as its argu- ment (see table 7.7), and using its returned value as the index. 1 #! /usr/bin/perl -wnl 2 3 BEGIN { 4 @ARGV=( '/usr/share/fortune/startrek' ); 5 $/='%'; # set input record separator for "fortune" files 6 } 7 8 # $fortunes[$. -1]=$_; # store fortune in (record-number -1) 9 push @fortunes, $_; # easier way 10 11 END { 12 print $fortunes[ rand @fortunes ]; # print random fortune 13 } Here are some test runs: $ fcookie A man will tell his bartender things he'll never tell his doctor. Dr. Phillip Boyce, "The Menagerie", stardate unknown $ fcookie It is a human characteristic to love little animals, especially if they're attractive in some way. McCoy, "The Trouble with Tribbles", stardate 4525.6 Yep, that’s space-grade profundity all right. But I crave more! And I don’t want to reissue the command every time I want to see another fortune—nor do I want to see any reruns. These problems will be solved in the next episode. Listing 9.3 The fcookie script 306 CHAPTER 9 LIST VARIABLES fcookie2: The sequel fcookie2 is an enhancement that responds to the newfound needs of the increas- ingly demanding user community (consisting of me, at least). It illustrates the use of a dual input-mode technique that first reads fortunes from a file and stores them in an array, and then takes each <ENTER> from the keyboard as a request to print another randomly selected fortune. Here’s a test run that uses the Unix yes command to feed the script lots of y<ENTER> inputs, simulating the key presses of an inexhaustible fortune seeker: $ yes | fcookie2 Press <ENTER> for a fortune, or <^D>: There is a multi-legged creature crawling on your shoulder. Spock, "A Taste of Armageddon", stardate 3193.9 Vulcans never bluff. Spock, "The Doomsday Machine", stardate 4202.1 fcookie2: How unfortunate; out of fortunes! You can do a “full sensor scan” of the script in listing 9.4. 1 #! /usr/bin/perl -wnl 2 # Interactive fortune-cookie displayer, with no repeats 3 4 BEGIN { 5 @ARGV or # provide default fortune file 6 @ARGV=( '/usr/share/fortune/startrek' ); 7 push @ARGV, '-'; # Read STDIN next, for interactive mode 8 $/='%'; # Set input record separator for fortunes 9 $initializing=1; # Start in "initializing the array" mode 10 } 11 ############# Load Fortunes into Array ############# 12 if ($initializing) { 13 push @fortunes, $_; # add next fortune to list 14 if (eof) { # on end-of-file, switch to input from STDIN 15 $initializing=0; # signify end of initializing mode 16 $/="\n"; # set input record separator for keyboard 17 printf 'Press <ENTER> for a fortune, or <^D>: '; 18 } 19 } 20 ############# Present Fortunes to User ############# 21 else { 22 # Use random sampling without replacement. After a fortune is 23 # displayed, mark its array element as "undefined" using 24 # "delete", then prune it from array using "grep" 25 26 $index=rand @fortunes; # select random index Listing 9.4 The fcookie2 script USING ARRAY VARIABLES 307 27 printf $fortunes[ $index ]; # print random fortune 28 delete $fortunes[ $index ]; # mark fortune undefined 29 @fortunes=grep { defined } @fortunes; # remove used ones 30 @fortunes or # terminate after all used 31 die "\n$0: How unfortunate; out of fortunes!\n"; 32 } The BEGIN block starts by assigning the pathname of the startrek file to @ARGV if that array is empty, to establish a default data source. Next, it adds “-” as the final argument, so the program will read from STDIN after reading (and storing) all the fortunes. Lines 8–9 set the input record separator to % and the $initializing variable to the (True) value of 1, so the script begins by loading fortunes into the array (Lines 12–13). As with all scripts of this type, it’s necessary to detect the end of the initial- ization phase by sensing “end of file” (using eof, Line 14) and then to reset $initializing to a False value, set the appropriate input record separator for the user-interaction phase, and prompt the user for input. Line 26 obtains a random index for the array and saves it in a variable, which is used in the next statement to extract and print the selected fortune. printf is used for the printing rather than print, because the fortune already has a trailing new- line, 7 and print (in conjunction with the l option) would add another one. Line 28 then runs delete (see table 9.2) on the array element, which isn’t quite as lethal as it sounds—all it does is mark its value as undefined. 8 The actual removal of that element is accomplished by using grep to filter it out of @fortunes and reinitialize the array (see section 7.3.3), using @fortunes=grep { defined } @fortunes; That’s all the coding it takes, because defined operates on $_ by default, and grep stores the list element that it’s currently processing in that same variable. If the user has sufficient stamina, he’ll eventually see all the fortunes, so Line 30 checks the remaining size of the array and calls die when it’s depleted. Alternatively, because the implicit loop is reading the user’s input, the program can be terminated by responding to the prompt with <^D>. One final word of caution, for you to file away under “debugging tips”: Any dual input-mode script will behave strangely if you neglect to reset the “ $/” variable to 7 Why does it have a trailing newline? The input record separator ($/) was defined as %, so that’s what the l option stripped from each input record, leaving the newline that came before it untouched. An alternative approach would be to set “ $/” to “\n%” to get them both stripped off and to use print to replace the newline on output. 8 In contrast, delete removes both the index and its value when used on a hash element, as we’ll discuss shortly. [...]... data types: # discusses scalar, array, and hash variables • man perldata The standard List::Util module provides several useful utility functions for lists of all kinds—explicit lists, arrays, and hashes For example, it provides functions that shuffle (randomly reorder) a list’s values, and that return their minimum and maximum values Run the following command for additional details: • SUMMARY man List::Util... logged-in users on a Linux system, with its input provided by who’s output: forrest forrest forrest willy willy gloria gloria :0 pts/0 tty1 tty2 tty3 pts/1 pts/5 Dec Dec Dec Dec Dec Dec Dec 6 6 6 6 6 6 8 09:07 (console) 09: 08 09:37 09:43 09: 48 17:03 09:36 But first, that output will be reduced by an awk command17 to its first column, to isolate the user names: 17 3 18 Although Perl has many advantages... THE SHELL AND PERL 331 The Unix shells offer three kinds of loops: • The functionally similar foreach (from the C shell) and for (from the other Shells), which are top-tested list-handling loops • The closely related while and until, which are condition-evaluating loops that support both top and bottom tests • The select loop, which is a uniquely useful hybrid that provides menuoriented list-handling... Perl equivalents 19 This is also called IFS (for Internal Field Separator) processing, after the Shell variable of the same name COMPARING LIST GENERATORS IN THE SHELL AND PERL 325 You can use this table to select the Perl counterparts for the Shell expressions you already know, without the burden of working out the Perl equivalents yourself Table 9.9 Common list generators in the Shell and their Perl. .. man List::Util # describes utility functions for lists 329 C H A P T E R 1 0 Looping facilities 10.1 Looping facilities in the Shell and Perl 331 10.2 Looping with while/until 333 10.3 Looping with do while/until 3 38 10.4 Looping with foreach 340 10.5 Looping with for 345 10.6 Using loop-control directives 349 10.7 The CPAN’s select loop for Perl 355 10 .8 Summary 360 Recapitulation Repetition! Redundancy!!... looping in Perl relates to looping in the Shell, let’s consider a typical kind of activity that looping makes a lot easier As you know, most Unix commands can handle multiple arguments That’s why you can get a long listing of every file that ends in txt for the current directory by using this command: ls -l *.txt There’s no need to run the command separately for each of the files, because the command processes... accident in Perl, as they may in the Shell Directions for further study To obtain information about specific Perl functions covered in this chapter, such as keys, values, shift, unshift, delete, exists, or printf, you can use that function’s name in a command of this form: • perldoc -f function-name # coverage of "function-name" The following document provides additional information on Perl s data... condition-evaluating loop Perl provides four loops: • foreach, which is like the Shell’s for • while/until and do while/until, which together cover the same ground as the Shell’s while/until loop • for, inherited from the C language, which is a top-tested, condition-evaluating loop that’s especially useful for handling arrays in certain ways This chapter compares each Shell loop to its Perl counterpart and shows translations... True/False value provided by the command before do (called the controlling condition, or condition for short) The loops differ only in while iterating while the condition remains True, and until iterating until it becomes True The while/until loops of both languages are shown in table 10.2 The expanded forms are shown in the top panel, and the compressed forms, suitable for loops that will fit on one line,... numbers for phone owners @phone_numbers Index Value 0 789 - 983 4 1 89 7-7164 10 Hashes use curly braces rather than square brackets around their subscripts, as shown in table 9.7 USING HASH VARIABLES 309 The processing steps would therefore involve first looking up the entries of interest in the specified arrays and then retrieving the desired values: @phone_owners: Joe -> @phone_numbers: 0 0 -> 789 - 983 4 . $index=rand @fortunes; # select random index Listing 9.4 The fcookie2 script USING ARRAY VARIABLES 307 27 printf $fortunes[ $index ]; # print random fortune 28 delete $fortunes[ $index ]; # mark fortune. $fortunes[$. -1]=$_; # store fortune in (record-number -1) 9 push @fortunes, $_; # easier way 10 11 END { 12 print $fortunes[ rand @fortunes ]; # print random fortune 13 } Here are some test. /usr/bin /perl -wnl 2 3 BEGIN { 4 @ARGV=( '/usr/share/fortune/startrek' ); 5 $/='%'; # set input record separator for "fortune" files 6 } 7 8 # $fortunes[$.