1. Trang chủ
  2. » Công Nghệ Thông Tin

Minimal Perl For UNIX and Linux People 5 pot

50 755 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 50
Dung lượng 748,69 KB

Nội dung

160 CHAPTER 5 PERL AS A (BETTER) awk COMMAND However, some of the apparent similarities between the languages mask significant differences. For example, some AWK functions have namesakes that take different arguments in Perl, and certain other functions, such as AWK’s sub and match, corre- spond to operators represented by symbols in Perl, rather than to named functions. To help AWKiologists migrate to Perlistan, table 5.14 shows the Perl counter- parts to the most commonly used (non-mathematical) functions found in popular versions of AWK. Some general differences are that Perl functions are normally invoked without any parentheses around their arguments, 33 and all occurrences of the $0 variable in the AWK examples must be converted to $_ for Perl (assuming use of the n or p option). Notice in particular that the “offset” argument (#2) of AWK’s substr (“sub- string”) function needs to be a 1 to grab characters from the very beginning of the string, whereas in Perl, the value 0 has that meaning. Table 5.13 Popular built-in functions of AWK and Perl Ty p e a NAWK GAWK Perl String gsub, index, match, split, sprintf, sub, substr, tolower, toupper asort, gensub, gsub, index, length, match, split, strtonum, sub, substr, tolower, toupper chomp, chop, chr, crypt, hex, index, lc, lcfirst, length, oct, ord, pack, q/STRING/, qq/STRING/, reverse, rindex, sprintf, substr, tr///, uc, ucfirst, y/// Arithmetic cos, exp, int, log, sin, sqrt , srand cos, exp, int, log, sin, sqr abs, atan2, cos, exp, hex, int, log, oct, rand, sin, sqrt, srand Input/Output close, getline, print, printf close, getline, print, printf, fflush binmode, close, closedir, dbmclose, dbmopen, die, eof, fileno, flock, format, getc, print, printf, warn Miscella- neous system bindtextdomain, compl, dcgettext, dcngettext, extension, lshift, mktime, rshift, strftime, system defined, dump, eval, formline, gmtime, local, localtime, my, our, pos, reset, scalar, system, time, undef, wantarray a. The standard Perl installation provides hundreds of additional functions not listed here, including ones that fall into these categories: Unix system calls, array handling, file handling, fixed-length record manipulation, hash handling, list processing, module management, network information retrieval, pattern matching, process control, socket control, user/group information retrieval, and variable scoping. 33 But if you’ve been cruelly rebuked by other languages whenever you’ve forgotten to use parentheses around your function arguments, and you consequently feel your Perl programs look shockingly defective without them, feel free to put them in! Perl won’t mind. USING BUILT-IN FUNCTIONS 161 Another difference is that GAWK’s case-conversion functions, toupper and tolower, have two corresponding resources in Perl—the functions called uc and lc, and the \U and \L string modifiers (see table 4.5). Perl’s voluminous collection of built-in functions makes it easy to write com- mands that do various types of data processing, as you’ll see next. 5.7.1 One-liners that use functions The following command prints up to 80 characters from each input line: perl -wnl -e 'print substr $_, 0, 80;' lines It uses the substr function and specifies an offset of zero from the beginning of $_ as the starting point, along with a selection length of 80 characters. Because the call to substr appears in the argument list of print, substr’s output is delivered into print’s argument list and subsequently printed. The following command reads lines consisting of numbers, and prints their square roots: perl -wnl -e 'print "The square root of $_ is ", sqrt $_;' numbers The addition of syntactically unnecessary but cosmetically beneficial parentheses changes the previous commands into these variations: perl -wnl -e 'print "The square root of $_ is ", sqrt($_);' numbers perl -wnl -e 'print substr ($_, 0, 80);' lines Perl won’t mind the unnecessary parentheses (see section 7.6, and appendix B), but after you become more acculturated to Perlistan, you’ll no longer feel the need to type them in such cases. Table 5.14 Perl counterparts to popular AWK functions AWK (or GAWK) Perl sub("RE","replacement") s/RE/replacement/; gsub("RE","replacement") s/RE/replacement/g; match(string_var,"RE") $string_var =~ /RE/; substr($0, 1, 3) substr $_, 0, 3; $0=tolower($0) $_="\L$_"; Or $_=lc; $0=toupper($0) $_="\U$_"; Or $_=uc; getline $_=<>; split($0, array_var)@array_var=split; index, length, print, printf, sprintf, system Same function names, but Perl doesn’t require parentheses. 162 CHAPTER 5 PERL AS A (BETTER) awk COMMAND Commands like those just reviewed are great for applying the same processing regi- men to each input record—but what if you only want to perform a single numeric cal- culation, such as the square root of 42 or the remainder of 365 divided by 12? You could write a custom program to generate each of those results. But wouldn’t it be even better to write a generic script that could calculate and print the result of any basic mathematical problem? This valuable technique will be demonstrated next, using a command of legen- dary significance. 5.7.2 The legend of nexpr We’ll begin this section with a discussion of the role played by a certain command in UNIX’s early years and how AWK improved on it, and then you’ll see how Perl’s ver- sion is even better. Along the way, you’ll learn not only some UNIX history, but also how to win barroom bets by writing one-liners on napkins that can compute tran- scendental numbers! 34 But first, you need to understand that in the early days of UNIX, C was considered the language of choice for all serious computing tasks—such as performing mathemati- cal calculations. In contrast, the early shells were viewed as simple tools for packaging command sequences in scripts and processing interactively issued commands. For this reason, the utility program that was used to perform calculations in shell programming, expr, was only endowed with the most rudimentary mathe- matical capabilities: $ expr 22 / 7 # Gimme pi! And I won't take 3 for an answer! 3 Moreover, using expr was horrendously inefficient. For instance, reading 100 num- bers from a file and totaling them required 100 separate expr processes—compared to a single process on modern systems, using AWK or Perl. Therefore, even though it was the mathematical mainstay of Bourne shell pro- gramming during the late 1970s and 1980s, the expr approach to arithmetic still left a lot to be desired. 35 Given this situation, it’s no wonder there was so much interest in improving expr. Without further ado, I’ll now relate to you the Legend of N expr (for new expr), which was initially told to me by my Bell System boss, then extensively embellished by yours truly through hundreds of retellings to my students. 34 Well, at least approximations thereof. 35 We actually had a great alternative for doing arithmetic starting in 1977—AWK! But most program- mers didn’t understand its capabilities until the 1988 book came out. USING BUILT-IN FUNCTIONS 163 Born in a barroom wager: nexpr One day after work in the early 1980s, three Bell System software engineers stop in a popular New Jersey watering hole. The bearded veteran orders his usual—a pint of Guinness—while the rookies each order a can of the local lager. “Man,” the veteran mumbles, apparently to himself, “the UNIX shell is really awe- some for math!” The first rookie says to the other, “Grandpa over there thinks the shell is good at math! That black sludge he’s imbibing must have fouled up his logic circuits.” Fixing his beady eyes intensely on the impudent rookie, the veteran says: I’ll bet you $100 each I can write a one-line shell script that calculates the square root of pi! 36 The second rookie exclaims, “Impossible! The expr command used in Bourne shell programming can’t even do floating-point calculations, let alone mathematical func- tions—we accept the bet.” While hastily writing the following script on a napkin—using a nacho-chip dipped in salsa—the veteran says, “I call the script nexpr, for new expr”: #! /bin/sh awk "BEGIN{ print $* ; exit }" “Read it and weep, and hand over $200!” If laptop computers running UNIX had been available in those days, the Chumps would surely have typed in the script and tested it on the spot, using this command: 37 $ nexpr 'sqrt(22/7)' # Becomes: awk 'BEGIN {print sqrt(22/7); exit}' 1.77281 (The comment attached to that command shows the awk command that is composed and run by nexpr, as explained in section 5.7.3.) The rookies are first shocked, then flabbergasted, and finally angry. They cry foul, arguing that awk isn’t part of the shell, and therefore what he has written isn’t a shell script after all. The veteran mounts a quick defense by pointing to the script’s unequivocally shellish shebang line and reminding them that it’s normal for a shell script to use external UNIX commands like sort, grep, and yes, even awk—not to mention the expr command they assumed he’d use. The rookies grudgingly relent and remit payment, admitting they’ve been out- foxed by the wily vet. 36 You know, the transcendental number that expresses the ratio of the circumference of a circle to its diameter that’s represented by the sixteenth letter of the Greek alphabet, . 37 expr can do more than arithmetic, so the nexpr* scripts aren’t full-fledged replacements for it. 164 CHAPTER 5 PERL AS A (BETTER) awk COMMAND Okay, I hear you. You’re wondering, “What does all this have to do with Perl?” Quite a bit, actually, because Perl can do just about anything AWK can do—includ- ing generating revenues from barroom wagers. The nexpr_p script (Perl) A script like nexpr is a great asset to those employing a command-line interface to Unix. But the Perl version, which I call nexpr_p (for perl), is even better than the original nexpr: $ cat nexpr_p #! /bin/sh # This script uses the Shell to create and run a custom Perl # program that evaluates and prints its arguments. # Sample transformation: nexpr_p '2 * 21' > perl print 2 * 21; perl -wl -e "print $*;" Perl is smart enough to exit automatically once it runs out of things to do, so there’s no need for an explicit exit statement in this script as there was with the classic AWK of nexpr’s era. Nor is there any need for a BEGIN block, which the AWK version requires to position its statements outside the (obligatory) implicit input-reading loop. That’s because that (unnecessary) loop can be omitted from the Perl version through use of the –wl cluster instead of –wnl. Like nexpr, nexpr_p is capable of performing any calculation that is supported by its built-in operators (such as / for division; see table 5.12) or its functions (such as sqrt; see table 5.13). But the Perl version is even more capable than nexpr, because it has access to a richer collection of built-in functions, along with Perl’s other advantages over AWK (especially its module mechanism). Next, we’ll discuss how these nexpr* scripts manage to make the requested computations. 5.7.3 How the nexpr* programs work The nexpr_p Shell script works the same way nexpr does—by exploiting the Shell’s willingness to substitute the script’s own arguments (see tables 2.4, 10.1) for the “ $*” variable in a double-quoted string, thereby creating a custom print statement to handle the user’s request. So when the user issues this comand: $ nexpr_p 'sqrt(22/7)' nexpr_p’s Shell transforms the Perl source code template in the script from perl -wl -e "print $*;" into perl -wl -e "print sqrt(22/7);" and executes that command. ADDITIONAL EXAMPLES 165 Next, we’ll examine some additional programs that employ techniques presented in this chapter. 5.8 ADDITIONAL EXAMPLES This section features Perl programs that analyze Linux log files, perform compound interest calculations, and inflect nouns in print statements to make them singular or plural as needed. I think you’ll find these examples interesting, but feel free to proceed to the next chapter at this point if you prefer. 5.8.1 Computing compound interest: compound_interest Consider the following script called compound_interest, which reports the growth of an investment over time: $ compound_interest -amount=100 -rate=18 Press <ENTER> to see $100 compound at 18%.<ENTER> $118 after 1 year(s)<ENTER> $139.24 after 2 year(s)<ENTER> $164.3032 after 3 year(s)<ENTER> $193.877776 after 4 year(s)<^D> Although the script uses the n option, it’s meant to be invoked without any file- name arguments, so it will default to reading input from the user’s terminal. This allows each press of <ENTER> to be taken as a request to show an additional year’s worth of growth. 38 What’s more, when given certain command-line switches, the script will calculate the growth of an arbitrary initial investment at an arbitrary annual rate of interest. I’m sure your interest in examining the script is rapidly com- pounding, so have a look at listing 5.4. 1 #! /usr/bin/perl -s -wn 2 3 BEGIN { 4 $Usage="Usage: $0 -amount=dollars -rate=percent"; 5 6 # Check for proper invocation 7 $amount and $rate or warn "$Usage\n" and exit 255; 8 9 $pct_rate=$rate/100; # convert interest to decimal 10 $multiplier=1 + $pct_rate; # .05 becomes 1.05 11 # Instruct user 12 print "Press <ENTER> to see \$$amount compound at $rate%."; 13 } 38 The results demonstrate the Rule of 72, according to which an investment of $X at Y% interest will approximately double in value every 72/Y years. In this case, Y is 18, yielding 4 years for each doubling. Listing 5.4 The compound_interest script 166 CHAPTER 5 PERL AS A (BETTER) awk COMMAND 14 15 $amount=$amount * $multiplier; # accumulate growth 16 17 # $. counts input lines, which represent years here 18 print "\$$amount after $. year(s)"; 19 20 END { print "\n"; } # start shell prompt on fresh line after <^D> The first thing to notice is that all the operations that can be done in advance of input processing are collected together in the BEGIN block. For example, an informational message is loaded into the $Usage variable on Line 4, which will be printed by the warn function if the user neglects to provide the required switches. The nominal percentage rate is then converted to a decimal number on Line 9, and the multiplier that will be used to add each additional year’s worth of interest to the previous balance is prepared on Line 10. Then a message is printed to inform the user how to interact with the program. Next, the program waits for a line of input (via <ENTER>) before executing the first line after the BEGIN block, Line 15, which calculates the new balance figure. The result is then reported to the user on Line 18. Fortunately, although we think of “ $.” as counting records, in cases where records represent the passage of additional years of investment growth—as they do here— that variable conveniently doubles as a year counter. Notice the need to backslash certain $ symbols in the double-quoted strings of Lines 12 and 18 to make them literal dollar signs, and the absence of that treatment for the $ symbols attached to scalar variable names, which allows variable interpola- tion for $amount and “$.” to occur. Although this is a useful program, it doesn’t do anything that AWK couldn’t do on its own—at least, not yet. But we’ll teach it how to improve its grammar next, using a valuable programmer’s aid that AWK lacks. 5.8.2 Conditionally pluralizing nouns: compound_interest2 As useful as it is, there’s something that bothers me about the compound_interest program. Specifically, it’s the output statement that hedges its bets on the singular/plural nature of the year-count, using the phrasing “1 year(s)” and “2 year(s)”. Like any lit- erate person striving for grammatical correctness, 39 I’d prefer to see the output pre- sented as “1 year” and “2 years ” instead. Although programmers using other languages—including AWK—may have to settle for such compromises, we certainly don’t in the world of Perl! The 39 More candidly, as a survivor of a Catholic grade-school education, something deep inside me still fears the wrath of the hickory ruler on my throbbing knuckles when I contemplate such flagrant examples of grammatical incorrectness. ADDITIONAL EXAMPLES 167 easy and entirely general solution to this problem is to use a function from the Lingua::EN::Inflect module to automatically inflect the word as “year” or “years ”, so it will match the numeric value before it. To effect this enhancement, you first download and install the required module from the CPAN (as discussed in chapter 12) and then add the following line at the top of the script: use Lingua::EN::Inflect 'PL_N'; That statement loads the module and the needed function, which in this case is one that knows how to conditionally pluralize (“ PL”) a noun (“N”). Then, the statement that prints the investment’s growth is modified to call PL_N with arguments consist- ing of the noun and its associated count. For comparison, here are the original and PL_N-enhanced print statements: print "\$$amount after $. year(s)"; # 1 year(s), 2 year(s) print "\$$amount after $. ", PL_N 'year', $.; # 1 year, 2 years Notice that the quoted string is terminated after the first “$.” in the second version, because the function name PL_N would be treated as literal text if it appeared within those quotes. How does the automatic inflection work? The function PL_N returns its first argu- ment as “year” or “years ”, according to the singular/plural nature of the number in “ $.”, its second argument. Then, the word returned by PL_N becomes the final argu- ment to print, providing the grammatically correct output that’s desired. 40 Here’s a sample run of the enhanced script: $ compound_interest2 -amount=100 -rate=10 Press <ENTER> to see $100 compound at 10%.<ENTER> $110 after 1 year<ENTER> $121 after 2 years Listing 5.5 shows the enhanced script in its entirety. An alternative to using a module-based function to conditionally print “year” or “years ” would be to employ Perl’s if/else construct (covered in part 2) to print the appropriate word. But it’s equally easy to use the PL_N function—and more empow- ering to learn how to do such things using Perl’s modules—than it is to roll your own solution. For this reason, we’ll discuss functions and modules more fully in part 2. 40 As detailed in section 7.6, adding optional parentheses may make it clearer to the reader that the final “ $.” is an argument to PL_N, not to print: print "\$$amount after $. ", PL_N('year', $.); 168 CHAPTER 5 PERL AS A (BETTER) awk COMMAND 1 #! /usr/bin/perl -s -wn 2 3 use Lingua::EN::Inflect 'PL_N'; # import noun pluralizer 4 5 BEGIN { 6 $Usage="Usage: $0 -amount=dollars -rate=percent"; 7 8 # Check for proper invocation 9 $amount and $rate or warn "$Usage\n" and exit 255; 10 11 $pct_rate=$rate/100; # 5 becomes .05 12 $multiplier=1 + $pct_rate; # .05 becomes 1.05 13 # Instruct user 14 print "Press <ENTER> to see \$$amount compound at $rate%."; 15 } 16 17 $amount=$amount * $multiplier; # accumulate growth 18 19 # $. counts input lines, which represent years 20 print "\$$amount after $. ", PL_N 'year', $. ; 21 22 END { print "\n"; } # start shell prompt on fresh line after <^D> 5.8.3 Analyzing log files: scan4oops Felix has been a happy Linux user since his company installed it on all their notebook computers a few years back. But ever since that clumsy security agent dropped Felix’s notebook at the airport, while Felix was frantically trying to grab his freshly X-rayed shoes, his notebook has been crashing periodically. Of course, he did load some experi- mental device drivers into the kernel during that flight, which could also be the source of the problem. In any case, he needs to diagnose the problem and get his notebook fixed. He already ran its hardware diagnostic tests several times, and it passed them all with fly- ing colors. So, he needs to try another approach. The nice people at the local Linux users group suggested he should check the /var/log/messages file for “Oops” reports, because they might indicate why his machine is crashing. When his boss, Murray, heard about this, he requested that Felix formalize his solution in the form of a Perl script so that others in the com- pany (and the users group) could benefit from his efforts. Felix examines that file and indeed finds an “Oops” report within it. To help the report fit on the page, the timestamp at the beginning of every line, “ Aug 17 04:15:14 floss kernel: ” has been removed: Listing 5.5 The compound_interest2 script ADDITIONAL EXAMPLES 169 Isn’t that a lovely format? 8-( Scanning onward in the file, he notices many other “Oops” reports, varying slightly in their details. Realizing he’d probably need to examine them all eventually, he resolves to write a script to extract them. His first step in attaining that goal is to identify what it is about the “Oops” reports that distinguishes them from the many other reports in the same file, includ- ing ones like these: Aug floss insmod: Using usb-storage.o Aug floss sshd[1079]: Received signal 15; terminating. Aug floss cardmgr[807]: executing: './network check eth0' He finds an easy answer—apart from the “Oops” reports all having multiple lines, the first line is always of this form: Aug 17 04:15:14 floss kernel: Oops: 0001 And the last line always ends with a sequence of 20 two-digit hex numbers: Apr 17 00:38:52 floss kernel: Code: 89 50 24 89 02 c7 43 24 Having found the distinctive markers that encase each “Oops” report, Felix’s next step is to construct regexes to match them. Constructing a regex to match “Oops” reports On further scrutiny, Felix notices that the timestamps on the individual reports differ, and that the hostname “floss” that appears within them is unique to his system. So he allows for variations in those fields in the regex he designs to match the initial line of an “Oops” report: ^[A-Z]\w+ +\d+ \d+:\d+:\d+ \w+ kernel: Oops: \d+ A B C D This regex says, starting from position A, “Find records that start with a capital let- ter, followed by one or more ‘word’ characters” (that’s for the Month-abbreviation). Position markers [...]... Perl commands, having forms such as these, could be added as the filtering stage in the pipeline: perl -wnl -e '-A and print;' # Example 1 perl -wnl -e '-A and -B and print;' # Example 2 perl -wnl -e '-A and ! -B and print;' # Example 3 perl -wnl -e '-A and -B and -C and print;' # Example 4 perl -wnl -e '( -A or -B ) and print;' # Example 5 perl -wnl -e '( -A or -B or -C ) and print;' # Example 6 perl. .. (viz., those in the second and third panels) For example, Perl s unique offering of six read/write/execute 5 For more information about Unix file types and permissions, consult man ls and man chmod FILE TESTING CAPABILITIES OF find VS PERL 181 Table 6.2 and Perl Comparison of supported file attributes in versions of the find command File attribute a Classic find b GNU find c Perl Perl operator Regular/plain... 5: c7 43 24 00 00 00 00 movl $0x0,0x24(%ebx) Code; c012 850 b < remove_inode_page+5b/90> c: 89 1c 24 mov %ebx,(%esp,1) Code; c012 850 e < remove_inode_page+5e/90> f: c7 44 24 04 ff 00 00 movl $0xff,0x4(%esp,1) Code; c012 851 5 < remove_inode_page+ 65/ 90> 16:... but truly, any IT manager would be fortunate to have the combined talents of an Oscar and a Felix on hand In your own career, I’d advise you to develop an appreciation and an aptitude for both the quick -and- dirty and elegant -and- formal styles of programming, and to cultivate the ability to produce either kind on demand, as circumstances warrant 5. 9 USING THE AWK-TO -PERL TRANSLATOR: a2p As discussed... by the data at hand, inflect a noun into its singular or plural form AWKiologists migrating to Perlistan should keep in mind that tables 5. 6, 5. 7, and 5. 13 provide a succinct summary of the major differences in syntax between the languages, and that the a2p command is available to help convert legacy AWK programs into Perl scripts 41 176 As you’ll see in chapter 9, Perl even provides for the aggregate... to perform with both versions of find as well as Perl On the other hand, the second panel shows that all permission-related tests that are easy with Perl are impossible to perform with find The table also shows that the text-file and binary-file tests provided by Perl (-T, -B) are impossible with find, and the three other tests in the third panel are easier with Perl For example, Perl s test for a... operators, and operator precedence • man perlop man Lingua::EN::Inflect # conditional pluralization, and more42 • # Perl' s Plain Old Documentation system • man perlpod # Perl' s documentation-retrieval utility • man perldoc # AWK to Perl source-code converter • man a2p • http://perldoc .perl. org/index-functions.html # the function list TIP The range operator is documented in excruciating detail on the perlop... you’ll see how to use a special kind of find | perl pipeline for filtering out undesirable arguments for Unix utilities, and how to use Unix utilities for validating arguments for Perl programs 6.4 PROCESSING FILENAME ARGUMENTS Have you ever run the grep command, only to find yourself suddenly staring at a screen full of blinking graphics characters? Most Unix users should witness this phenomenon sooner... make it easy for programmers using other Unix tools to migrate to Perl, which is why Perl comes with a sed-toperl translator Guess what Perl comes with an awk-to -perl translator too, called a2p! It converts inline AWK programs, such as the quoted portion of awk '{print $1}', as well as stand-alone AWK scripts, such as the one in the file munge referenced in the command awk -f munge, into Perl scripts... Unix users from these grepological calamities? And what does this have to do with Perl? ” Of course there’s hope; and, as usual, our salvation is achieved by Perl coming to the rescue 6.4.1 Defending against grep’s messes A valuable feature provided by the Shell is its ability to replace a command in backward quotes with that command’s own output This facility, called command substitution, and its Perl . # Check for proper invocation 9 $amount and $rate or warn "$Usage " and exit 255 ; 10 11 $pct_rate=$rate/100; # 5 becomes . 05 12 $multiplier=1 + $pct_rate; # . 05 becomes 1. 05 13 #. create and run a custom Perl # program that evaluates and prints its arguments. # Sample transformation: nexpr_p '2 * 21' > perl print 2 * 21; perl -wl -e "print $*;" Perl. transforms the Perl source code template in the script from perl -wl -e "print $*;" into perl -wl -e "print sqrt(22/7);" and executes that command. ADDITIONAL EXAMPLES 1 65 Next,

Ngày đăng: 06/08/2014, 03:20

TỪ KHÓA LIÊN QUAN