Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 41 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
41
Dung lượng
416,51 KB
Nội dung
WRITING ONE-LINE PROGRAMS 13 For example: $ cat exotic_fruits exotic_jerkies fig kiwi camel python Now we’ll examine some Perl programs that act as cat-like filters. Why? Because the simplicity of cat—called a null filter, since it doesn’t change its input—makes it an ideal starting point for our explorations of Perl’s data-processing facilities. Here’s an example of the hard way to emulate cat with Perl, using a script that takes an unnecessarily complex approach: #! /usr/bin/perl -wl @ARGV or @ARGV = '-'; foreach my $file (@ARGV) { open IN, "< $file" or die "$0: Open of $file failed, code $!\n"; while ( defined ($_=<IN>) ) { print $_; } close IN or die "$0: Close of $file failed, code $!\n"; } Only masochists, paranoiacs, or programmers abused in their early years by the C language (e.g., squared JAPHs) would write a Perl program this way. 7 That’s because Perl provides facilities to automatically create the filtering infrastructure for you—all you have to do is ask for it! An equivalent yet considerably simpler approach is shown next. In this case, Perl’s input operator ( <>) is used to automatically acquire data from filename arguments or STDIN (as detailed in chapter 10). Unlike the previous solution, this cat-like pro- gram is small enough to implement as a one-liner: perl -wl -e 'while (<>) { print; }' file file2 But even this is too much coding! You’re busy, and typing is tiresome, error-prone, and likely to give you carpal tunnel syndrome, so you should try to minimize it (within reason). Accordingly, the ideal solution to writing a basic filter program in Perl is the following, which uses the n option: perl -wnl -e 'print;' file file2 # OPTIMALLY simple! The beauty of this version is that it lets you focus on the filtering being implemented in the program, which in this case is no filtering at all—the program just prints every 7 There are cases where it makes sense to write your own loops in Perl, as shown in chapter 10, but this isn’t one of them. 14 CHAPTER 1 INTRODUCING MINIMAL PERL line it reads. That’s easy to see when you aren’t distracted by a dozen lines of boilerplate input-reading code, as you were with the scripted equivalent shown earlier. Where did the while loop go? It’s still there, but it’s invisible, because the n option tells Perl, “Insert the usual input-reading loop for this Lazy programmer, with no automatic printing of the input lines.” A fuller explanation of how the n option works is given in chapter 10. For the time being, just remember that it lets you forget about the mundane details of input pro- cessing so you can concentrate on the task at hand. Believe it or not, there’s a way to write a cat-like program in Perl that involves even less typing: perl -wpl -e '' file file2 # OVERLY simple! By now, you’re probably thinking that Perl’s reputation in some circles as a write-only language (i.e., one nobody can read) may be well deserved. That’s understandable, and we’ll return to this matter in a moment. But first, let’s discuss how this program works—which certainly isn’t obvious. The p option requests the usual input-reading loop, but with automatic printing of each input line after it has been processed. In this case, no processing is specified, because there’s no program between those quotes. Yet it still works, because the p option provides the essential cat-like behavior of printing each input line. This bizarrely cryptic solution is, frankly, a case of taking a good thing too far. It’s the kind of coding that may lead IT managers to wonder whether Larry has a screw loose somewhere—and to hope their competitors will hire as many Perl programmers as they can find. Of course, it’s unwise to drive your colleagues crazy, and tarnish your reputation, by writing programs that appear to be grossly defective—even if they work! For this reason, the optimally simple form shown previously with the n option and the explicit print statement is the approach used for most filter programs in Minimal Perl. 1.7 SUMMARY As illustrated by the Traveler’s tale at the beginning of this chapter, and the cat-like filter programs we examined later, the Perl programmer often has the choice of writing a complex or a simple program to handle a particular task. You can use this flexibility to create programs that range from minor masterpieces of inscrutability—because they’re so tiny and mysterious—to major masterpieces of verbosity—because they’re so voluminous and long-winded. The Perl subset I call Minimal Perl avoids programs at both ends of that spectrum, because they can’t be readily understood or maintained, and there are always concise yet readable alternatives that are more prudent choices. To make Perl easier for Unix people to learn, Minimal Perl favors simple and compact approaches based on familiar features of Unix, including the use of invoca- tion options to duplicate the input-processing behavior of Unix filter programs. SUMMARY 15 Minimal Perl exploits the power of Perl to indulge the programmer’s Laziness, which allows energy to be redirected from the mundane aspects of programming toward more productive uses of its capabilities. For instance, the n and p invoca- tion options allow Lazy Perl programmers—those who strive to work efficiently—to avoid retyping the generic input-reading loop in every filter program they write for the rest of their Perl programming careers. As an additional benefit, using these options also lets them write many useful programs as one-line commands rather than as larger scripts. In the next chapter, we’ll discuss several of Perl’s other invocation options. Learn- ing about them will give you a better understanding of the inner workings of the sim- ple programs you’ve seen thus far and will prepare you for the many useful and interesting programs coming up in subsequent chapters. 16 CHAPTER 2 Perl essentials 2.1 Perl’s invocation options 17 2.2 Using variables 23 2.3 Loading modules: -M 27 2.4 Writing simple scripts 29 2.5 Additional special variables 42 2.6 Standard option clusters 44 2.7 Constructing programs 47 2.8 Summary 51 This chapter introduces the most essential features of Perl, to pave your way for the programming examples you’ll see in the following chapters. Among the topics we’ll cover here are the use of Perl’s special variables, how to write Perl one-line commands and scripts, and the fundamentals of using Perl modules. But we don’t discuss everything you need to know about Perl in this chapter. Fur- ther details on this chapter’s topics—and more specialized ones not discussed here— are presented in later chapters, in the context of illustrative programming examples. Some of the language features discussed here won’t be used until part 2 of the book, so it’s not necessary for you to read this chapter in its entirety right now. If you haven’t learned a computer programming language before—or if you have, but you’re eager to get started with Perl—you should read the most important sections 1 now (2.1, 2.4.5, 2.5.1, and 2.6, including subsections) and then proceed to the next chapter. This chapter will serve you well as a reference document, so you should revisit it when you need to brush up on any of its topics. To make this easy for you, when “essential” features are used in programs in the following chapters, cross-references will refer you back to the relevant sections in this chapter. Forward references are also 1 To help you spot them, the headings for these sections are marked with the symbol. PERL’S INVOCATION OPTIONS 17 provided, to help you easily find more detailed coverage in later chapters on topics introduced here. We’ll begin our coverage of Perl with a discussion of its invocation options, because you’ve got to invoke Perl before you can do anything else with it. 2 2.1 PERL’S INVOCATION OPTIONS An invocation option is a character (usually a letter) that’s preceded by a hyphen and presented as one of the initial arguments to a perl command. Its purpose is to enable special features for the execution of a program. Table 2.1 lists the most important invocation options. Although each of the invocation options shown in table 2.1 is described under its own heading in the sections that follow, it’s not necessary to memorize what each one does, because they’re commonly used in only a handful of combinations. These combina- tions, which we call option clusters, consist of a hyphen followed by one or more options (e.g., –wnl). Toward the end of this chapter, you’ll learn now to select the appropriate options for your programs using a procedure for selecting standard option clusters that takes the guesswork out of this important task. First, we’ll describe what the individual options do. 2 A few of these options were discussed in chapter 1’s comparisons of easy and hard ways to write cat- like commands. To enhance the reference value of this chapter, these options are also included here. Table 2.1 Effects of Perl’s most essential invocation options Option Provides Explanation -e 'code' Execution of code Causes Perl to execute code as a program. Used to avoid the overhead of a script’s file with tiny programs. -w Warnings Enables warning messages, which is generally advisable. -n Reading but no printing Requests an implicit input-reading loop that stores records in $_. -p Reading and printing Requests an implicit input-reading loop that stores records in $_ and automatically prints that variable after optional processing of its contents. -l Line-end processing Automatically inserts an output record separator at the end of print’s output. When used with -n or -p, additionally does automatic chomping—removal of the input record separator from input records. -0digits Setting of input record separator Defines the character that marks the end of an input record, using octal digits. The special case of -00 enables paragraph mode, in which empty lines mark ends of input records; -0777 enables file mode, in which each file constitutes a single record. 18 CHAPTER 2 PERL ESSENTIALS 2.1.1 One-line programming: -e The purpose of Perl’s e invocation option is to identify the next argument as the pro- gram to be executed. This allows a simple program to be conveyed to perl as an interactively typed command rather than as a specially prepared file called a script. As an example, here’s a one-line command that calculates and prints the result of dividing 42 by 3: $ perl -wl -e 'print 42/3;' 14 The division of 42 by 3 is processed first, and then the print function receives 14 as its argument, which it writes to the output. We’ll discuss the w option used in that command’s invocation next and the l option shortly thereafter. 2.1.2 Enabling warnings: -w Wouldn’t it be great if you could have Larry and his colleagues discreetly critique your Perl programs for you? That would give you an opportunity to learn from the masters with every execution of every program. That’s effectively what happens when you use the w option to enable Perl’s extensive warning system. In fact, Perl’s warnings are gen- erally so insightful, helpful, and educational that most programmers use the w option all the time. As a practical example, consider which of the following runs of this program pro- vides the more useful output: $ perl -l -e 'print $HOME;' # Is Shell variable known to Perl? (no output) $ perl -wl -e 'print $HOME;' # Apparently not! Name "main::HOME" used only once: possible typo at -e line 1. Use of uninitialized value in print at -e line 1. The messages indicate that Perl was unable to print the value of the variable $HOME (because it was neither inherited from the Shell nor set in the program). Because there’s usually one appearance of a variable when its value is assigned and another when the value is retrieved, it’s unusual for a variable name to appear only once in a program. As a convenience to the programmer, Perl detects this condition and warns that the variable’s name may have been mistyped (“possible typo”). 3 You’d be wise to follow the example of professional Perl programmers. They use the w option routinely, so they hear about their coding problems in the privacy of their own cubicles—rather than having them flare up during high-pressure customer demos instead! The option we’ll cover next is also extremely valuable. 3 You could say the variable’s name was grossly mistyped, because in Perl this Shell variable is accessed as a member of an associative array (a.k.a. a hash) using $ENV{HOME}, as detailed in chapter 9. PERL’S INVOCATION OPTIONS 19 2.1.3 Processing input: -n Many Unix utilities ( grep, sed, sort, and so on) are typically used as filter pro- grams—they read input and then write some variation on it to the output. Here’s an example of the Unix sed command inserting spaces at the beginning of each input line using its substitution facility, which typically appears in the form s/search-string/replacement-string/g: 4 $ cat seattleites Torbin Ulrich 98107 Yeshe Dolma 98117 $ sed 's/^/ /g' seattleites Torbin Ulrich 98107 Yeshe Dolma 98117 The ^ symbol in the search-string field represents the beginning of the line, causing the spaces in the replacement-string to be inserted there before the modified line is sent to the output. Here’s the Perl counterpart to that sed command, which uses a sed-like substitu- tion operator (described in chapter 4). Notice the need for an explicit request to print the resulting line, which isn’t needed with sed: $ perl -wnl -e 's/^/ /g; print;' seattleites Torbin Ulrich 98107 Yeshe Dolma 98117 This command works like the sed command does—by processing one line at a time, taken from files named as arguments or from STDIN, using an implicit loop (provided by the n option). (For a more detailed explanation, see section 10.2.4.) This example also provides an opportunity to review an important component of Perl syntax. The semicolons at the ends of the sed-like substitution operator and the print function identify each of them as constituting a complete statement—and that’s important! If the semicolon preceding print were missing, for example, that word would be associated with the substitution operator rather than being recognized as an invocation of the print function, and a fatal syntax error would result. Because sed-like processing is so commonly needed in Perl, there’s a provision for obtaining it more easily, as shown next. 2.1.4 Processing input with automatic printing: -p You request input processing with automatic printing after (optional) processing by using the p option in place of n: 4 Are you wondering about the use of the “global” replacement modifier (/g)? Because it’s needed much more often than not, it’s used routinely in Minimal Perl and removed only in the rare cases where it spoils the results. It’s shown here for both the sed and perl commands for uniformity. 20 CHAPTER 2 PERL ESSENTIALS $ perl -wpl -e 's/^/ /g;' seattleites # "p" does the printing Torbin Ulrich 98107 Yeshe Dolma 98117 This coding style makes it easier to concentrate on the primary activity of the com- mand—the editing operation—and it’s no coincidence that it makes the command look more like the equivalent sed command shown earlier. That’s because Larry modeled the substitution operator on the syntax of sed (and vi) to make Perl easier for UNIX users to learn. Like the Shell’s echo command, Perl’s print can automatically generate new- lines, as you’ll see next. 2.1.5 Processing line-endings: -l Before discussing how automatic processing of record separators works in Perl, we first need to define some terms. A record is a collection of characters that’s read or written as a unit, and a file is a collection of records. When you’re dealing with text files, each individual line is con- sidered to be a separate record by default. The particular character, or sequence of characters, that marks the end of the record being read is called the input record sepa- rator. On Unix systems, that’s the linefeed character by default; but for portability and convenience, Perl lets you refer to the OS-specific default input record separator (whatever it may be) as \n, which is called newline. Perl normally retains the input record separator as part of each record that’s read, so it’s still there if that record is printed later. However, with certain kinds of programs, it’s a great convenience to have the separators automatically stripped off as input is read, and then to have them automatically replaced when output is written by print. This effect is enabled by adding the l option to n or p with perl’s invocation. To see what difference that option makes, we’ll compare the outputs of the fol- lowing two commands, which print the number of each input line (but not the input lines themselves). The numbers are provided by the special variable “ $.” (covered in table 2.2), which automatically counts records as they’re processed. First, here’s a command that omits the l option and features a special Shell prompt ( go$) for added clarity: go$ perl -wn -e 'print $.;' three_line_file # file has three lines 123go$ The output lines are scrunched together, because the “$.” variable doesn’t contain a newline—and nothing else in the program causes one to be issued after each print. Notice also that the Shell’s prompt for the next command isn’t where it should be—at the beginning of a fresh line. That’s about as unnerving as a minor earthquake to the average Unix user! In contrast, when the l option is used, a newline is automatically added at the end of print’s output: PERL’S INVOCATION OPTIONS 21 go$ perl -wnl -e 'print $.;' three_line_file 1 2 3 go$ For comparison, here’s how you’d achieve the same result without using l: $ perl -wn -e 'print $. , "\n";' three_line_file This approach requires an explicit request to print the newline, which you make by adding the "\n" argument to print. Doing that doesn’t require a substantial amount of extra typing in this tiny program, but the extra work would be consider- able in programs having many print statements. To avoid the effort that would be wasted in routinely typing "\n" arguments for print statements, Minimal Perl nor- mally uses the l option. Of course, in some situations it’s desirable to omit the newline from the end of an output line, as we’ll discuss next. 2.1.6 Printing without newlines: printf In most programs that read input, using the l option offers a significant benefit, and in the others, it usually doesn’t hurt. However, there is a situation where the process- ing provided by this option is undesirable. Specifically, any program that outputs a string using print that should not be terminated with a newline will be affected adversely, because the l option will dutifully ensure that it gets one. This situation occurs most commonly in programs that prompt for input. Here’s an example, based on the (unshown) script called order, which writes its output using print but doesn’t use the l option: $ order How many robotic tooth flossers? [1-200]: 42 We'll ship 42 tomorrow. Here’s another version of that script, which uses the l option: $ order2 # using -l option How many robotic tooth flossers? [1-200]: 42 We'll ship 42 tomorrow. As computer users know, it’s conventional for input to be accepted at the end of a prompt—not on the line below it, as in the case of order2. This can be accom- plished by using the printf function rather than print, because printf is immune to the effects of the l option. 5 5 The name printf refers to this function’s ability to do formatted printing when multiple arguments are provided (as opposed to the single-argument case we’re using for prompting). 22 CHAPTER 2 PERL ESSENTIALS Accordingly, the order2 script can be adjusted to suppress the prompt’s newline by adding one letter to its output statement, as shown in the second line that follows: print "How many robotic flossers? [1-200]:"; # -l INcompatible print f "How many robotic flossers? [1-200]:"; # -l compatible In summary, the l option is routinely used in Minimal Perl, and printf is used in place of print when an automatic newline isn’t desired at the output’s end (as in the case of prompts). Tip on using printf for prompting In its typical (non-prompting) usage, printf’s first argument contains % symbols that are interpreted in a special way. For this reason, if your prompt string contains any % symbols, you must double each one (%%) to get them to print correctly. 2.1.7 Changing the input record separator: -0digits When the n or p option is used, Perl reads one line at a time by default, using an OS- appropriate definition of the input record separator to find each line’s end. But it’s not always desirable to use an individual line as the definition of a record, so Perl (unlike most of its UNIX predecessors) has provisions for changing that behavior. The most common alternate record definitions are for units of paragraphs and files, with a paragraph being defined as a chunk of text separated by one or more empty lines from the next chunk. The input record separator of choice is specified using special sequences of digits with the -0digits option, shown earlier in table 2.1. Look how the behavior of the indenting program you saw earlier is altered as the record definition is changed from a line, to a paragraph, to a file ( F represents the space character): $ perl –wpl -e 's/^/ FFFF /g;' memo # default is line mode FFFF This is the file "memo", which has these FFFF lines spread out like so. FFFF And then it continues to a FFFF second paragraph. $ perl -00 -wpl -e 's/^/ FFFF /g;' memo # paragraph mode FFFF This is the file "memo", which has these lines spread out like so. FFFF And then it continues to a second paragraph. $ perl -0777 -wpl -e 's/^/ FFFF /g;' memo # file mode FFFF This is the file "memo", which has these lines spread out like so. And then it continues to a second paragraph. In all cases, indentation is inserted at the beginning of each record (signified by ^ in the search-string field), although it may not look that way at first glance. [...]... aliases useful during your initial adventures with Perl, so we’ll discuss them next Aliases for Perl commands: Line mode The first alias is for Perl commands that only generate output: alias perl_ o=' perl -wl ' # Output Generation This next group is for commands that read input: alias alias alias alias perl_ io=' perl_ iop=' perl_ f=' perl_ fp=' perl perl perl perl -wnl -wpl -wnla -wpla ' ' ' ' # # # # Input/Output... "Provide info\n" and and exit 25 5; exit 25 5; That second statement is read: “If it’s not true that no information was provided (i.e., if information was provided), don’t issue the warning.” Another use for Perl s logical and and or is to construct compound conditional tests for use with the if/else facility, as you’ll see in part 2 Next, we’ll talk about Perl s special provisions for executing specific... paragraph-oriented variations on these aliases Aliases for Perl commands: Paragraph mode The names of these aliases are the same as those of the previous group, except they start with a capital P to signify that they process input a paragraph at a time: alias alias alias alias alias Perl_ o=' Perl_ io=' Perl_ iop=' Perl_ f=' Perl_ fp=' perl perl perl perl perl -00 -00 -00 -00 -00 -wl -wnl -wpl -wnla -wpla '... column 44 CHAPTER 2 PERL ESSENTIALS Table 2. 9 Standard option clusters for Perl commands and scripts Primary option cluster Application type -wl Output generation perl –wl –e 'print "TEXT";' -wnl Input or Input/ Output processing perl –wnl –e 'print;' /etc/passwd who | perl –wnl –e 'print;' -wnla Field processing; whitespace separators $F[0] accesses the input record’s first field: perl –wnla –e 'print... WRITING SIMPLE SCRIPTS 31 2. 4 .2 True and False values We’ll frequently need to distinguish between True and False outcomes of logical tests, and True and False values for variables The Perl definitions of these important concepts are as follows, stated separately for values treated as numbers or strings: • For numbers, only values equating to 0 (including 0.0, and so on) are False • For strings, only the... technique is documented in table 2. 7 for reference purposes and demonstrated in later chapters (e.g., listings 9.3 and 9.4) Now we’ll turn our attention to Perl s fastest and easiest-to-use mechanism for storing and retrieving data: the variable 2. 2 USING VARIABLES In part 1 of this book, most of the variables you’ll see are the simplest kind: scalar variables, which can store only one value All Perl variables... perl_ o -e 'print 22 /7;' # perl_ o provides the perl invocation 3.1 428 571 428 5714 Just keep this in mind: As handy as these aliases may be initially, you’ll ultimately have to learn the underlying clusters anyway for use in your scripts, because aliases don’t work on shebang lines We’ll talk next about a procedure that simplifies the selection of appropriate invocation options for your Perl programs and. .. used Table 2. 3 compares the use of scalar variables in the Shell and Perl USING VARIABLES 25 Table 2. 3 Employing user-defined scalar variables in the Shell and Perl Shell Usage notes name=value name="value" $name=value; $name="value"; In both languages, quoting a value being assigned to a variable is generally a good practice but isn’t always required $num2=$num1= 42; Assignment Perl In Perl, assignments... readable, and that has an appropriate Perl shebang line at the top In some ways, it’s constructed much like a Shell script, but there are some differences, as detailed in table 2. 4 To illustrate the differences between commands and scripts, here once again is the one-line Perl command that prints each line of its input, along with its scripted counterpart, called perl_ cat Table 2. 4 Comparison of Shell and Perl. .. permission to the file and to conduct a test run: $ chmod +x perl_ cat $ ls -l perl_ cat 10 tim -rwxr r-$ perl_ cat marx_bros Groucho staff # enable execute permission # confirm execute permission 29 20 03-09-30 11:58 perl_ cat # script gives same results as command There’s a big difference between this script and a comparable one written for the Shell: You don’t have to refer to the command-line argument marx_bros . routinely in Minimal Perl and removed only in the rare cases where it spoils the results. It’s shown here for both the sed and perl commands for uniformity. 20 CHAPTER 2 PERL ESSENTIALS $ perl -wpl. many useful and interesting programs coming up in subsequent chapters. 16 CHAPTER 2 Perl essentials 2. 1 Perl s invocation options 17 2. 2 Using variables 23 2. 3 Loading modules: -M 27 2. 4 Writing. CHAPTER 2 PERL ESSENTIALS 2. 4 .2 True and False values We’ll frequently need to distinguish between True and False outcomes of logical tests, and True and False values for variables. The Perl definitions