Web Server Programming phần 3 pdf

# Sorted! (Sorted alphabetically) Sorted:-11 100 26 3 3001 49 78 The default sort behavior of alphabetic sorting can be modified; you have to provide your own sort helper subroutine. The helper functions for sorting are a little atypical of user-defined routines, but they are not hard to write. Your routine will be called to return the result of a comparison operation on two elements from the array – these elements will have been placed in the global variables $a and $b prior to the call to your subroutine. (This use of specific global variables is what makes these sort subroutines different from other programmer-defined routines.) The following code illustrates the definition and use of a sort helper subroutin e ‘ numeric_sort’. #!/share/bin/perl -w sub numeric_sort { if($a < $b) { return -1; } elsif($a == $b) { return 0; } else { return 1; } } @list2 = ( 100, 26, 3, 49, -11, 3001, 78); @slist2 = sort @list2; print "List2 @list2\n"; print "Sorted List2 (default sort) @slist2\n"; @nlist2 = sort numeric_sort @list2; print "Sorted List2 (numeric sort) @nlist2\n"; Perl has a special <=> operator for numeric comparisons; using this operator, the numeric sort function could be simp lified: sub numeric_sort { @a <=> $b } Perl permits in-line definition of s ort help er fun ctions, allowing cons tructs such as: @nlist2 = sort { $a <=> $b } @list2; 5.6.2 Two simple list examples Many simple databases and spreadsheets have options that let you get a listing of their contents as a text file. Such a file will contain one line for each record; fields in the record will be separated in the file by some delimiter character (usually the tab or colon char - acter). For example, a database that recorded the names, roles, departments, rooms and phone numbers of employees might be dumped to file in a format like the following: 112 Perl J.Smith:Painter:Buildings & Grounds::3456 T.Smythe:Audit clerk:Administration:15.205:3383 A.Solly:Help line:Sales:8.177:4222 Perl programs can be very effective for processing such data. The input lines can be broken into lists of elements. The simplest way is to use Perl’s split() function as illu strated in this example, but there are alternative ways involving more complex uses of regular expression matchers. Once the data are in lists, Perl can easily manipulate the records and so produce reports such as reverse telephone directories (mapping phone numbers to people), listing of employees with no specified room number, andsoforth. The fo llowing little pr ogram (which employs a few Perl ‘tricks’) generates a report that identifies those employees who have no assigned room: while(<STDIN>) { @line= split /:/ ; $room = $line[3]; if(!$room) { print $line[0], "\n" ; } } The main ‘trick’ here is the use of Perl's ‘anonymous’ variable. The statement while(<STDIN>) clearly reads in the next line of input and tests for an empty line, but it is not explicit as to where that input line is stored. In many places like this, Perl allows the programmer to omit reference to an explicit variable; if the context requires a variable, Perl au tomatically substitutes the ‘anonymous variable’ $_. (This feature is a part of the high whipitupitude level of the Perl language: you don’t have to define variables whose role is simply to ho ld data temporarily.) The while statement is really equivalent to while($_ = <STDIN>) { }. The split function is then used to break the input line into separate elements. This function is documented, in the perlfunc section, as one of the regular expression and pat - tern matching functions. It has the following usages: split /PATTERN/,EXPR,LIMIT split /PATTERN/,EXPR split /PATTERN/ It splits the string given by EXPR.ThePATTERN element is a regular expression specifying the characters that form the element separators; here it is particularly simple: the pattern specifies the colon character used in the example data. The LIMIT element is optional: it allows you to split ou t the first n elements from the expression, ignoring any others. The example code u ses the simplest f orm of split, with merely th e specification of the sepa - rator pattern. Here split is implicitly ope rating on th e anonymous variable $_ that has Beyond CS1: lists and arrays 113 just had assigned the value of a string representing the next line of input. The list resulting from the splitting operation is assigned to the list variable @line. The room was the fourth element of the print lines in the dump file from the database. Array indexing style operations allow this scalar value to be extracted from the list/array @line.Ifthisis‘null’(‘undef’ or undefined in Perl), the employee’s name is printed. In this example, only one element of the list was required; array-style subscripting is the appropriate way to extract the data. If moreofthedataweretobeprocessed,then rather than code like the fo llowing: $name = $line[0]; $role = $line[1]; $department = $line[2]; one can use a list literal as an lvalue: ($name, $role, $department) = @line; This statement copies th e first three elements from the list @line into the named scalar variables. It is also possible to select a few elements into scalars, and keep the remaining elements in another array: ($name, $role, $department, @rest) = @line; Use of list literals would allow the first example prog ram to be simplified to: while(<STDIN>) { ($name, $role, $department, $room, $phone) = split /:/ ; if(!$room) { print $name, "\n" ; } } The second example is a program to produce a ‘keyword in context’ index for a set of film titles. The input data for this program are the film titles; one title per line, with keywords capitalized. Example data could be: The Matrix The Empire Strikes Back The Return of the Jedi Moulin Rouge Picnic at Hanging Rock Gone with the Wind The Vertical Ray of the Sun Sabrina The Sound of Music 114 Perl Captain Corelli's Mandolin The African Queen Casablanca From these data, the program is to produce a permuted keyword in context index of the titles: The African Queen The Empire Strikes Back Captain Corelli's Mandolin Casablanca Captain Corelli's Mandolin The Empire Strikes Back Gone with the Wind Picnic at Hanging Rock The Return of the Jedi Captain Corelli's Mandolin The Matrix Moulin Rouge The Sound of Music Picnic at Hanging Rock The African Queen The Vertical Ray of the Sun The Return of the Jedi Picnic at Hanging Rock Moulin Rouge Sabrina The Sound of Music The Empire Strikes Back The Vertical Ray of the Sun The African Queen The Empire Strikes Back The Matrix The Return of the Jedi The Sound of Music The Vertical Ray of the Sun The Vertical Ray of the Sun Gone with the Wind The program has to loop, reading and processing each line of input (film title). Given a line, the progr am must find the keywords – these are the words that start with a capital letter. For each keyword, the program must generate a string with the context – separating the words before the keyword from the keyword and remainder of the words in the line. This gen erated string must be add ed to a collection. When all data have been read, the col - lection has to b e sorted using a specialized sort helper routine. Finally, the sorted list is Beyond CS1: lists and arrays 115 printed. (The actual coding could be made more efficient; the mechanisms used have been selected to illustrate a few more of Perl’s standard features.) The co de (given in full later) has the general structure: @collection = (); #read loop while($title = <STDIN>) { chomp($title); @Title = split//,$title; foreach $i (0 $#Title) { $Word = $Title[$i]; # if keyword, then generate another output line # and add to collection } } # sort collection using special helper function @sortcollection = sort by_keystr @collection; # print the sorted data foreach $entry (@sortcollection) { print $entry; } Each output line consists in effect of a list of words (the words before the keyword) printed right justified in a fixed width field, a gap of a few spaces, and then the keyword and remaining words printed left justified. These lines have to be sorted using an alphabetic ordering that uses the sub-string starting at the keyword. The keyword starts after column 50, so we require a special sort helper routine that picks out these sub-strings. The sort routine is similar to the numeric_sort illu strated earlier. It relies o n the con - vention that, before th e rou tine is called, th e glob al variables $a and $b will have b een assigned the two data elements (in this case r eport lines) that must be compared. sub by_keystr { my $str1 = substr($a,50); my $str2 = substr($b,50); if($str1 lt $str2) { return -1; } elsif($str1 eq $str2) { return 0; } else { return 1; } } This subroutine requires local variables to store the two sub-strings. Perl permits the dec - laration of variables whose scope is limited to the body of a function (or, scoped to an inner block in which they are declared). These variables are declared with the keyword my; here the sort helper function has two local variables $str1 and $str2. These contain the 116 Perl sub-strings starting at position 5 0 from the two generated lines. The lt and eq compari - sons done on these strings could be simplified using Perl’s cmp operator (it is a string ver - sion of the <=> operator mentioned in the context of the numeric sort helper function). The body of the main while loop works by splitting the input line into a list of words and then processing this list. while($title = <STDIN>){ chomp($title); @Title = split//,$title; foreach $i (0 $#Title) { $Word = $Title[$i]; } } Each word must be tested to determine whether it is a keyword. This can be done using a simple regular expression match. The pattern in this regular expression specifies that there must be an upper-case letter at the beginning of the string held in $Word: if($Word =~ /^[A-Z]/) { } The =~ operator is Perl’s regular expression matching operator; this is used to invoke the comparison of the value of $Word and the /^[A-Z]/ pattern. (Regular expressions are cov- ered in more detail in Section 5.11. H ere the ^ symbol signifies that the pattern must be found at the start of the string; the [A-Z] construct specifies the requirement for a single letter taken from the set of all capital letters). If the current word is classified as a keyword, then the words before it are combined to form the start string, and the keyword and remaining words are combined to form an end string. These strings can then be combined to produce a line for the final output. This is achieved using the sprintf function (the same as that in C’s stdio library). The sprintf function creates a string in memory, returning this string as its result. Like printf, sprintf takes a format string and a list of arguments. The output lines shown can be pro - duced using the statement: $line = sprintf "%50s %-50s\n", $start, $end; The complete program is: #!/usr/bin/perl sub by_keystr { my $str1 = substr($a,50); my $str2 = substr($b,50); if($str1 lt $str2) { return -1; } elsif($str1 eq $str2) { return 0; } else { return 1; } Beyond CS1: lists and arrays 117 } @collection = (); while($title = <STDIN>) { chomp($title); @Title = split//,$title; $start = ""; foreach $i (0 $#Title) { $Word = $Title[$i]; if($Word =~ /^[A-Z]/) { $end = ""; for($j=$i;$j<=$#Title;$j++) { $end .= $Title[$j]."";} $line = sprintf "%50s %-50s\n", $start, $end; push(@collection, $line); } $start .= $Word.""; } } @sortcollection = sort by_keystr @collection; foreach $entry (@sortcollection) { print $entry; } In Perl, there is always another way! Another way of building the $end list would use Perl’s join function: $end = join ‘ ‘ $Title[$i $#Title]; Perl’s join function (documented in perlfunc) has two arguments – an expression and a list. It builds a string by joining the separate strings of the list, and the value of the expres - sion is used as a separator element. 5.7 Subroutines Perl comes with libraries of several thousand subroutines; often the majority of your work can be done using existing routines. However, you will need to define your own subrou - tine – if simply to tidy up your code and avoid excessively large main-line programs. Perl routines are defined as: sub name block A routine h as a retur n value; this is either the value o f the last statement executed or a value specified in an explicit return statement. Arguments passed to a routine are co mbined into 118 Perl a single list – @_. Individual arguments may be isolated by indexing into this list, or by using a list literal as an lvalue. As illustrated with the sort helper fu nction in the last sec - tion, subroutines can define their own local scope variables. Many more details of subrou - tines are given in the perlsub section of the documentation. Parentheses are completely optional in subroutine calls: Process_data($arg1, $arg2, $arg3); is the same as Process_data $arg1, $arg2, $arg3; The ‘ls -l’ example in Section 5.5.2 had to convert a string such as ‘drwxr-x—‘ into the equivalent octal code; a subroutine to perform this task would simp lify the main line code. A definition for such a routine is: sub octal { my $str = $_[0]; my $code = 0; for(my $i=1;$i<10;$i++) { $code *=2; $code++ if("-" ne substr($str,$i,1)); } return $code; } This subroutine could be invoked: $str = "-rwxr-x "; $accesscode = octal $str; For a second example, consider a subroutine to determine whether a particular string is present in a list: member(item,list); As noted ab ove, the argumen ts for a rou tine are combin ed into a sin gle list; they have to be split apart in the ro utine. The p rocessing involves a foreach loop that checks whether the next list member equals the desired string: sub member { my($entry,@list) = @_; # separate the arguments foreach $memb (@list) { if($memb eq $entry) { return 1; } } Subroutines 119 return 0; } Actually, there is another way. There is no need to invent a member subroutine because Perl already possesses a generalized version in its grep routine. grep match_criterion datalist When used in a list context, grep produces a sub-list with references to those members of datalist that satisfy the test. When used in a scalar context, grep returns the number of members of datalist that satisfy requirements 5.8 Hashes Perl’s third main data type is a ‘hash’. A hash is essentially an associative array that relates keys to values. A n example would be a hash structure that relates the names of suburbs to their postcodes. A reference to a hash uses the % type qualifier on a name; so one could haveahash %postcodes. Hashes are dynamic, just like lists: you can start with an empty hash and add (key/value) pairs. Typically, most of your code will reference individual elements of a hash rather than the hash structure as a whole. The hash structure itself might be referenced in iterative con- structs that loop through all key value pairs. References to elements appear in scalar con- texts with a key being used like an ‘array subscript’ to index into the hash. A hash for a suburb/postcode mapping could be constructed as follows: $postcode{"Wollongong"} = 2500; $postcode{"Unanderra"} = 2526; $postcode{"Dapto"} = 2530; $postcode{"Figtree"} = 2525; The {}characters are used when indexing into a hash. The first statement would have implicitly created the hash %postcode; the subsequent statements add key/value pairs. The contents of the hash could then be printed: while(($suburb,$code) = each(%postcode)) { printf "%-20s %s\n" , $suburb, $code; } Every hash has an implicit iterator associated with it; this can be used via the each func - tion. The each function will return a two-element list w ith the next key/value pair; after the last p air has been returned, the next call to each will return an empty list; if each is again called, it restarts the iteration at the beginning of the hash. In the example code, each is used to control a loo p printing data from the hash. Naturally, given that it is a hash, the elements are returned in an essentially arbitrary order. 120 Perl Another way of iterating through a hash is to get a list with all the keys by applying the keys function to the hash and using a foreach loop: @keylist = keys(%postcode); foreach $key (@keylist) { print $key, ":\t", $postcode{$key}, "\n"; } If you need only the values from the hash, then you can obtain these by applying the values function to the hash. The delete function can be used to remove an element – delete $postcode{"Dapto"}. Hashes and lists can be directly inter-converted - @data = %postcode; the resulting list is made up of a sequence of key value pairs. A list with an even number of elements can similarly be c onverted directly to a hash; the first element is a key, the second is the corre - sponding value, the third list element is the next key, and so forth. If the reverse function is applied to a hash, you get a hash with the roles of the keys and values interchanged: %pc = reverse %postcode; while(($k,$v) = each(%pc)) { printf "%-20s %s\n" , $k, $v; } (You can ‘lose’ elements when reversing a hash; for example, if the original hash listed two suburbs that shared the same postcode – $postcode{"Wollongong"}=2500; $postcode{"Mangerton"} =2500; – then only one record would appear in the reversed hash that would map key 2500 to one or other of the suburbs.) There are a number of ways to in itialize a hash. First, you could explicitly assign values to the elements of the hash: #Amateur Drama’s Macbeth production #cast list $cast{"First witch"} = "Angie"; $cast{"Second witch"} = "Karen"; $cast{"Third witch"} = "Sonia"; $cast{"Duncan"} = "Peter"; $cast{"Macbeth"} = "Phillip"; $cast{"Lady Macbeth"} = "Joan"; $cast{"Gentlewoman 3"} = "Holly"; Alternatively, you could create the hash from a list: @cast = ("First witch", "Angie","Second witch", "Karen","Third witch", "Sonia", "Duncan", "Peter", "Macbeth", "Phillip", "Banquo", "John","Lady Macduff", "Lois", "Porter", "Neil", "Lennox", Hashes 121 [...]... string : This program cost $0 Dollars 0 and cents 0 Enter string : This program should cost $34 .99 Dollars 34 and cents 99 Enter string : qUIT Often, you need a pattern like: G Some fixed text; G A string whose value is arbitrary, but is needed for processing; G Some more fixed text Regular expression matching 133 You use * to match an arbitrary string; so if you were seeking to extract the sub-string... ; if($str =~ /Quit/i) { last; } if($str A FAIRLY COMPLEX MATCH PATTERN!) { # Replace x:=x+1 by x++, similarly x-if(( $3= =1) && ($2 eq "+")) { print "\t$1++;\n"; } elsif(( $3= =1) && ($2 eq "-")) { print "\t$1 ;\n"; } # Replace x:=x+y by x+=y, similarly for else { print "\t$1 $2= $3; \n"; } } else { print "$str\n"; } } The pattern needed here is: /\s*([A-Za-z]\w*) *:= *\1 *(\+|\*|\/|-) *(([0-9]+)|([A-Za-z]\w*))... know a few letters – ‘starts with ab, has three more unknown letters, and ends with either t or f depending on the right answer for 13- across’ How to solve this? Easy: search a dictionary for all the words that match the pattern Most Unix Regular expression matching 131 systems contain a small ‘dictionary’ (about 20 000 words) in the file /usr/dict/words; the words are held one per line and there are... is to know is whether input text matched a pattern More commonly, you want to further process the specific data that were matched For example, you hope that data from your web form contain a valid credit card number – a sequence of 13 to 16 digits You would not simply want to verify the occurrence of this pattern; what you would want to do is to extract the digit sequence that was matched, so that you... global variables defined in the Perl core The groups of pattern elements, whose matches in the string are required, are placed in parentheses So, a pattern for extracting a 13 16 digit sub-string from some longer string could be /\D(\d{ 13, 16})\D/; if a string matches this pattern, the variable $1 will hold the digit string The following example illustrates the extraction of two fields from an input line... "Ian","Seyton", "Jeffrey","Fleance", "Will", "Donaldbain", "Gentlewoman 3" , "Holly"); %cast = @cast; Lists like that get unreadable, and you are likely to mess up the pairings of keys and values Hence a third mechanism is available: %cast = ("First witch" => "Angie", "Donaldbain" => "Willy", "Menteith" => "Tim", "Gentlewoman 3" => "Holly"); It is also possible to obtain slices of hashes – one use... 134 Perl For these, you need a pattern that: G Matches a name (Lvalue); this is to be matched sub-string $1 G Matches Pascal’s := assignment operator G Matches another name that is identical to the first thing matched, so you need back reference \1 in the pattern G Matches a Pascal +, -, *, / operator; this is to be matched sub-string $2 G Matches either a number or another name; match sub-string $3. .. pattern to occur 1 or more times {n,} {n,m} Pattern to occur n times, or more, or the range n to m times Examples of patterns with quantifiers are: / /+ Requires span of space characters /0-9/{ 13, 16} Require 13 to 16 decimal digits (as in credit card number) (+|-)?[0-9]+\.?[0-9]* An optional + or – sign, one or more digits, an optional decimal point, optionally more digits – i.e a signed number with... core includes essentially all the Unix system calls that are documented in Unix’s man 2 documentation, and also has equivalents for the functions in many of the C libraries Perl and the OS 137 documented in man 3 Perl’s functions are documented in the perlfunc section of the documentation These functions make it easy for Perl programs to search directories, rename and copy files, launch sub-processes... the key $dept in the main %machines hash structure The final report generated will be in the desired format: accounting red 209.208.207.1 blue 209.208.207.2 sales jabberwok 209.208.207.46 5.12 .3 A ‘systems programming example This slightly larger example illustrates the kind of task that can be automated using Perl scripting for file and process manipulation The task could also be solved with a shell . like the following: 112 Perl J.Smith:Painter:Buildings & Grounds: :34 56 T.Smythe:Audit clerk:Administration:15.205 :33 83 A.Solly:Help line:Sales:8.177:4222 Perl programs can be very effective. $b) { return -1; } elsif($a == $b) { return 0; } else { return 1; } } @list2 = ( 100, 26, 3, 49, -11, 30 01, 78); @slist2 = sort @list2; print "List2 @list2 "; print "Sorted List2. # Sorted! (Sorted alphabetically) Sorted:-11 100 26 3 3001 49 78 The default sort behavior of alphabetic sorting can be modified; you have to provide your

Định dạng
Số trang	63
Dung lượng	503,46 KB