Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 69 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
69
Dung lượng
1,47 MB
Nội dung
There are also variants in how the print function can be used. It is possible to use the print operator con- ditionally in the following way. The following code is included in the file MatchAlternativeChomp2.pl in the code download: print “Enter a string. It will be matched against the pattern ‘/Star/i’.\n\n”; chomp (my $myTestString = <STDIN>); The if statement is included in the same line as the print operator after the string to be printed: print “There is a match for ‘$myTestString’.” if ($myTestString =~ m/Star/i); The !~ operator in the test for the if statement means “There is not a match”: print “There is no match for ‘$myTestString’.” if ($myTestString !~ m/Star/i); It isn’t necessary to express the pattern to match against as a string. You have the option to match against a variable. Matching against a variable is useful when you want to match against the same pattern more than once in your code. Try It Out Matching Against a Variable 1. Type the following code in your chosen editor, and save the code as MatchUsingVariable.pl: #!/usr/bin/perl -w use strict; my $myPattern = “^\\d{5}(-\\d{4})?\$”; print “Enter a US Zip Code: “; my $myTestString = <STDIN>; chomp ($myTestString); print “You entered a Zip code.\n\n” if ($myTestString =~ m/$myPattern/); print “The value you entered wasn’t recognized as a US Zip code.” if ($myTestString !~ m/$myPattern/); 2. Run the code in Komodo or at the command line. When prompted, enter the test string 12345, and inspect the displayed result. 3. Run the code again (F3 if you are using the Windows command line). When prompted, enter the test string 12345-6789, and inspect the displayed result. 4. Run the code again. When prompted, enter the test string Hello world! and inspect the result, as shown in Figure 26-14. Figure 26-14 674 Chapter 26 29_574892 ch26.qxd 1/7/05 11:07 PM Page 674 How It Works First, the variable $myPattern is declared and assigned the pattern ^\d{5}(-\d{4})?$. Notice that when you use the \d metacharacter and the $ metacharacter, you must precede them with an extra back- slash character. The pattern uses the positional metacharacters ^ and $ to indicate that the pattern must match all of the test string. The pattern matches either a test string of five numeric digits, as indicated by \d{5}, which is the abbreviated form of a U.S. Zip code, or a sequence of five numeric digits, optionally followed by a hyphen and four numeric digits, as indicated by (-\d{4})?, which matches the extended version of a U.S. Zip code. The -\d{4} is grouped inside paired parentheses, so the ? quantifier indicates that all of -\d{4} is optional: my $myPattern = “^\\d{5}(-\\d{4})?\$”; Next, the user is invited to enter a Zip code. The input is captured from the standard input using <STDIN>. And chomp() is used to remove the newline character at the end of $myTestString: print “Enter a US Zip Code: “; my $myTestString = <STDIN>; chomp ($myTestString); Then two print statements are used, each with an if statement and corresponding test that determines whether or not anything is displayed. The if statement on the first of the following lines means that the message is output if there is a match. The if statement on the last line causes the text to be displayed if there is no match: print “You entered a Zip code.\n\n” if ($myTestString =~ m/$myPattern/); print “The value you entered wasn’t recognized as a US Zip code.” if ($myTestString !~ m/$myPattern/); Using Other Regular Expression Delimiters The flexibility of Perl also includes a syntax to specify alternative characters to delimit a regular expres- sion pattern. The default regular expression delimiters in Perl are paired forward slashes, as in the following: my $myTestString = “Hello world!”; $myTestString =~ /world/; However, Perl allows developers to use other characters as regular expression delimiters, if the m is specified. Personally, I find it easiest to stick with the paired forward slashes almost all the time, but because Perl provides the flexibility to use other characters, it can be confusing interpreting matches or substitutions that use delimiters other than paired forward slashes, if you aren’t aware that Perl allows this flexibility. 675 Regular Expressions in Perl 29_574892 ch26.qxd 1/7/05 11:07 PM Page 675 The following example shows how matched curly braces, paired exclamation marks, and paired period (dot) characters can be used as regular expression delimiters. Try It Out Using Nondefault Delimiters 1. Type the following code into your chosen text editor, and save the code as NonDefaultDelimiters.pl: #!/usr/bin/perl -w use strict; print “This example uses delimiters other than the default /somePattern/.\n\n”; my $myTestString = “Hello world!”; print “It worked using paired { and }\n\n” if $myTestString =~ m{world}; print “It worked using paired ! and !\n\n” if $myTestString =~ m!world!; print “It worked using paired . and .\n\n” if $myTestString =~ m.world.; 2. Run the code inside or Komodo or at the command line by typing perl NonDefaultDelimiters.pl. 3. Inspect the displayed results, as shown in Figure 26-15. Notice that matched { and }, or paired ! and ! or paired period characters, have all worked, in the sense that they have been used to achieve a successful match. Figure 26-15 How It Works After a brief informational message, the string Hello world! is assigned to the variable $myTestString: my $myTestString = “Hello world!”; Then the print operator is used three times to print out a message indicating matching using specified delimiters, if the test of an if statement has been satisfied, which it has been in this case. Matching Using Variable Substitution If you are new to Perl programming, it may have been surprising that you can include variables inside paired double quotes. You may be even more surprised to learn that you can also include variables inside regular expression patterns. 676 Chapter 26 29_574892 ch26.qxd 1/7/05 11:07 PM Page 676 There are two ways variables can be included in patterns, depending on whether or not the variable comes at the end of the pattern. If the variable comes at the end of the pattern, you can write the following: /some characters$myPattern/ However, if you want to use the variable at any other position in the pattern, you need to write some- thing like this: /${myPattern}some other characters/ Try It Out Matching Using Variable Substitution 1. Type the following code in a text editor: #!/usr/bin/perl -w use strict; my $myTestString = “shells”; my $myPattern = “she”; print “$myPattern is found in $myTestString.\n\n” if ($myTestString =~ m/${myPattern}ll/); $myTestString = “scar”; $myPattern = “car”; print “$myPattern is found in $myTestString.\n\n” if ($myTestString =~ m/s$myPattern/); 2. Save the code as MatchingVariableSubstitution.pl. 3. Run the code and inspect the results, as shown in Figure 26-16. Figure 26-16 How It Works First, look at the variable substitution syntax that can be placed anywhere inside a pattern. You assign values to the $myTestString and $myPattern variables: my $myTestString = “shells”; my $myPattern = “she”; The following line is split only for reasons of presentation on page. Notice the syntax used in the pattern in the test for the if statement. The $myPattern variable is used inside the pattern and is written as ${myPattern}. The paired curly braces allow the name of the pattern to be unambiguously delineated: 677 Regular Expressions in Perl 29_574892 ch26.qxd 1/7/05 11:07 PM Page 677 print “$myPattern is found in $myTestString.\n\n” if ($myTestString =~ m/${myPattern}ll/); The second part of this example uses the syntax that can be used only at the end of the pattern. The $myPattern variable is written exactly like that: $myPattern. Because the only use of the second of the paired forward slashes is to delimit the end of the pattern, the meaning is clear: $myTestString = “scar”; $myPattern = “car”; print “$myPattern is found in $myTestString.\n\n” if ($myTestString =~ m/s$myPattern/); As you have seen in this section on matching, there is enormous flexibility in the syntax you can use to achieve matching in Perl. Using the s/// Operator The s/// operator is used when a match in the test string is to be replaced by (or substituted with) a replacement string. Search-and-replace syntax takes the following general form: s/pattern/replacmentText/modifiers If there is a match, s/// returns the numeric value corresponding to the number of successful matches. The number of matches attempted depends on whether or not the s/// operator is modified by the g (global) modifier. If the g modifier is present, the regular expression engine attempts to find all matches in the test string. In the following example, the literal pattern Star is replaced by the replacement (substitution) string Moon. Try It Out Using the s/// Operator 1. Type the following code in Komodo or another text editor: #!/usr/bin/perl -w use strict; my $myString = “I attended a Star Training Company training course.”; my $oldString = $myString; $myString =~ s/Star/Moon/; print “The original string was: \n’$oldString’\n\n”; print “After replacement the string is: \n’$myString’\n\n”; if ($oldString =~ m/Star/) { print “The string ‘Star’ was matched and replaced in the old string”; } 2. Save the code as SimpleReplace.pl. 3. Either run the code inside Komodo 3.0 or type perl SimpleReplace.pl at the command line, assuming that the file is saved in the current directory or a directory on your machine’s PATH. Inspect the displayed results, as shown in Figure 26-17. 678 Chapter 26 29_574892 ch26.qxd 1/7/05 11:07 PM Page 678 Figure 26-17 How It Works The test string is assigned to the variable $myString: my $myString = “I attended a Star Training Company training course.”; The variable $oldString is used to hold the original value for later display: my $oldString = $myString; The first occurrence of the character sequence Star in the test string is replaced by the character sequence Moon: $myString =~ s/Star/Moon/; The user is informed of the original and replaced strings: print “The original string was: \n’$oldString’\n\n”; print “After replacement the string is: \n’$myString’\n\n”; if ($oldString =~ m/Star/) { print “The string ‘Star’ was matched and replaced in the old string”; } Using s/// with the Global Modifier Often, you will want to replace all occurrences of a character sequence in the test string. The example of the Star Training Company earlier in this book is a case in point. To specify that all occurrences of a pat- tern are replaced, the global modifier, g, is used. To achieve global replacement, you write the following: $myTestString =~ s/pattern/replacementString/g The g modifier after the third forward slash indicates that global replacement is to take place. Try It Out Using s/// with the Global Modifier 1. Type the following code in a text editor: #!/usr/bin/perl -w use strict; 679 Regular Expressions in Perl 29_574892 ch26.qxd 1/7/05 11:07 PM Page 679 print “This example uses the global modifier, ‘g’\n\n”; my $myTestString = “Star Training Company courses are great. Choose Star for your training needs.”; my $myOnceString = $myTestString; my $myGlobalString = $myTestString; my $myPattern = “Star”; my $myReplacementString = “Moon”; $myOnceString =~ s/$myPattern/$myReplacementString/; $myGlobalString =~ s/$myPattern/$myReplacementString/g; print “The original string was ‘$myTestString’.\n\n”; print “After a single replacement it became ‘$myOnceString’.\n\n”; print “After global replacement it became ‘$myGlobalString’.\n\n”; 2. Save the code as GlobalReplace.pl. 3. Run the code and inspect the results, as shown in Figure 26-18. Notice that without the g modi- fier, only one occurrence of the character sequence Star has been replaced. With the g modifier present, all occurrences (in this case, there are two) are replaced. Figure 26-18 How It Works The test string is assigned to the variable $myTestString: my $myTestString = “Star Training Company courses are great. Choose Star for your training needs.”; The value of the original test string is copied to the variables $myOnceString and $myGlobalString: my $myOnceString = $myTestString; my $myGlobalString = $myTestString; The pattern Star is assigned to the variable $myPattern: my $myPattern = “Star”; The replacement string, Moon, is assigned to the variable $myReplacementString: my $myReplacementString = “Moon”; 680 Chapter 26 29_574892 ch26.qxd 1/7/05 11:07 PM Page 680 One match is replaced in $myOnceString: $myOnceString =~ s/$myPattern/$myReplacementString/; All matches (two, in this example) are replaced in $myGlobalString, because the g modifier is specified: $myGlobalString =~ s/$myPattern/$myReplacementString/g; Then the original string, the string after a single replacement, and the string after global replacement are displayed: print “The original string was ‘$myTestString’.\n\n”; print “After a single replacement it became ‘$myOnceString’.\n\n”; print “After global replacement it became ‘$myGlobalString’.\n\n”; Using s/// with the Default Variable The default variable, $_, can be used with s/// to search and replace the value held in the default variable. Two forms of syntax can be used. You can use the normal s/// syntax, with the variable name, the =~ operator and the pattern and replacement text: $_ =~ s/pattern/replacementText/modifiers; The alternative, more succinct, syntax allows the name of the default variable and =~ operator to be omitted. So you can simply write the following: s/pattern/replacementText/modifiers Try It Out Using s/// with the Default Variable 1. Type the following code in a text editor: #!/usr/bin/perl -w use strict; $_ = “I went to a training course from Star Training Company.”; print “The default string, \$_, contains ‘$_’.\n\n”; if (s/Star/Moon/) { print “A replacement has taken place using the default variable.\n”; print “The replaced string in \$_ is now ‘$_’.”; } 2. Save the code as ReplaceDefaultVariable.pl. 3. Run the code, and inspect the displayed result, as shown in Figure 26-19. 681 Regular Expressions in Perl 29_574892 ch26.qxd 1/7/05 11:07 PM Page 681 Figure 26-19 How It Works The test string is assigned to the default variable, $_: $_ = “I went to a training course from Star Training Company.”; The value contained in the default variable is displayed: print “The default string, \$_, contains ‘$_’.\n\n”; The test of the if statement uses the abbreviated syntax for carrying out a replacement on the default variable: if (s/Star/Moon/) You might prefer to use the full syntax: if ($_ =~ s/Star/Moon/) Whichever syntax you use, the user is then informed that a replacement operation has taken place and is informed of the value of the string after the replacement operation: print “A replacement has taken place using the default variable.\n”; print “The replaced string in \$_ is now ‘$_’.”; Using the split Operator The split operator is used to split a test string according to the match for a regular expression. The following example shows how you can separate a comma-separated sequence of values into its component parts. Try It Out Using the split Operator 1. Type the following code into a text editor: #!/usr/bin/perl -w use strict; my $myTestString = “A, B, C, D”; print “The original string was ‘$myTestString’.\n”; my @myArray = split/,\s?/, $myTestString; 682 Chapter 26 29_574892 ch26.qxd 1/7/05 11:07 PM Page 682 print “The string has been split into four array elements:\n”; print “$myArray[0]\n”; print “$myArray[1]\n”; print “$myArray[2]\n”; print “$myArray[3]\n”; print “Displaying array elements using the ‘foreach’ statement:\n”; foreach my $mySplit (split/,\s?/, $myTestString) { print “$mySplit\n”; } 2. Save the code as SplitDemo.pl. 3. Run the code, and inspect the displayed results, as shown in Figure 26-20. Figure 26-20 How It Works A sequence of values separated by commas and a space character is assigned to the variable $myTestString: my $myTestString = “A, B, C, D”; The value of the original string is displayed: print “The original string was ‘$myTestString’.\n”; The @myArray array is assigned the result of using the split operator. The pattern that is matched against is a comma optionally followed by a whitespace character. The target of the split operator is the variable $myTestString: my @myArray = split/,\s?/, $myTestString; Then you can use array indices to display the components into which the string has been split: print “The string has been split into four array elements:\n”; print “$myArray[0]\n”; print “$myArray[1]\n”; print “$myArray[2]\n”; print “$myArray[3]\n”; 683 Regular Expressions in Perl 29_574892 ch26.qxd 1/7/05 11:07 PM Page 683 [...]... character is still paired ‘!’ characters.\n”; a match.\n\n” if ($myTestString =~ m!http:\/\/!); Regular Expressions in Perl A Simple Perl Regex Tester You have seen a range of techniques used to explore some of the ways Perl regular expressions can be used You may find it useful to have a simple Perl tool to test regular expressions against test strings RegexTester.pl is intended to provide you with straightforward... (Star followed by a space character) precedes the character sequence Training 699 Chapter 26 Using the Regular Expression Matching Modes in Perl The regular expression matching modes allow developers to control useful aspects of how regular expression patterns are applied The following table summarizes the regular expression matching modes in Perl Mode Description i Matching is case insensitive x Allows... is used to specify what is being looked for after the other component of the regular expression pattern matches The character(s) inside the lookahead are not captured The negative lookahead syntax, (?! ), is used to specify what must not come after another component if the regular expression pattern is matched 696 Regular Expressions in Perl Try It Out Using Positive Lookahead 1 Type the following... Lookbehind Lookbehind works similarly to lookahead, except that a character sequence that precedes another component of the regular expression pattern is the focus of interest Positive lookbehind is signified by the syntax (? Expressions in Perl Try It Out Using Lookbehind 1 Type the following code in a text editor, and save it as LookBehind.pl:... tests for matches for the regular expression pattern: “ + myRegExp + “.\nType in a string and click on the OK button.”, “Type your text here.”); if (Validate(entry)){ alert(“There is a match!\nThe regular expression pattern is: “ + myRegExp + “.\n The string that you entered was: ‘“ + entry + “‘.”); } // end if else{ alert(“There is no match in the string you entered.\n” + “The regular expression pattern... print “There was a match: ‘$&’.\n”; } else { print “There was no match.”; } 2 3 694 Save the code as NegatedCharacterClass.pl Run the code, and inspect the displayed results, as shown in Figure 26-26 Regular Expressions in Perl Figure 26-26 How It Works The pattern assigned to the $myPattern variable is [^A-D]\d{2} Remember, it is necessary to double the backslash to ensure that the \d metacharacter is... preceding character or group occurs a minimum of n times and a maximum of m times ( ) Capturing parentheses $1 etc Variables that allow access to captured groups | 684 Description Alternation character Regular Expressions in Perl Metacharacter Description \b Matches a word boundary — in other words, the position between a word character ([A-Za-z0-9_]) and a nonword character [ ] Character class It matches... ‘$myPattern’.\n\n” if ($myTestString =~ m/$myPattern/); 2 3 Save the code as PositionalMetacharacters.pl Run the code, and inspect the displayed results, as shown in Figure 26-21 Figure 26-21 686 Regular Expressions in Perl How It Works First, a simple informational message is displayed to the user: print “\nThis example demonstrates the use of the ^ and \$ positional metacharacters.\n\n”; Then the... variable: $myTestString =~ m/$myPattern/; The values of the test string and pattern are displayed to the user: print “The pattern is ‘$myPattern’.\n”; print “The test string is ‘$myTestString’.\n”; 688 Regular Expressions in Perl The $& variable is used to display the whole match, in this case, the character sequence B9: print “The whole match is ‘$&’, contained in the \$& variable.\n”; The group captured... string, and press the Return key Inspect the displayed result Run the code again Enter 12345-6789 as a test string, and press the Return key Inspect the displayed result, as shown in Figure 26-31 Regular Expressions in Perl Figure 26-31 How It Works The key part of xModifier.pl is how the content of the m// operator is laid out in the code Notice in the last of the following lines that the x modifier . m/$myPattern/); Using Other Regular Expression Delimiters The flexibility of Perl also includes a syntax to specify alternative characters to delimit a regular expres- sion pattern. The default regular expression. opening parenthesis of a pair. 687 Regular Expressions in Perl 29_574892 ch26.qxd 1/7/05 11:07 PM Page 687 Captured groups can be accessed from outside the regular expression using the numbered. as ${myPattern}. The paired curly braces allow the name of the pattern to be unambiguously delineated: 677 Regular Expressions in Perl 29_574892 ch26.qxd 1/7/05 11:07 PM Page 677 print “$myPattern is found in