Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 12 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
12
Dung lượng
1,27 MB
Nội dung
ptg 754 Chapter 17 • Regular Expressions and Pattern Matching 17.4.5 Anchoring Metacharacters Often it is necessary to anchor a metacharacter down, so that it matches only if the pat- tern is found at the beginning or end of a line, word, or string. These metacharacters are based on a position just to the left or to the right of the character that is being matched. Anchors are technically called zero-width assertions because they correspond to posi- tions, not actual characters in a string; for example, /^abc/ will search for abc at the beginning of the line, where the ^ represents a position, not an actual character. See Table 17.10 for a list of anchoring metacharacters. EXPLANATION 1 The variable called myString is assigned a string of lowercase letters, just exactly like the last example. 2 The regular expression reads: Search for one or more lowercase letters, but after the + sign, there is a question mark. The question mark turns off the greed factor. Now instead of taking as many lowercase letters as it can, this regular expression search stops after it finds the first lowercase character, and then replaces that char- acter with XXX. See Figure 17.27. Figure 17.27 This is not greedy: Output from Example 17.25. Table 17.10 Anchors (Assertions) Metacharacter What It Matches ^ Matches to beginning of line or beginning of a string. $ Matches to end of line or end of a string. \b Matches a word boundary (when not inside [ ]). \B Matches a nonword boundary. EXAMPLE 17.26 <html> <head><title></title></head> <body> From the Library of WoweBook.Com ptg 17.4 Getting Control—The Metacharacters 755 <script type="text/javascript"> 1 var reg_expression = /^Will/; // Beginning of line anchor 2 var textString=prompt("Type a string of text",""); 3 var result=reg_expression.test(textString);// Returns true // or false document.write(result+"<br />"); if (result){ document.write("<b>The regular expression /^Will/ matched the string\""+ textString +"\".<br />"); } else{ alert("No Match!"); } </script> </body> </html> EXPLANATION 1 The variable is assigned a regular expression containing the beginning of line an- chor metacharacter, the caret, followed by Will. 2 The variable textString is assigned user input; in this example, Willie Wonker was entered. 3 The regular expression test() method will return true because the string Willie Wonker begins with Will. See Figure 17.28. Figure 17.28 The user entered Willie Wonker. Will is at the beginning of the line, so this tests true (top); if the user enters I know Willie, and Will is not at the beginning of the line, the input would test false (bottom). EXAMPLE 17.26 (CONTINUED) From the Library of WoweBook.Com ptg 756 Chapter 17 • Regular Expressions and Pattern Matching EXAMPLE 17.27 <html> <head><title>Beginning of Line Anchor</title></head> <body> <script type="text/javascript"> 1 var reg_expression = /^[JK]/; 2 var textString=prompt("Type a string of text",""); 3 var result=reg_expression.test(textString); // Returns true // or false document.write(result+"<br />"); if (result){ document.write("<b>The regular expression /^[JK]/ matched the string\""+ textString +"\".<br />"); } else{ alert("No Match!"); } </script> </body> </html> EXPLANATION 1 A regular expression contains a beginning of line anchor, the caret. The regular expression reads: Find either an uppercase J or uppercase K at the beginning of the line or string. 2 The variable textString is assigned user input; in this example, Jack and Jill. 3 The regular expression test() method will return true because the string Jack matches an uppercase letter J and is found at the beginning of the string. See Figure 17.29. Figure 17.29 The string must begin with either a J or K. The user entered Jack and Jill (top) and this returns true; the user entered Karen Evich (bottom) and this also returns true. From the Library of WoweBook.Com ptg 17.4 Getting Control—The Metacharacters 757 EXAMPLE 17.28 <html> <head><title>End of Line Anchor</title></head> <body> <script type="text/javascript"> 1 var reg_expression = /50$/; 2 var textString=prompt("Type a string of text",""); 3 var result=reg_expression.test(textString);// Returns true // or false document.write(result+"<br />"); if (result){ document.write("<b>The regular expression /50$/ matched the string\""+ textString +"\".<br />"); } else{ alert("No Match!"); } </script> </body> </html> EXPLANATION 1 The regular expression /50$/ is assigned to the variable. The pattern contains the dollar sign ($) metacharacter, representing the end of line anchor only when the $ is the last character in the pattern. The expression reads: Find a 5 and a 0 fol- lowed by a newline. 2 The user is prompted for a string of text. 3 If the string ends in 50, the regex test method returns true; otherwise false. EXAMPLE 17.29 <html> <head><title>Anchors</title></head> <body> <script type="text/javascript"> 1 var reg_expression = /^[A-Z][a-z]+\s\d$/; // At the beginning of the string, find one uppercase // letter, followed by one or more lowercase letters, // a space, and one digit. 2 var string=prompt("Enter a name and a number",""); 3 if ( reg_expression.test(string)){ alert("It Matched!!"); } Continues From the Library of WoweBook.Com ptg 758 Chapter 17 • Regular Expressions and Pattern Matching Figure 17.30 The string begins with a capital letter, followed by one or more lowercase letters, a space, and ends with one digit (left); the input sequence matched, so this message is displayed (right). Figure 17.31 The regular expression does not match because the string ends in more than one digit (left); the input sequence did not match, so this message is displayed (right). else{ alert("No Match!"); } </script> </body> </html> EXPLANATION 1 The regular expression reads: Look at the beginning of the line, ^, find an upper- case letter, [A-Z], followed by one or more lowercase letters, [a-z]+, a single whitespace, \s, and a digit at the end of the line, \d$. 2 The user is prompted for input. 3 The regular expression test() method tests to see if there was a match and returns true if so, and false if not. See Figures 17.30 and 17.31. EXAMPLE 17.29 (CONTINUED) From the Library of WoweBook.Com ptg 17.4 Getting Control—The Metacharacters 759 17.4.6 Alternation Alternation allows the regular expression to contain alternative patterns to be matched; for example, the regular expression /John|Karen|Steve/ will match a line containing John or Karen or Steve. If Karen, John, or Steve are all on different lines, all lines are matched. Each of the alternative expressions is separated by a vertical bar (the pipe symbol, |) and the expressions can consist of any number of characters, unlike the character class that only matches for one character; thus, /a|b|c/ is the same as [abc], whereas /ab|de/ cannot EXAMPLE 17.30 <html> <head><title>The Word Boundary</title></head> <body> <script type="text/javascript"> // Anchoring a word with \b 1 var reg_expression = /\blove\b/; var textString=prompt("Type a string of text",""); 2 var result=reg_expression.test(textString);// Returns true // or false document.write(result+"<br />"); if (result){ document.write("<b>The regular expression /\blove\b/ matched the string \""+ textString +"\".<br />"); } else{ alert("No Match!"); } </script> </body> </html> EXPLANATION 1 The regular expression contains the \b metacharacter, representing a word bound- ary, not a specific character. The expression reads: Find a word beginning and ending with love. This means that gloves, lover, clover, and so on, will not be found. 2 The regular expression test() method will return true because the string love is within word boundary anchors \b. See Figure 17.32. Figure 17.32 The user entered Iloveyou!. The word love is between word boundaries (\b). The match was successful. From the Library of WoweBook.Com ptg 760 Chapter 17 • Regular Expressions and Pattern Matching be represented as [abde]. The pattern /ab|de/ is either ab or de, whereas the class [abcd] represents only one character in the set a, b, c, or d. Figure 17.33 The user entered Do you know Tommy?. Pattern Tom was matched in the string. EXAMPLE 17.31 <html> <head><title>Alternation</title></head> <body> <script type="text/javascript"> // Alternation: this or that or whatever 1 var reg_expression = /Steve|Dan|Tom/; var textString=prompt("Type a string of text",""); 2 var result=reg_expression.test(textString);// Returns true // or false document.write(result+"<br />"); if (result){ document.write("<b>The regular expression /Steve|Dan|Tom/ matched the string\""+ textString +"\".<br />"); } else{ alert("No Match!"); } </script> </body> </html> EXPLANATION 1 The pipe symbol, |, is used in the regular expression to match on a set of alterna- tive patterns. If any of the patterns, Steve, Dan, or Tom, are found, the match is successful. 2 The test() method will return true if the user enters either Steve, Dan, or Tom. See Figure 17.33. From the Library of WoweBook.Com ptg 17.4 Getting Control—The Metacharacters 761 Grouping or Clustering. Grouping occurs when a set of characters are enclosed in parentheses, such as /(ma)/ or /(John|Joe) Brown. If the regular expression pattern is enclosed in parentheses, a group or subpattern is created. Then instead of the greedy metacharacters matching on zero, one, or more of the previous single characters, they can match on the previous subpattern. For example, /(ma)+/ means search for “ma” or “mama” or “mamama,” and so forth; one or more occurrences of the pattern “ma”. If the parentheses are removed and the regular expression is /ma+/, we would be searching for an “m” followed by one or more occurrences of an “a,” such a “ma,” “maaaaaa.” Alter- nation can also be controlled if the patterns are enclosed in parentheses. In the example, /(John|Joe) Brown/, the regular expression reads: search for “John Brown” or “Joe Brown”. The grouping creates a subpattern of either “John” or “Joe” followed by the pat- tern “Brown”. Without grouping (i.e., /John|John Brown/), the regular expression reads: search for “John” or “Joe Brown”. This process of grouping characters together is also called clustering. EXAMPLE 17.32 <html> <head><title>Grouping or Clustering</title></head> <body> <script type="text/javascript"> // Grouping with parentheses 1 var reg_expression = /^(Sam|Dan|Tom) Robbins/; 2 var textString=prompt("Type a string of text",""); 3 var result=reg_expression.test(textString);// Returns true // or false document.write(result+"<br />"); if (result){ document.write("<b>The regular expression /^(Sam|Dan|Tom) Robbins/ matched the string\""+ textString +"\".<br />"); } else{ alert("No Match!"); } </script> </body> </html> EXPLANATION 1 By enclosing Sam, Dan, and Tom in parentheses, the alternative now becomes ei- ther Sam Robbins, Dan Robbins, or Tom Robbins. Without the parentheses, the reg- ular expression matches Sam, or Dan, or Tom Robbins. The caret metacharacter ^ anchors all of the patterns to the beginning of the line. 2 The user input is assigned to the variable called textString. 3 The test() method checks to see if the string contains one of the alternatives: Sam Robbins or Dan Robbins or Tom Robbins. If it does, true is returned; otherwise, false is returned. See Figure 17.34. From the Library of WoweBook.Com ptg 762 Chapter 17 • Regular Expressions and Pattern Matching Remembering or Capturing. Besides grouping, when the regular expression pattern is enclosed in parentheses, the subpattern created is being captured, meaning the subpat- tern is saved in special numbered class properties, starting with $1, then $2, and so on. For example, in the grouping example where we created a regular expression: /(ma)/, capturing will assign “ma” to $1 if “ma” if matched. We can say that “ma” is remembered in $1. If we have the expression /(John) (Doe)/, “John” will be captured in $1 and “Doe” in $2 if the pattern “John Doe” is matched. For each subpattern in the expression, the number of the property will be incremented by one: $1, $2, $3, and so on. The dollar sign properties can be applied to the RegExp object, not an instance of the object. and then used later in the program as shown in Example 17.33.They will persist until another successful pattern match occurs, at which time they will all be cleared. Even if the intention was to control the greedy metacharacter or the behavior of alternation as shown in the previous grouping examples, the subpatterns are automatically captured and saved as a side effect. 2 For more information on this go to http://developer.netscape.com/docs/manuals/communicator/ jsguide/reobjud.hmt#1007373. Figure 17.34 The user entered Dan Robbins as one of the alternatives. Sam Robbins or Tom Robbins would also be okay. 2. It is possible to prevent a subpattern from being saved. EXAMPLE 17.33 <html> <head><title>Capturing</title></head> <body> <h3> <script type="text/javascript"> 1 textString = "Everyone likes William Rogers and his friends." 2 var reg_expression = /(William)\s(Rogers)/; 3 myArray=textString.match(reg_expression); 4 document.write(myArray); // Three element array 5 document.write("<br>"+RegExp.$1 + " "+RegExp.$2 +"<br>"); /* alert(myArray[1] + " "+ myArray[2]); match and exec create an array consisting of the string, and the captured patterns. myArray[0] is "William Rogers" myArray[1] is "William" myArray[2] is "Rogers".*/ </script> From the Library of WoweBook.Com ptg 17.4 Getting Control—The Metacharacters 763 </h3> </body> </html> EXPLANATION 1 The string called textString is created. 2 The regular expression contains two subpatterns, William and Rogers, both en- closed in parentheses. 3 When either the String object’s match() method or the RegExp object’s exec() method are applied to the regular expression containing subpatterns, an array is returned, where the first element of the array is the regular expression string, and the next elements are the values of the subpatterns. 4 The array elements are displayed, separated by commas. 5 The subpatterns are class properties of the RegExp object. $1 represents the first captured subpattern, William, and $2 represents the second captured subpattern, Rogers. See Figure 17.35. Figure 17.35 Capturing portions of a regular expression using the RegExp object. EXAMPLE 17.34 <html> <head><title>Capture and Replace</title></head> <body> <big> <script type = "text/javascript"> 1 var string="Tommy Savage:203-123-4444:12 Main St." 2 var newString=string.replace(/(Tommy) (Savage)/, "$2, $1"); 3 document.write(newString +"<br />"); </script> </big> </body> </html> EXAMPLE 17.33 (CONTINUED) From the Library of WoweBook.Com [...]... been reversed The new string is displayed See Figure 17.36 Figure 17.36 Output from Example 17.34 EXAMPLE 17.35 Capture and Replace 1 var string="Tommy Savage:203-123-4444:12 Main St." 2 var newString=string.replace(/(\w+)\s(\w+)/, "$2, $1"); 3 document.write(newString +""); EXPLANATION... five tries to get it right because you didn’t complete the form exactly the way you were asked A message will appear and you won’t be allowed to submit the form until you get it right Behind the scenes a JavaScript program is validating the form 17.5.1 Checking for Empty Fields There’s a form waiting to be filled out Some of the fields are optional, and some are mandatory The question is this: Did the... form can’t be processed properly Checking for empty or null fields is one of the first things you might want to do EXAMPLE 17.36 Checking for Empty Fields 1 function validate_text(form1) { 2 if ( form1.user_name.value == "" || form1.user_name.value == null){ alert("You must enter your name."); return false; } 3 if ( form1.user_phone.value == . Library of WoweBook.Com ptg 17.4 Getting Control—The Metacharacters 755 <script type="text /javascript& quot;> 1 var reg_expression = /^Will/; // Beginning of line anchor 2 var textString=prompt("Type. <head><title>Beginning of Line Anchor</title></head> <body> <script type="text /javascript& quot;> 1 var reg_expression = /^[JK]/; 2 var textString=prompt("Type a string. <head><title>End of Line Anchor</title></head> <body> <script type="text /javascript& quot;> 1 var reg_expression = /50$/; 2 var textString=prompt("Type a string