Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 11 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
11
Dung lượng
1,13 MB
Nội dung
ptg 717 chapter 17 Regular Expressions and Pattern Matching 17.1 What Is a Regular Expression? A user is asked to fill out an HTML form and provide his or her name, address, and birth date. Before sending the form off to a server for further processing, a JavaScript program checks the form to make sure the user actually entered something, and that the infor- mation is in the requested format. We saw in Chapter 11, “Working with Forms and Input Devices,” some basic ways that JavaScript can check form information, but now with the addition of regular expressions, form validation can be much more sophisti- cated and precise. Regular expressions are also useful for searching for patterns in input data, and replacing the data with something else or splitting it up into substrings. This chapter is divided into two main parts: (1.) how to create regular expressions and regu- lar expression metacharacters, and (2.) how to validate form input data with regular expressions. If you are savvy with Perl regular expressions (or the UNIX utilities, grep, sed, and awk), you can move rapidly through the first section, because JavaScript regular expressions, for the most part, are identical to those found in Perl. A regular expression is really just a sequence of characters that specify a pattern to be matched against a string of text when performing searches and replacements. A simple regular expression consists of a character or set of characters that matches itself. The regular expression is normally delimited by forward slashes; for example, /abc/. Like Perl, JavaScript 1 provides a large variety of regular expression metacharacters to control the way a pattern is found. A metacharacter is a special character that represents something other than itself, such a a ^, $,*, and so on. They are placed within in the reg- ular expression to control the search pattern; for example, /^abc/ means look for the pat- tern abc at the beginning of the line. With the help of metacharacters, you can look for strings containing only digits, only alphas, a digit at the beginning of the line followed by any number of alphas, a line ending with a digit, and so on. When searching for a pattern of characters, the possibilities of fine-tuning your search are endless. 1. JavaScript 1.2, NES 3.0 JavaScript 1.3 added toSource() method. JavaScript 1.5, NES 6.0 added m flag, nongreedy modifier, noncapturing parentheses, look-ahead assertions. ECMA 262, Edition 3. From the Library of WoweBook.Com ptg 718 Chapter 17 • Regular Expressions and Pattern Matching Again, JavaScript regular expressions are used primarily to verify data input on the client side. When a user fills out a form and presses the submit button, the form is sent to a server, and then often to a server script such as PHP, ASP.NET or a JavaServlet for further processing. Although forms can be validated by a server program, it is more effi- cient to take care of the validation before sending the script to the server. This is an important function of JavaScript. The user fills out the form and JavaScript checks to see if all the boxes have been filled out correctly, and if not, the user is told to reenter the data before the form is submitted to the server. Checking the form on the client side allows for instant feedback, and less traveling back and forth between the browser and server. It might be that the server-side program does its own validation anyway, but if JavaScript has already done the job, it will still save time and inconvenience for the user. With the power provided by regular expressions, the ability to check for any type of input, such as e-mail addresses, passwords, Social Security numbers, and birthdates is greatly simplified. This chapter will teach you how regular expressions and their metacharacters are used so that you will be able to read expressions even as complicated as the one shown in Figure 17.1. There are a number of regular expression validators and libraries on the Web. An excellent source is at http://www.regexlib.com. Figure 17.1 A regular expression library. The user types “email” in the Search box. See Figure 17.2 for results. From the Library of WoweBook.Com ptg 17.2 Creating a Regular Expression 719 17.2 Creating a Regular Expression A regular expression is a pattern of characters. It shouldn’t be any surprise by now. Java- Script regular expressions are objects. When you create a regular expression, you test the regular expression against a string. For example, the regular expression /green/ might be matched against the string “The green grass grows”. If green is contained in the string, then there is a successful match. Building a regular expression is like building a JavaScript string. If you recall, you can create a String object the literal way or you can use the String() constructor method. To build a regular expression object, you can assign a literal regular expression to a variable, or you can use the RegExp constructor to create and return a regular expression object. 17.2.1 The Literal Way To create a regular expression object with the literal notation, you assign the regular expression to a variable. The regular expression is a pattern of characters enclosed in Figure 17.2 The result of searching for the email regular expression considered to be the best. From the Library of WoweBook.Com ptg 720 Chapter 17 • Regular Expressions and Pattern Matching forward slashes. After the closing forward slash, options may be provided to modify the search pattern. The options are i, g, and m. See Table 17.1. If you are not going to change the regular expression, say, if it is hard-coded right into your script, then this literal notation is faster, because the regular expression is evaluated at runtime. 17.2.2 The Constructor Method The constructor method, called RegExp(), creates a RegExp object. The RegExp() con- structor takes one or two arguments. The first argument is the regular expression; it is a string representing the regular expression, for example, “green” represents the literal regular expression /green/. The second optional argument is called a flag such as i for case insensitivity or g for global. The constructor method is used when the regular expression is being provided from some other place, such as from user input, and can change throughout the run of the program. This method is handled at runtime. Table 17.1 Options Used for Modifying Search Patterns Option Purpose i Used to ignore case. g Used to match for all occurrences of the pattern in the string. m Used to match over multiple lines. FORMAT var variable_name = /regular expression/options; EXAMPLE var myreg = /love/; var reobj = /san jose/ig; FORMAT var variable_name = new RegExp("regular expression", "options"); EXAMPLE var myreg = new RegExp("love"); var reobj = new RegExp("san jose", "ig"); From the Library of WoweBook.Com ptg 17.2 Creating a Regular Expression 721 17.2.3 Testing the Expression The RegExp object has two methods that can be used to test for a match in a string, the test() method and the exec() method, which are quite similar. The test() method searches for a regular expression in a string and returns true if it matched and false if it didn’t. The exec() method also searches for a regular expression in a string. If the exec() method succeeds, it returns an array of information including the search string, and the parts of the string that matched. If it fails, it returns null. This is similar to the match() method of the String object. Table 17.2 summarizes the methods of the Reg- Exp object. The test() Method. The RegExp object’s test() method is used to see if a string con- tains the pattern represented in the regular expression. It returns a true or false Bool- ean value. After the search, the lastIndex property of the RegExp object contains the position in the string where the next search would start. (A string starts at character position 0.) If a global search is done, then the lastIndex property contains the starting position after the last pattern was matched. (See Example 17.4 to see how the lastIndex property is used.) Steps to test for a match: 1. Assign a regular expression to a variable. 2. Use the regular expression test() method to see if there is a match. If there is a match, the test() method returns true; otherwise, it returns false. There are also four string methods that can be used with regular expressions. (See section “String Methods Using Regular Expressions” on page 727.) Table 17.2 Methods of the RegExp Object Method What It Does exec Executes a search for a match in a string and returns an array. test Tests for a match in a string and returns either true or false. FORMAT var string="String to be tested goes here"; var regex = /regular expression/; // Literal way var regex=new RegExp("regular expression"); // Constructor way regex.test(string); // Returns either true or false or /regular expression/.test("string"); From the Library of WoweBook.Com ptg 722 Chapter 17 • Regular Expressions and Pattern Matching EXAMPLE var myString="She wants attention now!"; var regex = /ten/ // Literal way var regex=new RegExp("ten"); // Constructor way regex.test(myString); // Looking for "ten" in myString or /ten/.test("She wants attention now!"); EXAMPLE 17.1 <html> <head><title>Regular Expression Objects the Literal Way</title> <script language = "JavaScript"> 1 var myString="My gloves are worn for wear."; 2 var regex = /love/; // Create a regular expression object 3 if (regex.test(myString)){ 4 alert("Found pattern!"); } else{ 5 alert("No match."); } </script> </head> <body></body> </html> EXPLANATION 1 “My gloves are worn for wear.” is assigned to a variable called myString. 2 The regular expression /love/ is assigned to the variable called regex. This is the literal way of creating a regular expression object. 3 The test() method for the regular expression object tests to see if myString contains the pattern, love. If love is found within gloves, the test() method will return true. 4 The alert dialog box will display Found pattern! if the test() method returned true. 5 If the pattern /love/ is not found in myString, the test() method returns false, and the alert dialog box will display its message, No match. EXAMPLE 17.2 <html> <head> <title>Regular Expression Objects with the Constructor</title> <script language = "JavaScript"> 1 var myString="My gloves are worn for wear."; From the Library of WoweBook.Com ptg 17.2 Creating a Regular Expression 723 The exec() Method. The exec() method executes a search to find a match for a spec- ified pattern in a string. If it doesn’t find a match, exec() returns null; otherwise it returns an array containing the string that matched the regular expression. 2 var regex = new RegExp("love"); // Creating a regular // expression object 3 if ( regex.test(myString)){ 4 alert("Found pattern love!"); } else{ 5 alert("No match."); } </script> </head> <body></body> </html> EXPLANATION 1 The variable called myString is assigned “My gloves are worn for wear.” 2 The RegExp() constructor creates a new regular expression object, called regex. This is the constructor way of creating a regular expression object. It is assigned the string “love”, the regular expression. 3 The test() method for the regular expression object tests to see if myString con- tains the pattern, love. If it finds love within gloves, it will return true. 4, 5 The alert dialog box will display Found pattern! if the test() method returned true, or No match. if it returns false. See Figure 17.3. Figure 17.3 My gloves are worn for wear.” contains the pattern love. FORMAT array = regular_expression.exec(string); EXAMPLE list = /ring/.exec("Don't string me along, just bring me the goods."); EXAMPLE 17.2 (CONTINUED) From the Library of WoweBook.Com ptg 724 Chapter 17 • Regular Expressions and Pattern Matching 17.2.4 Properties of the RegExp Object There are two types of properties that can be applied to a RegExp object. The first type is called a class property (see Table 17.3) and applies to the RegExp object as a whole, not a simple instance of a regular expression object. The input property is an example of a class property. It contains the last string that was matched, and is applied directly to the RegExp object as RegExp.input. The other type of property is called an instance property and is applied to an instance of the object (see Table 17.4); for example, mypattern.lastIndex refers to the position within the string where the next search will start for this instance of the regular expres- sion object, called mypattern. These properties will be explained in examples throughout this chapter. EXAMPLE 17.3 <html> <head><title>The exec() method</title> <script type="text/javascript"> 1 var myString="My lovely gloves are worn for wear, Love."; 2 var regex = /love/i; // Create a regular expression object 3 var array=regex.exec(myString); 4 if (regex.exec(myString)){ alert("Matched! " + array); } else{ alert("No match."); } </script> </head> <body></body> </html> EXPLANATION 1 The string “My gloves are worn for wear.” is assigned to myString. 2 The regular expression /love/ is assigned to the variable regex. 3 The exec() method returns an array of values that were found. 4 If the exec() method doesn’t return null, then there was a match. See Figure 17.4. Figure 17.4 The array returned by exec() contains love. From the Library of WoweBook.Com ptg 17.2 Creating a Regular Expression 725 Table 17.3 Class Properties of the RegExp Object Property What It Describes input Represents the input string being matched. lastMatch Represents the last matched characters. lastParen Represents the last parenthesized substring pattern match. leftContext Represents the substring preceding the most recent pattern match. RegExp.$* Boolean value that specifies whether strings should be searched over multiple lines; same as the multiline property. RegExp.$& Represents the last matched characters. RegExp.$_ Represents the string input that is being matched. RegExp.$‘ Represents the substring preceding the most recent pattern match (see the leftContext property). RegExp.$’ Represents the substring following the most recent pattern match (see the rightContextproperty). RegExp.$+ Represents the last parenthesized substring pattern match (see the lastParen property). RegExp.$1,$2,$3 Used to capture substrings of matches. rightContext Represents the substring following the most recent pattern match. Table 17.4 Instance Properties of the RegExp Object Property What It Describes global Boolean to specify if the g option was used to check the expression against all possible matches in the string. ignoreCase Boolean to specify if the i option was used to ignore case during a string search. lastIndex If the g option was used, specifies the character position immediately following the last match found by exec() or test(). multiline Boolean to test if the m option was used to search across multiple lines. source The text of the regular expression. From the Library of WoweBook.Com ptg 726 Chapter 17 • Regular Expressions and Pattern Matching EXAMPLE 17.4 <html> <head> <title>The test() method</title> </head> <body bgcolor="silver"> <font face="arial" size="+1"> <script type = "text/javascript"> 1 var myString="I love my new gloves!"; 2 var regex = /love/g; // Create a regular expression object 3 var booleanResult = regex.test(myString); if ( booleanResult != false ){ 4 document.write("Tested regular expression <em>"+ regex.source + ".</em> The result is <em>" + booleanResult + "</em>"); document.write(".<br>Starts searching again at position " + 5 regex.lastIndex + " in string<em> \"" + 6 RegExp.input + "\"<br />"); document.write("The last matched characters were: "+ 7 RegExp.lastMatch+"<br />"); document.write("The substring preceding the last match is: 8 "+ RegExp.leftContext+"<br />"); document.write("The substring following the last match is: 9 "+ RegExp.rightContext+"<br />"); } else{ alert("No match!"); } </script> </font> </body> </html> EXPLANATION 1 The string object to be tested is created. 2 A regular expression object, called regex, is created. 3 The test() method returns true or false if the regular expression is matched in the string. 4 The source property is applied to regex, an instance of a RegExp object. It contains the text of the regular expression, /love/. 5 The lastIndex property is applied to an instance of a RegExp object. It represents the character position right after the last matched string. 6 The input class property represents the input string on which the pattern match- ing (regular expression) is performed. 7 lastMatch is a class property that represents the characters that were last matched. From the Library of WoweBook.Com . characters, the possibilities of fine-tuning your search are endless. 1. JavaScript 1.2, NES 3.0 JavaScript 1.3 added toSource() method. JavaScript 1.5, NES 6.0 added m flag, nongreedy modifier, noncapturing. sed, and awk), you can move rapidly through the first section, because JavaScript regular expressions, for the most part, are identical to those found in Perl. A regular expression is really. before sending the script to the server. This is an important function of JavaScript. The user fills out the form and JavaScript checks to see if all the boxes have been filled out correctly,