Professional Information Technology-Programming Book part 110 pot

Caution As noted previously, you will need to modify the backreference designator based on the implementation used. JavaScript users will need to use $ instead of the previously used \. ColdFusion users should use \for both find and replace operations. Tip As seen in this example, a subexpression may be referred to multiple times simply by referring to the backreference as needed. Let's look at one more example. User information is stored in a database, and phone numbers are stored in the format 313-555-1234. However, you need to reformat the phone numbers as (313) 555-1234. Here is the example: 313-555-1234 248-555-9999 810-555-9000 (\d{3})(-)(\d{3})(-)(\d{4}) ($1) $3-$5 (313) 555-1234 (248) 555-9999 (810) 555-9000 Again, two regular expression patterns are used here. The first looks far more complicated than it is, so let's walk through it. (\d{3})(-)(\d{3})(- )(\d{4}) matches a phone number, but breaks it into five subexpressions (so as to isolate its parts). (\d{3}) matches the first three digits as the first subexpression, (-) matches – as the second subexpression, and so on. The end result is that the phone number is broken into five parts (each part its own subexpression): the area code, a hyphen, the first three digits of the number, another hyphen, and then the final four digits. These five parts can be used individually and as needed, and so ($1) $3-$5 simply reformats the number using only three of the subexpressions and ignoring the other two, thereby turning 313-555-1234 into (313) 555-1234. Tip When manipulating text for reformatting, it is often useful to break the text into lots of little subexpressions so as to have greater control over that text. Converting Case Some regex implementations support the use of conversion operations via the metacharacters listed in Table 8.1. Table 8.1. Case Conversion Metacharacters Metacharacter Description \E Terminate \L or \U conversion \l Convert next character to lowercase \L Convert all characters up to \E to lowercase \u Convert next character to uppercase Table 8.1. Case Conversion Metacharacters Metacharacter Description \U Convert all characters up to \E to uppercase \l and \u are placed before a character (or expression) so as to convert the case of the next character. \L and \U convert the case of all characters until a terminating \E is reached. Following is a simple example, converting the text within an <H1> tag pair to uppercase: <BODY> <H1>Welcome to my Homepage</H1> Content is divided into two sections:<BR> <H2>ColdFusion</H2> Information about Macromedia ColdFusion. <H2>Wireless</H2> Information about Bluetooth, 802.11, and more. <H2>This is not valid HTML</H3> </BODY> (<[Hh]1>)(.*?)(</[Hh]1>) $1\U$2\E$3 <BODY> <H1>WELCOME TO MY HOMEPAGE</H1> Content is divided into two sections:<BR> <H2>ColdFusion</H2> Information about Macromedia ColdFusion. <H2>Wireless</H2> Information about Bluetooth, 802.11, and more. <H2>This is not valid HTML</H3> </BODY> The pattern (<[Hh]1>)(.*?)(</[Hh]1>) breaks the header into three subexpressions: the opening tag, the text, and the closing tag. The second pattern then puts the text back together: $1 contains the start tag, \U$2\E converts the second subexpression (the header text) to uppercase, and $3 contains the end tag. Summary Subexpressions are used to define sets of characters or expressions. In addition to being used for repeating matches (as seen in the previous lesson), subexpressions can be referred to within patterns. This type of reference is called a backreference (and unfortunately, there are implementation differences in backreference syntax). Backreferences are useful in text matching and in replace operations. Lesson 9. Looking Ahead and Behind All the expressions used thus far have matched text, but sometimes you may want to use expressions to mark the position of text to be matched (in contrast to the matched text itself). This involves the use of lookaround (the capability to look ahead and behind), which will be explained in this lesson. Introducing Lookaround Again, we'll start with an example. You need to extract the title of a Web page; HTML page titles are placed between <TITLE> and </TITLE> tags in the <HEAD> section of HTML code. Here's the example: <HEAD> <TITLE>Ben Forta's Homepage</TITLE> </HEAD> <[tT][iI][tT][lL][eE]>.*</[tT][iI][tT][lL][eE]> <HEAD> <TITLE>Ben Forta's Homepage</TITLE> </HEAD> <[tT][iI][tT][lL][eE]>.*</[tT][iI][tT][lL][eE]> matches the opening <TITLE> tag (in upper, lower, or mixed case), the closing </TITLE> tag, and whatever text is between them. That worked. Or did it? What you needed was the title text, but what you got also contained the opening and closing <TITLE> tags. Is it possible to return just the title text? One solution could be to use subexpressions (as seen in Lesson 7, "Using Subexpressions"). This would allow for you to retrieve the matched text in three parts: the opening tag, the text, and the closing tag. With the matched text broken into parts, it would not be too difficult to extract just that part you want. But it makes little sense to make the effort to retrieve something that you actually don't want, only to have to manually remove it. What you really need here is a way . broken into five parts (each part its own subexpression): the area code, a hyphen, the first three digits of the number, another hyphen, and then the final four digits. These five parts can be. matched text in three parts: the opening tag, the text, and the closing tag. With the matched text broken into parts, it would not be too difficult to extract just that part you want. But it. sections:<BR> <H2>ColdFusion</H2> Information about Macromedia ColdFusion. <H2>Wireless</H2> Information about Bluetooth, 802.11, and more. <H2>This

Định dạng
Số trang	6
Dung lượng	19,89 KB