] // Init var windowName='spellWindow'; var spellCheckURL='spell.cfm?formname=comment&fieldname='+field. name; // Done return false; } </SCRIPT> ^\s*//.*$ matches the start of a string, followed by any whitespace, followed by // (used to define JavaScript comments), followed by any text, and then an end of string. But that pattern would match only the first comment (and only if it were the only text in the page). The (?m) modifier in (?m)^\s*//.*$ forces the pattern to treat line breaks as string separators, and so all comments were matched. Caution (?m) is not supported by many regular expression implementations. Note Some regular expression implementations also support the use of \A to mark the start of a string and \Z to mark the end of a string. If supported, these metacharacters function much like ^ and $, respectively, but unlike ^ and $, they are not modified by (?m) and will therefore not operate in multiline mode. Summary Regular expressions can match any blocks of text or text at specific locations within a string. \b is used to specify a word boundary (and \B does the exact opposite). ^ and $ mark string boundaries (start of string and end of string, respectively), although when used with the (?m) modifier, ^ and $ will also match strings that start or end at a line break. Lesson 7. Using Subexpressions Metacharacters and character matching provide the basic power behind regular expressions, as has been demonstrated in the lessons thus far. In this lesson you'll learn how to group expressions together using subexpressions. Understanding Subexpressions Matching multiple occurrences of a character was introduced in Lesson 5, "Repeating Matches." As discussed in that lesson, \d+ matches one or more digits, and https?:// matches http:// or https://. In both of these examples (and indeed, in all the examples thus far) the repetition metacharacters (? or * or {2}, for example) apply to the previous character or metacharacter. For example, HTML developers often place nonbreaking spaces between words to ensure that text does not wrap between those words. Suppose you needed to locate all repeating HTML nonbreaking spaces (to replace them with something else). Here's the example: Hello, my name is Ben Forta, and I am the author of books on SQL, ColdFusion, WAP, Windows 2000, and other subjects. {2,} Hello, my name is Ben Forta, and I am the author of books on SQL, ColdFusion, WAP, Windows 2000, and other subjects. is the entity reference for the HTML nonbreaking spaces. Pattern {2,} should have matched 2 or more instances of . But it didn't. Why not? Because the {2,} is specifying the number of repetitions of whatever is directly preceding it, in this case a semicolon. ;;;; would have matched, but will not. Grouping with Subexpressions This brings us to the topic of subexpressions. Subexpressions are parts of a bigger expression; the parts are grouped together so that they are treated as a single entity. Subexpressions are enclosed between ( and ) characters. Tip ( and ) are metacharacters. To match the actual characters ( and ), you must escape them as \( and \), respectively. To demonstrate the use of subexpressions, let's revisit the previous example: Hello, my name is Ben Forta, and I am the author of books on SQL, ColdFusion, WAP, Windows 2000, and other subjects. ( ){2,} Hello, my name is Ben Forta, and I am the author of books on SQL, ColdFusion, WAP, Windows 2000, and other subjects. ( ) is a subexpression and is treated as single entity. As such, the {2,} that follows it applies to the entire subexpression (not just the semicolon). That did the trick. Here is another example—this time a regular expression is used to locate IP addresses. IP addresses are formatted as four sets of numbers separated by periods, such as 12.159.46.200. Because each of the numbers can be one, two, or three digits, the pattern to match each number could be expressed as \d{1,3}. This is shown in the following example: Pinging hog.forta.com [12.159.46.200] with 32 bytes of data: \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} Pinging hog.forta.com [12.159.46.200] with 32 bytes of data: Each instance of \d{1,3} matches one of the numbers in an IP address. The four numbers are separated by ., which is escaped as \. The pattern \d{1,3}\. (up to 3 digits followed by .) is repeated three times and can thus be expressed as a repetition as well. Following is an alternative version of the same example: Pinging hog.forta.com [12.159.46.200] with 32 bytes of data: (\d{1,3}\.){3}\d{1,3} Pinging hog.forta.com [12.159.46.200] with 32 bytes of data: This pattern worked just as well as the previous one, but the syntax is different. The expression \d{1,3}\. has been enclosed within ( and ) to make it a subexpression. (\d{1,3}\.){3} repeats the subexpression 3 times (for the first three numbers in the IP address), and then \d{1,3} matches the final number. Note (\d{1,3}\.){4} is not a viable alternative to the pattern just used. Can you work out why it would have failed in this example? Tip Some users like to enclose parts of expressions as subexpressions to improve readability; the previous pattern would be expressed as (\d{1,3}\.){3}(\d{1,3}). This practice is perfectly legal, and using it has no effect on the actual behavior of the expression (although there may be performance implications, depending on the regular expression implementation being used). . author of books on SQL, ColdFusion, WAP, Windows 2000, and other subjects. {2,} Hello, my name is Ben Forta, and I am the author of books. Subexpressions This brings us to the topic of subexpressions. Subexpressions are parts of a bigger expression; the parts are grouped together so that they are treated as a single entity. Subexpressions. author of books on SQL, ColdFusion, WAP, Windows 2000, and other subjects. ( ){2,} Hello, my name is Ben Forta, and I am the author of books