to construct a pattern so that it contains matches that are not returned—matches that are used so as to find the correct match location, but not used as part of the core match. In other words, you need to look around. Note This lesson discusses both lookahead and lookbehind. The former is supported in all major regular expression implementations, but the latter is not supported as extensively. Java, .NET, PHP, and Perl all support lookbehind (some with restrictions). JavaScript and ColdFusion, however, do not. Looking Ahead Lookahead specifies a pattern to be matched but not returned. A lookahead is actually a subexpression and is formatted as such. The syntax for a lookahead pattern is a subexpression preceded by ?=, and the text to match follows the = sign. Tip Some regular expression documentation uses the term consume to refer to what is matched and returned; lookahead matches are said to not consume. Here is an example. The following text contains a list of URLs, and you need to extract the protocol portion of each (possibly so as to know how to process them). Here's the example: http://www.forta.com/ https://mail.forta.com/ ftp://ftp.forta.com/ .+(?=:) http://www.forta.com/ https://mail.forta.com/ ftp://ftp.forta.com/ In the URLs listed, the protocol is separated from the hostname by a :. Pattern .+ matches any text (http in the first match), and subexpression (?=:) matches :. But notice that the : was not matched; ?= tells the regular expression engine to match : but to lookahead (and not consume it). To better understand what ?= is doing, here is the same example, this time without the lookahead metacharacters: http://www.forta.com/ https://mail.forta.com/ ftp://ftp.forta.com/ .+(:) http://www.forta.com/ https://mail.forta.com/ ftp://ftp.forta.com/ The subexpression (:) correctly matches :, but the matched text is consumed and is returned as part of the match. The difference between the two examples is that the former used pattern (?=:) to match the :, and the latter used (:). Both of these patterns matched the same thing; they both matched the : after the protocol. The difference is in whether the matched : was actually included in the matched text. When using lookahead, the regular expression parser looks ahead to process the : match, but does not process it as part of the primary search. .+(:) finds the text up to and including the :. .+(?=:) finds the text up to, but not including, the :. Note Lookahead (and lookbehind) matches actually do return results, but the results are always 0 characters in length. As such, you will sometimes find the lookaround operations referred to as being zero-width. Tip Any subexpression can be turned into a lookahead expression by simply prefacing the text with ?=. Multiple lookahead expressions may be used in a search pattern, and they may appear anywhere in the pattern (not just at the beginning, as shown here). Looking Behind As you have just seen, ?= looks ahead (it looks at what comes after the matched text, but does not consume what it finds). ?= is thus referred to as the lookahead operator. In addition to looking ahead, many regular expression implementations support looking behind. Looking at what is before the text to be returned involves looking behind, and the lookbehind operator ?<=. Tip Need help distinguishing ?= and ?<= from each other? Here's a way to remember which is which: The one that contains the arrow pointing behind (the < character) is lookbehind. ?<= is used in the same way as ?=; it is used within a subexpression and is followed by the text to match. Following is an example. A database search lists products, and you need only the prices. ABC01: $23.45 HGG42: $5.31 CFMX1: $899.00 XTC99: $69.96 Total items found: 4 \$[0-9.]+ ABC01: $23.45 HGG42: $5.31 CFMX1: $899.00 XTC99: $69.96 Total items found: 4 \$ matches the $, and [0-9.]+ matches the price. That worked. But what if you did not want the $ characters in the matched text? Could you simply drop \$ from the pattern? ABC01: $23.45 HGG42: $5.31 CFMX1: $899.00 XTC99: $69.96 Total items found: 4 [0-9.]+ ABC01: $23.45 HGG42: $5.31 CFMX1: $899.00 XTC99: $69.96 Total items found: 4 That obviously did not work. You do need the \$ to determine which text to match, but you do not want the $ to be returned. The solution? A lookbehind match, as follows: ABC01: $23.45 HGG42: $5.31 . are not returned—matches that are used so as to find the correct match location, but not used as part of the core match. In other words, you need to look around. Note This lesson discusses both. subexpression preceded by ?=, and the text to match follows the = sign. Tip Some regular expression documentation uses the term consume to refer to what is matched and returned; lookahead matches. The subexpression (:) correctly matches :, but the matched text is consumed and is returned as part of the match. The difference between the two examples is that the former used pattern (?=:)