Regular Expressions: The Complete Tutorial
Table of Contents
Introduction
Complete Regular Expression Tutorial
Applications & Languages That Support Regexes
Not Only for Programmers
1. Regular Expression Tutorial
2. Literal Characters
3. First Look at How a Regex Engine Works Internally
4. Character Classes or Character Sets
Useful Applications
Negated Character Classes
Metacharacters Inside Character Classes
Shorthand Character Classes
Negated Shorthand Character Classes
Repeating Character Classes
Looking Inside The Regex Engine
5. The Dot Matches (Almost) Any Character
6. Start of String and End of String Anchors
Useful Applications
Using ^ and $ as Start of Line and End of Line Anchors
Permanent Start of String and End of String Anchors
Zero-Length Matches
Strings Ending with a Line Break
Looking Inside the Regex Engine
Another Inside Look
Caution for Programmers
7. Word Boundaries
8. Alternation with The Vertical Bar or Pipe Symbol
9. Optional Items
10. Repetition with Star and Plus
Limiting Repetition
Watch Out for The Greediness!
Looking Inside The Regex Engine
Laziness Instead of Greediness
An Alternative to Laziness
Repeating \Q...\E Escape Sequences
11. Use Round Brackets for Grouping
Round Brackets Create a Backreference
How to Use Backreferences
The Entire Regex Match As Backreference Zero
Using Backreferences in The Regular Expression
Looking Inside The Regex Engine
Repetition and Backreferences
Useful Example: Checking for Doubled Words
Parentheses and Backreferences Cannot Be Used Inside Character Classes
12. Named Capturing Groups
Named Capture with Python, PCRE and PHP
Named Capture with .NET’s System.Text.RegularExpr
Names and Numbers for Capturing Groups
Other Regex Flavors
13. Unicode Regular Expressions
Characters, Code Points and Graphemes or How Unicode Makes a Mess of Things
How to Match a Single Unicode Grapheme
Matching a Specific Code Point
Unicode Character Properties
Unicode Scripts
Unicode Blocks
Alternative Unicode Regex Syntax
Do You Need To Worry About Different Encodings?
14. Regex Matching Modes
15. Possessive Quantifiers
How Possessive Quantifiers Work
When Possessive Quantifiers Matter
Possessive Quantifiers Can Change The Match Result
Using Atomic Grouping Instead of Possessive Quantifiers
16. Atomic Grouping
17. Lookahead and Lookbehind Zero-Width Assertions
Positive and Negative Lookahead
Regex Engine Internals
Positive and Negative Lookbehind
More Regex Engine Internals
Important Notes About Lookbehind
Lookaround Is Atomic
18. Testing The Same Part of a String for More Than One Requirement
Lookaround to The Rescue
Optimizing Our Solution
A More Complex Problem
19. Continuing at The End of The Previous Match
20. If-Then-Else Conditionals in Regular Expressions
21. XML Schema Character Classes
Character Class Subtraction
Nested Character Class Subtraction
Notational Compatibility with Other Regex Flavors
22. POSIX Bracket Expressions
Character Classes
Collating Sequences
Character Equivalents
23. Adding Comments to Regular Expressions
24. Free-Spacing Regular Expressions
1. Sample Regular Expressions
Grabbing HTML Tags
Trimming Whitespace
IP Addresses
More Detailed Examples
Common Pitfalls
2. Matching Floating Point Numbers with a Regular Expression
3. How to Find or Validate an Email Address
4. Matching a Valid Date
5. Matching Whole Lines of Text
6. Deleting Duplicate Lines From a File
8. Find Two Words Near Each Other
9. Runaway Regular Expressions: Catastrophic Backtracking
Possessive Quantifiers and Atomic Grouping to The Rescue
A Real Example: Matching CSV Records
Preventing Catastrophic Backtracking
See the Difference with RegexBuddy
Alternative Solution Using Atomic Grouping
Quickly Matching a Complete HTML File
10. Repeating a Capturing Group vs. Capturing a Repeated Group
1. Specialized Tools and Utilities for Working with Regular Expressions
2. Using Regular Expressions with Delphi for .NET and Win32
3. EditPad Pro: Convenient Text Editor with Full Regular Expression Support
EditPad Pro’s Regular Expression Support
Search and Replace Using Regular Expressions
Syntax Coloring or Highlighting Schemes
File Navigation Schemes for Text Folding and Navigation
More Information on EditPad Pro and Free Trial Download
4. What Is grep?
Using grep
Grep’s Regex Engine
Beyond The Command Line
5. Using Regular Expressions in Java
Quick Regex Methods of The String Class
Using The Pattern Class
Using The Matcher Class
Regular Expressions, Literal Strings and Backslashes
Java Demo Application using Regular Expressions
6. Java Demo Application using Regular Expressions
7. Using Regular Expressions with JavaScript and ECMAScript
8. JavaScript RegExp Example: Regular Expression Tester
9. MySQL Regular Expressions with The REGEXP Operator
10. Using Regular Expressions with The Microsoft .NET Framework
System.Text.RegularExpressions Overview (Using VB.NET Syntax)
The System.Text.RegularExpressions.Match Class
Regular Expressions, Literal Strings and Backslashes
.NET Framework Demo Application using Regular Expressions (C# Syntax)
11. C# Demo Application
12. Oracle Database 10g Regular Expressions
13. The PCRE Open Source Regex Library
14. Perl’s Rich Support for Regular Expressions
15. PHP Provides Three Sets of Regular Expression Functions
The ereg Function Set
The mb_ereg Function Set
The preg Function Set
16. POSIX Basic Regular Expressions
17. PostgreSQL Has Three Regular Expression Flavors
18. PowerGREP: Taking grep Beyond The Command Line
The Ultimate Search and Replace
Collecting Information and Statistics
File Sectioning and Extra Processing
More Information on PowerGREP and Free Trial Download
19. Python’s re Module
20. How to Use Regular Expressions in REALbasic
21. RegexBuddy: Your Perfect Companion for Working with Regular Expressions
Interactive Regex Tester and Debugger
Quickly Develop Efficient Software
Collect and Save Regular Expressions
Find out More and Get Your Own Copy of RegexBuddy
22. Using Regular Expressions with Ruby
23. Tcl Has Three Regular Expression Flavors
24. VBScript’s Regular Expression Support
25. VBScript RegExp Example: Regular Expression Tester
26. How to Use Regular Expressions in Visual Basic
27. XML Schema Regular Expressions
1. Basic Syntax Reference
2. Advanced Syntax Reference
Grouping and Backreferences
Modifiers
Atomic Grouping and Possessive Quantifiers
Lookaround
Continuing from The Previous Match
Conditionals
Comments
3. Unicode Syntax Reference
4. Syntax Reference for Specific Regex Flavors
.NET Syntax for Named Capture and Backreferences
Python Syntax for Named Capture and Backreferences
XML Character Classes
POSIX Bracket Expressions
5. Regular Expression Flavor Comparison
Characters
Character Classes or Character Sets
Dot
Anchors
Word Boundaries
Alternation
Quantifiers
Grouping and Backreferences
Modifiers
Atomic Grouping and Possessive Quantifiers
Lookaround
Continuing from The Previous Match
Conditionals
Comments
Unicode Characters
Unicode Properties, Scripts and Blocks
.NET Syntax for Named Capture and Backreferences
Python Syntax for Named Capture and Backreferences
XML Character Classes
POSIX Bracket Expressions
6. Replacement Text Reference
Syntax Using Backslashes
Syntax Using Dollar Signs
Tokens Without a Backslash or Dollar
General Replacement Text Behavior
Highest-Numbered Capturing Group