CHAPTER 18
The following table shows all of the regex options in a quick summary cheat sheet.
Table 19: Character Options
Metacharacters | |
^ | Start of string (or line if multiline option used) |
$ | End of string (or line) |
. | Any character (except \n – or all characters if single line option used) |
| | Alternation |
{n,m} | Match specific number of occurrences |
[…] | Match one character from list |
(….) | Create a group |
* | 0 or more of previous pattern |
+ | 1 or more of previous pattern |
? | 0 or 1 of previous pattern, also converts greedy expressions to lazy |
\ | Changes following meta character to literal character |
\<1-99> | Back reference to a group in the pattern, numbered 1-99 |
Character Classes | |
[aeiou] | Match any single character between brackets |
[^aeiou] | Match any single character not in list between brackets |
[0-9] | Matches any character within range. To match a hyphen, place it at beginning or end of the character set |
\w | Match any word characters [a-zA-Z0-9_]. For .NET, also matches Unicode |
\W | Match any non-word characters, i.e. [^a-zA-Z0-9_] |
\s | Matches any whitespace [\f\n\r\t\v ] |
\S | Matches any non-whitespace [^\f\n\r\t\v ] |
\d | Matches any digit [0-9] |
\D | Matches any non-digit [^0-9] |
Character Escapes | |
\t | Matches tab character |
\r | Matches carriage return |
\v | Matches vertical tab |
\f | Matches form feed |
\n | Matches new line |
\e | Matches escape character |
Table 20: Unicode Options
Unicode | |
\u9999 | Matches Unicode code point, must be exactly 4 digits after \u |
\p{L} | Matches any Unicode “letter” |
\P{L} | Matches any non-letter Unicode category |
\p{Z} | Matches any whitespace or invisible separator character |
\P{Z} | Matches any non-whitespace Unicode character |
\p{N} | Matches any Unicode number |
\P{N} | Matches any non-number Unicode character |
\p{P} | Matches any punctuation character |
\P{P} | Matches any non-punctuation Unicode character |
\p{C} | Matches control characters and unused code points |
\P{C} | Matches non-control characters in Unicode |
\p{S} | Matches any symbol character |
\P{S} | Matches any non-symbol character in Unicode |
Named Unicode blocks | |
\p{block} | Found character in named block IsBasicLatin IsGreek IsHebrew IsArabic IsCurrencySymbols IsMathematicalOperators See Unicode Character Blocks for more named blocks in .NET |
\P{block} | Does not match named block |