125
Building CLEM Expressions
Characters Matches
\0nn The character with octal value 0nn (0 <= n <= 7)
\0mnn Thech aracter with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7)
\xhh The character with hexadecimal value 0xhh
\uhhhh Thecharacter with hexadecimal value 0xhhhh
\t The tab character (‘\u0009’)
\n Thenewline (line feed) character (‘\u000A’)
\r Thecarriage-return character ( ‘\u000D’)
\f Theform-feed character ( ‘\u000C’)
\a Thealert (bell) character (‘\u0007’)
\e Theescape character (‘\u001B’)
\cx The control character corresponding to x
Matching Character Classes
Character classes Matches
[abc] a, b, or c (simple class)
[^abc] Any character except a, b, or c (subtraction)
[a-zA-Z] a through z or A through Z, inclusive (range)
[a-d[m-p]] athrough d, or m throug h p (union). Alternatively this could be specied
as [a-dm-p]
[a-z&&[def]] a through z, and d, e, or f (intersection)
[a-z&&[^bc]] athrough z, except for b and c (subtraction). Alternatively this could
be specied as [ad-z]
[a-z&&[^m-p]] a through z, and not m through p (subtraction). Alternatively this could
be specied as [a-lq-z]
Predefined Character Classes
Predenedcharacte r classes Matches
.Any character (may or may not match line terminators)
\d Any digit: [0-9]
\D A non-digit: [^0-9]
\s A white space character: [ \t\n\x0B\f\r]
\S A non-white space character: [^\s]
\w A word character: [a-zA-Z_0-9]
\W Anon-word character: [^\w]
Boundary Matches
Boundary matchers Matches
^ The beginning of a line
$The end of a line
\b A word boundary
\B A non-word boundary
\A The beginning of the input