Slick V3.3 manual Unicode Category Specifications for Regular Expressions

Models: V3.3

1 568
Download 568 pages 2.03 Kb
Page 559
Image 559

Unicode Category Specifications

for Regular Expressions

Sample Brief Regular Expression

Description

 

 

 

tice that the backslash must prefix the special char-

 

acter *.

 

 

[\t ]

Matches tab and space characters.

 

 

[\d9\d32]

Matches tab and space characters.

 

 

[\x9\x20]

Matches tab and space characters.

 

 

p?t

Matches any three-letter string starting with the let-

 

ter p and ending with the letter t. Two possible

 

matches are pot and pat.

 

 

s*t

Matches the letter s followed by any number of

 

characters followed by the nearest letter t. Two pos-

 

sible matches are seat and st.

 

 

{for}{while}

Matches the strings for or while.

 

 

^\:p

Matches lines beginning with a file name.

 

 

xy+z

Matches x followed by one or more occurrences of

 

y followed by z.

 

 

Unicode Category Specifications for Regular Expressions

The Unicode consortium standard regular expression categories are supported. The syntax for specifying categories is:

\p{MainCategoryLetter Subcategories}

The above syntax matches the categories specified. The following syntax matches all characters not in the categories specified:

\P{MainCategoryLetter Subcategories}

The \p and \P notations can only be used inside a character set specification. MainCategoryLetter can be L, M, N, P, S, Z, or C. The valid Subcategories depend on the MainCategoryLetter spe- cified. If no Subcategories are specified, all are assumed. For example:

[\p{L}] matches all Unicode letters.

[\p{Lul}] matches all uppercase and lowercase letters.

[\P{L}] matches all characters that are not letters.

537

Page 559
Image 559
Slick V3.3 manual Unicode Category Specifications for Regular Expressions, pMainCategoryLetter Subcategories