Slick V3.3 manual Unicode Character Blocks for Regular Expressions

Models: V3.3

1 568
Download 568 pages 2.03 Kb
Page 561
Image 561

 

Unicode Character Blocks for

 

Regular Expressions

 

 

 

Subcategory

 

Description

 

 

 

Sc

 

Symbol, Currency

Sk

 

Symbol, Modifier

 

 

 

So

 

Symbol, Other

 

 

 

Zs

 

Separator, Space

 

 

 

Zl

 

Separator, Line

 

 

 

Zp

 

Separator, Paragraph

 

 

 

Cc

 

Other, Control

 

 

 

Cf

 

Other, Format

 

 

 

Cs

 

Other, Surrogate

 

 

 

Co

 

Other, Private Use

 

 

 

Cn

 

Other, Not Assigned (no characters in the file have

 

 

this property)

 

 

 

Unicode Character Blocks for Regular Expressions

The Unicode consortium standard regular expression block categories are supported. The syntax for spe- cifying a character block is:

\p{IsBlockName}

The above syntax matches the characters in the block specified. The following syntax matches all charac- ters not in the block specified:

\P{IsBlockName}

The \p and \P notations may only be used inside a character set specification. For example, [\p{isBasicLatin}] matches all characters in the Greek block. [\P{isBasicLatin}] matches all characters that are not in the Greek block.

The following is a list of the non-standard valid character block names. This list was generated from XML standards found at the World Wide Web Consortium Web site (http://www.w3c.org).

XMLNameStartChar - All characters that are valid for the start of an XML tag name.

539

Page 561
Image 561
Slick V3.3 manual Unicode Character Blocks for Regular Expressions