UNIX Regular Expressions

 

 

 

UNIX Regular Expression

 

Definition

 

 

 

 

 

 

[^char-set]

 

Matches any character not specified by char-set.

 

 

A dash (-) character may be used to specify ranges.

 

 

 

[char-set1 - [char-set2]]

 

Character set subtraction. Matches all characters in

 

 

char-set1except the characters in char-set2.

 

 

The expression [^A-Z]matches all characters ex-

 

 

cept uppercase letters. For example, [a-z-[qw]]

 

 

matches all English lowercase letters except q and

 

 

w. [\p{L}-[qw]]matches all Unicode lowercase let-

 

 

ters except q and w.

[char-set1 & [char-set2]

 

Character set intersection. Matches all characters in

 

 

char-set1that are also in char-set2. For ex-

 

 

ample, [\x{0}-\x{7f}&[\p{L}]]matches all letters

 

 

between 0 and 127.

 

 

 

\x{hhhh}

 

Matches up to 31-bit Unicode hexadecimal charac-

 

 

ter specified by hhhh.

 

 

 

\p{UnicodeCategorySpec]

 

(Only valid in character set) Matches characters in

 

 

UnicodeCategorySpec. Where UnicodeCat-

 

 

egorySpec uses the standard general categories

 

 

specified by the Unicode consortium. For example,

 

 

[\p{L}] matches all letters. [\p{Lu}] matches all up-

 

 

percase letters. See Unicode Category Specifica-

 

 

tions for Regular Expressions.

 

 

 

\P{UnicodeCategorySpec]

 

(Only valid in character set) Matches characters not

 

 

in UnicodeCategorySpec. For example, [\P{L}]

 

 

matches all characters that are not letters. This is

 

 

equivalent to [^\p{L}]. [\P{Lu}] matches all charac-

 

 

ters that are not uppercase letters. See Unicode

 

 

Category Specifications for Regular Expressions.

 

 

 

\p{UnicodeIsBlockSpec]

 

(Only valid in character set) Matches characters in

 

 

UnicodeIsBlockSpec. Where UnicodeIsB-

 

 

lockSpec one of the standard character blocks

 

 

specified by the Unicode consortium. For example,

 

 

[\p{isGreek}] matches Unicode characters in the

 

 

Greek block. See Unicode Character Blocks for

 

 

Regular Expressions.

 

 

 

\P{UnicodeIsBlockSpec]

 

(Only valid in character set) Matches characters not

 

 

in UnicodeIsBlockSpec. For example,

 

 

[\P{isGreek}] matches all characters that are not in

 

 

 

522

Page 544
Image 544
Slick V3.3 manual Unix Regular Expression Definition, xhhhh