Unicode category specifica, Tions for regular expressions, Unicode – Slick EDIT V3.3 User Manual

Page 559: Category specifications for regular expressions

Advertising
background image

Sample Brief Regular Expression

Description

tice that the backslash must prefix the special char-
acter *.

[\t ]

Matches tab and space characters.

[\d9\d32]

Matches tab and space characters.

[\x9\x20]

Matches tab and space characters.

p?t

Matches any three-letter string starting with the let-
ter p and ending with the letter t. Two possible
matches are pot and pat.

s*t

Matches the letter s followed by any number of
characters followed by the nearest letter t. Two pos-
sible matches are seat and st.

{for}|{while}

Matches the strings for or while.

^\:p

Matches lines beginning with a file name.

xy+z

Matches x followed by one or more occurrences of
y followed by z.

Unicode Category Specifications for Regular Expressions

The Unicode consortium standard regular expression categories are supported. The syntax for specifying
categories is:

\p{MainCategoryLetter Subcategories}

The above syntax matches the categories specified. The following syntax matches all characters not in
the categories specified:

\P{MainCategoryLetter Subcategories}

The \p and \P notations can only be used inside a character set specification. MainCategoryLetter
can be L, M, N, P, S, Z, or C. The valid Subcategories depend on the MainCategoryLetter spe-
cified. If no Subcategories are specified, all are assumed. For example:

[\p{L}] matches all Unicode letters.

[\p{Lul}] matches all uppercase and lowercase letters.

[\P{L}] matches all characters that are not letters.

Unicode Category Specifications

for Regular Expressions

537

Advertising