Unicode category specifica, Tions for regular expressions, Unicode – Slick EDIT V3.3 User Manual

Page 559: Category specifications for regular expressions

background image

Sample Brief Regular Expression


tice that the backslash must prefix the special char-
acter *.

[\t ]

Matches tab and space characters.


Matches tab and space characters.


Matches tab and space characters.


Matches any three-letter string starting with the let-
ter p and ending with the letter t. Two possible
matches are pot and pat.


Matches the letter s followed by any number of
characters followed by the nearest letter t. Two pos-
sible matches are seat and st.


Matches the strings for or while.


Matches lines beginning with a file name.


Matches x followed by one or more occurrences of
y followed by z.

Unicode Category Specifications for Regular Expressions

The Unicode consortium standard regular expression categories are supported. The syntax for specifying
categories is:

\p{MainCategoryLetter Subcategories}

The above syntax matches the categories specified. The following syntax matches all characters not in
the categories specified:

\P{MainCategoryLetter Subcategories}

The \p and \P notations can only be used inside a character set specification. MainCategoryLetter
can be L, M, N, P, S, Z, or C. The valid Subcategories depend on the MainCategoryLetter spe-
cified. If no Subcategories are specified, all are assumed. For example:

[\p{L}] matches all Unicode letters.

[\p{Lul}] matches all uppercase and lowercase letters.

[\P{L}] matches all characters that are not letters.

Unicode Category Specifications

for Regular Expressions

