Unicode character blocks for regular expressions, Unicode character blocks for, Regular expressions – Slick EDIT V3.3 User Manual

Page 561

Advertising
background image

Subcategory

Description

Sc

Symbol, Currency

Sk

Symbol, Modifier

So

Symbol, Other

Zs

Separator, Space

Zl

Separator, Line

Zp

Separator, Paragraph

Cc

Other, Control

Cf

Other, Format

Cs

Other, Surrogate

Co

Other, Private Use

Cn

Other, Not Assigned (no characters in the file have
this property)

Unicode Character Blocks for Regular Expressions

The Unicode consortium standard regular expression block categories are supported. The syntax for spe-
cifying a character block is:

\p{IsBlockName}

The above syntax matches the characters in the block specified. The following syntax matches all charac-
ters not in the block specified:

\P{IsBlockName}

The \p and \P notations may only be used inside a character set specification. For example,
[\p{isBasicLatin}] matches all characters in the Greek block. [\P{isBasicLatin}] matches all characters
that are not in the Greek block.

The following is a list of the non-standard valid character block names. This list was generated from XML
standards found at the World Wide Web Consortium Web site (

http://www.w3c.org

).

XMLNameStartChar - All characters that are valid for the start of an XML tag name.

Unicode Character Blocks for

Regular Expressions

539

Advertising