Regular Expressions

Overview

The Guardium implementation of regular expressions conforms with POSIX 1003.2. For more detailed information, see the Open Group web site: www.opengroup.org.

This help topic provides instructions for using the Build Regular Expression Tool, and several tables of commonly used special characters and constructs. It does not provide a comprehensive description of how regular expressions are constructed or used. See the web site referenced above for more detailed information.

The important point to keep in mind about pattern matching using regular expressions, is that the search for a match starts at the beginning of a string and stops when the first sequence matching the expression is found.

Using the Build Regular Expression Tool

When an input field requires a regular expression, you can use the Build Regular Expression tool to code and test a regular expression.

To open the Build Regular Expression tool, click the (Regex) button beside the field that will contain the regular expression. If you have already entered anything in the field, it will be copied to the Regular Expression box in the Build Regular Expression panel.

  1. Enter or modify the expression in the Regular Expression box.

  2. To test the expression, enter text in the Text To Match Against box, and then click the Test button:

  3. We suggest that you repeat the above step a number of times to verify that your regular expression both matches and does not match, as expected for your purpose.

  4. To enter a special character at the end of your expression, you can select it from the Select element list. To enter a special character anywhere else, you will have to type it or copy it there.

  5. When you are done making changes and testing, click the Accept button to close the Build Regular Expression panel and copy the regular expression to the definition panel.

Special Characters and Constructs

The following table provides a summary of the more commonly used special characters and constructs.

Char

How do I ...

Example

Matches

No Match

literal

Match an exact sequence of characters (case sensitive), except for the special characters described below

can

can

Can

cab

caN

. (dot)

Match any character including carriage return or newline (\n) characters

ca.

can

cab

c

cb

*

Match zero or more instances of preceding character(s)

Ca*n

Cn

Can

Caan

Cb

Cabn

^

Match string beginning with following character(s)

^C.

Ca

ca

a

$

Match string ending with preceding character(s)

C.*n$

Can

Cn

 

Cab

+

Match one or more instances of preceding character(s)

^Ca+n

Can

Caan

Cn

?

Match either zero or one instance of preceding character(s)

Ca?n

Cn

Can

Caan

|

Match either the preceding or following pattern

Can | cab

Can

cab

Cab

(x…)

Match the sequence enclosed in parentheses

(Ca)*n

Can

XaCan

Cn

CCnn

{n}

Match exactly n instances of the preceding character(s)

Ca{3}n

Caaan

Caan

Caaaan

{n,}

Match n or more instances of the preceding character(s)

Ca{2,}n

Caan

Caaaan

Can

Cn

{n,m}

Match from n to m instances of the preceding character(s)

Ca{2,3}n

Caan

Caaan

Can

Caaaan

[a-ce]

Match a single character in the set, where the dash indicates a contiguous sequence; for example, [0-9] matches any digit

[C-FL]an

Can

Dan

Lan

Ban

[^a-ce]

Match any character that is NOT in the specified set

 [^C-FL]an

aan

Ban

Can

Dan

[[.char.]]

Match the enclosed character or the named character from the Named Characters Table, below

[[.~.]]an

or

[[.tilde.]]an

~an

@an

[[:class:]]

Match any character in the specified character class, from the Character Classes Table, below

[[:alpha:]]+

 abc

ab3

Named Characters Table (English)

The following table describes the standard character names that can be used within regular expression bracket pairs ([[.char]] - see above). Character names are location specific, so non-English versions of Guardium may use a different set of character names.

Name

Value

NUL

\0

SOH

\001

STX

\002

ETX

\003

EOT

\004

ENQ

\005

ACK

\006

BEL

\007

alert

\007

BS

\010

backspace

\b

HT

\011

tab

\t

LF

\012

newline

\n

VT

\013

vertical-tab

\v

FF

\014

form-feed

\f

CR

\015

carriage-return

\r

SO

\016

SI

\017

DLE

\020

DC1

\021

DC2

\022

DC3

\023

DC4

\024

NAK

\025

SYN

\026

ETB

\027

CAN

\030

EM

\031

SUB

\032

ESC

\033

IS4

\034

FS

\034

IS3

\035

GS

\035

IS2

\036

RS

\036

IS1

\037

US

\037

space

' '

exclamation-mark

!

quotation-mark

"

number-sign

#

dollar-sign

$

percent-sign

%

ampersand

&

apostrophe

\'

left-parenthesis

(

right-parenthesis

)

asterisk

*

plus-sign

+

comma

,

hyphen

-

hyphen-minus

-

period

.

full-stop

.

slash

/

solidus

/

zero

0

one

1

two

2

three

3

four

4

five

5

six

6

seven

7

eight

8

nine

9

colon

:

semicolon

;

less-than-sign

<

equals-sign

=

greater-than-sign

>

question-mark

?

commercial-at

@

left-square-bracket

[

backslash

\\

reverse-solidus

\\

right-square-bracket

]

circumflex

^

circumflex-accent

^

underscore

_

low-line

_

grave-accent

`

left-brace

{

left-curly-bracket

{

vertical-line

|

right-brace

}

right-curly-bracket

}

tilde

~

DEL

177

NULL

0

Named Character Class Table (English)

The following table describes the standard character classes that you can reference within regular expression bracket pairs ([[:class:]] - see above). Note that character classes are location specific, so non-English versions of Guardium may use a different set of character names.

Class

Characters Included

alnum

Alphanumeric (a-z, A-Z, 0-9)

alpha

Alphabetic (a-z, A-Z)

blank

Whitespace (blank, line feed, carriage return)

cntrl

Control

digit

0-9

graph

Graphics

lower

Lowercase alphabetic (a-z)

print

Printable characters

punct

Punctuation characters

space

Space, tab, newline, and carriage return

upper

Uppercase alphabetic

xdigit

Hexadecimal digit (0-9, a-f)

Regular Expression Examples

You can copy and paste any of the expressions from the right-hand column to a field requiring a regular expression. When using any of these examples, we strongly suggest that you experiment by using it in the Build Regular Expression tool, entering a variety of matching and non-matching values, so that you understand exactly what is being matched by the expression.

Description

Regular Expression

Social Security Number
(must have hyphens)

[0-9]{3}-[0-9]{2}-[0-9]{4}

Phone Number
(North America - Matches 3334445555, 333.444.5555,
333-444-5555,
333 444 5555,
(333) 444 5555,
and all combinations thereof)

\(?[0-9]{3}\)?[-. ]?[0-9]{3}[-. ]?[0-9]{4}

Postal Code
(Canada)

[ABCEGHJKLMNPRSTVXY][0-9][A-Z] [0-9][A-Z][0-9]

Postal Code
(UK)

[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][ABD-HJLNP-UW-Z]{2}

Zip Code (US)

(5 digits required, hyphen followed by four digits optional)l

[0-9]{5}(?:-[0-9]{4})?

Credit Card Numbers

[0-9]{4}[-, ]?[0-9]{4}[-, ]?[0-9]{4}[-, ]?[0-9]{4}