Before you can actually match a regular expression, you must compile it. This is not true compilation---it produces a special data structure, not machine instructions. But it is like ordinary compilation in that its purpose is to enable you to ``execute'' the pattern fast. (See Matching POSIX Regexps, for how to use the compiled regular expression for matching.)
There is a special data type for compiled regular expressions:
re_nsub
There are several other fields, but we don't describe them here, because only the functions in the library should use them.
After you create a regex_t
object, you can compile a regular expression into it by calling regcomp
.
regcomp
``compiles'' a regular expression into a data structure that you can use with regexec
to match against a string. The compiled regular expression format is designed for efficient matching. regcomp
stores it into *compiled
.
It's up to you to allocate an object of type regex_t
and pass its address to regcomp
.
The argument cflags lets you specify various options that control the syntax and semantics of regular expressions. See Flags for POSIX Regexps.
If you use the flag REG_NOSUB
, then regcomp
omits from the compiled regular expression the information necessary to record how subexpressions actually match. In this case, you might as well pass 0
for the matchptr and nmatch arguments when you call regexec
.
If you don't use REG_NOSUB
, then the compiled regular expression does have the capacity to record how subexpressions match. Also, regcomp
tells you how many subexpressions pattern has, by storing the number in compiled->re_nsub
. You can use that value to decide how long an array to allocate to hold information about subexpression matches.
regcomp
returns 0
if it succeeds in compiling the regular expression; otherwise, it returns a nonzero error code (see the table below). You can use regerror
to produce an error message string describing the reason for a nonzero value; see Regexp Cleanup.
Here are the possible nonzero values that regcomp
can return:
REG_BADBR
REG_BADPAT
REG_BADRPT
REG_ECOLLATE
REG_ECTYPE
REG_EESCAPE
REG_ESUBREG
REG_EBRACK
REG_EPAREN
REG_EBRACE
REG_ERANGE
REG_ESPACE
regcomp
ran out of memory.