#include <regex.h>
Inheritance diagram for RegexMatcher:
Public Member Functions | |
RegexMatcher (const UnicodeString ®exp, uint32_t flags, UErrorCode &status) | |
Construct a RegexMatcher for a regular expression. | |
RegexMatcher (const UnicodeString ®exp, const UnicodeString &input, uint32_t flags, UErrorCode &status) | |
Construct a RegexMatcher for a regular expression. | |
virtual | ~RegexMatcher () |
Destructor. | |
virtual UBool | matches (UErrorCode &status) |
Attempts to match the entire input string against the pattern. | |
virtual UBool | matches (int32_t startIndex, UErrorCode &status) |
Attempts to match the input string, beginning at startIndex, against the pattern. | |
virtual UBool | lookingAt (UErrorCode &status) |
Attempts to match the input string, starting from the beginning, against the pattern. | |
virtual UBool | lookingAt (int32_t startIndex, UErrorCode &status) |
Attempts to match the input string, starting from the specified index, against the pattern. | |
virtual UBool | find () |
Find the next pattern match in the input string. | |
virtual UBool | find (int32_t start, UErrorCode &status) |
Resets this RegexMatcher and then attempts to find the next substring of the input string that matches the pattern, starting at the specified index. | |
virtual UnicodeString | group (UErrorCode &status) const |
Returns a string containing the text matched by the previous match. | |
virtual UnicodeString | group (int32_t groupNum, UErrorCode &status) const |
Returns a string containing the text captured by the given group during the previous match operation. | |
virtual int32_t | groupCount () const |
Returns the number of capturing groups in this matcher's pattern. | |
virtual int32_t | start (UErrorCode &status) const |
Returns the index in the input string of the start of the text matched during the previous match operation. | |
virtual int32_t | start (int32_t group, UErrorCode &status) const |
Returns the index in the input string of the start of the text matched by the specified capture group during the previous match operation. | |
virtual int32_t | end (UErrorCode &status) const |
Returns the index in the input string of the first character following the text matched during the previous match operation. | |
virtual int32_t | end (int32_t group, UErrorCode &status) const |
Returns the index in the input string of the character following the text matched by the specified capture group during the previous match operation. | |
virtual RegexMatcher & | reset () |
Resets this matcher. | |
virtual RegexMatcher & | reset (int32_t index, UErrorCode &status) |
Resets this matcher, and set the current input position. | |
virtual RegexMatcher & | reset (const UnicodeString &input) |
Resets this matcher with a new input string. | |
virtual const UnicodeString & | input () const |
Returns the input string being matched. | |
virtual const RegexPattern & | pattern () const |
Returns the pattern that is interpreted by this matcher. | |
virtual UnicodeString | replaceAll (const UnicodeString &replacement, UErrorCode &status) |
Replaces every substring of the input that matches the pattern with the given replacement string. | |
virtual UnicodeString | replaceFirst (const UnicodeString &replacement, UErrorCode &status) |
Replaces the first substring of the input that matches the pattern with the replacement string. | |
virtual RegexMatcher & | appendReplacement (UnicodeString &dest, const UnicodeString &replacement, UErrorCode &status) |
Implements a replace operation intended to be used as part of an incremental find-and-replace. | |
virtual UnicodeString & | appendTail (UnicodeString &dest) |
As the final step in a find-and-replace operation, append the remainder of the input string, starting at the position following the last appendReplacement(), to the destination string. | |
virtual int32_t | split (const UnicodeString &input, UnicodeString dest[], int32_t destCapacity, UErrorCode &status) |
Split a string into fields. | |
void | setTrace (UBool state) |
setTrace Debug function, enable/disable tracing of the matching engine. | |
virtual UClassID | getDynamicClassID () const |
ICU "poor man's RTTI", returns a UClassID for the actual class. | |
Static Public Member Functions | |
static UClassID | getStaticClassID () |
ICU "poor man's RTTI", returns a UClassID for this class. | |
Friends | |
class | RegexPattern |
class | RegexCImpl |
It includes methods for testing for matches, and for find and replace operations.
Class RegexMatcher is not intended to be subclassed.
Definition at line 451 of file regex.h.
RegexMatcher::RegexMatcher | ( | const UnicodeString & | regexp, | |
uint32_t | flags, | |||
UErrorCode & | status | |||
) |
Construct a RegexMatcher for a regular expression.
This is a convenience method that avoids the need to explicitly create a RegexPattern object. Note that if several RegexMatchers need to be created for the same expression, it will be more efficient to separately create and cache a RegexPattern object, and use its matcher() method to create the RegexMatcher objects.
regexp | The Regular Expression to be compiled. | |
flags | Regular expression options, such as case insensitive matching. |
status | Any errors are reported by setting this UErrorCode variable. |
RegexMatcher::RegexMatcher | ( | const UnicodeString & | regexp, | |
const UnicodeString & | input, | |||
uint32_t | flags, | |||
UErrorCode & | status | |||
) |
Construct a RegexMatcher for a regular expression.
This is a convenience method that avoids the need to explicitly create a RegexPattern object. Note that if several RegexMatchers need to be created for the same expression, it will be more efficient to separately create and cache a RegexPattern object, and use its matcher() method to create the RegexMatcher objects.
The matcher will retain a reference to the supplied input string, and all regexp pattern matching operations happen directly on the original string. It is critical that the string not be altered or deleted before use by the regular expression operations is complete.
regexp | The Regular Expression to be compiled. | |
input | The string to match. The matcher retains a reference to the caller's string; mo copy is made. | |
flags | Regular expression options, such as case insensitive matching. |
status | Any errors are reported by setting this UErrorCode variable. |
virtual RegexMatcher::~RegexMatcher | ( | ) | [virtual] |
virtual UBool RegexMatcher::matches | ( | UErrorCode & | status | ) | [virtual] |
Attempts to match the entire input string against the pattern.
status | A reference to a UErrorCode to receive any errors. |
virtual UBool RegexMatcher::matches | ( | int32_t | startIndex, | |
UErrorCode & | status | |||
) | [virtual] |
Attempts to match the input string, beginning at startIndex, against the pattern.
The match must extend to the end of the input string.
startIndex | The input string index at which to begin matching. | |
status | A reference to a UErrorCode to receive any errors. |
virtual UBool RegexMatcher::lookingAt | ( | UErrorCode & | status | ) | [virtual] |
Attempts to match the input string, starting from the beginning, against the pattern.
Like the matches() method, this function always starts at the beginning of the input string; unlike that function, it does not require that the entire input string be matched.
If the match succeeds then more information can be obtained via the start()
, end()
, and group()
functions.
status | A reference to a UErrorCode to receive any errors. |
virtual UBool RegexMatcher::lookingAt | ( | int32_t | startIndex, | |
UErrorCode & | status | |||
) | [virtual] |
Attempts to match the input string, starting from the specified index, against the pattern.
The match may be of any length, and is not required to extend to the end of the input string. Contrast with match().
If the match succeeds then more information can be obtained via the start()
, end()
, and group()
functions.
startIndex | The input string index at which to begin matching. | |
status | A reference to a UErrorCode to receive any errors. |
virtual UBool RegexMatcher::find | ( | ) | [virtual] |
Find the next pattern match in the input string.
The find begins searching the input at the location following the end of the previous match, or at the start of the string if there is no previous match. If a match is found, start(), end()
and group()
will provide more information regarding the match.
Note that if the input string is changed by the application, use find(startPos, status) instead of find(), because the saved starting position may not be valid with the altered input string.
virtual UBool RegexMatcher::find | ( | int32_t | start, | |
UErrorCode & | status | |||
) | [virtual] |
Resets this RegexMatcher and then attempts to find the next substring of the input string that matches the pattern, starting at the specified index.
start | the position in the input string to begin the search | |
status | A reference to a UErrorCode to receive any errors. |
virtual UnicodeString RegexMatcher::group | ( | UErrorCode & | status | ) | const [virtual] |
Returns a string containing the text matched by the previous match.
If the pattern can match an empty string, an empty string may be returned.
status | A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed. |
virtual UnicodeString RegexMatcher::group | ( | int32_t | groupNum, | |
UErrorCode & | status | |||
) | const [virtual] |
Returns a string containing the text captured by the given group during the previous match operation.
Group(0) is the entire match.
groupNum | the capture group number | |
status | A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed and U_INDEX_OUTOFBOUNDS_ERROR for a bad capture group number. |
virtual int32_t RegexMatcher::groupCount | ( | ) | const [virtual] |
Returns the number of capturing groups in this matcher's pattern.
virtual int32_t RegexMatcher::start | ( | UErrorCode & | status | ) | const [virtual] |
Returns the index in the input string of the start of the text matched during the previous match operation.
status | a reference to a UErrorCode to receive any errors. |
virtual int32_t RegexMatcher::start | ( | int32_t | group, | |
UErrorCode & | status | |||
) | const [virtual] |
Returns the index in the input string of the start of the text matched by the specified capture group during the previous match operation.
Return -1 if the capture group exists in the pattern, but was not part of the last match.
group | the capture group number | |
status | A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed, and U_INDEX_OUTOFBOUNDS_ERROR for a bad capture group number |
virtual int32_t RegexMatcher::end | ( | UErrorCode & | status | ) | const [virtual] |
Returns the index in the input string of the first character following the text matched during the previous match operation.
status | A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed. |
virtual int32_t RegexMatcher::end | ( | int32_t | group, | |
UErrorCode & | status | |||
) | const [virtual] |
Returns the index in the input string of the character following the text matched by the specified capture group during the previous match operation.
group | the capture group number | |
status | A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed and U_INDEX_OUTOFBOUNDS_ERROR for a bad capture group number |
virtual RegexMatcher& RegexMatcher::reset | ( | ) | [virtual] |
Resets this matcher.
The effect is to remove any memory of previous matches, and to cause subsequent find() operations to begin at the beginning of the input string.
virtual RegexMatcher& RegexMatcher::reset | ( | int32_t | index, | |
UErrorCode & | status | |||
) | [virtual] |
Resets this matcher, and set the current input position.
The effect is to remove any memory of previous matches, and to cause subsequent find() operations to begin at the specified position in the input string.
virtual RegexMatcher& RegexMatcher::reset | ( | const UnicodeString & | input | ) | [virtual] |
Resets this matcher with a new input string.
This allows instances of RegexMatcher to be reused, which is more efficient than creating a new RegexMatcher for each input string to be processed.
input | The new string on which subsequent pattern matches will operate. The matcher retains a reference to the callers string, and operates directly on that. Ownership of the string remains with the caller. Because no copy of the string is made, it is essential that the caller not delete the string until after regexp operations on it are done. |
virtual const UnicodeString& RegexMatcher::input | ( | ) | const [virtual] |
Returns the input string being matched.
The returned string is not a copy, but the live input string. It should not be altered or deleted.
virtual const RegexPattern& RegexMatcher::pattern | ( | ) | const [virtual] |
Returns the pattern that is interpreted by this matcher.
virtual UnicodeString RegexMatcher::replaceAll | ( | const UnicodeString & | replacement, | |
UErrorCode & | status | |||
) | [virtual] |
Replaces every substring of the input that matches the pattern with the given replacement string.
This is a convenience function that provides a complete find-and-replace-all operation.
This method first resets this matcher. It then scans the input string looking for matches of the pattern. Input that is not part of any match is left unchanged; each match is replaced in the result by the replacement string. The replacement string may contain references to capture groups.
replacement | a string containing the replacement text. | |
status | a reference to a UErrorCode to receive any errors. |
virtual UnicodeString RegexMatcher::replaceFirst | ( | const UnicodeString & | replacement, | |
UErrorCode & | status | |||
) | [virtual] |
Replaces the first substring of the input that matches the pattern with the replacement string.
This is a convenience function that provides a complete find-and-replace operation.
This function first resets this RegexMatcher. It then scans the input string looking for a match of the pattern. Input that is not part of the match is appended directly to the result string; the match is replaced in the result by the replacement string. The replacement string may contain references to captured groups.
The state of the matcher (the position at which a subsequent find() would begin) after completing a replaceFirst() is not specified. The RegexMatcher should be reset before doing additional find() operations.
replacement | a string containing the replacement text. | |
status | a reference to a UErrorCode to receive any errors. |
virtual RegexMatcher& RegexMatcher::appendReplacement | ( | UnicodeString & | dest, | |
const UnicodeString & | replacement, | |||
UErrorCode & | status | |||
) | [virtual] |
Implements a replace operation intended to be used as part of an incremental find-and-replace.
The input string, starting from the end of the previous replacement and ending at the start of the current match, is appended to the destination string. Then the replacement string is appended to the output string, including handling any substitutions of captured text.
For simple, prepackaged, non-incremental find-and-replace operations, see replaceFirst() or replaceAll().
dest | A UnicodeString to which the results of the find-and-replace are appended. | |
replacement | A UnicodeString that provides the text to be substituted for the input text that matched the regexp pattern. The replacement text may contain references to captured text from the input. | |
status | A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed, and U_INDEX_OUTOFBOUNDS_ERROR if the replacement text specifies a capture group that does not exist in the pattern. |
virtual UnicodeString& RegexMatcher::appendTail | ( | UnicodeString & | dest | ) | [virtual] |
As the final step in a find-and-replace operation, append the remainder of the input string, starting at the position following the last appendReplacement(), to the destination string.
appendTail()
is intended to be invoked after one or more invocations of the RegexMatcher::appendReplacement()
.
dest | A UnicodeString to which the results of the find-and-replace are appended. |
virtual int32_t RegexMatcher::split | ( | const UnicodeString & | input, | |
UnicodeString | dest[], | |||
int32_t | destCapacity, | |||
UErrorCode & | status | |||
) | [virtual] |
Split a string into fields.
Somewhat like split() from Perl. The pattern matches identify delimiters that separate the input into fields. The input data between the matches becomes the fields themselves.
input | The string to be split into fields. The field delimiters match the pattern (in the "this" object). This matcher will be reset to this input string. | |
dest | An array of UnicodeStrings to receive the results of the split. This is an array of actual UnicodeString objects, not an array of pointers to strings. Local (stack based) arrays can work well here. | |
destCapacity | The number of elements in the destination array. If the number of fields found is less than destCapacity, the extra strings in the destination array are not altered. If the number of destination strings is less than the number of fields, the trailing part of the input string, including any field delimiters, is placed in the last destination string. | |
status | A reference to a UErrorCode to receive any errors. |
void RegexMatcher::setTrace | ( | UBool | state | ) |
setTrace Debug function, enable/disable tracing of the matching engine.
For internal ICU development use only. DO NO USE!!!!
static UClassID RegexMatcher::getStaticClassID | ( | ) | [static] |
virtual UClassID RegexMatcher::getDynamicClassID | ( | ) | const [virtual] |