left-icon

Regular Expressions Succinctly®
by Joseph D. Booth

Previous
Chapter

of
A
A
A

CHAPTER 9

Regex Objects

Regex Objects


In this chapter, we are going to review the various regular expression objects that are returned by the regex match calls. A lot of details are available in these object about where the match was located, groups, etc.

Match object

The match object (which we briefly discussed in Chapter 2) is returned directly from the Regex Match() method, or the Regex Matches() method returns a collection in which each item in the collection is a match object. You can use the properties to explore the information collected during the match process.

Properties

The following properties return information about the result of the entire expression:

  • Index: The index property returns the zero-based position in the string where the match was found.
  • Length: The length property returns the length of the result captured by the expression.
  • Success: This is a Boolean value indicating whether or not a match was found.
  • Value: The value property is the actual result string that the expression found.
  • Groups Collections: Returns a collection of individual group objects the regular expression engine found. If you don’t specify any groups, this collection can be empty. Otherwise, it will contain a list of group objects found while processing the expression.

Captures Collections

When the regular expression engine parses the string, it will capture each element of text as it processes, looking for matches. Most regex engines discard the match as it continues searching in this process, but the Microsoft engine saves these discarded matches in a collection called Captures. To illustrate, let’s consider the following regex to get a list of words in a sentence:  \b(\w+\s*)+

Table 12: Example Regex

English rule

Regex pattern

Begin at a word boundary

\b

Start a group

(

Within group, find word

\w+

Followed by any number of spaces

\s*

Close the group

)

Group can occur 1 or more times

+

Each word in the sentence would be considered a match; however, only the last word in the sentence gets assigned to the group. While other words will be discarded as non-matches, you can find them in the captures collection when using Microsoft’s regex object.

If I wish my friend Sherry happy birthday using the above pattern, I’ll get the following tree structure:

Tree Structure

Figure 5: Tree Structure

The first group is the last word in the sentence <1> Sherry and the captures (discarded matches) are each word in the sentence, including the final word that was assigned to the group.

We will explore groups and captures in much greater detail in the next chapter.

Group Object

The group object contains similar properties to the match object, providing details about the sub-expression search results.

Properties

The following properties return information about the result of the sub-expression.

  • Index: Returns the zero based position in the string where the match was found.
  • Length: Returns the length of the result captured by the sub-expression.
  • Success: This is a Boolean value indicating whether or not a match was found.
  • Value: The actual result string that the sub-expression found.
  • Captures Collection: Each group sub-expression will also contain a collection of the captured search results from the sub-expression defined in the group.

Capture Object

The capture object contains basic information about each search text result. It can be obtained from the match object’s capture collection or the group object’s capture collection. It allows you to get a hold of all texts returned for either the entire expression or the individual group sub-expressions.

Properties

The following properties return information about the search texts that were evaluated during the expression search process. Note that in a group scenario, the last evaluated search text is returned in the group object and also present in the capture collection.

  • Index: Returns the zero based position in the string where the match was found.
  • Length: Returns the length of the result captured by the sub-expression.
  • Value: Value contains the actual result string that the sub-expression found.
Scroll To Top
Disclaimer
DISCLAIMER: Web reader is currently in beta. Please report any issues through our support system. PDF and Kindle format files are also available for download.

Previous

Next



You are one step away from downloading ebooks from the Succinctly® series premier collection!
A confirmation has been sent to your email address. Please check and confirm your email subscription to complete the download.