C# Professional - Processing Text
Regular Expressions - Groups
When building a regular expression pattern, you can specify groups that will match one value between multiple possible values. This is done using the parenthesis :
This example will match any text containing
Groups are very handy when working with a regular expression where you need to specify multiple options for a specific work.
As with individual characters, groups can use quantifiers to specify the number of occurence of the group.
Capture & Backreference
When using groups in the pattern, by default, the regular expression will capture the value corresponding to that group. This is often used when using regular expressions to extract a specific substring from a larger text.
In .Net, the value captured can be retrieved using the
Groups property of a
Match from a regular expression.
Note: the first element in the
Groups enumeration is the whole match, captured groups start at the 1 index
Values captured from a group can also be used as backreference in the pattern, allowing to ensure that the first captured value is the same in another part of the regular expression.
The backreference is done with the
\N syntax, where
N is the number of the referenced group in the pattern.
user_id: (\d+) - validating email for user \1
This will match text when the first
user_id is the same than the one at the end of the text.
Groups can be given a name with a specific syntax in the pattern.
Here, the capturing group is named
username. This name can be used for backreferences using the
\k<username> syntax, and can be used when retrieving groups on a
Match object in .Net.