如何在 C# 中读取正则表达式捕获
本文关键字:正则表达式 读取 | 更新日期: 2023-09-27 18:34:02
我想在控制台中询问用户的电话号码,根据正则表达式进行检查,然后捕获数字,以便我可以按照我想要的方式格式化它们。除了正则表达式捕获部分之外,我已经完成了所有工作。如何将捕获值获取到 C# 变量中?
static void askPhoneNumber()
{
String pattern = @"[(]?('d{3})[)]?[ -.]?('d{3})[ -.]?('d{4})";
System.Console.WriteLine("What is your phone number?");
String phoneNumber = Console.ReadLine();
while (!Regex.IsMatch(phoneNumber, pattern))
{
Console.WriteLine("Bad Input");
phoneNumber = Console.ReadLine();
}
Match match = Regex.Match(phoneNumber, pattern);
Capture capture = match.Groups.Captures;
System.Console.WriteLine(capture[1].Value + "-" + capture[2].Value + "-" + capture[3].Value);
}
C# 正则表达式 API 可能非常令人困惑。有组和捕获:
- 组
- 表示捕获组,用于从文本中提取子字符串
- 如果组出现在量词内,则每个组可以有多个捕获。
层次结构为:
- 火柴
- 群
- 捕获
- 群
(一个匹配可以有多个组,每个组可以有多个捕获(
例如:
Subject: aabcabbc
Pattern: ^(?:(a+b+)c)+$
在此示例中,只有一个组:(a+b+)
。此组位于量词内,并匹配两次。它生成两个捕获:aab
和 abb
:
aabcabbc
^^^ ^^^
Cap1 Cap2
当组不在量词内时,它只生成一个捕获。在您的情况下,您有 3 个组,每个组捕获一次。您可以使用 match.Groups[1].Value
、 match.Groups[2].Value
和 match.Groups[3].Value
来提取您感兴趣的 3 个子字符串,而无需诉诸捕获概念。
匹配结果可能很难理解。我写这段代码是为了帮助我理解发现了什么以及在哪里找到的。目的是可以将输出片段(从标有//**
的行(复制到程序中,以利用在匹配中找到的值。
public static void DisplayMatchResults(Match match)
{
Console.WriteLine("Match has {0} captures", match.Captures.Count);
int groupNo = 0;
foreach (Group mm in match.Groups)
{
Console.WriteLine(" Group {0,2} has {1,2} captures '{2}'", groupNo, mm.Captures.Count, mm.Value);
int captureNo = 0;
foreach (Capture cc in mm.Captures)
{
Console.WriteLine(" Capture {0,2} '{1}'", captureNo, cc);
captureNo++;
}
groupNo++;
}
groupNo = 0;
foreach (Group mm in match.Groups)
{
Console.WriteLine(" match.Groups[{0}].Value == '"{1}'"", groupNo, match.Groups[groupNo].Value); //**
groupNo++;
}
groupNo = 0;
foreach (Group mm in match.Groups)
{
int captureNo = 0;
foreach (Capture cc in mm.Captures)
{
Console.WriteLine(" match.Groups[{0}].Captures[{1}].Value == '"{2}'"", groupNo, captureNo, match.Groups[groupNo].Captures[captureNo].Value); //**
captureNo++;
}
groupNo++;
}
}
给定此输入,使用此方法的简单示例:
Regex regex = new Regex("/([A-Za-z]+)/(''d+)$");
String text = "some/directory/Pictures/Houses/12/apple/banana/"
+ "cherry/345/damson/elderberry/fig/678/gooseberry");
Match match = regex.Match(text);
DisplayMatchResults(match);
输出为:
Match has 1 captures
Group 0 has 1 captures '/Houses/12'
Capture 0 '/Houses/12'
Group 1 has 1 captures 'Houses'
Capture 0 'Houses'
Group 2 has 1 captures '12'
Capture 0 '12'
match.Groups[0].Value == "/Houses/12"
match.Groups[1].Value == "Houses"
match.Groups[2].Value == "12"
match.Groups[0].Captures[0].Value == "/Houses/12"
match.Groups[1].Captures[0].Value == "Houses"
match.Groups[2].Captures[0].Value == "12"
假设我们想在上面的文本中找到上述正则表达式的所有匹配项。然后我们可以在代码中使用MatchCollection
,例如:
MatchCollection matches = regex.Matches(text);
for (int ii = 0; ii < matches.Count; ii++)
{
Console.WriteLine("Match[{0}] // of 0..{1}:", ii, matches.Count-1);
RegexMatchDisplay.DisplayMatchResults(matches[ii]);
}
由此产生的输出是:
Match[0] // of 0..2:
Match has 1 captures
Group 0 has 1 captures '/Houses/12/'
Capture 0 '/Houses/12/'
Group 1 has 1 captures 'Houses'
Capture 0 'Houses'
Group 2 has 1 captures '12'
Capture 0 '12'
match.Groups[0].Value == "/Houses/12/"
match.Groups[1].Value == "Houses"
match.Groups[2].Value == "12"
match.Groups[0].Captures[0].Value == "/Houses/12/"
match.Groups[1].Captures[0].Value == "Houses"
match.Groups[2].Captures[0].Value == "12"
Match[1] // of 0..2:
Match has 1 captures
Group 0 has 1 captures '/cherry/345/'
Capture 0 '/cherry/345/'
Group 1 has 1 captures 'cherry'
Capture 0 'cherry'
Group 2 has 1 captures '345'
Capture 0 '345'
match.Groups[0].Value == "/cherry/345/"
match.Groups[1].Value == "cherry"
match.Groups[2].Value == "345"
match.Groups[0].Captures[0].Value == "/cherry/345/"
match.Groups[1].Captures[0].Value == "cherry"
match.Groups[2].Captures[0].Value == "345"
Match[2] // of 0..2:
Match has 1 captures
Group 0 has 1 captures '/fig/678/'
Capture 0 '/fig/678/'
Group 1 has 1 captures 'fig'
Capture 0 'fig'
Group 2 has 1 captures '678'
Capture 0 '678'
match.Groups[0].Value == "/fig/678/"
match.Groups[1].Value == "fig"
match.Groups[2].Value == "678"
match.Groups[0].Captures[0].Value == "/fig/678/"
match.Groups[1].Captures[0].Value == "fig"
match.Groups[2].Captures[0].Value == "678"
因此:
matches[1].Groups[0].Value == "/cherry/345/"
matches[1].Groups[1].Value == "cherry"
matches[1].Groups[2].Value == "345"
matches[1].Groups[0].Captures[0].Value == "/cherry/345/"
matches[1].Groups[1].Captures[0].Value == "cherry"
matches[1].Groups[2].Captures[0].Value == "345"
同样适用于matches[0]
和matches[2]
.
string pattern = @"[(]?('d{3})[)]?[ -.]?('d{3})[ -.]?('d{4})";
System.Console.WriteLine("What is your phone number?");
string phoneNumber = Console.ReadLine();
while (!Regex.IsMatch(phoneNumber, pattern))
{
Console.WriteLine("Bad Input");
phoneNumber = Console.ReadLine();
}
var match = Regex.Match(phoneNumber, pattern);
if (match.Groups.Count == 4)
{
System.Console.WriteLine("Number matched : "+match.Groups[0].Value);
System.Console.WriteLine(match.Groups[1].Value + "-" + match.Groups[2].Value + "-" + match.Groups[3].Value);
}