用于提取数字的正则表达式

本文关键字：正则表达式数字提取用于 | 更新日期: 2023-09-27 18:15:17

我有一个输入字符串列表如下:"PlusContent"PlusArchieve"PlusContent1"PlusArchieve1"

等，这意味着总是有两个条目构建一个主题{content, archive}。现在我想要得到字符串末尾的数字(我们称之为postfix)。为此，我使用了以下正则表达式:

Regex r = new Regex(@"Plus.+?('d*)$", RegexOptions.IgnoreCase);

现在我循环匹配(应该最多2，第一个是整个字符串，第二个是实际匹配，如果存在)。

foreach (Match m in r.Matches(tester)) Console.WriteLine(m);

其中tester是我列表中的当前字符串。但结果我总是得到整个字符串，而不是它的后缀。我在regexHero上测试了相同的正则表达式，它工作了…

你知道哪里出了问题吗?

注::这是整个代码:

List<string> words = new List<string> { "PlusContent", "PlusArchieve", "PlusContent1", "PlusArchieve1" };
foreach(string tester in words) {
    Regex r1 = new Regex(@"Plus.+?('d*)$", RegexOptions.IgnoreCase);
    foreach (Match m in r1.Matches(tester)) Console.WriteLine(m);
}

但是由于一些奇怪的原因，我只从列表中得到原始字符串。

用于提取数字的正则表达式

尝试捕获组并从索引1中获得匹配的组

'bPlus'D*('d*)'b

演示

阅读更多关于捕获组

模式说明:

  'b                       the word boundary
  Plus                     'Plus'
  'D*                      non-digits (all but 0-9) (0 or more times (most))
  (                        group and capture to '1:
    'd*                      digits (0-9) (0 or more times (most))
  )                        end of '1
  'b                       the word boundary

user3218114的正则表达式是正确的。由于数字是使用分组捕获的，因此需要使用Groups属性访问它们，如下所示:

List<string> words = new List<string> { "PlusContent", "PlusArchieve", "PlusContent1", "PlusArchieve1" };
foreach (string tester in words)
{
    Regex r1 = new Regex(@"'bPlus'D*('d*)'b", RegexOptions.IgnoreCase);
    foreach (Match m in r1.Matches(tester)) Console.WriteLine(m.Groups[1]);
}

在这种情况下，m.Group[0]是原始字符串内容，而m.Group[1]是在正则表达式中指定的分组。

可能这将有助于短Regex，它将获取任何字符串中的数字作为后缀。

'bPlus'D+('d*)

这是演示

如果你使用Regex，那么你可以用上面提到的正则表达式得到$1中的数字。

List<string> words = new List<string> { "PlusContent", "PlusArchieve", "PlusContent1", "PlusArchieve1" };
        foreach (string tester in words)
        {
            Regex r1 = new Regex(@"'bPlus'D+('d*)", RegexOptions.IgnoreCase);
            foreach (Match m in r1.Matches(tester))
            {
                //Console.WriteLine(m);
                if (!string.IsNullOrEmpty(Regex.Replace(tester, @"'bPlus'D+('d*)", "$1")))
                    Console.WriteLine(Regex.Replace(tester, @"'bPlus'D+('d*)", "$1"));
            }
        }