正在从字符串(regex)中提取字符

本文关键字：提取字符 regex 字符串 | 更新日期: 2024-09-24 00:52:54

我想从下面的示例字符串中提取粗体字符。模式如下：

字母组_数字组_字符集_ChunkOfDigits_CharIDontCare

"ABC12A1234D"

"ABCD34B5678E"

"EF34C9101F"

我想出了以下代码。它似乎工作得很好，但我想知道有没有更有效的方法，也许是使用regex？

    char extractString(string test)
    {
        bool isDigit = false;
        foreach(var c in test)
        {
            if (isDigit && !char.IsDigit(c))
                return c;
            isDigit = char.IsDigit(c);
        }
        return '0';
    }

正在从字符串(regex)中提取字符

如果您使用C#LINQ会更容易、更高性能（regex涉及大量开销）：

static char ExtractString(string test)
{
    return test.SkipWhile(c => Char.IsLetter(c))
               .SkipWhile(c => Char.IsDigit(c))
               .FirstOrDefault();
}

首先，正则表达式不应该比一个好的小算法更快。然而，我给你一个正则表达式来尝试它，并检查什么更快。

下面的正则表达式提供了您想要的内容：

^'D+'d+([A-Za-z])'d+'D+$

我建议你使用https://regex101.com/，它非常适合测试类似的东西。

C#中的这个函数应该使用regex来完成您所期望的操作，但我怀疑它是否比简单的算法更有效：

    using System.Text.RegularExpressions;
    private char extractChar(string test)
    {
        char charOut = ''0';
        var matches = Regex.Matches(test, "^[a-zA-Z]+[0-9]+([a-zA-Z])[0-9]+.+");
        if (matches.Count > 0)
            charOut = matches[0].Groups[1].Value[0];
        return charOut;
    }

假设

ChunkofAlphabets＝[A-Za-z]&lt--英文字母

ChunkOfDigits=[0-9]

CharIWant=可以是除数字[0-9]之外的任何字符

如上所述，正则表达式应该是

^[A-Za-z]+'d+('D+)'d+.*$

Regex演示

C#代码Ideone演示