正则表达式获取仅包含数字但本身不在方括号内的方括号

本文关键字:方括号 获取 包含 数字 正则表达式 | 更新日期: 2023-09-27 18:36:13

示例字符串

 "[] [ds*[000112]] [1448472995] sample string [1448472995] ***";

正则表达式应匹配

 [1448472995] [1448472995]

并且不应该匹配[000112]因为有外部方括号。

目前我也有这个与[000112]匹配的正则表达式

const string unixTimeStampPattern = @"'[([0-9]+)]";

正则表达式获取仅包含数字但本身不在方括号内的方括号

这是使用平衡文本执行此操作的好方法。

    ( '[ 'd+ '] )                 # (1)
 |                             # or,
    '[                            # Opening bracket
    (?>                           # Then either match (possessively):
         [^'[']]+                      #  non - brackets
      |                              # or
         '[                            #  [ increase the bracket counter
         (?<Depth> )
      |                              # or
         ']                            #  ] decrease the bracket counter
         (?<-Depth> )
    )*                            # Repeat as needed.
    (?(Depth)                     # Assert that the bracket counter is at zero
         (?!)
    )
    ']                            # Closing bracket

C# 示例

string sTestSample = "[] [ds*[000112]] [1448472995] sample string [1448472995] ***";
Regex RxBracket = new Regex(@"('['d+'])|'[(?>[^'[']]+|'[(?<Depth>)|'](?<-Depth>))*(?(Depth)(?!))']");
Match bracketMatch = RxBracket.Match(sTestSample);
while (bracketMatch.Success)
{
    if (bracketMatch.Groups[1].Success)
        Console.WriteLine("{0}", bracketMatch);
    bracketMatch = bracketMatch.NextMatch();
}

输出

[1448472995]
[1448472995]

您需要使用平衡组来处理此问题 - 它看起来有点令人生畏,但并不是那么复杂:

Regex regexObj = new Regex(
    @"'[               # Match opening bracket.
    'd+                # Match a number.
    ']                 # Match closing bracket.
    (?=                # Assert that the following can be matched ahead:
     (?>               # The following group (made atomic to avoid backtracking):
      [^'[']]+         # One or more characters except brackets
     |                 # or
      '[ (?<Depth>)    # an opening bracket (increase bracket counter)
     |                 # or
      '] (?<-Depth>)   # a closing bracket (decrease bracket counter, can't go below 0).
     )*                # Repeat ad libitum.
     (?(Depth)(?!))    # Assert that the bracket counter is now zero.
     [^'[']]*          # Match any remaining non-bracket characters
     'z                # until the end of the string.
    )                  # End of lookahead.", 
    RegexOptions.IgnorePatternWhitespace);

你只是想捕获 unix 时间戳吗?然后,您可以尝试一个更简单的方法,其中指定组中匹配的最小字符数。

'[([0-9]{10})']

在这里,我将其限制为 10 个字符,因为我怀疑时间戳是否会很快达到 11 个字符......为了防止这种情况:

'[([0-9]{10,11})']

当然,如果在封闭括号中有一个 10 长度的数字,这可能会导致误报。

这将按预期匹配您的表达式:http://regexr.com/3csg3 它使用前瞻。