用c#进行多行正则表达式解析

本文关键字:正则表达式 | 更新日期: 2023-09-27 17:50:40

我在正则表达式中遇到了一些问题。

我有以下类型的字符串,我需要解析/从

获取数据
'Account'.Search = "[Id] is NULL OR NAME = ""0%"""
'Account'.Sort = "NAME, Address"
'Bank Objectives'.Search = "[Active Flg] = "Y""
'Bank'.Search = "[Id] is NULL"
'Bank'.Sort = "Transit"
'Bank Goals'.Search = "[Active Flg] = 'Y'"
.......

下面的表达式似乎可以

's*(?:(?<search>(?:(=['"])?(?:.*)'1|.*)'.Search[^"']*(?:(=['"])?(?:.*)'1|.*))'s*$?(?<sort>(?:(=['"])?(?:.*)'1|.*)'.Sort[^"']*(?:(=['"])?(?:.*)'1|.*))?'s*$?)

我可以得到像

这样的输出
search: 'Account'.Search = '[Id] is NULL OR NAME = ''0%'''
sort: 'Account'.Sort = "NAME, Address"
search: 'Bank Objectives'.Search = "[Active Flg] = "Y""
search: 'Bank'.Search = "[Id] is NULL"
sort: 'Bank'.Sort = "Transit"
search: 'Bank Goals'.Search = "[Active Flg] = 'Y'"

但是如果输入字符串没有行返回,那么我开始得到

search: 'Account'.Search = '[Id] is NULL OR NAME = ''0%''''Account'.Sort = "NAME, Address"

我真正从输入字符串中得到的是

Account
  Search = "[Id] is NULL OR NAME = ""0%"""
  Sort = "NAME, Address"
Bank Objectives 
  Search = "[Id] is NULL OR NAME = ""0%"""
  Sort = ""
.....

(=['"])?(?:.*)'1|.*)也应该匹配引号,但它似乎也不起作用。

最终目标是将字符串[Id] IS NULL or Name =""0%"""传递给表达式编辑器,允许用户进行更改,然后更新整个字符串的搜索部分,因此,如果用户想要以bob开头的帐户名称,我需要生成

'Account'.Search = "Name like 'bob%'"
'Account'.Sort = "NAME, Address"
'Bank Objectives'.Search = "[Active Flg] = "Y""
'Bank'.Search = "[Id] is NULL"
'Bank'.Sort = "Transit"
'Bank Goals'.Search = "[Active Flg] = 'Y'"
.......

我已经建立了那块,它正在解析这个可以有多个条目的字符串。字符串也可以只包含

"Name like 'bob%'"

所以我要弄清楚它实际上应该是

"帐户"。Search = "Name like 'bob%'"

所以我需要知道输入字符串是否匹配正则表达式

用c#进行多行正则表达式解析

使用正则表达式不太可能得到100%的解决方案。如果您的语法非常复杂,并且可能接收到不符合规则的输入,则需要生成词法解析器。关于如何在。net中做到这一点,有很多资源。

  • http://stackoverflow.com/questions/1669/learning-to-write-a-compiler
  • http://blogs.microsoft.co.il/blogs/sasha/archive/2010/10/06/writing-a-compiler-in-c-lexical-analysis.aspx
  • http://www.antlr.org/
更新:

如果你必须使用正则表达式,为什么不把它分解成几个较小的问题呢?首先,按换行分隔并单独解析每行。然后,使用一组更简单的正则表达式来分解字符串。

例如,将每行解析为[obj]。[动词]= [rule] with:
/'(?<obj>[^']+)'.(?<verb>[^'w=]+')'s*='s*"(?<rule>.+)"/

然后用:
解析排除字段/'[[^']+']/