从给定条件的字符串数组中提取字符串,例如一个Class's Properties
本文关键字:字符串 一个 Class Properties 条件 数组 提取 | 更新日期: 2023-09-27 18:01:18
用例:我导入文本文件,需要读取由4行组成的内容。对于这4行,我解析出不是预定义的,而是动态的字符串。
这是我从@zx81收集的一个例子:
输入:on Apr 28, 2014 at 22:00
an Employee John Doe accessed
server - TPCX123
AccessType2 was ReasonType1 - program: Px2x3x, start: No22, 0.0 sec
因此,给定上述4行,我正在考虑保留它们的回车(即4行)或使其全部为一个字符串(即只有一行),我将提取属性并通过Class
的属性将它们放入内存,例如ReportDate
, ReportTime
, EmployeeName
, ServerName
, AccessType
, ReasonType
, ProgramId
, Start
, Length
。
期望输出:
ReportDate = Apr 28, 2014
ReportTime = 22:00
EmployeeName = John Doe
ServerName = TnCX123
AccessType = AccessType2
ReasonType = ReasonType1
ProgramId = Px2x3x
Start = No22
Length = 0.0 sec
这就是我想要的-在等号的RHS上找到的所有项目,即分配给内存中Object
中发现的特定属性的某些字符串,最终响应数据库表的列。从上面的示例中,属性EmployeeName
将始终位于相同的位置(在特定字符串之间),因此将解析出其值,例如:"John Doe"。当然,对于我导入的每个文件,这些值将是不同的,因此它的动态部分。
希望对大家有帮助,谢谢。
给定您的数据,像这样的代码将输出您想要的内容:
输出:ReportDate = Apr 28, 2014
ReportTime = 22:00
EmployeeName = John Doe
ServerName = TnCX123
AccessType = AccessType2
ReasonType = ReasonType1
ProgramId = Px2x3x
Start = No22
Length = 0.0 sec
代码:
using System;
using System.Text.RegularExpressions;
using System.Collections.Specialized;
class Program
{
static void Main()
{
string s1 = @"on Apr 28, 2014 at 22:00
an Employee John Doe accessed
server - TPCX123
AccessType2 was ReasonType1 - program: Px2x3x, start: No22, 0.0 sec";
try
{
var myRegex = new Regex(@"(?s)^on's+(['w, ]+?) at ('d{2}:'d{2}).*?Employee (['w ]+) accessed.*?server - ('w+).*?('w+) was ('w+) - program: ('w+), start: ('w+), ('d+'.'d+ 'w+)");
string date = myRegex.Match(s1).Groups[1].Value;
string time = myRegex.Match(s1).Groups[2].Value;
string name = myRegex.Match(s1).Groups[3].Value;
string server = myRegex.Match(s1).Groups[4].Value;
string access = myRegex.Match(s1).Groups[5].Value;
string reason = myRegex.Match(s1).Groups[6].Value;
string prog = myRegex.Match(s1).Groups[7].Value;
string start = myRegex.Match(s1).Groups[8].Value;
string length = myRegex.Match(s1).Groups[9].Value;
Console.WriteLine("ReportDate = " + date);
Console.WriteLine("ReportTime = " + time);
Console.WriteLine("EmployeeName = " + name);
Console.WriteLine("ServerName = " + server);
Console.WriteLine("AccessType = " + access);
Console.WriteLine("ReasonType = " + reason);
Console.WriteLine("ProgramId = " + prog);
Console.WriteLine("Start = " + start);
Console.WriteLine("Length = " + length);
}
catch (ArgumentException ex)
{
// We have a syntax error
}
Console.WriteLine("'nPress Any Key to Exit.");
Console.ReadKey();
} // END Main
} // END Program
调整它
然而,要调整它,你将不得不刷新你的正则表达式。
首先,下面是对代码中正则表达式的逐个标记解释。那么我建议你访问FAQ, RexEgg和FAQ中提到的其他网站。
@"
(? # Use these options for the whole regular expression
s # Dot matches line breaks
)
^ # Assert position at the beginning of the string
on # Match the character string “on” literally (case sensitive)
's # Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
( # Match the regex below and capture its match into backreference number 1
['w,' ] # Match a single character present in the list below
# A “word character” (Unicode; any letter or ideograph, digit, connector punctuation)
# A single character from the list “, ”
+? # Between one and unlimited times, as few times as possible, expanding as needed (lazy)
)
' at' # Match the character string “ at ” literally (case sensitive)
( # Match the regex below and capture its match into backreference number 2
'd # Match a single character that is a “digit” (0–9 in any Unicode script)
{2} # Exactly 2 times
: # Match the character “:” literally
'd # Match a single character that is a “digit” (0–9 in any Unicode script)
{2} # Exactly 2 times
)
. # Match any single character
*? # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
Employee' # Match the character string “Employee ” literally (case sensitive)
( # Match the regex below and capture its match into backreference number 3
['w' ] # Match a single character present in the list below
# A “word character” (Unicode; any letter or ideograph, digit, connector punctuation)
# The literal character “ ”
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
' accessed # Match the character string “ accessed” literally (case sensitive)
. # Match any single character
*? # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
server' -' # Match the character string “server - ” literally (case sensitive)
( # Match the regex below and capture its match into backreference number 4
'w # Match a single character that is a “word character” (Unicode; any letter or ideograph, digit, connector punctuation)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
. # Match any single character
*? # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
( # Match the regex below and capture its match into backreference number 5
'w # Match a single character that is a “word character” (Unicode; any letter or ideograph, digit, connector punctuation)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
' was' # Match the character string “ was ” literally (case sensitive)
( # Match the regex below and capture its match into backreference number 6
'w # Match a single character that is a “word character” (Unicode; any letter or ideograph, digit, connector punctuation)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
' -' program:' # Match the character string “ - program: ” literally (case sensitive)
( # Match the regex below and capture its match into backreference number 7
'w # Match a single character that is a “word character” (Unicode; any letter or ideograph, digit, connector punctuation)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
,' start:' # Match the character string “, start: ” literally (case sensitive)
( # Match the regex below and capture its match into backreference number 8
'w # Match a single character that is a “word character” (Unicode; any letter or ideograph, digit, connector punctuation)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
,' # Match the character string “, ” literally
( # Match the regex below and capture its match into backreference number 9
'd # Match a single character that is a “digit” (0–9 in any Unicode script)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
'. # Match the character “.” literally
'd # Match a single character that is a “digit” (0–9 in any Unicode script)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
' # Match the character “ ” literally
'w # Match a single character that is a “word character” (Unicode; any letter or ideograph, digit, connector punctuation)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
"