c# -删除匹配正则表达式的行

本文关键字:正则表达式 删除 | 更新日期: 2023-09-27 18:02:40

我有一些数据…它看起来像这样:

0423 222222 ADH, TEXTEXT 
0424 1234 ADH,MORE TEXT 
0425 98765 ADH, TEXT 3609 
2000 98765-4 LBL,IUC,PCA,S/N 
0010 99999-27 LBL,IUI,1.0x.25 
9000 12345678 HERE IS MORE, TEXT
9010 123-123 SOMEMORE,TEXT1231
9100 SD178 YAYFOR, TEXT01
9999 90123 HEY:HOW-TO DOTHIS

并且我想删除以9xxx开头的每个整个行。现在我已经尝试使用Regex替换值。下面是我的修改:

output = Regex.Replace(output, @"^9['d]{3}'s+['d*'-*'w*]+'s+['d*'w*'-*',*':*';*'.*'d*'w*]+", "");

然而,这真的很难读,它实际上并没有删除整行。


代码: 下面是我使用的代码部分:

        try
        {
            // Resets the formattedTextRichTextBox so multiple files aren't loaded on top of eachother.
            formattedTextRichTextBox.ResetText();
            foreach (string line in File.ReadAllLines(openFile.FileName))
            {
                // Uses regular expressions to find a line that has, digit(s), space(s), digit(s) + letter(s),
                // space(s), digit(s), space(s), any character (up to 25 times).
                Match theMatch = Regex.Match(line, @"^['.*'d]+'s+['d'w]+'s+['d'-'w*]+'s+.{25}");
                if (theMatch.Success)
                {
                    // Stores the matched value in string output.
                    string output = theMatch.Value;
                    // Replaces the text with the required layout.
                    output = Regex.Replace(output, @"^['.*'d]+'s+", "");
                    //output = Regex.Replace(output, @"^9['d]{3}'s+['d*'-*'w*]+'s+['d*'w*'-*',*':*';*'.*'d*'w*]+", "");
                    output = Regex.Replace(output, @"'s+", " ");
                    // Sets the formattedTextRichTextBox to the string output.
                    formattedTextRichTextBox.AppendText(output);
                    formattedTextRichTextBox.AppendText("'n");
                }
            }
        }

结果: 因此,我希望新数据的格式为(移除了9xxx):

0423 222222 ADH, TEXTEXT 
0424 1234 ADH,MORE TEXT 
0425 98765 ADH, TEXT 3609 
2000 98765-4 LBL,IUC,PCA,S/N 
0010 99999-27 LBL,IUI,1.0x.25 

问题:

  • 有更简单的方法去做这件事吗?
  • 如果是这样,我可以使用正则表达式来解决这个问题,或者我必须使用不同的方式?

c# -删除匹配正则表达式的行

只需重新制定测试格式的正则表达式,以匹配所有不以9开头的内容-这样以9开头的行就不会添加到富文本框中。

Try this(Uses Linq):

//Create a regex to identify lines that start with 9XXX
Regex rgx = new Regex(@"^9'd{3}");
//Below is the linq expression to filter the lines that start with 9XXX
var validLines = 
(
//This following line specifies what enumeration to pick the data from 
from ln in File.ReadAllLines(openFile.FileName)
//This following specifies what is the filter that needs to be applied to select the data. 
where !rgx.IsMatch(ln)
//This following specifies what to select from the filtered data.
select ln;
).ToArray(); //This line makes the IQueryable enumeration to an array of Strings (since variable ln in the above expression is a String)
//Finally join the filtered entries with a 'n using String.Join and then append it to the textbox
formattedTextRichTextBox.AppendText = String.Join(validLines, "'n");

是的,有一个更简单的方法。仅采用Regex.Replace方法,提供Multiline选项。

为什么不匹配第一个9xxx部分,使用通配符匹配该行的其余部分,这会更容易阅读。

output = Regex.Replace(output, @"^9['d{3}].*", "")