在字符串的所有实例前加上字符

本文关键字：字符实例字符串 | 更新日期: 2023-09-27 18:18:39

我正在使用c#。. NET 2.0和WinForms.

我有一段代码，在汇编格式如下:

VARIABLE = 40
ADDRESS = $60
LDA VARIABLE;
STA ADDRESS;

输出应该是:

!VARIABLE = 40
!ADDRESS = $60
LDA !VARIABLE;
STA !ADDRESS;

当然，还有更多。比如2000行，但关键是在文件的顶部有声明，我可以加载/保存或对它们做任何事情。但我的问题是，我必须在所有这些声明(即使在原始代码中)前加上!.

我当前的方法是:

        var defs = tab.tb.GetRanges(@"^'s*([A-Za-z0-9_]+){4,255}'s*=", RegexOptions.Multiline); // all declarations are formatted like this, so use regex to get all of them
        foreach (var def in defs)
        {
            string actual = def.Text.Substring(0, def.Text.IndexOf(' ')); // remove the = sign since its not used in actual code
                txt = txt.Replace(actual, "!" + actual);
        }

然而，这种方法非常慢。"修复"文件中的所有声明大约需要3秒钟。有更好的办法吗?需要说明的是，语法与普通文本框略有不同，因为我使用http://www.codeproject.com/KB/edit/FastColoredTextBox_.aspx作为文本控件。

在字符串的所有实例前加上字符

我怀疑你的性能问题在于在str上做替换。. net中的字符串是不可变的，所以当你做任何改变字符串的操作(追加、替换等)时，. net必须创建一个新字符串，并将旧字符串复制到其中(总共2000行)，加上所做的更改。尝试将字符串加载到StringBuilder(可变的)中，并使用它的本机. replace()方法。

这里是一个快速的尝试。在我的机器上，这将在&lt中处理25000行以上的文件;100 ms .

然而，我只有两个样本值作为基础;替换操作越多，性能越差。

更新:我尝试了另一个示例，这次有25000行和8个唯一值要修改。性能只下降了几毫秒。

Stopwatch sw = new Stopwatch();
string text = File.ReadAllText( @"C:''temp'so.txt" );
sw.Start();
// find the tokens we will be replacing
Regex tokenFinder = new Regex( @"^(([A-Za-z0-9_]+){4,255})'s*(=)", RegexOptions.Multiline );
// ensure uniqueness, and remove "=" by looking at the second match group
var tokens = ( from Match m in tokenFinder.Matches( text ) select m.Groups[1].Value ).Distinct();
// perform replace for each token...performance here will greatly vary based on the number of tokens to replace
foreach( string token in tokens )
{
    Regex replaceRegex = new Regex( token );
    text = replaceRegex.Replace( text, string.Concat( "!", token.Trim() ) );
}
sw.Stop();
Console.WriteLine( "Complete in {0}ms.", sw.ElapsedMilliseconds );

你在正则表达式中嵌套了量词!

([A-Za-z0-9_]+){4,255}

采取一个简单的字符串为'aaaaaaaa':什么是正则表达式引擎应该捕获?"a"8次，"aa"4次，"aa"，然后"a"，然后"a"，然后……

这很可能是导致性能问题的原因。千万别这么做!更重要的是，您试图在整个文件中匹配。即使regex引擎最终会选择最左边最长的匹配，它也会尝试所有的可能性。

去掉+