c#中字符串中转义字符的最快方法
本文关键字:方法 转义字符 字符串 | 更新日期: 2023-09-27 18:03:11
我想找到最快的方法来替换字符串中所有保留字符的转义版本。
我自然而然地想到了两种朴素的方法(注意,保留字符集只是一个例子):A:使用查找字典和字符串。替换
private Dictionary<string, string> _someEscapeTokens = new Dictionary<string, string>()
{
{"'t", @"'t"},
{"'n", @"'n"},
{"'r", @"'r"}
};
public string GetEscapedStringByNaiveLookUp(string s)
{
foreach (KeyValuePair<string, string> escapeToken in _someEscapeTokens.Where(kvp => s.Contains(kvp.Key)))
{
s = s.Replace(escapeToken.Key, escapeToken.Value);
}
return s;
}
B:遍历字符串
中的每个字符public string GetEscapedStringByTraversingCharArray(string s)
{
char[] chars = s.ToCharArray();
StringBuilder sb = new StringBuilder();
for (int i = 0; i < chars.Length; i++)
{
switch (chars[i])
{
case ''t':
sb.Append(@"'t"); break;
case ''n':
sb.Append(@"'n"); break;
case ''r':
sb.Append(@"'r"); break;
default:
sb.Append(chars[i]); break;
}
}
return sb.ToString();
}
正如我已经测试过的那样,版本B的性能很容易超过第一个。
注意:我已经考虑过正则表达式。转义,但由于字符集与我的不匹配,所以它不适合。
然而,有没有其他方法可以解决这个问题(考虑到性能)?
进行更多的测试代码见下文。
在两个不同的系统上对进行了测试。. NET Framework 4.0。无论如何,结果几乎是相同的:Char Array (short string) average time: 38600 ns
Foreach (short string) average time: 26680 ns
Char Array (long string) average time: 48,1 ms
Foreach (long string) average time: 64,2 ms
Char Array (escaping only) average time: 13,6 ms
Foreach (escaping only) average time: 17,3 ms
这让我得出结论,foreach
版本对于短字符串似乎稍微快一些,但不知何故它"落在";对于更长的弦。然而,我们在这里讨论的是非常小的差异。
测试代码:
private static void Main(string[] args)
{
//around 700 characters
string shortString = new StackTrace().ToString();
string longString;
string pureEscape;
//loading from a file with 1000000 words http://loremipsum.de/
using (StreamReader sr = new StreamReader(@"C:'users'ekrueger'desktop'LoremIpsum.txt"))
{
longString = sr.ReadToEnd();
}
//text file containing only escapable characters (length ~1000000)
using (StreamReader sr = new StreamReader(@"C:'users'ekrueger'desktop'PureEscape.txt"))
{
pureEscape = sr.ReadToEnd();
}
List<double> timesCharArrayShortString = new List<double>();
List<double> timesForeachShortString = new List<double>();
List<long> timesCharArrayLongString = new List<long>();
List<long> timesForeachLongString = new List<long>();
List<long> timesCharArrayPureEscape = new List<long>();
List<long> timesForeachPureEscape = new List<long>();
Stopwatch sw = new Stopwatch();
for (int i = 0; i < 10; i++)
{
sw.Restart();
GetEscapedStringByTraversingCharArray(shortString);
sw.Stop();
timesCharArrayShortString.Add(sw.Elapsed.TotalMilliseconds * 1000000);
}
for (int i = 0; i < 10; i++)
{
sw.Restart();
GetEscapedStringForeach(shortString);
sw.Stop();
timesForeachShortString.Add(sw.Elapsed.TotalMilliseconds * 1000000);
}
for (int i = 0; i < 10; i++)
{
sw.Restart();
GetEscapedStringByTraversingCharArray(longString);
sw.Stop();
timesCharArrayLongString.Add(sw.ElapsedMilliseconds);
}
for (int i = 0; i < 10; i++)
{
sw.Restart();
GetEscapedStringForeach(longString);
sw.Stop();
timesForeachLongString.Add(sw.ElapsedMilliseconds);
}
for (int i = 0; i < 10; i++)
{
sw.Restart();
GetEscapedStringByTraversingCharArray(pureEscape);
sw.Stop();
timesCharArrayPureEscape.Add(sw.ElapsedMilliseconds);
}
for (int i = 0; i < 10; i++)
{
sw.Restart();
GetEscapedStringForeach(pureEscape);
sw.Stop();
timesForeachPureEscape.Add(sw.ElapsedMilliseconds);
}
Console.WriteLine("Char Array (short string) average time: {0} ns", timesCharArrayShortString.Average());
Console.WriteLine("Foreach (short string) average time: {0} ns", timesForeachShortString.Average());
Console.WriteLine("Char Array (long string) average time: {0} ms", timesCharArrayLongString.Average());
Console.WriteLine("Foreach (long string) average time: {0} ms", timesForeachLongString.Average());
Console.WriteLine("Char Array (escaping only) average time: {0} ms", timesCharArrayPureEscape.Average());
Console.WriteLine("Foreach (escaping only) average time: {0} ms", timesForeachPureEscape.Average());
Console.Read();
}
private static string GetEscapedStringByTraversingCharArray(string s)
{
if (String.IsNullOrEmpty(s))
return s;
char[] chars = s.ToCharArray();
StringBuilder sb = new StringBuilder(s.Length);
for (int i = 0; i < chars.Length; i++)
{
switch (chars[i])
{
case ''t':
sb.Append(@"'t"); break;
case ''n':
sb.Append(@"'n"); break;
case ''r':
sb.Append(@"'r"); break;
case ''f':
sb.Append(@"'f"); break;
default:
sb.Append(chars[i]); break;
}
}
return sb.ToString();
}
public static string GetEscapedStringForeach(string s)
{
if (String.IsNullOrEmpty(s))
return s;
StringBuilder sb = new StringBuilder(s.Length);
foreach (Char ch in s)
{
switch (ch)
{
case ''t':
sb.Append(@"'t"); break;
case ''n':
sb.Append(@"'n"); break;
case ''r':
sb.Append(@"'r"); break;
default:
sb.Append(ch); break;
}
}
return sb.ToString();
}
您不需要将string
转化为Char[]
,因此解决方案可以稍微改进:
public string GetEscapedStringByTraversingCharArray(string s) {
// do not forget about null...
if (String.IsNullOrEmpty(s))
return s;
// we can be sure, that it requires at least s.Length symbols, let's allocate then
// in case we want to trade memory for speed we can even put
// StringBuilder sb = new StringBuilder(2 * s.Length);
// for the worst case when all symbols should be escaped
StringBuilder sb = new StringBuilder(s.Length);
foreach(Char ch in s) {
switch (ch) {
case ''t':
sb.Append(@"'t");
break;
case ''n':
sb.Append(@"'n");
break;
case ''r':
sb.Append(@"'r");
break;
default:
sb.Append(ch);
break;
}
}
return sb.ToString();
}
第一个选项比较慢,因为您使用:
创建了很多字符串对象。s = s.Replace(escapeToken.Key, escapeToken.Value);
在第二个方法中,不需要创建char[],因为string也有索引器。为了提高性能,您唯一能做的就是用容量初始化StringBuilder,这样它就不需要调整大小。您仍然可以在第二个方法中使用Dictionary