连续搜索字符串以寻找分隔符

本文关键字:寻找 分隔符 字符串 搜索 连续 | 更新日期: 2023-09-27 18:17:55

我正在写我的第一个程序,我发现它运行得很慢。它用于将一种文本格式转换为另一种文本格式。在这样做的过程中,我经常需要搜索某个字符串,然后从那里解析文本。这可能看起来像:

const string separator = "Parameters Table";
 // Get to the first separator
var cuttedWords = backgroundWords.SkipWhile(x => x != separator).Skip(1);      
// Run as long as there is anything left to scan
while (cuttedWords.Any())
{
    // Take data from the last found separator until the next one, exclusively
    var variable = cuttedWords.TakeWhile(x => x != separator).ToArray();
    // Step through cuttedWords to update where the last found separator was
    cuttedWords = cuttedWords.Skip(variable.Length + 1);
    // Do what you want with the sub-array containing information

} 

我想知道是否有一种更有效的方法来做同样的事情(即搜索字符串并做我想要的子数组之间的字符串和下一个相同的字符串)。谢谢你。

连续搜索字符串以寻找分隔符

更直接的:

var startIndex = backgroundWords.IndexOf(separator) + separator.Length;
var endIndex = backgroundWords.IndexOf(separator, startIndex);
var cuttedWords = backgroundWords.Substring(startIndex, endIndex);

你可以一直这样剪。当你想继续前进时,你可以这样做:

// note I added the endIndex + separator.Length variable here to say
// continue with the end of what I found before
startIndex = backgroundWords.IndexOf(separator, endIndex + separator.Length) + separator.Length;
endIndex = backgroundWords.IndexOf(separator, startIndex);
cuttedWords = backgroundWords.Substring(startIndex, endIndex);

那么,你修改的代码片段看起来像这样:

const string separator = "Parameters Table";
var startIndex = backgroundWords.IndexOf(separator) + separator.Length;
var endIndex = backgroundWords.IndexOf(separator, startIndex);
var cuttedWords = backgroundWords.Substring(startIndex, endIndex);
while (cuttedWords.Any())
{
    // Take data from the last found separator until the next one, exclusively
    var variable = cuttedWords.TakeWhile(x => x != separator).ToArray();
    // Step through cuttedWords to update where the last found separator was
    cuttedWords = cuttedWords.Skip(variable.Length + 1);
    startIndex = backgroundWords.IndexOf(separator, endIndex + separator.Length) + separator.Length;
    endIndex = backgroundWords.IndexOf(separator, startIndex);
    cuttedWords = backgroundWords.Substring(startIndex, endIndex);
}

最简单的方法就是拆分字符串。

const string separator = "Parameters Table";
var text = "ThisParameters TableisParameters TableaParameters Tabletest";
var split = text.Split(new[] { separator }, StringSplitOptions.None);
foreach(var x in split)
{
    // do something
}

这不能工作的原因是什么?

在进入循环之前将其转换为列表。否则Linq的内部工作将每次重复。

你可以这样尝试:

var lst = cuttedWords.ToList();
while (lst.count() > 0)
{
    // Take data from the last found separator until the next one, exclusively
    var variable = lst.TakeWhile(x => x != separator).ToArray();
    // Step through cuttedWords to update where the last found separator was
    lst = lst.Skip(variable.Length + 1).ToList();
    // Do what you want with the sub-array containing information

}