如何获取在我搜索的术语之后出现的以逗号分隔的值?

本文关键字:分隔 之后 术语 获取 搜索 何获取 | 更新日期: 2023-09-27 18:04:19

我的代码如下:

public void DeserialStream(string filePath)
    {
        using (StreamReader sr = new StreamReader(filePath))
        {
            string currentline;
            while ((currentline = sr.ReadLine()) != null)
            {
                if (currentline.IndexOf("Count", StringComparison.CurrentCultureIgnoreCase) >= 0)
                {
                    Console.WriteLine(currentline);
                }
            }
        }
    }

我想知道如何抓住逗号分隔的值出现在我搜索的术语之后?

就像如果我有一个csv包含这个信息:

"Date","dd/mm/yyyy"
"ExpirationDate","dd/mm/yyyy"
"DataType","Count"
"Location","Unknown","Variable1","Variable2","Variable3"
"A(Loc3, Loc4)","Unknown","5656","787","42"
"A(Loc5, Loc6)","Unknown","25","878","921"
"DataType","Net"
"Location","Unknown","Variable1","Variable2","Variable3"
"A(Loc3, Loc4)","Unknown","5656","787","42"
"A(Loc5, Loc6)","Unknown","25","878","921"

但是我如何在计数之后但在Net之前抓取值表?

也就是说,只有括号中的数据才是我想要解析的:

"Date","dd/mm/yyyy"
    "ExpirationDate","dd/mm/yyyy"
    "DataType","Count"
   [ "Location","Unknown","Variable1","Variable2","Variable3"
    "A(Loc3, Loc4)","Unknown","5656","787","42"
    "A(Loc5, Loc6)","Unknown","25","878","921"]
    "DataType","Net"
    "Location","Unknown","Variable1","Variable2","Variable3"
    "A(Loc3, Loc4)","Unknown","5656","787","42"
    "A(Loc5, Loc6)","Unknown","25","878","921"

我在想也许我应该使用正则表达式,或者有一个更简单的方法使用上面的方法?

如何获取在我搜索的术语之后出现的以逗号分隔的值?

您可以像这样使用正则表达式:

'"DataType'"','"(?:Count|Net)'"((?!'"DataType'").)*

这将一直匹配到下一个DataType行

您可以使用LINQ:

List<string> lines = File.ReadLines(path)
   .SkipWhile(l => l.IndexOf("'"Count'"", StringComparison.InvariantCultureIgnoreCase) == -1)
   .Skip(1) // skip the "Count"-line
   .TakeWhile(l => l.IndexOf("'"Net'"",   StringComparison.InvariantCultureIgnoreCase) == -1)
   .ToList();

使用String.Split得到每一行的string[]。一般来说,我会使用一个可用的CSV解析器来处理边缘情况和坏数据,而不是重新发明轮子。

Edit:如果您想将字段拆分为List<string>,您应该使用CSV解析器,因为您的数据已经使用引号字符,因此在"中包装的逗号不应该被拆分。

然而,这里有另一个简单但有效的方法,使用StringBuilder:

public static IEnumerable<string> SplitCSV(string csvString)
{
    var sb = new StringBuilder();
    bool quoted = false;
    foreach (char c in csvString)
    {
        if (quoted)
        {
            if (c == '"')
                quoted = false;
            else
                sb.Append(c);
        }
        else
        {
            if (c == '"')
            {
                quoted = true;
            }
            else if (c == ',')
            {
                yield return sb.ToString();
                sb.Length = 0;
            }
            else
            {
                sb.Append(c);
            }
        }
    }
    if (quoted)
        throw new ArgumentException("csvString", "Unterminated quotation mark.");
    yield return sb.ToString();
}

(感谢https://stackoverflow.com/a/4150727/284240)

现在您可以在上面的查询中使用SelectMany来平化所有标记:

List<string> allTokens = File.ReadLines(path)
    .SkipWhile(l => l.IndexOf("'"Count'"", StringComparison.InvariantCultureIgnoreCase) == -1)
    .Skip(1) // skip the "Count"-line
    .TakeWhile(l => l.IndexOf("'"Net'"", StringComparison.InvariantCultureIgnoreCase) == -1)
    .SelectMany(l => SplitCSV(l.Trim()))
    .ToList();
结果:

Location, Unknown, Variable1, Variable2, Variable3, A(Loc3, Loc4), Unknown, 5656, 787, 42, A(Loc5, Loc6), Unknown, 25, 878, 921, ""