如何搜索多个字符串并为它们保留计数器
本文关键字:串并 字符串 计数器 保留 字符 何搜索 搜索 | 更新日期: 2023-09-27 18:04:13
我要做的是以下-我有数百个日志文件,我需要搜索并做一些计数。基本思想是这样的,取一个.txt
文件,读取每一行,如果找到搜索项1,则增加搜索项1的计数器,如果找到搜索项2,则增加搜索项2的计数器,以此类推。例如,如果文件包含像…
a b c
d e f
g h i
j k h
且如果我指定searchables
为e
&h
,输出应该是
e : 1
h : 2
搜索项的数量是可扩展的,基本上用户可以给出1个搜索项或10个搜索项,所以我不确定如何根据可搜索项的数量实现n个计数器。
下面是我到目前为止所拥有的,它只是一个基本的方法来看看什么有效,什么不…目前,它只记录一个搜索词的计数。目前,我正在将结果写入控制台以进行测试,最终,它将被写入.txt
或.xlsx
。任何帮助将不胜感激!
string line;
int Scounter = 0;
int Mcounter = 0;
List<string> searchables = new List<string>();
private void search_Log(string p)
{
searchables.Add("S");
searchables.Add("M");
StreamReader reader = new StreamReader(p);
while ((line = reader.ReadLine()) != null)
{
for (int i = 0; i < searchables.Count(); i++)
{
if (line.Contains(searchables[i]))
{
Scounter++;
}
}
}
reader.Close();
Console.WriteLine("# of S: " + Scounter);
Console.WriteLine("# of M: " + Mcounter);
}
一种常见的方法是使用Dictionary<string, int>
来跟踪值和计数:
// Initialise the dictionary:
Dictionary<string, int> counters = new Dictionary<string, int>();
之后:
if (line.Contains(searchables[i]))
{
if (counters.ContainsKey(searchables[i]))
{
counters[searchables[i]] ++;
}
else
{
counters.Add(searchables[i], 1);
}
}
然后,当您完成处理时:
// Add in any searches which had no results:
foreach (var searchTerm in searchables)
{
if (counters.ContainsKey(searchTerm) == false)
{
counters.Add(searchTerm, 0);
}
}
foreach (var item in counters)
{
Console.WriteLine("Value {0} occurred {1} times", item.Key, item.Value);
}
您可以为可搜索项使用一个类,如:
public class Searchable
{
public string searchTerm;
public int count;
}
然后while ((line = reader.ReadLine()) != null)
{
foreach (var searchable in searchables)
{
if (line.Contains(searchable.searchTerm))
{
searchable.count++;
}
}
}
这将是跟踪多个搜索词及其计数的众多方法之一。
你可以在这里使用linq:
string lines = reader.ReadtoEnd();
var result = lines.Split(new string[]{" ","'r'n"},StringSplitOptions.RemoveEmptyEntries)
.GroupBy(x=>x)
.Select(g=> new
{
Alphabet = g.Key ,
Count = g.Count()
}
);
输入:输出:a b c
a: 1
B: 1
C: 1
D: 1
E: 1
F: 1
此版本将计算每文件行出现1^n次的1^n个搜索项。它考虑了一个词在一行中出现多次的可能性。
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication5
{
class Program
{
static void Main(string[] args)
{
Func<string, string[], Dictionary<string, int>> searchForCounts = null;
searchForCounts = (filePathAndName, searchTerms) =>
{
Dictionary<string, int> results = new Dictionary<string, int>();
if (string.IsNullOrEmpty(filePathAndName) || !File.Exists(filePathAndName))
return results;
using (TextReader tr = File.OpenText(filePathAndName))
{
string line = null;
while ((line = tr.ReadLine()) != null)
{
for (int i = 0; i < searchTerms.Length; ++i)
{
var searchTerm = searchTerms[i].ToLower();
var index = 0;
while (index > -1)
{
index = line.IndexOf(searchTerm, index, StringComparison.OrdinalIgnoreCase);
if (index > -1)
{
if (results.ContainsKey(searchTerm))
results[searchTerm] += 1;
else
results[searchTerm] = 1;
index += searchTerm.Length - 1;
}
}
}
}
}
return results;
};
var counts = searchForCounts("D:''Projects''ConsoleApplication5''ConsoleApplication5''TestLog.txt", new string[] { "one", "two" });
Console.WriteLine("----Counts----");
foreach (var keyPair in counts)
{
Console.WriteLine("Term: " + keyPair.Key.PadRight(10, ' ') + " Count: " + keyPair.Value.ToString());
}
Console.ReadKey(true);
}
}
}
Input:
OnE, TwO
Output:
----Counts----
Term: one Count: 7
Term: two Count: 15