给定字符串集合,计算每个单词在List中出现的次数
本文关键字:List 集合 字符串 计算 单词 | 更新日期: 2023-09-27 18:10:30
输入1:List<string>
,例如:
"hello", "world", "stack", "overflow".
输入2:List<Foo>
(两个属性,字符串a,字符串b),例如:
Foo 1:a:"你好!"b:字符串。空
Foo 2:答:"我爱Stack Overflow"b:"这是有史以来最好的网站!"
所以我想以一个Dictionary<string,int>
结束。该单词及其在List<Foo>
中出现的次数,无论是在a
还是 b
字段中。
当前的第一次通过/顶部我的头代码,这是太慢了:
var occurences = new Dictionary<string, int>();
foreach (var word in uniqueWords /* input1 */)
{
var aOccurances = foos.Count(x => !string.IsNullOrEmpty(x.a) && x.a.Contains(word));
var bOccurances = foos.Count(x => !string.IsNullOrEmpty(x.b) && x.b.Contains(word));
occurences.Add(word, aOccurances + bOccurances);
}
大致:
- 从第一个输入构建字典(
occurrences
),可选择使用不区分大小写的比较器。 - 对于第二个输入中的每个
Foo
,使用RegEx
将a
和b
拆分为单词。 - 对于每个单词,检查
occurrences
中是否存在该键。如果存在,则递增并更新字典中的值。
您可以尝试连接两个字符串a + b,然后执行正则表达式将所有单词提取到集合中。最后使用group by query对其进行索引。
例如void Main()
{
var a = "Hello there!";
var b = "It's the best site ever!";
var ab = a + " " + b;
var matches = Regex.Matches(ab, "[A-Za-z]+");
var occurences = from x in matches.OfType<System.Text.RegularExpressions.Match>()
let word = x.Value.ToLowerInvariant()
group word by word into g
select new { Word = g.Key, Count = g.Count() };
var result = occurences.ToDictionary(x => x.Word, x => x.Count);
Console.WriteLine(result);
}
有一些修改建议的例子…编辑。重新阅读要求....有点奇怪,但是嘿…
void Main()
{
var counts = GetCount(new [] {
"Hello there!",
"It's the best site ever!"
});
Console.WriteLine(counts);
}
public IDictionary<string, int> GetCount(IEnumerable<Foo> inputs)
{
var allWords = from input in inputs
let matchesA = Regex.Matches(input.A, "[A-Za-z']+").OfType<System.Text.RegularExpressions.Match>()
let matchesB = Regex.Matches(input.B, "[A-Za-z']+").OfType<System.Text.RegularExpressions.Match>()
from x in matchesA.Concat(matchesB)
select x.Value;
var occurences = allWords.GroupBy(x => x, (x, y) => new{Key = x, Count = y.Count()}, StringComparer.OrdinalIgnoreCase);
var result = occurences.ToDictionary(x => x.Key, x => x.Count, StringComparer.OrdinalIgnoreCase);
return result;
}