检查字符串是否包含子字符串列表,并保存匹配的子字符串
本文关键字:字符串 保存 列表 是否 包含 检查 | 更新日期: 2023-09-27 17:49:27
这是我的情况:我有一个字符串表示文本
string myText = "Text to analyze for words, bar, foo";
和要搜索的单词列表
List<string> words = new List<string> {"foo", "bar", "xyz"};
我想知道最有效的方法,如果存在的话,获取文本中包含的单词列表,就像这样:
List<string> matches = myText.findWords(words)
除了必须使用Contains
方法外,此查询中没有特殊的分析。所以你可以试试这个:
string myText = "Text to analyze for words, bar, foo";
List<string> words = new List<string> { "foo", "bar", "xyz" };
var result = words.Where(i => myText.Contains(i)).ToList();
//result: bar, foo
您可以使用HashSet<string>
并使两个集合相交:
string myText = "Text to analyze for words, bar, foo";
string[] splitWords = myText.Split(' ', ',');
HashSet<string> hashWords = new HashSet<string>(splitWords,
StringComparer.OrdinalIgnoreCase);
HashSet<string> words = new HashSet<string>(new[] { "foo", "bar" },
StringComparer.OrdinalIgnoreCase);
hashWords.IntersectWith(words);
一个正则表达式解
var words = new string[]{"Lucy", "play", "soccer"};
var text = "Lucy loves going to the field and play soccer with her friend";
var match = new Regex(String.Join("|",words)).Match(text);
var result = new List<string>();
while (match.Success) {
result.Add(match.Value);
match = match.NextMatch();
}
//Result ["Lucy", "play", "soccer"]
播放的想法,你想要能够使用myText.findWords(words)
,你可以做一个扩展方法的字符串类做你想要的。
public static class StringExtentions
{
public static List<string> findWords(this string str, List<string> words)
{
return words.Where(str.Contains).ToList();
}
}
用法:
string myText = "Text to analyze for words, bar, foo";
List<string> words = new List<string> { "foo", "bar", "xyz" };
List<string> matches = myText.findWords(words);
Console.WriteLine(String.Join(", ", matches.ToArray()));
Console.ReadLine();
结果:
foo,酒吧
这里有一个简单的解决方案来解释空格和标点符号:
static void Main(string[] args)
{
string sentence = "Text to analyze for words, bar, foo";
var words = Regex.Split(sentence, @"'W+");
var searchWords = new List<string> { "foo", "bar", "xyz" };
var foundWords = words.Intersect(searchWords);
foreach (var item in foundWords)
{
Console.WriteLine(item);
}
Console.ReadLine();
}