如何使用C#或jquery匹配段落中的重复单词

本文关键字:段落中 单词 何使用 jquery | 更新日期: 2023-09-27 17:59:38

我想匹配两个字符串之间的常用词

// C# code  
string str1,str2;

我想检查字符串str1的前十五个单词中的五个单词

str1="Musician Ben Sollee on the Ravages of Coal and the Wonders of the Bicycle"
str2="There is Wonders of Musician Ben Sollee on the Ravages of Coal"

我想跳过上面字符串中的动词比较,如"of"、"on"、"the"等。只检查没有动词的单词

从上面的字符串中,我想将str2tr1进行比较,如果它包含五个重复单词,则给出包含一些重复单词的消息。。

我如何比较它并检查它是否包含重复项。我对jQuery或C#的答案都很满意。

如何使用C#或jquery匹配段落中的重复单词

一行Linq查询奖励积分?

string[] str1Words = ...
string[] str2Words = ...
string[] dontCheck = {"of", "a", "the"};
var greaterThanFive = str1Words.Join(str2Words, s1 => s1, s2 => s2, (r1, r2) => r1)
                               .Distinct()
                               .Where(s => !dontCheck.Contains(s))
                               .Count() > 5;

从每个字符串中获取单词:

string[] str1Words = str1.split(" ");
string[] str2Words = str2.split(" ");

指定你不想检查的单词:

string[] dontCheck = {"of", "a", "the"}; // etc..

看看有多少重复:

string[] duplicates = Array.FindAll(
    str1Words, srt1word => 
        Array.Exists(str2Words, str1Word => string.Equals(str1word, str2word)) 
        && !Array.Exists(dontCheck, dontCheckWord => string.Equals(dontCheckWord, str1Word))
);
if(duplicates.length > 5)
{
    // Give message
}

您可以尝试通过拆分"将两个字符串拆分为单词列表。然后简单地按指定的单词进行迭代,并检查第二个列表是否包含该字符串。

您还应该维护一个包含被忽略单词的列表或文件。

        List<string> str1List = new List<string>(str1.Split(' '));
        List<string> str2List = new List<string>(str2.Split(' '));
        foreach (string word in str1List)
        {
            if (str2List.Contains(word))
            {
                //do something
            }
        }
        string str1 = "Musician Ben Sollee on the Ravages of Coal and the Wonders of the Bicycle";          
        string str2="There is Wonders of Musician Ben Sollee on the Ravages of Coal";
        string[] DontCheck = new string[]{"is", "of", "the"};
        List<string> List1 = new List<string>(str1.Split(' '));
        List<string> List2 = new List<string>(str2.Split(' '));

        var Result = ((from s in List1
                       where List2.Contains(s) && !DontCheck.Contains(s)
                      select s).Count() > 5);
        if (Result)
        {
            //It Contains some duplicate words
        } 

用于不同重复检查的更新代码

        string str1 = "Official Facebook iPad App Coming Soon to the App Store Official Facebook iPad";
        string str2 = "App Coming Soon to Official Facebook iPad the Store App Official App App App App";
        string[] DontCheck = new string[]{"is", "of", "the", "to"};           
        HashSet<string> Set = new HashSet<string>(new List<string>(str1.Split(' ')));
        HashSet<string> Set2 = new HashSet<string>(new List<string>(str2.Split(' ')));
        var Result = ((from s in Set
                       where Set2.Contains(s) && !DontCheck.Contains(s)
                      select s).Count() > 5);
        int result =Convert.ToInt32(Result);
        if (Result)
        {
           // It Contains more than 5 duplicate words
        }