找出两个字符串中有多少单词是相同的

本文关键字:单词 多少 字符串 两个 | 更新日期: 2023-09-27 18:01:16

我有这个函数,我想比较两个字符串,然后返回存在多少个单词,但下面的不起作用。我似乎总是得到0的SameWordCount和1的MasterAddressWordCount

任何想法?

// some more string cleaning
        mastermkAddressKey = mastermkAddressKey.Replace(",", " ").Replace(".", " ").Trim();
        mastermkAddressKey = Encoding.ASCII.GetString(Encoding.GetEncoding("Cyrillic").GetBytes(mastermkAddressKey));
        mastermkAddressKey = mastermkAddressKey.Replace("  ", " |").Replace("| ", "").Replace("|", "");
        mastermkAddressKey = QbaseStrings.RemoveDuplicateWords(mastermkAddressKey);
        duplicatemkAddressKey = duplicatemkAddressKey.Replace(",", " ").Replace(".", " ").Trim();
        duplicatemkAddressKey = Encoding.ASCII.GetString(Encoding.GetEncoding("Cyrillic").GetBytes(duplicatemkAddressKey));
        duplicatemkAddressKey = duplicatemkAddressKey.Replace("  ", " |").Replace("| ", "").Replace("|", "");
        duplicatemkAddressKey = QbaseStrings.RemoveDuplicateWords(duplicatemkAddressKey);
        string[] masterAddressSeparateWords = mastermkAddressKey.Split(new char[' '], StringSplitOptions.RemoveEmptyEntries);
        string[] duplicateAddressSeparateWords = duplicatemkAddressKey.Split(new char[' '], StringSplitOptions.RemoveEmptyEntries);
        int SameWordCount = 0;
        int MasterAddressWordCount = 0;
        foreach (string masterWord in masterAddressSeparateWords)
                {
                    foreach (string duplicateWord in duplicateAddressSeparateWords)
                    {
                        if (masterWord == duplicateWord) {SameWordCount++;}
                    }
                    MasterAddressWordCount++;
                }
        int WordDifference = MasterAddressWordCount - SameWordCount;
        if (WordDifference == 0) { return "sure"; }
        if (WordDifference > 0 && WordDifference < 3) { return SameWordCount.ToString() + " " + MasterAddressWordCount.ToString(); }
        if (WordDifference > 2 && WordDifference < 5) { return "possible"; }

找出两个字符串中有多少单词是相同的

你的问题是因为new char[' '],你这里的意思是new char[] {' '}。编译器(非常有用)在这里将' '转换为int,使其成为char[int]。这意味着:

new char[' ']

实际上与

相同
new char[32]

这最终成为一个无用的大char[]数组,而不是你所追求的单个空格。


您可以通过查看为

生成的IL来清楚地看到:
var a = new char[' '];

:

IL_0001:  ldc.i4.s    20
IL_0003:  newarr      System.Char
IL_0008:  stloc.0     // a

20是32的十六进制表示。

我已经通过更改以下行解决了这个问题:

string[] masterAddressSeparateWords = mastermkAddressKey.Split(new char[' '], StringSplitOptions.RemoveEmptyEntries);
        string[] duplicateAddressSeparateWords = duplicatemkAddressKey.Split(new char[' '], StringSplitOptions.RemoveEmptyEntries);

:

string[] masterAddressSeparateWords = mastermkAddressKey.Split(' ');
string[] duplicateAddressSeparateWords = duplicatemkAddressKey.Split(' ');