String == operator:微软是怎么写的?

本文关键字:operator 微软 String | 更新日期: 2023-09-27 18:16:29

我想知道微软是如何编写字符串比较算法的

string.equal and string.compare

它们是否像这样逐字符比较:

int matched = 1;
for (int i = 0; i < str1.Length; i++)
{
    if (str1[i] == str2[i])
    {
        matched++;
    }
    else
    {
        break;
    }
}
if (matched == str1.Length) return true;

或者一次匹配全部

if (str1[0] == str2[0] && str1[1] == str2[1] && str1[2] == str2[2]) return true;

我试着按F12在字符串上。相等的函数,但它让我看到了函数声明,而不是实际的代码。由于

在Thilo提到要查看源代码后,我能够找到这个…这是微软写的

public static bool Equals(String a, String b) {
    if ((Object)a==(Object)b) {
        return true;
    }
    if ((Object)a==null || (Object)b==null) {
        return false;
    }
    if (a.Length != b.Length)
        return false;
    return EqualsHelper(a, b);
}

但是这提出了一个问题:逐个字符检查还是完全匹配更快?

String == operator:微软是怎么写的?

查看源代码(复制如下):

    零检查
  1. 参考身份
  2. 不同长度=>不等于
  3. 查看展开循环中位字符的二进制编码

这引起了一个问题,是逐个字符检查快还是完全匹配快?

我不明白这个问题。如果不检查每个字符,就不能进行"完全匹配"。你能做的就是一旦发现不匹配,就立即退出。这减少了一点运行时间,但并没有改变它是O(n)的事实。
        // Determines whether two strings match.
        [Pure]
        [ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
        public bool Equals(String value) {
            if (this == null)                        //this is necessary to guard against reverse-pinvokes and
                throw new NullReferenceException();  //other callers who do not use the callvirt instruction
            if (value == null)
                return false;
            if (Object.ReferenceEquals(this, value))
                return true;
            if (this.Length != value.Length)
                return false;
            return EqualsHelper(this, value);
        }
        [System.Security.SecuritySafeCritical]  // auto-generated
        [ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
        private unsafe static bool EqualsHelper(String strA, String strB)
        {
            Contract.Requires(strA != null);
            Contract.Requires(strB != null);
            Contract.Requires(strA.Length == strB.Length);
            int length = strA.Length;
            fixed (char* ap = &strA.m_firstChar) fixed (char* bp = &strB.m_firstChar)
            {
                char* a = ap;
                char* b = bp;
                // unroll the loop
      #if AMD64
                // for AMD64 bit platform we unroll by 12 and
                // check 3 qword at a time. This is less code
                // than the 32 bit case and is shorter
                // pathlength
                while (length >= 12)
                {
                    if (*(long*)a     != *(long*)b) return false;
                    if (*(long*)(a+4) != *(long*)(b+4)) return false;
                    if (*(long*)(a+8) != *(long*)(b+8)) return false;
                    a += 12; b += 12; length -= 12;
                }
      #else
                while (length >= 10)
                {
                    if (*(int*)a != *(int*)b) return false;
                    if (*(int*)(a+2) != *(int*)(b+2)) return false;
                    if (*(int*)(a+4) != *(int*)(b+4)) return false;
                    if (*(int*)(a+6) != *(int*)(b+6)) return false;
                    if (*(int*)(a+8) != *(int*)(b+8)) return false;
                    a += 10; b += 10; length -= 10;
                }
       #endif
                // This depends on the fact that the String objects are
                // always zero terminated and that the terminating zero is not included
                // in the length. For odd string sizes, the last compare will include
                // the zero terminator.
                while (length > 0) 
                {
                    if (*(int*)a != *(int*)b) break;
                    a += 2; b += 2; length -= 2;
                }
                return (length <= 0);
            }
        }