可能加快字节解析速度

本文关键字：速度字节 | 更新日期: 2023-09-27 17:57:43

我们正在项目中进行一些性能优化，使用探查器，我得到了以下方法：

private int CalculateAdcValues(byte lowIndex)
{
    byte middleIndex = (byte)(lowIndex + 1);
    byte highIndex = (byte)(lowIndex + 2);
    // samples is a byte[]
    retrun (int)((int)(samples[highIndex] << 24) 
        + (int)(samples[middleIndex] << 16) + (int)(samples[lowIndex] << 8));
}

这种方法已经很快了，每次执行约1µs，但每秒调用约100.000次，因此占用约10%的CPU。

有人知道如何进一步改进这种方法吗

编辑：

当前解决方案：

fixed (byte* p = samples)
{
    for (; loopIndex < 61; loopIndex += 3)
    {
        adcValues[k++] = *((int*)(p + loopIndex)) << 8;
    }
}

这需要<40%的时间（"整个方法"之前每次调用大约需要35µs，现在大约需要13µs）。for-循环实际花费的时间比现在的计算要长。。。

可能加快字节解析速度

我强烈怀疑，在转换为byte之后，您的索引无论如何都会转换回int，以便在数组索引操作中使用。这将是便宜的，但可能不是完全免费的。因此，去掉强制转换，除非您使用对byte的转换来有效地获得0..255范围内的索引。在这一点上，您也可以去掉单独的局部变量。

此外，对int的强制转换没有操作，因为移位操作仅在int及更高类型上定义。

最后，使用|可能比+:更快

private int CalculateAdcValues(byte lowIndex)
{
    return (samples[lowIndex + 2] << 24) |
           (samples[lowIndex + 1] << 16) |
           (samples[lowIndex] << 8);
}

（为什么底部8位中什么都没有？这是故意的吗？请注意，如果samples[lowIndex + 2]设置了顶部位，结果将为负-可以吗？）

看到您有一个友好的endianes，请转到unsafe

unsafe int CalculateAdcValuesFast1(int lowIndex)
{
  fixed (byte* p = &samples[lowIndex])
  {
    return *(int*)p << 8;
  }
}

在x86上大约快30%。没有我希望的那么多收获。在x64上时约为40%。

如@CodeInChaos建议：

  var bounds = samples.Length - 3;
  fixed (byte* p = samples)
  {
    for (int i = 0; i < 1000000000; i++)
    {
      var r = CalculateAdcValuesFast2(p, i % bounds); // about 2x faster
      // or inlined:
      var r = *((int*)(p + i % bounds)) << 8; // about 3x faster
      // do something
    }
  }

unsafe object CalculateAdcValuesFast2(byte* p1, int p2)
{
  return *((int*)(p1 + p2)) << 8;
}

可能跟得快一点。我已将强制转换为integer。

        var middleIndex = (byte)(lowIndex + 1);
        var highIndex = (byte)(lowIndex + 2);
        return (this.samples[highIndex] << 24) + (this.samples[middleIndex] << 16) + (this.samples[lowIndex] << 8);