在将扫描线从位图复制到位图时,使用Parallel类会提高性能吗?
本文关键字:位图 Parallel 高性能 使用 扫描 复制 | 更新日期: 2023-09-27 18:10:41
我需要将扫描线从一个字节缓冲区复制到另一个字节缓冲区。
要做到这一点,我使用rtlmovemmemory:[DllImport("Kernel32.dll", EntryPoint = "RtlMoveMemory", SetLastError = false)]
private static unsafe extern void MoveMemory(void* dest, void* src, int size);
正常的方式是像
for (int y = 0; y < height; y++ )
{
MoveMemory(dst_ptr, src_ptr, stride);
src_ptr += src_stride;
dst_ptr += dst_stride;
}
我的问题是——使用Parallel类会更快吗?
Parallel.For(0, height, (y) =>
{
byte* src_ptr = src_base + y * src_stride;
byte* dst_ptr = dst_base + y * dst_stride;
MoveMemory(dst_ptr, src_ptr, line_width);
});
还是会对性能产生负面影响?
实际上,我的测试表明逐行复制更快。例如,我创建了一个表示1024 x 768位图的数组。并行版本快了45%。
当扫描线较长时,并行版本要快得多。在1千字节以下,单线程版本更快。
在。net 4.5, Visual Studio 2013中测试。64位模式。使用发行版编译,无需附加调试器即可运行。
private const int NumLines = 1024;
private const int LineLength = 768*3;
private const int ArraySize = NumLines*LineLength;
[DllImport("Kernel32.dll", EntryPoint = "RtlMoveMemory", SetLastError = false)]
private static unsafe extern void MoveMemory(void* dest, void* src, int size);
unsafe public void Test()
{
// initialize a big array to test the copy
var source = Enumerable.Range(0, ArraySize).Select(x => (byte)x).ToArray();
var dest = new byte[ArraySize];
fixed (byte* pSource = source, pDest = dest)
{
// test single threaded
// do it once for the JIT
Console.WriteLine("Testing single threaded...");
MoveSingleThread(pSource, pDest, NumLines, LineLength);
// Okay, time it.
var sw = Stopwatch.StartNew();
MoveSingleThread(pSource, pDest, NumLines, LineLength);
sw.Stop();
Console.WriteLine("Single threaded: {0:N0} ticks", sw.ElapsedTicks);
var singleTicks = sw.ElapsedTicks;
Console.WriteLine("Testing parallel");
// do it once for JIT
MoveParallel(pSource, pDest, NumLines, LineLength);
sw = Stopwatch.StartNew();
MoveParallel(pSource, pDest, NumLines, LineLength);
sw.Stop();
Console.WriteLine("Parallel: {0:N0} ticks", sw.ElapsedTicks);
var diff = sw.ElapsedTicks - singleTicks;
var pct = (double) sw.ElapsedTicks/singleTicks;
Console.WriteLine("Difference: {0:N0} ticks {1:P2}", diff, pct);
}
}
private unsafe void MoveSingleThread(byte* source, byte* dest, int nLines, int lineLength)
{
var srcPtr = source;
var dstPtr = dest;
for (int y = 0; y < nLines; ++y)
{
MoveMemory(dstPtr, srcPtr, lineLength);
srcPtr += lineLength;
dstPtr += lineLength;
}
}
unsafe void MoveParallel(byte* source, byte* dest, int nLines, int lineLength)
{
Parallel.For(0, nLines, (y) =>
{
byte* srcPtr = source + y * lineLength;
byte* dstPtr = dest + y * lineLength;
MoveMemory(dstPtr, srcPtr, lineLength);
});
}
我以前这样做过,根据我的经验,Parallel。用于复制每条扫描线时速度较慢。相反,尝试平行。For用于图像的部分(每个线程几十或几百条扫描线)。扫描线的"神奇"数量根据机器/内存/CPU的不同而不同,所以你需要调整它以获得最佳的传输速率。
当然,正如上面所建议的,您应该尝试每种方法,看看它在您的特定情况下是如何执行的。