测量函数调用的CPU周期

本文关键字:周期 CPU 函数调用 测量 | 更新日期: 2023-09-27 18:28:07

我正在寻找一种方法来测量线程上函数调用占用的cpu周期。

伪代码示例:

void HostFunction()
{
     var startTick = CurrentThread.CurrentTick;  //does not exist
     ChildFunction();
     var endTick = CurrentThread.CurrentTick;  //does not exist
     var childFunctionCost = endTick - startTick;
}
void ChildFunction() 
{
    //Do whatever...
    Thread.Sleep(3000);
    //Do some more...
}

我不想使用Stopwatch或其他时间测量,因为它会包括线程睡眠的任何时间,而我不想测量这些时间。我只想衡量真正的工作。

这种测量需要在运行时进行,就像在我的伪代码中一样,因为结果用于确定是否应该允许子函数继续运行(我的实际情况是插件类型的架构),所以分析工具对我没有帮助。

测量函数调用的CPU周期

您可以pinvoke QueryThreadCycleTime()。有关详细信息,请查看链接。

一些示例代码:

using System;
using System.Diagnostics;
using System.Runtime.InteropServices;
class Program {
    static void Main(string[] args) {
        ulong start, end;
        start = NativeMethods.GetThreadCycles();
        System.Threading.Thread.Sleep(1000);
        end = NativeMethods.GetThreadCycles();
        ulong cycles = end - start;
        Debug.Assert(cycles < 200000);
    }
    static class NativeMethods {
        public static ulong GetThreadCycles() {
            ulong cycles;
            if (!QueryThreadCycleTime(PseudoHandle, out cycles))
                throw new System.ComponentModel.Win32Exception();
            return cycles;
        }
        [DllImport("kernel32.dll", SetLastError = true)]
        private static extern bool QueryThreadCycleTime(IntPtr hThread, out ulong cycles);
        private static readonly IntPtr PseudoHandle = (IntPtr)(-2);
    }
}

根据我上面提供的注释,考虑以下代码:

using System;
using System.Diagnostics;
using System.Runtime.InteropServices;
using System.Threading;
namespace FunctionTiming
{
    class Program
    {
        private static Thread _thread;
        private static IntPtr _threadHandle;
        static void Main(string[] args)
        {
            _thread = new Thread(new ThreadStart(Program.TargetFunction));
            _thread.Start();
            _thread.Join();
            System.Runtime.InteropServices.ComTypes.FILETIME start, end, rawKernelTime, rawUserTime;
            bool result = GetThreadTimes(_threadHandle, out start, out end, out rawKernelTime, out rawUserTime);
            Debug.Assert(result);
            ulong uLow = (ulong)rawKernelTime.dwLowDateTime;
            ulong uHigh = (uint)rawKernelTime.dwHighDateTime;
            uHigh = uHigh << 32;
            long kernelTime = (long)(uHigh | uLow);
            uLow = (ulong)rawUserTime.dwLowDateTime;
            uHigh = (uint)rawUserTime.dwHighDateTime;
            uHigh = uHigh << 32;
            long userTime = (long)(uHigh | uLow);
            Debug.WriteLine("Kernel time: " + kernelTime);
            Debug.WriteLine("User time: " + userTime);
            Debug.WriteLine("Combined raw execution time: " + (kernelTime + userTime));
            long functionTime = (kernelTime + userTime) / 10000;
            Debug.WriteLine("Funciton Time: " + functionTime + " milliseconds");
        }
        static void TargetFunction()
        {
            IntPtr processHandle = GetCurrentProcess();
            bool result = DuplicateHandle(processHandle, GetCurrentThread(), processHandle, out _threadHandle, 0, false, (uint)DuplicateOptions.DUPLICATE_SAME_ACCESS);
            double value = 9876543.0d;
            for (int i = 0; i < 100000; ++i)
                value = Math.Cos(value);
            Thread.Sleep(3000);
            value = 9876543.0d;
            for (int i = 0; i < 100000; ++i)
                value = Math.Cos(value);
        }
        [DllImport("kernel32.dll", SetLastError = true)]
        static extern bool GetThreadTimes(IntPtr hThread,
            out System.Runtime.InteropServices.ComTypes.FILETIME lpCreationTime, out System.Runtime.InteropServices.ComTypes.FILETIME lpExitTime,
            out System.Runtime.InteropServices.ComTypes.FILETIME lpKernelTime, out System.Runtime.InteropServices.ComTypes.FILETIME lpUserTime);
        [DllImport("kernel32.dll")]
        private static extern IntPtr GetCurrentThread();
        [DllImport("kernel32.dll", SetLastError = true)]
        [return: MarshalAs(UnmanagedType.Bool)]
        static extern bool DuplicateHandle(IntPtr hSourceProcessHandle,
            IntPtr hSourceHandle, IntPtr hTargetProcessHandle, out IntPtr lpTargetHandle,
            uint dwDesiredAccess, [MarshalAs(UnmanagedType.Bool)] bool bInheritHandle, uint dwOptions);
        [Flags]
        public enum DuplicateOptions : uint
        {
            DUPLICATE_CLOSE_SOURCE = (0x00000001),// Closes the source handle. This occurs regardless of any error status returned.
            DUPLICATE_SAME_ACCESS = (0x00000002), //Ignores the dwDesiredAccess parameter. The duplicate handle has the same access as the source handle.
        }
        [DllImport("kernel32.dll")]
        static extern IntPtr GetCurrentProcess();
    }
}

它产生以下结果(在我的旧机器上):

Kernel time: 0
User time: 156250
Combined raw execution time: 156250
Function time: 15 milliseconds

你可以清楚地看到,3秒钟的睡眠时间不包括在内。希望这能有所帮助。