快速读取 C 结构(当它包含字符数组时)

本文关键字:字符 包含 数组 读取 结构 | 更新日期: 2023-09-27 17:56:44

>我有以下 C 结构

struct MyStruct {
    char chArray[96];
    __int64 offset;
    unsigned count;
}

我现在有一堆用 C 语言创建的文件,其中包含数千个这样的结构。我需要使用 C# 读取它们,速度是一个问题。

我在 C# 中完成了以下操作

[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi, Size = 108)]
public struct PreIndexStruct {
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 96)]
    public string Key;
    public long Offset;
    public int Count;
}

然后我使用

using (BinaryReader br = new BinaryReader(
       new FileStream(pathToFile, FileMode.Open, FileAccess.Read, 
                      FileShare.Read, bufferSize))) 
{
    long length = br.BaseStream.Length;
    long position = 0;
    byte[] buff = new byte[structSize];
    GCHandle buffHandle = GCHandle.Alloc(buff, GCHandleType.Pinned);
    while (position < length) {
        br.Read(buff, 0, structSize);
        PreIndexStruct pis = (PreIndexStruct)Marshal.PtrToStructure(
            buffHandle.AddrOfPinnedObject(), typeof(PreIndexStruct));
        structures.Add(pis);
        position += structSize;
    }
    buffHandle.Free();
}

这工作得很好,我可以从文件中很好地检索数据。

我读到,如果不使用 GCHandle.Alloc/Marshal.PtrToStructure,我可以使用 C++/CLI 或 C# 不安全代码,我可以加快速度。我找到了一些例子,但它们只引用没有固定大小数组的结构。

我的

问题是,对于我的具体情况,是否有一种更快的方法来使用 C++/CLI 或 C# 不安全代码来做事?

编辑

其他性能信息(我使用了 ANTS 性能探查器 7.4):

我 66% 的 CPU 时间用于调用 Marshal.PtrToStructure。

关于 I/O,105 毫秒中只有 6 毫秒用于读取文件。

快速读取 C 结构(当它包含字符数组时)

在这种情况下,您不需要显式使用 P/Invoke,因为您不必在托管代码和本机代码之间来回传递结构。所以你可以这样做。它将避免这种无用的 GC 句柄分配,并仅分配所需的内容。

public struct PreIndexStruct {
    public string Key;
    public long Offset;
    public int Count;
}
while (...) {
    ...
    PreIndexStruct pis = new PreIndexStruct();
    pis.Key = Encoding.Default.GetString(reader.ReadBytes(96));
    pis.Offset = reader.ReadInt64();
    pis.Count = reader.ReadInt32();
    structures.Add(pis);
}

我不确定你能比这快得多。

可能更准确地说,你想使用非托管代码,这就是我要做的:

  1. 创建一个 C++/CLI 项目,并在其中移植并运行现有的 C# 代码
  2. 确定瓶颈所在(使用探查器)
  3. 用直接C++重写该部分代码,从 C++/CLI 代码调用它并确保它有效,再次分析它
  4. 用"#pragma 非托管"将新代码括起来
  5. 再次分析它

您可能会获得一定程度的速度提高,但这可能不是您所期望的。

可以

非常繁琐地读取某些结构数组,但是由于此技术需要 blitable 类型,因此唯一的方法是为 Key 创建一个固定的字节缓冲区,而不是使用字符串。

如果你这样做,你必须使用不安全的代码,所以它可能不值得。

然而,只是为了好奇,这就是你如何对这些结构进行超级快速读写,代价是必须允许不安全的代码和大量的小提琴:

using System;
using System.ComponentModel;
using System.Diagnostics;
using System.Diagnostics.CodeAnalysis;
using System.IO;
using System.Runtime.InteropServices;

namespace Demo
{
    public static class Program
    {
        [StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi, Size = 108)]
        public struct PreIndexStruct
        {
            public unsafe fixed byte Key[96];
            public long Offset;
            public int Count;
        }
        private static void Main(string[] args)
        {
            PreIndexStruct[] a = new PreIndexStruct[100];
            for (int i = 0; i < a.Length; ++i)
            {
                a[i].Count = i;
                unsafe
                {
                    fixed (byte* key = a[i].Key)
                    {
                        for (int j = 0; j < 10; ++j)
                        {
                            key[j] = (byte)i;
                        }
                    }
                }
            }
            using (var output = File.Create(@"C:'TEST'TEST.BIN"))
            {
                FastWrite(output, a, 0, a.Length);
            }
            using (var input = File.OpenRead(@"C:'TEST'TEST.BIN"))
            {
                var b = FastRead<PreIndexStruct>(input, a.Length);
                for (int i = 0; i < b.Length; ++i)
                {
                    Console.Write("Count = " + b[i].Count + ", Key =");
                    unsafe
                    {
                        fixed (byte* key = b[i].Key)
                        {
                            // Here you would access the bytes in Key[], which would presumably be ANSI chars.
                            for (int j = 0; j < 10; ++j)
                            {
                                Console.Write(" " + key[j]);
                            }
                        }
                    }
                    Console.WriteLine();
                }
            }
        }
        /// <summary>
        /// Writes a part of an array to a file stream as quickly as possible,
        /// without making any additional copies of the data.
        /// </summary>
        /// <typeparam name="T">The type of the array elements.</typeparam>
        /// <param name="fs">The file stream to which to write.</param>
        /// <param name="array">The array containing the data to write.</param>
        /// <param name="offset">The offset of the start of the data in the array to write.</param>
        /// <param name="count">The number of array elements to write.</param>
        /// <exception cref="IOException">Thrown on error. See inner exception for <see cref="Win32Exception"/></exception>
        [SuppressMessage("Microsoft.Reliability", "CA2001:AvoidCallingProblematicMethods", MessageId="System.Runtime.InteropServices.SafeHandle.DangerousGetHandle")]
        public static void FastWrite<T>(FileStream fs, T[] array, int offset, int count) where T: struct
        {
            int sizeOfT = Marshal.SizeOf(typeof(T));
            GCHandle gcHandle = GCHandle.Alloc(array, GCHandleType.Pinned);
            try
            {
                uint bytesWritten;
                uint bytesToWrite = (uint)(count * sizeOfT);
                if
                (
                    !WriteFile
                    (
                        fs.SafeFileHandle.DangerousGetHandle(),
                        new IntPtr(gcHandle.AddrOfPinnedObject().ToInt64() + (offset*sizeOfT)),
                        bytesToWrite,
                        out bytesWritten,
                        IntPtr.Zero
                    )
                )
                {
                    throw new IOException("Unable to write file.", new Win32Exception(Marshal.GetLastWin32Error()));
                }
                Debug.Assert(bytesWritten == bytesToWrite);
            }
            finally
            {
                gcHandle.Free();
            }
        }
        /// <summary>
        /// Reads array data from a file stream as quickly as possible,
        /// without making any additional copies of the data.
        /// </summary>
        /// <typeparam name="T">The type of the array elements.</typeparam>
        /// <param name="fs">The file stream from which to read.</param>
        /// <param name="count">The number of elements to read.</param>
        /// <returns>
        /// The array of elements that was read. This may be less than the number that was
        /// requested if the end of the file was reached. It may even be empty.
        /// NOTE: There may still be data left in the file, even if not all the requested
        /// elements were returned - this happens if the number of bytes remaining in the
        /// file is less than the size of the array elements.
        /// </returns>
        /// <exception cref="IOException">Thrown on error. See inner exception for <see cref="Win32Exception"/></exception>
        [SuppressMessage("Microsoft.Reliability", "CA2001:AvoidCallingProblematicMethods", MessageId="System.Runtime.InteropServices.SafeHandle.DangerousGetHandle")]
        public static T[] FastRead<T>(FileStream fs, int count) where T: struct
        {
            int sizeOfT = Marshal.SizeOf(typeof(T));
            long bytesRemaining  = fs.Length - fs.Position;
            long wantedBytes     = count * sizeOfT;
            long bytesAvailable  = Math.Min(bytesRemaining, wantedBytes);
            long availableValues = bytesAvailable / sizeOfT;
            long bytesToRead     = (availableValues * sizeOfT);
            if ((bytesRemaining < wantedBytes) && ((bytesRemaining - bytesToRead) > 0))
            {
                Debug.WriteLine("Requested data exceeds available data and partial data remains in the file.", "Dmr.Common.IO.Arrays.FastRead(fs,count)");
            }
            T[] result = new T[availableValues];
            GCHandle gcHandle = GCHandle.Alloc(result, GCHandleType.Pinned);
            try
            {
                uint bytesRead = 0;
                if
                (
                    !ReadFile
                    (
                        fs.SafeFileHandle.DangerousGetHandle(),
                        gcHandle.AddrOfPinnedObject(),
                        (uint)bytesToRead,
                        out bytesRead,
                        IntPtr.Zero
                    )
                )
                {
                    throw new IOException("Unable to read file.", new Win32Exception(Marshal.GetLastWin32Error()));
                }
                Debug.Assert(bytesRead == bytesToRead);
            }
            finally
            {
                gcHandle.Free();
            }
            return result;
        }

        /// <summary>See the Windows API documentation for details.</summary>
        [SuppressMessage("Microsoft.Interoperability", "CA1415:DeclarePInvokesCorrectly")]
        [DllImport("kernel32.dll", SetLastError=true)]
        [return: MarshalAs(UnmanagedType.Bool)]
        private static extern bool ReadFile
        (
            IntPtr hFile,
            IntPtr lpBuffer,
            uint nNumberOfBytesToRead,
            out uint lpNumberOfBytesRead,
            IntPtr lpOverlapped
        );
        /// <summary>See the Windows API documentation for details.</summary>
        [SuppressMessage("Microsoft.Interoperability", "CA1415:DeclarePInvokesCorrectly")]
        [DllImport("kernel32.dll", SetLastError=true)]
        [return: MarshalAs(UnmanagedType.Bool)]
        private static extern bool WriteFile
        (
            IntPtr hFile,
            IntPtr lpBuffer,
            uint nNumberOfBytesToWrite,
            out uint lpNumberOfBytesWritten,
            IntPtr lpOverlapped
        );
    }
}