初始化一个大型锯齿数组需要超过1gb的RAM,并且会因StackOverflowException而崩溃

本文关键字:RAM 1gb 崩溃 StackOverflowException 一个 大型 数组 初始化 | 更新日期: 2023-09-27 18:07:38

当我编译这段c#代码(全文)并运行ArrayTest.exe时,该进程挂起了几秒钟,消耗了1 GB的内存,并因StackOverflowException而崩溃。为什么?

public struct Point { }
public class ArrayTest {
    public static void Main(string[] args) {
        Point[][] array = {
            new Point[]{new Point(), new Point(), /* ... 296 omitted ... */, new Point(), new Point()},
            new Point[]{new Point(), new Point(), /* ... 296 omitted ... */, new Point(), new Point()},
            /* ... 296 omitted ... */
            new Point[]{new Point(), new Point(), /* ... 296 omitted ... */, new Point(), new Point()},
            new Point[]{new Point(), new Point(), /* ... 296 omitted ... */, new Point(), new Point()},
        };
        /* Do nothing and return */
    }
}

我使用Microsoft (R) Visual c# Compiler version 4.0.30319.33440 for Microsoft (R) . net Framework 4.5。我只是在命令行上调用csc.exe并执行编译的EXE。当我添加csc /optimize标志时,问题消失了。上面的代码片段确实是我正在测试的整个代码——在数组初始化之后,Main()中没有执行任何有用的工作。


问题背景:我试图将一组数字测试用例硬编码到一个程序中。在Java、JavaScript或Python中,代码看起来是这样的,并且可以正常工作:

class Point { int x; int y; }
Point[][] data = {  // About 1000 entries
    {new Point(1, 2)},
    {new Point(5, 3), new Point(0, 6), new Point(1, 8)},  // Different length
    ... et cetera ...
};
for (Point[] thing : data):
    test(thing);

但是当我尝试在c#中编译这样的代码时,数组初始化花费了明显的时间(~5秒),甚至在test()的for循环开始执行之前。

我的实际代码已经简化为上面的MVCE,其中struct Point不包含字段,Main()只包含数组初始化,没有有用的工作。

初始化一个大型锯齿数组需要超过1gb的RAM,并且会因StackOverflowException而崩溃

好的,我开始编译你的类文件的调试/发布版本。使用14.0版本的工具中的VS 2015编译器,IL的输出是相同的。这包括人们没有注意到问题的原因。

调试与发布在vs 2013中使用的以前的编译器是相当直接的诅咒。调试模式下输出的可执行文件为2091 kb。发布版本的IL表明它只是忽略了实际对象,因为它从未被使用过。好的,很好。我将比较VS 2015 Debug IL和VS 2013 Debug IL。

为了简洁,我将数组大小更改为3x3。

下面是2015年IL的输出:

  .method public hidebysig static void  Main() cil managed
  {
    .entrypoint
    // Code size       45 (0x2d)
    .maxstack  4
    .locals init (valuetype Point[][] V_0)
    IL_0000:  nop
    IL_0001:  ldc.i4.4
    IL_0002:  newarr     valuetype Point[]
    IL_0007:  dup
    IL_0008:  ldc.i4.0
    IL_0009:  ldc.i4.3
    IL_000a:  newarr     Point
    IL_000f:  stelem.ref
    IL_0010:  dup
    IL_0011:  ldc.i4.1
    IL_0012:  ldc.i4.3
    IL_0013:  newarr     Point
    IL_0018:  stelem.ref
    IL_0019:  dup
    IL_001a:  ldc.i4.2
    IL_001b:  ldc.i4.3
    IL_001c:  newarr     Point
    IL_0021:  stelem.ref
    IL_0022:  dup
    IL_0023:  ldc.i4.3
    IL_0024:  ldc.i4.3
    IL_0025:  newarr     Point
    IL_002a:  stelem.ref
    IL_002b:  stloc.0
    IL_002c:  ret
  } // end of method ArrayTest::Main

与发布模式代码的主要区别在于附加的nop指令。

下面是编译器2012/2013版本的输出:

  .method public hidebysig static void  Main() cil managed
  {
    .entrypoint
    // Code size       307 (0x133)
    .maxstack  4
    .locals init (valuetype Point[][] V_0,
             valuetype Point[][] V_1,
             valuetype Point[] V_2,
             valuetype Point V_3)
    IL_0000:  nop
    IL_0001:  ldc.i4.4
    IL_0002:  newarr     valuetype Point[]
    IL_0007:  stloc.1
    IL_0008:  ldloc.1
    IL_0009:  ldc.i4.0
    IL_000a:  ldc.i4.3
    IL_000b:  newarr     Point
    IL_0010:  stloc.2
    IL_0011:  ldloc.2
    IL_0012:  ldc.i4.0
    IL_0013:  ldelema    Point
    IL_0018:  ldloca.s   V_3
    IL_001a:  initobj    Point
    IL_0020:  ldloc.3
    IL_0021:  stobj      Point
    IL_0026:  ldloc.2
    IL_0027:  ldc.i4.1
    IL_0028:  ldelema    Point
    IL_002d:  ldloca.s   V_3
    IL_002f:  initobj    Point
    IL_0035:  ldloc.3
    IL_0036:  stobj      Point
    IL_003b:  ldloc.2
    IL_003c:  ldc.i4.2
    IL_003d:  ldelema    Point
    IL_0042:  ldloca.s   V_3
    IL_0044:  initobj    Point
    IL_004a:  ldloc.3
    IL_004b:  stobj      Point
    IL_0050:  ldloc.2
    IL_0051:  stelem.ref
    IL_0052:  ldloc.1
    IL_0053:  ldc.i4.1
    IL_0054:  ldc.i4.3
    IL_0055:  newarr     Point
    IL_005a:  stloc.2
    IL_005b:  ldloc.2
    IL_005c:  ldc.i4.0
    IL_005d:  ldelema    Point
    IL_0062:  ldloca.s   V_3
    IL_0064:  initobj    Point
    IL_006a:  ldloc.3
    IL_006b:  stobj      Point
    IL_0070:  ldloc.2
    IL_0071:  ldc.i4.1
    IL_0072:  ldelema    Point
    IL_0077:  ldloca.s   V_3
    IL_0079:  initobj    Point
    IL_007f:  ldloc.3
    IL_0080:  stobj      Point
    IL_0085:  ldloc.2
    IL_0086:  ldc.i4.2
    IL_0087:  ldelema    Point
    IL_008c:  ldloca.s   V_3
    IL_008e:  initobj    Point
    IL_0094:  ldloc.3
    IL_0095:  stobj      Point
    IL_009a:  ldloc.2
    IL_009b:  stelem.ref
    IL_009c:  ldloc.1
    IL_009d:  ldc.i4.2
    IL_009e:  ldc.i4.3
    IL_009f:  newarr     Point
    IL_00a4:  stloc.2
    IL_00a5:  ldloc.2
    IL_00a6:  ldc.i4.0
    IL_00a7:  ldelema    Point
    IL_00ac:  ldloca.s   V_3
    IL_00ae:  initobj    Point
    IL_00b4:  ldloc.3
    IL_00b5:  stobj      Point
    IL_00ba:  ldloc.2
    IL_00bb:  ldc.i4.1
    IL_00bc:  ldelema    Point
    IL_00c1:  ldloca.s   V_3
    IL_00c3:  initobj    Point
    IL_00c9:  ldloc.3
    IL_00ca:  stobj      Point
    IL_00cf:  ldloc.2
    IL_00d0:  ldc.i4.2
    IL_00d1:  ldelema    Point
    IL_00d6:  ldloca.s   V_3
    IL_00d8:  initobj    Point
    IL_00de:  ldloc.3
    IL_00df:  stobj      Point
    IL_00e4:  ldloc.2
    IL_00e5:  stelem.ref
    IL_00e6:  ldloc.1
    IL_00e7:  ldc.i4.3
    IL_00e8:  ldc.i4.3
    IL_00e9:  newarr     Point
    IL_00ee:  stloc.2
    IL_00ef:  ldloc.2
    IL_00f0:  ldc.i4.0
    IL_00f1:  ldelema    Point
    IL_00f6:  ldloca.s   V_3
    IL_00f8:  initobj    Point
    IL_00fe:  ldloc.3
    IL_00ff:  stobj      Point
    IL_0104:  ldloc.2
    IL_0105:  ldc.i4.1
    IL_0106:  ldelema    Point
    IL_010b:  ldloca.s   V_3
    IL_010d:  initobj    Point
    IL_0113:  ldloc.3
    IL_0114:  stobj      Point
    IL_0119:  ldloc.2
    IL_011a:  ldc.i4.2
    IL_011b:  ldelema    Point
    IL_0120:  ldloca.s   V_3
    IL_0122:  initobj    Point
    IL_0128:  ldloc.3
    IL_0129:  stobj      Point
    IL_012e:  ldloc.2
    IL_012f:  stelem.ref
    IL_0130:  ldloc.1
    IL_0131:  stloc.0
    IL_0132:  ret
  } // end of method ArrayTest::Main

所以,在你正在使用的2012/2013编译器中,调试模式正在进行非常大量的堆栈分配,可能这样你就可以在编辑和继续期间智能感知整个锯齿状数组结构,或者可能这样你就可以进入每个单独的对象结构。我一点也不确定。

我不是IL方面的专家,但在我看来,它是为每个点分配,然后再为每个数组分配,然后再为锯齿数组分配,导致太多的分配。