这个'短'IL的符号

本文关键字:符号 这个 IL | 更新日期: 2023-09-27 17:51:11

每次我在IL中碰到它们:br_S, ldc_i4_S, ldarg_S等等……所以我只需要问这个问题:

我的意思是……如果您正在将一种语言从IL JIT到本机汇编程序,那么就性能而言,这应该无关紧要,对吧?那么使用这些"速记"符号的目的是什么呢?它只是为了在IL二进制文件中拥有更少的字节(例如作为压缩机制)还是有其他一些原因?

如果它只是一种压缩机制,为什么不使用像deflate这样的压缩算法呢?

这个'短'IL的符号

当然,这是一个微优化。但微观优化的黄金法则在这里非常适用,当你可以一次又一次地进行优化时,它们会变成宏观的。这当然是这里的情况,方法体很小,所以绝大多数分支都很短,方法有有限数量的参数和局部变量,所以一个字节就足够了,常量0到9经常出现在实际程序中。

把它们加起来,你就在一个大型程序集上进行了宏观优化,削减了许多kb。哪个在运行时很重要,这是不需要在RAM中出现页面故障的IL。热启动时间在一个被抛弃的程序中一直是一个问题,并且已经从各个可能的角度受到攻击。

总的来说,. net框架是不断地进行微优化的。很大程度上是因为微软不能假设他们的代码不会出现在程序的关键路径上。

not a response:已经有响应了,它假定没有"short"操作码,程序集大小将膨胀。这是真的吗?是的……但我们有(我有,因为我今晚没什么事要做)来证明它:-)

获得了多少字节?通过使用"短"(事实上byte)值的操作码(如Bge_SBge)?从使用包含"固定"的操作码开始;索引/号码吗?(像Ldarg_0 vs LdargLdc_I4_0Ldc_I4_M1 vs Ldc_I4)?

我们可以写一个程序来检查它!:-)结果,对于mscorlib 4.5.2:

4.5.2或更高版本

组装:C:'Windows' Microsoft.NET ' Framework64 ' v4.0.30319 ' mscorlib.dll

大小:5217440

其中资源:948657

已跳过:9496个方法

Parsed: 63720方法

从短参数中获得420786字节

从"fixed"

有了"优化的操作码"Mscorlib是5.2mb…如果没有它,仅为值大小将是6.6mb。多25% !(这忽略了5.2mb,其中0.9mb是资源!所以操作码大小增加的百分比更大)

(我使用mscorlib,因为虽然它不是典型的汇编,但它肯定充满了各种类型的代码)

程序:(我使用mono_reflection):

public static void Main(string[] args)
{
    Console.WriteLine(CheckFor45DotVersion(GetReleaseKey()));
    string assemblyName = "mscorlib";
    Assembly assembly = Assembly.Load(assemblyName);
    Console.WriteLine("Assembly: {0}", assembly.Location);
    long fullSize = new FileInfo(assembly.Location).Length;
    Console.WriteLine("Size: {0}", fullSize);
    var allOpCodes = typeof(OpCodes).GetFields(BindingFlags.Static | BindingFlags.Public)
        .Where(x => x.FieldType == typeof(OpCode))
        .Select(x => (OpCode)x.GetValue(null));
    Dictionary<OpCode, int> opcodes = allOpCodes.ToDictionary(x => x, x => 0);
    long resourcesLength = 0;
    int skippedMethods = 0;
    int parsedMethods = 0;
    ParseAssembly(assembly, resource =>
    {
        ManifestResourceInfo info = assembly.GetManifestResourceInfo(resource);
        if (info.ResourceLocation.HasFlag(ResourceLocation.Embedded))
        {
            using (Stream stream = assembly.GetManifestResourceStream(resource))
            {
                resourcesLength += stream.Length;
            }
        }
    }, method =>
    {
        if (method.MethodImplementationFlags.HasFlag(MethodImplAttributes.InternalCall))
        {
            skippedMethods++;
            return;
        }
        if (method.Attributes.HasFlag(MethodAttributes.PinvokeImpl))
        {
            skippedMethods++;
            return;
        }
        if (method.IsAbstract)
        {
            skippedMethods++;
            return;
        }
        parsedMethods++;
        IList<Instruction> instructions = method.GetInstructions();
        foreach (Instruction instruction in instructions)
        {
            int num;
            opcodes.TryGetValue(instruction.OpCode, out num);
            opcodes[instruction.OpCode] = num + 1;
        }
    });
    Console.WriteLine("Of which resources: {0}", resourcesLength);
    Console.WriteLine();
    Console.WriteLine("Skipped: {0} methods", skippedMethods);
    Console.WriteLine("Parsed: {0} methods", parsedMethods);
    int gained = 0;
    int gainedFixedNumber = 0;
    // m1: Ldc_I4_M1
    var shortOpcodes = opcodes.Where(x =>
        x.Key.Name.EndsWith(".s") ||
        x.Key.Name.EndsWith(".m1") ||
            // .0 - .9
        x.Key.Name[x.Key.Name.Length - 2] == '.' && char.IsNumber(x.Key.Name[x.Key.Name.Length - 1]));
    foreach (var @short in shortOpcodes)
    {
        OpCode opCode = @short.Key;
        string name = opCode.Name.Remove(opCode.Name.LastIndexOf('.'));
        OpCode equivalentLong = opcodes.Keys.First(x => x.Name == name);
        int lengthShort = GetLength(opCode.OperandType);
        int lengthLong = GetLength(equivalentLong.OperandType);
        int gained2 = @short.Value * (lengthLong - lengthShort);
        if (opCode.Name.EndsWith(".s"))
        {
            gained += gained2;
        }
        else
        {
            gainedFixedNumber += gained2;
        }
    }
    Console.WriteLine();
    Console.WriteLine("Gained {0} bytes from short arguments", gained);
    Console.WriteLine("Gained {0} bytes from '"fixed'" number", gainedFixedNumber);
}
private static int GetLength(OperandType operandType)
{
    switch (operandType)
    {
        case OperandType.InlineNone:
            return 0;
        case OperandType.ShortInlineVar:
        case OperandType.ShortInlineI:
        case OperandType.ShortInlineBrTarget:
            return 1;
        case OperandType.InlineVar:
            return 2;
        case OperandType.InlineI:
        case OperandType.InlineBrTarget:
            return 4;
    }
    throw new NotSupportedException();
}
private static void ParseAssembly(Assembly assembly, Action<string> resourceAction, Action<MethodInfo> action)
{
    string[] names = assembly.GetManifestResourceNames();
    foreach (string name in names)
    {
        resourceAction(name);
    }
    Module[] modules = assembly.GetModules();
    foreach (Module module in modules)
    {
        ParseModule(module, action);
    }
}
private static void ParseModule(Module module, Action<MethodInfo> action)
{
    MethodInfo[] methods = module.GetMethods(BindingFlags.Instance | BindingFlags.Static | BindingFlags.Public | BindingFlags.NonPublic);
    foreach (MethodInfo method in methods)
    {
        action(method);
    }
    Type[] types = module.GetTypes();
    foreach (Type type in types)
    {
        ParseType(type, action);
    }
}
private static void ParseType(Type type, Action<MethodInfo> action)
{
    if (type.IsInterface)
    {
        return;
    }
    // delegate (in .NET all delegates are MulticastDelegate
    if (type != typeof(MulticastDelegate) && typeof(MulticastDelegate).IsAssignableFrom(type))
    {
        return;
    }
    MethodInfo[] methods = type.GetMethods(BindingFlags.Instance | BindingFlags.Static | BindingFlags.Public | BindingFlags.NonPublic);
    foreach (MethodInfo method in methods)
    {
        action(method);
    }
    Type[] nestedTypes = type.GetNestedTypes(BindingFlags.Public | BindingFlags.NonPublic);
    foreach (Type nestedType in nestedTypes)
    {
        ParseType(nestedType, action);
    }
}
// Adapted from https://msdn.microsoft.com/en-us/library/hh925568.aspx
private static int GetReleaseKey()
{
    using (RegistryKey ndpKey = RegistryKey.OpenBaseKey(RegistryHive.LocalMachine, RegistryView.Registry32).OpenSubKey("SOFTWARE''Microsoft''NET Framework Setup''NDP''v4''Full''"))
    {
        // .NET 4.0
        if (ndpKey == null)
        {
            return 0;
        }
        int releaseKey = (int)ndpKey.GetValue("Release");
        return releaseKey;
    }
}
// Checking the version using >= will enable forward compatibility,  
// however you should always compile your code on newer versions of 
// the framework to ensure your app works the same. 
private static string CheckFor45DotVersion(int releaseKey)
{
    if (releaseKey >= 393273)
    {
        return "4.6 RC or later";
    }
    if ((releaseKey >= 379893))
    {
        return "4.5.2 or later";
    }
    if ((releaseKey >= 378675))
    {
        return "4.5.1 or later";
    }
    if ((releaseKey >= 378389))
    {
        return "4.5 or later";
    }
    // This line should never execute. A non-null release key should mean 
    // that 4.5 or later is installed. 
    return "No 4.5 or later version detected";
}

受xanatos代码的启发,使用原始IL字节的相同代码可能会让我们更好地估计我们可以获得多少。我已经把收益/损失分成了几个类别,以便有一个清晰的画面。

class Program
{
    internal class Calculator
    {
        static Calculator()
        {
            Register(OpCodes.Beq_S, 1, 3); // category 1: short-hand notations
            Register(OpCodes.Bge_S, 1, 3);
            Register(OpCodes.Bge_Un_S, 1, 3);
            Register(OpCodes.Bgt_S, 1, 3);
            Register(OpCodes.Bgt_Un_S, 1, 3);
            Register(OpCodes.Ble_S, 1, 3);
            Register(OpCodes.Ble_Un_S, 1, 3);
            Register(OpCodes.Blt_S, 1, 3);
            Register(OpCodes.Blt_Un_S, 1, 3);
            Register(OpCodes.Bne_Un_S, 1, 3);
            Register(OpCodes.Br_S, 1, 3);
            Register(OpCodes.Brfalse_S, 1, 3);
            Register(OpCodes.Brtrue_S, 1, 3);
            Register(OpCodes.Conv_I, 2, 4); // category 2: types can be generalized
            Register(OpCodes.Conv_I1, 2, 4);
            Register(OpCodes.Conv_I2, 2, 4);
            Register(OpCodes.Conv_I4, 2, 4);
            Register(OpCodes.Conv_I8, 2, 4);
            Register(OpCodes.Conv_Ovf_I, 2, 4);
            Register(OpCodes.Conv_Ovf_I_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_I1, 2, 4);
            Register(OpCodes.Conv_Ovf_I1_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_I2, 2, 4);
            Register(OpCodes.Conv_Ovf_I2_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_I4, 2, 4);
            Register(OpCodes.Conv_Ovf_I4_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_I8, 2, 4);
            Register(OpCodes.Conv_Ovf_I8_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_U, 2, 4);
            Register(OpCodes.Conv_Ovf_U_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_U1, 2, 4);
            Register(OpCodes.Conv_Ovf_U1_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_U2, 2, 4);
            Register(OpCodes.Conv_Ovf_U2_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_U4, 2, 4);
            Register(OpCodes.Conv_Ovf_U4_Un, 2, 4);
            Register(OpCodes.Conv_Ovf_U8, 2, 4);
            Register(OpCodes.Conv_Ovf_U8_Un, 2, 4);
            Register(OpCodes.Conv_R_Un, 2, 4);
            Register(OpCodes.Conv_R4, 2, 4);
            Register(OpCodes.Conv_R8, 2, 4);
            Register(OpCodes.Conv_U, 2, 4);
            Register(OpCodes.Conv_U1, 2, 4);
            Register(OpCodes.Conv_U2, 2, 4);
            Register(OpCodes.Conv_U4, 2, 4);
            Register(OpCodes.Conv_U8, 2, 4);
            Register(OpCodes.Ldarg_0, 3, 2);
            Register(OpCodes.Ldarg_1, 3, 2);
            Register(OpCodes.Ldarg_2, 3, 2);
            Register(OpCodes.Ldarg_3, 3, 2);
            Register(OpCodes.Ldarg_S, 1, 2);
            Register(OpCodes.Ldarga, 3, 2);
            Register(OpCodes.Ldarga_S, 1, 2);
            Register(OpCodes.Ldc_I4_0, 3, 2); // category 3: small value loads
            Register(OpCodes.Ldc_I4_1, 3, 2);
            Register(OpCodes.Ldc_I4_2, 3, 2);
            Register(OpCodes.Ldc_I4_3, 3, 2);
            Register(OpCodes.Ldc_I4_4, 3, 2);
            Register(OpCodes.Ldc_I4_5, 3, 2);
            Register(OpCodes.Ldc_I4_6, 3, 2);
            Register(OpCodes.Ldc_I4_7, 3, 2);
            Register(OpCodes.Ldc_I4_8, 3, 2);
            Register(OpCodes.Ldc_I4_M1, 3, 2);
            Register(OpCodes.Ldc_I4_S, 1, 3);
            Register(OpCodes.Ldelem_I, 2, 4);
            Register(OpCodes.Ldelem_I1, 2, 4);
            Register(OpCodes.Ldelem_I2, 2, 4);
            Register(OpCodes.Ldelem_I4, 2, 4);
            Register(OpCodes.Ldelem_I8, 2, 4);
            Register(OpCodes.Ldelem_R4, 2, 4);
            Register(OpCodes.Ldelem_R8, 2, 4);
            Register(OpCodes.Ldelem_Ref, 2, 4);
            Register(OpCodes.Ldelem_U1, 2, 4);
            Register(OpCodes.Ldelem_U2, 2, 4);
            Register(OpCodes.Ldelem_U4, 2, 4);
            Register(OpCodes.Ldind_I, 2, 4);
            Register(OpCodes.Ldind_I1, 2, 4);
            Register(OpCodes.Ldind_I2, 2, 4);
            Register(OpCodes.Ldind_I4, 2, 4);
            Register(OpCodes.Ldind_I8, 2, 4);
            Register(OpCodes.Ldind_R4, 2, 4);
            Register(OpCodes.Ldind_R8, 2, 4);
            Register(OpCodes.Ldind_Ref, 2, 4);
            Register(OpCodes.Ldind_U1, 2, 4);
            Register(OpCodes.Ldind_U2, 2, 4);
            Register(OpCodes.Ldind_U4, 2, 4);
            Register(OpCodes.Ldloc_0, 3, 1);
            Register(OpCodes.Ldloc_1, 3, 1);
            Register(OpCodes.Ldloc_2, 3, 1);
            Register(OpCodes.Ldloc_3, 3, 1);
            Register(OpCodes.Ldloc_S, 1, 1);
            Register(OpCodes.Ldloca_S, 1, 1);
            Register(OpCodes.Leave_S, 1, 3);
            Register(OpCodes.Starg_S, 1, 1);
            Register(OpCodes.Stelem_I, 2, 4);
            Register(OpCodes.Stelem_I1, 2, 4);
            Register(OpCodes.Stelem_I2, 2, 4);
            Register(OpCodes.Stelem_I4, 2, 4);
            Register(OpCodes.Stelem_I8, 2, 4);
            Register(OpCodes.Stelem_R4, 2, 4);
            Register(OpCodes.Stelem_R8, 2, 4);
            Register(OpCodes.Stelem_Ref, 2, 4);
            Register(OpCodes.Stind_I, 2, 4);
            Register(OpCodes.Stind_I1, 2, 4);
            Register(OpCodes.Stind_I2, 2, 4);
            Register(OpCodes.Stind_I4, 2, 4);
            Register(OpCodes.Stind_I8, 2, 4);
            Register(OpCodes.Stind_R4, 2, 4);
            Register(OpCodes.Stind_R8, 2, 4);
            Register(OpCodes.Stind_Ref, 2, 4);
            Register(OpCodes.Stloc_0, 3, 1);
            Register(OpCodes.Stloc_1, 3, 1);
            Register(OpCodes.Stloc_2, 3, 1);
            Register(OpCodes.Stloc_3, 3, 1);
            Register(OpCodes.Stloc_S, 1, 1);
        }
        private Calculator() { }
        private static void Register(OpCode opCode, int category, int delta)
        {
            dict[opCode] = new KeyValuePair<int, int>(category, delta);
        }
        private static Dictionary<OpCode, KeyValuePair<int, int>> dict = new Dictionary<OpCode, KeyValuePair<int, int>>();
        public static void Update(int[] data, OpCode opcode)
        {
            KeyValuePair<int, int> kv;
            if (dict.TryGetValue(opcode, out kv))
            {
                data[kv.Key] += kv.Value;
            }
        }
    }
    internal class Decompiler
    {
        public Decompiler() { }
        static Decompiler()
        {
            singleByteOpcodes = new OpCode[0x100];
            multiByteOpcodes = new OpCode[0x100];
            FieldInfo[] infoArray1 = typeof(OpCodes).GetFields();
            for (int num1 = 0; num1 < infoArray1.Length; num1++)
            {
                FieldInfo info1 = infoArray1[num1];
                if (info1.FieldType == typeof(OpCode))
                {
                    OpCode code1 = (OpCode)info1.GetValue(null);
                    ushort num2 = (ushort)code1.Value;
                    if (num2 < 0x100)
                    {
                        singleByteOpcodes[(int)num2] = code1;
                    }
                    else
                    {
                        if ((num2 & 0xff00) != 0xfe00)
                        {
                            throw new Exception("Invalid opcode: " + num2.ToString());
                        }
                        multiByteOpcodes[num2 & 0xff] = code1;
                    }
                }
            }
        }
        private static OpCode[] singleByteOpcodes;
        private static OpCode[] multiByteOpcodes;
        public int[] Delta = new int[5];
        public void Decompile(MethodBase mi, byte[] ildata)
        {
            Module module = mi.Module;
            int position = 0;
            while (position < ildata.Length)
            {
                OpCode code = OpCodes.Nop;
                ushort b = ildata[position++];
                if (b != 0xfe)
                {
                    code = singleByteOpcodes[b];
                }
                else
                {
                    b = ildata[position++];
                    code = multiByteOpcodes[b];
                    b |= (ushort)(0xfe00);
                    Delta[4]++;
                }
                switch (code.OperandType)
                {
                    case OperandType.InlineBrTarget:
                        position += 4;
                        break;
                    case OperandType.InlineField:
                        position += 4;
                        break;
                    case OperandType.InlineI:
                        position += 4;
                        break;
                    case OperandType.InlineI8:
                        position += 8;
                        break;
                    case OperandType.InlineMethod:
                        position += 4;
                        break;
                    case OperandType.InlineNone:
                        break;
                    case OperandType.InlineR:
                        position += 8;
                        break;
                    case OperandType.InlineSig:
                        position += 4;
                        break;
                    case OperandType.InlineString:
                        position += 4;
                        break;
                    case OperandType.InlineSwitch:
                        int count = BitConverter.ToInt32(ildata, position);
                        position += count * 4 + 4;
                        break;
                    case OperandType.InlineTok:
                    case OperandType.InlineType:
                        position += 4;
                        break;
                    case OperandType.InlineVar:
                        position += 2;
                        break;
                    case OperandType.ShortInlineBrTarget:
                        position += 1;
                        break;
                    case OperandType.ShortInlineI:
                        position += 1;
                        break;
                    case OperandType.ShortInlineR:
                        position += 4;
                        break;
                    case OperandType.ShortInlineVar:
                        position += 1;
                        break;
                    default:
                        throw new Exception("Unknown instruction operand; cannot continue. Operand type: " + code.OperandType);
                }
                Calculator.Update(Delta, code);
            }
        }
    }
    static void Main(string[] args)
    {
        string assemblyName = "mscorlib";
        Assembly assembly = Assembly.Load(assemblyName);
        int skippedMethods = 0;
        int parsedMethods = 0;
        Decompiler decompiler = new Decompiler();
        long totalbytes = 0;
        Console.WriteLine("Assembly: {0}", assembly.Location);
        foreach (var type in assembly.GetTypes())
        {
            foreach (var method in type.GetMethods(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Static | BindingFlags.Instance))
            {
                if (method.MethodImplementationFlags.HasFlag(MethodImplAttributes.InternalCall) ||
                    method.Attributes.HasFlag(MethodAttributes.PinvokeImpl) ||
                    method.IsAbstract)
                {
                    skippedMethods++;
                }
                else
                {
                    var body = method.GetMethodBody();
                    byte[] bytes;
                    if (body != null && (bytes = body.GetILAsByteArray()) != null)
                    {
                        decompiler.Decompile(method, bytes);
                        ++parsedMethods;
                        totalbytes += bytes.Length;
                    }
                    else 
                    {
                        skippedMethods++;
                    }
                }
            }
        }
        var delta = decompiler.Delta;
        Console.WriteLine("{0} methods parsed, {1} methods skipped", parsedMethods, skippedMethods);
        Console.WriteLine("- {0} bytes total", totalbytes);
        Console.WriteLine("- {0} bytes gained from generalizing short-hand notations", delta[1]);
        Console.WriteLine("- {0} bytes gained from generalizing type notations", delta[2]);
        Console.WriteLine("- {0} bytes gained from generalizing loads notations", delta[3]);
        Console.WriteLine("- {0} bytes lost from multi-byte opcodes", delta[4]);
        Console.ReadLine();
    }
}
结果:

Assembly: C:'Windows'Microsoft.NET'Framework'v4.0.30319'mscorlib.dll
56117 methods parsed, 10605 methods skipped
- 2489275 bytes total
- 361193 bytes gained from generalizing short-hand notations
- 126724 bytes gained from generalizing type notations
- 618858 bytes gained from generalizing loads notations
- 9447 bytes lost from multi-byte opcodes

这是什么意思?首先,total基于方法的实际IL字节。简写符号增益是由于不使用_S操作码而获得的字节数。类型符号泛化是可以通过使用类型标记而不是操作码专门化来泛化的操作码(例如conv_i4这样的转换)。负载增益是指从负载专门化(例如ldc_i4_0)中获得的字节。最后,因为它们现在有太多的操作码来容纳一个字节,所以IL也需要更多的字节。

从所有这些中获得的总收益意味着我们将拥有一个大小为3586603而不是2489275字节的DLL -这总共是+/- 30%的压缩比。

一天结束时,短代码试图使"更小"的代码。没有什么语义上的区别

这不是由于压缩,因为每个字节都有一个"名称表示"。

意味着字节0x00(它代表命令nop)被存储为00而不是6e 6f 70(作为字符串" nop "的ascii版本)。
这些"短"表示法之所以短,是出于调试的目的——即使ldarg.0Load argument 0 onto the stack更难读。

完整的CIL命令列表可以在这里查看:http://en.wikipedia.org/wiki/List_of_CIL_instructions


请原谅我的解释过于简单,因为我不熟悉英语