Foreach vs IEnumerable.Concat() efficiency

本文关键字：efficiency Concat vs IEnumerable Foreach | 更新日期: 2023-09-27 18:17:35

我正在尝试为assemblyppaths集合中的每个项分组向字典中添加项。

我的问题是哪一个是最快的方法来添加到结果字典?方法1或方法2(见下面的代码)。对于方法1，我使用Concat将新创建的字典与主字典(pluginTypes)合并。对于方法2，我使用一个简单的foreach来为每个子字典添加主字典。这两种不同的方法在性能上真的有区别吗?

Dictionary<Type, string> pluginTypes = new Dictionary<Type, string>();
foreach (string assemblyPath in assemblyPaths)
{
Assembly assembly = RuntimeContext.Current.LoadAssembly(assemblyPath);
var plugins = (from type in assembly.GetTypes()
                let attribs = type.GetCustomAttributes(true)
                where attribs.Select(x => (x as Attribute).GetType().FullName).Contains("PluginAttribute")
                select type)
                .ToDictionary(k => k, v => assemblyPath);
  // METHOD 1
  pluginTypes = pluginTypes
  .Concat(plugins)
  .ToDictionary(k => k.Key, v => v.Value);
  // METHOD 2
  foreach (Type type in plugins.Keys)
     pluginTypes.Add(type, plugins[type]);
}

我个人更喜欢方法一。

Foreach vs IEnumerable.Concat() efficiency

嗯，您应该测量它们以确定，但是仅通过查看我怀疑方法2更快，原因如下:

Concat将枚举不需要的原始字典
ToDictionary将通过遍历两个对象的连接项来创建一个新的字典。

方法2更干净，使用更少的内存，并且可能更快。

作为旁注，您不需要创建中间字典作为查询的一部分;当您添加到字典中时，您可以直接引用查询结果本身:

var plugins = (from type in assembly.GetTypes()
                let attribs = type.GetCustomAttributes(true)
                where attribs.Select(x => (x as Attribute).GetType().FullName).Contains("PluginAttribute")
                select type)
  foreach (Type type in plugins)
     pluginTypes.Add(type, assemblyPath);

在不实际测试这两个方法的情况下(当然，这总是最好的方法)，可以尝试从逻辑上攻击该方法。

方法2接受一个现有集合，并向其中添加一些新项。

方法1从两个不同的集合中取出所有元素，并从中构建一个全新的集合。

在我看来，方法1在第一次迭代中做了大约3倍的工作。在那之后，它变成了四倍的工作，然后是五倍的工作，因为你从pluginTypes中取出更多的东西，并从中构建一个新的字典。

事实上，你为什么要把plugins构建成一个字典呢?你只是把项目线性地拉出来....

你真正想要的是:

 Assembly assembly = RuntimeContext.Current.LoadAssembly(assemblyPath);
 foreach (Type type in assembly.GetTypes())
 {
   if (type.GetCustomAttributes(true)
           .Select(x => (x as Attribute).GetType().FullName).Contains("PluginAttribute"))
      pluginTypes.Add(type, assemblyPath);
 }

进一步，我们需要问，我们究竟在自定义属性中寻找什么?一个名为"PluginAttribute"的属性?一个属性，其中包含"PluginAttribute"在它的名称(如"WidgetPluginAttribute")?从PluginAttribute?派生的属性。我猜最后一个可能是最接近的，所以我们就用那个吧。

 foreach (Type type in assembly.GetTypes())
 {
   if (type.GetCustomAttributes(true)
            .Any(ca=> typeof(PluginAttribute).IsAssignablefrom(ca.GetType())))
      pluginTypes.Add(type, assemblyPath);
 }

在考虑性能之前，请考虑以下几点:

你正在创建一个Dictionary<Type, string>，其中的值可以从键本身访问(使用Type.Assembly.Location)，因此拥有字典没有意义。
如果你有一个类PluginAttribute定义，你可以比较

。这段代码:

private ISet<Type> GetTypes(IEnumerable<string> assemblyPaths)
{
    return new HashSet<Type>(assemblyPaths
        .Select(RuntimeContext.Current.LoadAssembly)
        .SelectMany(a => a.GetTypes())
        .Where(t => t.GetCustomAttribute<PluginTypeAttribute>() != null));
}

或者使用IList，这取决于您是想迭代还是以类似字典的方式使用它。

方法2通常应该更快。ToDictionary基本上总是重新创建字典，因为Concat返回一个IEnumerable。为什么要从现有的字典转到IEnumerable只是为了重新创建字典呢?

如果没有唯一的键，那么在这两种情况下都可能出现异常。

有一个LINQ操作，它具体地接受一个项目序列，将每个项目投影到一个新的序列中，然后返回一个表示所有连接在一起的投影序列的序列。该操作为SelectMany。它的变体，在每个查询上使用Concat，在语义上是相同的，但会导致许多额外的间接级别和许多不必要的开销。

最重要的是，每次迭代外部循环时，您都在不断地创建字典，然后丢弃字典。由于字典的构造耗费时间和内存，这将对性能产生显著的负面影响。

第二个解决方案确实消除了一些不必要的开销，它是通过改变对象而不是使用查询操作来实现的。这并不意味着你不能使用查询操作来有效地解决这个问题，你只需要使用正确的查询操作。

Dictionary<Type, string> pluginTypes =
    (from assemblyPath in assemblyPaths
        let assembly = RuntimeContext.Current.LoadAssembly(assemblyPath)
        from type in assembly.GetTypes()
        let attribs = type.GetCustomAttributes(true)
        where attribs.Select(x => (x as Attribute).GetType().FullName)
        .Contains("PluginAttribute")
        select new
        {
            key = type,
            value = assemblyPath
        })
    .ToDictionary(pair => pair.key, pair => pair.value);