找到两个词典的交集
本文关键字:两个 | 更新日期: 2023-09-27 18:35:39
我有两个词典,如Dic1<string,string>
和Dic2<string,string>
我想为Dic1
和Dic2
中存在的键生成新值list
所以例如,如果
Dic1: <12,"hi">, <14,"bye">
Dic2: <12,"hello">, <18,"bye">
那么列表应该是:"hi" , "hello"
我试图与Dic1.Keys.Intersect
合作,但还不能完全弄清楚。
What I tried: Dic1.Keys.Intersect(Dic2.Keys).ToList(t => t.Values);
你在这里:
var dic1 = new Dictionary<int, string> { { 12, "hi" }, { 14, "bye" } };
var dic2 = new Dictionary<int, string> { { 12, "hello" }, { 18, "bye" } };
HashSet<int> commonKeys = new HashSet<int>(dic1.Keys);
commonKeys.IntersectWith(dic2.Keys);
var result =
dic1
.Where(x => commonKeys.Contains(x.Key))
.Concat(dic2.Where(x => commonKeys.Contains(x.Key)))
// .Select(x => x.Value) // With this additional select you'll get only the values.
.ToList();
结果列表包含{ 12, "hi" }
和{ 12, "hello" }
HashSet
对于十字路口非常有用。
出于好奇心,我比较了所有六种解决方案(希望没有错过任何解决方案),时间如下:
@EZI Intersect2 GroupBy ~149ms
@Selman22 Intersect3 Keys.Intersect ~41ms
@dbc Intersect4 Where1 ~22ms
@dbc Intersect5 Where2 ~18ms
@dbc Intersect5 Classic ~11ms
@t3chb0t Intersect1 HashSet ~66ms
class Program
{
static void Main(string[] args)
{
var dic1 = new Dictionary<int, string>();
var dic2 = new Dictionary<int, string>();
Random rnd = new Random(DateTime.Now.Millisecond);
for (int i = 0; i < 100000; i++)
{
int id = 0;
do { id = rnd.Next(0, 1000000); } while (dic1.ContainsKey(id));
dic1.Add(id, "hi");
do { id = rnd.Next(0, 1000000); } while (dic2.ContainsKey(id));
dic2.Add(id, "hello");
}
List<List<string>> results = new List<List<string>>();
using (new AutoStopwatch(true)) { results.Add(Intersect1(dic1, dic2)); }
Console.WriteLine("Intersect1 elapsed in {0}ms (HashSet)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);
using (new AutoStopwatch(true)) { results.Add(Intersect2(dic1, dic2)); }
Console.WriteLine("Intersect2 elapsed in {0}ms (GroupBy)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);
using (new AutoStopwatch(true)) { results.Add(Intersect3(dic1, dic2)); }
Console.WriteLine("Intersect3 elapsed in {0}ms (Keys.Intersect)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);
using (new AutoStopwatch(true)) { results.Add(Intersect4(dic1, dic2)); }
Console.WriteLine("Intersect4 elapsed in {0}ms (Where1)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);
using (new AutoStopwatch(true)) { results.Add(Intersect5(dic1, dic2)); }
Console.WriteLine("Intersect5 elapsed in {0}ms (Where2)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);
using (new AutoStopwatch(true)) { results.Add(Intersect7(dic1, dic2)); }
Console.WriteLine("Intersect7 elapsed in {0}ms (Old style :-)", AutoStopwatch.Stopwatch.ElapsedMilliseconds);
Console.ReadKey();
}
static List<string> Intersect1(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
HashSet<int> commonKeys = new HashSet<int>(dic1.Keys);
commonKeys.IntersectWith(dic2.Keys);
var result =
dic1
.Where(x => commonKeys.Contains(x.Key))
.Concat(dic2.Where(x => commonKeys.Contains(x.Key)))
.Select(x => x.Value)
.ToList();
return result;
}
static List<string> Intersect2(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var result = dic1.Concat(dic2)
.GroupBy(x => x.Key)
.Where(g => g.Count() > 1)
.SelectMany(g => g.Select(x => x.Value))
.ToList();
return result;
}
static List<string> Intersect3(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var result =
dic1
.Keys
.Intersect(dic2.Keys)
.SelectMany(key => new[] { dic1[key], dic2[key] })
.ToList();
return result;
}
static List<string> Intersect4(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var result =
dic1.
Where(pair => dic2.ContainsKey(pair.Key))
.SelectMany(pair => new[] { dic2[pair.Key], pair.Value }).ToList();
return result;
}
static List<string> Intersect5(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var result =
dic1
.Keys
.Where(dic2.ContainsKey).SelectMany(k => new[] { dic1[k], dic2[k] })
.ToList();
return result;
}
static List<string> Intersect7(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var list = new List<string>();
foreach (var key in dic1.Keys)
{
if (dic2.ContainsKey(key))
{
list.Add(dic1[key]);
list.Add(dic2[key]);
}
}
return list;
}
}
class AutoStopwatch : IDisposable
{
public static readonly Stopwatch Stopwatch = new Stopwatch();
public AutoStopwatch(bool start)
{
Stopwatch.Reset();
if (start) Stopwatch.Start();
}
public void Dispose()
{
Stopwatch.Stop();
}
}
var d1 = new Dictionary<int, string>() { { 12, "hi" }, { 14, "bye" } };
var d2 = new Dictionary<int, string>() { { 12, "hello" }, { 18, "bye" } };
var res = d1.Concat(d2)
.GroupBy(x => x.Key)
.Where(g => g.Count() > 1)
.SelectMany(g => g.Select(x=>x.Value))
.ToList();
您可以使用 Where
过滤 Dic1 中的键,然后将它们转换为如下所示的值:
var values = Dic1.Keys.Where(Dic2.ContainsKey).SelectMany(k => new[] { Dic1[k], Dic2[k] })
.ToList();
这应该与Dic1
和Dic2
上的查找操作一样有效,后者通常是log(n)或更好,并且不需要构建任何临时哈希集或查找表。
这是一个避免字典查找的版本,但代价是不太漂亮:
var values = Dic1.Where(pair => Dic2.ContainsKey(pair.Key)).SelectMany(pair => new[] { pair.Value, Dic2[pair.Key] })
.ToList();
更新
我的时间测试(使用 t3chb0t 的便捷测试工具)显示第一个版本实际上运行得更快。 它更简单,所以在两者中,更喜欢这样。 但是到目前为止我发现的最快的根本不使用 Linq,1000000 个交集为 7 毫秒,而 Linq 版本为 13 毫秒:
static List<string> Intersect7(Dictionary<int, string> dic1, Dictionary<int, string> dic2)
{
var list = new List<string>();
foreach (var key in dic1.Keys)
{
if (dic2.ContainsKey(key))
{
list.Add(dic1[key]);
list.Add(dic2[key]);
}
}
return list;
}
不过它是旧式的,所以你可能不想要这个。
您只需要使用索引器获取值:
var values = Dic1.Keys.Intersect(Dic2.Keys)
.SelectMany(key => new[] { Dic1[key], Dic2[key] })
.ToList();