使用LINQ按组列出的前三个文档
本文关键字:文档 三个 LINQ 使用 | 更新日期: 2023-09-27 17:57:47
我试图实现的是AccessGroup排名前三的文档。我所说的前三名是指拥有最多计数的文档。我当前的解决方案返回:
DocumentId AccessGroupId Count
2 1 5
1 1 3
3 1 2
5 1 2
4 1 1
6 1 1
8 1 1
10 1 1
... 2 ...
我的目标是:
DocumentId AccessGroupId Count
2 1 5
1 1 3
3 1 2
... 2 ...
我制作了一个可运行的LINQPad程序:GitHub Gist
void Main()
{
var sampleData = new List<Foo>();
sampleData.Add(new Foo { DocumentId = 1, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 2, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 3, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 2, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 2, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 2, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 3, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 2, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 1, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 1, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 4, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 5, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 6, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 5, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 8, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 10, AccessGroupId = 1 });
sampleData.Add(new Foo { DocumentId = 2, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 2, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 2, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 3, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 3, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 3, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 3, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 3, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 3, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 3, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 3, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 4, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 4, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 4, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 4, AccessGroupId = 2 });
sampleData.Add(new Foo { DocumentId = 4, AccessGroupId = 2 });
var x = (from entry in sampleData
group entry by new { entry.DocumentId, entry.AccessGroupId } into g
orderby g.Key.AccessGroupId, g.Count() descending
select new { DocumentId = g.Key.DocumentId, AccessGroupId = g.Key.AccessGroupId, Count = g.Count() }
);
Console.WriteLine(x);
}
public class Foo {
public int DocumentId { get; set; }
public int AccessGroupId { get; set; }
}
感谢您的帮助!
您可以先按AccessGroupId
分组,然后按DocumentId
分组,按计数排序,取前3个。然后,您可以使用SelectMany
来展开每个访问组的前3个文档。
var x = sampleData
.GroupBy(x => x.AccessGroupId)
.Select(accessGroup => new
{
AccessGroupId = accessGroup.Key,
TopThreeDocs = accessGroup.GroupBy(x => x.DocumentId)
.OrderyByDescending(subg => subg.Count())
.Take(3)
})
.SelectMany(x => x.TopThreeDocs.Select(y => new
{
x.AccessGroupId,
DocumentId = y.Key,
Count = y.Count()
});
如下更改linq。
var x = (from entry in sampleData
group entry by new { entry.DocumentId, entry.AccessGroupId } into g
orderby g.Count() descending
select new { DocumentId = g.Key.DocumentId, AccessGroupId = g.Key.AccessGroupId, Count = g.Count() }
).Take(3);
Take(3)将选出前三名,希望能有所帮助。
使用Linq
如果你处理的是一个特别大的数据集,你可能会选择在linq之外做。
sampleData
.GroupBy(a=>new{a.AccessGroupId,a.DocumentId})
.Select(a=>new{ Count=a.Count(),a.Key.AccessGroupId,a.Key.DocumentId })
.OrderByDescending(a=>a.Count)
.GroupBy(a=>a.AccessGroupId)
.Select(a=>new{ AccessGroupId = a.Key, Values = a.Take(3)});
如果你想查看,请查看正在工作的小提琴
使用字典
可以肯定的是,使用Dictionary<int,Dictionary<int,int>>
来存储计数会更有效。
var cache = new Dictionary<int,Dictionary<int,int>>();
foreach(var item in sampleData)
{
if(!cache.ContainsKey(item.AccessGroupId))
{
cache[item.AccessGroupId] = new Dictionary<int,int>();
}
if(!cache[item.AccessGroupId].ContainsKey(item.DocumentId))
{
cache[item.AccessGroupId][item.DocumentId]=0;
}
cache[item.AccessGroupId][item.DocumentId]++;
}
var results = cache
.Select(a=>new{
AccessGroupId = a.Key,
Values = a.Value.OrderByDescending(b=>b.Value)
.Select(b=>new{ DocumentId = b.Key, Count = b.Value })
.Take(3)
});
用户友好程度较低,但与使用GroupBy相比,它确实更便宜,除非您使用的是Linq to Something,如果您想检查,请使用以下选项
我发现这是最简单的方法:
你首先需要获得每组的计数,然后你需要根据该计数排序,并取每组的前3名,然后使用Select Many:将该列表压平
var results = (from entry in sampleData
group entry by new { entry.AccessGroupId, entry.DocumentId } into g
select new
{
AccessGroupId = g.Key.AccessGroupId,
DocumentId = g.Key.DocumentId,
Count = g.Count()
}).OrderByDescending(x => x.Count)
.GroupBy(x => x.AccessGroupId)
.SelectMany(x => x.Take(3));