多个排序列表的分页
本文关键字:分页 列表 排序 | 更新日期: 2023-09-27 18:18:04
我有一个未知数量的有序列表,我需要对它们进行分页。例如,当页面大小为6时,这3个列表的页面看起来应该是这样的。
- List1: 01、02、03、04、05、06、07、08、09、10
- 清单2:11、12、13、14、15
- List3: 16、17、18、19日,20日,21日,22日,23日,24日,25日,26日,27日、28日
结果页面:
- Page1: 01,11,16,02,12,17 所以Page2
- : 03, 13日,18日04,14日19
- Page3: 05年,15日,20日,21日,07年 08年
- Page4: 22日,23日,09年,24日10
- page5: 25日,26日,27日、28日
当给定页码时,从每个列表(起始索引和项目数量)中获取哪些项目的最有效方法是什么?
考虑到每个列表可能有几十万个元素,所以遍历所有元素的效率不高。
我不能说这是否是最有效的方法,但这里有一个算法具有O(M*Log2(M))时间复杂度,其中M是列表的数量。它的工作原理如下。输入集合按照项目Count
升序分组和排序,迭代直到有效的开始索引适合当前范围,跳过之前的范围。这是可能的,因为在每一步中我们都知道它是最小计数,因此所有剩余的列表都在该范围内。完成这些之后,我们从剩下的列表中发出页面项。
函数如下:
static IEnumerable<T> GetPageItems<T>(List<List<T>> itemLists, int pageSize, int pageIndex)
{
int start = pageIndex * pageSize;
var counts = new int[itemLists.Count];
for (int i = 0; i < counts.Length; i++)
counts[i] = itemLists[i].Count;
Array.Sort(counts);
int listCount = counts.Length;
int itemIndex = 0;
for (int i = 0; i < counts.Length; i++)
{
int itemCount = counts[i];
if (itemIndex < itemCount)
{
int rangeLength = listCount * (itemCount - itemIndex);
if (start < rangeLength) break;
start -= rangeLength;
itemIndex = itemCount;
}
listCount--;
}
if (listCount > 0)
{
var listQueue = new List<T>[listCount];
listCount = 0;
foreach (var list in itemLists)
if (itemIndex < list.Count) listQueue[listCount++] = list;
itemIndex += start / listCount;
int listIndex = 0;
int skipCount = start % listCount;
int nextCount = 0;
int yieldCount = 0;
while (true)
{
var list = listQueue[listIndex];
if (skipCount > 0)
skipCount--;
else
{
yield return list[itemIndex];
if (++yieldCount >= pageSize) break;
}
if (itemIndex + 1 < list.Count)
{
if (nextCount != listIndex)
listQueue[nextCount] = list;
nextCount++;
}
if (++listIndex < listCount) continue;
if (nextCount == 0) break;
itemIndex++;
listIndex = 0;
listCount = nextCount;
nextCount = 0;
}
}
}
and test:
static void Main(string[] args)
{
var data = new List<List<int>>
{
new List<int> { 01, 02, 03, 04, 05, 06, 07, 08, 09, 10 },
new List<int> { 11, 12, 13, 14, 15 },
new List<int> { 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 },
};
int totalCount = data.Sum(list => list.Count);
int pageSize = 6;
int pageCount = 1 + (totalCount - 1) / pageSize;
for (int pageIndex = 0; pageIndex < pageCount; pageIndex++)
Console.WriteLine("Page #{0}: {1}", pageIndex + 1, string.Join(", ", GetPageItems(data, pageSize, pageIndex)));
Console.ReadLine();
}
我认为两步就可以很好地完成:
- 将你的列表平铺成一个列表(按你描述的方式排序)。
- 从该列表中获取所需页面的项目。
要完成第1步,我将做如下建议:合并多个列表
那么,(假设你的页面项目是整型,就像你的例子一样),这里有一个很好的方法,可以找到你想要的:
static IEnumerable<int> GetPageItems(IEnumerable<List<int>> itemLists, int pageSize, int page)
{
var mergedOrderedItems = itemLists.SelectMany(x => x.Select((s, index) => new { s, index }))
.GroupBy(x => x.index)
.SelectMany(x => x.Select(y => y.s));
// assuming that the first page is page 1, not page 0:
var startingIndex = pageSize * (page - 1);
var pageItems = mergedOrderedItems.Skip(startingIndex)
.Take(pageSize);
return pageItems;
}
注意-您不必担心传入的页#超过了给定总条目数可能存在的总页数…由于Linq的魔力,这个方法将简单地返回一个空IEnumerable。同样,如果Take(pageSize)的结果小于"pageSize",它只返回它确实找到的项目。
我将提交另一个基于Bear的实现。S对我第一个答案的反馈。这个很低级,但性能很好。它有两个主要部分:
-
找出哪个项目应该首先出现在页面上(特别是包含它的列表的索引是什么,以及该项目在该列表中的索引是什么)。
-
根据需要,按照正确的顺序从所有列表中取出项目(直到我们拥有所有需要的项目或耗尽项目)。
该实现在步骤1中不迭代单个列表。它确实使用了List。计数属性,但这是一个0(1)的操作。
因为我们在这里要考虑性能,所以代码不一定像我希望的那样自我描述,所以我放了一些注释来帮助解释逻辑:
static IEnumerable<T> GetPageItems<T>(List<List<T>> itemLists, int pageSize, int page)
{
if (page < 1)
{
return new List<T>();
}
// a simple copy so that we don't change the original (the individual Lists inside are untouched):
var lists = itemLists.ToList();
// Let's find the starting indexes for the first item on this page:
var currItemIndex = 0;
var currListIndex = 0;
var itemsToSkipCount = pageSize * (page - 1); // <-- assuming that the first page is page 1, not page 0
// I'll just break out of this loop manually, because I think this configuration actually makes
// the logic below a little easier to understand. Feel free to change it however you see fit :)
while (true)
{
var listsCount = lists.Count;
if (listsCount == 0)
{
return new List<T>();
}
// Let's consider a horizontal section of items taken evenly from all lists (based on the length of
// the shortest list). We don't need to iterate any items in the lists; Rather, we'll just count
// the total number of items we could get from this horizontal portion, and set our indexes accordingly...
var shortestListCount = lists.Min(x => x.Count);
var itemsWeAreConsideringCount = listsCount * (shortestListCount - currItemIndex);
// Does this horizontal section contain at least as many items as we must skip?
if (itemsWeAreConsideringCount >= itemsToSkipCount)
{ // Yes: So mathematically find the indexes of the first page item, and we're done.
currItemIndex += itemsToSkipCount / listsCount;
currListIndex = itemsToSkipCount % listsCount;
break;
}
else
{ // No: So we need to keep going. Let's increase currItemIndex to the end of this horizontal
// section, remove the shortest list(s), and the loop will continue with the remaining lists:
currItemIndex = shortestListCount;
lists.RemoveAll(x => x.Count == shortestListCount);
itemsToSkipCount -= itemsWeAreConsideringCount;
}
}
// Ok, we've got our starting indexes, and the remaining lists that still have items in the index range.
// Let's get our items from those lists:
var pageItems = new List<T>();
var largestListCount = lists.Max(x => x.Count);
// Loop until we have enough items to fill the page, or we run out of items:
while (pageItems.Count < pageSize && currItemIndex < largestListCount)
{
// Taking from one list at a time:
var currList = lists[currListIndex];
// If the list has an element at this index, get it:
if (currItemIndex < currList.Count)
{
pageItems.Add(currList[currItemIndex]);
}
// else... this list has no more elements.
// We could throw away this list, since it's pointless to iterate over it any more, but that might
// change the indices of other lists... for simplicity, I'm just gonna let it be... since the above
// logic simply ignores an empty list.
currListIndex++;
if (currListIndex == lists.Count)
{
currListIndex = 0;
currItemIndex++;
}
}
return pageItems;
}
下面是一些测试代码,使用了三个列表。我可以在几毫秒内从页面1,000,000中抓取6个条目:)
var list1 = Enumerable.Range(0, 10000000).ToList();
var list2 = Enumerable.Range(10000000, 10000000).ToList();
var list3 = Enumerable.Range(20000000, 10000000).ToList();
var lists = new List<List<int>> { list1, list2, list3 };
var timer = new Stopwatch();
timer.Start();
var items = GetPageItems(lists, 6, 1000000).ToList();
var count = items.Count();
timer.Stop();