最有效的结构,快速访问随机元素+最后一个元素的一个标准

本文关键字:元素 最后一个 标准 一个 随机 结构 有效 访问 | 更新日期: 2023-09-27 18:18:37

我们有一个函数,它被告知我们收到了一个特定时间戳的项。

这样做的目的是等待一个特定的时间戳,我们等待我们收到我们期望的每个项目,然后在我们与所有项目"同步"后进一步推送通知。

目前,我们有一个Dictionary<DateTime, TimeSlot>来存储非同步的TimeSlot(TimeSlot =我们收到的特定时间戳的所有项目的列表)。

//Let's assume that this method is not called concurrently, and only once per "MyItem"
public void HandleItemReceived(DateTime timestamp, MyItem item){
    TimeSlot slot;
    //_pendingTimeSlot is a Dictionary<DateTime,TimeSlot>
    if(!_pendingTimeSlot.TryGetValue(timestamp, out slot)){
        slot = new TimeSlot(timestamp);
        _pendingTimeSlot.Add(timestamp,slot );
        //Sometimes we don't receive all the items for one timestamps, which may leads to some ghost-incomplete TimeSlot
        if(_pendingTimeSlot.Count>_capacity){
            TimeSlot oldestTimeSlot = _pendingTimeSlot.OrderBy(t=>t.Key).Select(t=>t.Value).First();
            _pendingTimeSlot.Remove(oldestTimeSlot.TimeStamp);
            //Additional work here to log/handle this case
        }
    }
    slot.HandleItemReceived(item);
    if(slot.IsComplete){
        PushTimeSlotSyncronized(slot);
        _pendingTimeSlot.Remove(slot.TimeStamp);
    }
}

对于不同组的项目,我们有几个并行的"Synchronizer"实例。

它工作得很好,除了当系统处于高负载下时,我们有更多的不完整时隙,并且应用程序使用更多的CPU。分析器似乎表明LINQ查询的Compare花费了大量时间(大部分时间)。所以我试图找到一些结构来保存这些引用(替换字典)

以下是一些指标:

  • 我们有几个(可变的,但在10到20之间)这个同步器的实例
  • 同步器当前最大容量(_capacity)为500项
  • 我们可以在两个不同的时间戳之间拥有的最短间隔是100ms(因此每个同步器每秒有10个新字典条目)(大多数情况下超过1项/秒)
  • 对于每个时间戳,我们期望接收300-500个条目。

因此,对于一个同步器,我们将执行每秒(最坏情况):

  • 1添加
  • 500年获得
  • <
  • 3 - 5类/gh>
我最好的招数是什么?我想到了SortedDictionary,但我没有找到任何文档显示我如何根据键取第一个元素。

最有效的结构,快速访问随机元素+最后一个元素的一个标准

你可以尝试的第一件事是消除OrderBy -所有你需要的是最小键,不需要排序得到它:

if (_pendingTimeSlot.Count > _capacity) {
    // No Enumerable.Min(DateTime), so doing it manually
    var oldestTimeStamp = DateTime.MaxValue;
    foreach (var key in _pendingTimeSlot.Keys)
        if (oldestTimeStamp > key) oldestTimestamp = key;
    _pendingTimeSlot.Remove(oldestTimeStamp);
    //Additional work here to log/handle this case
}

那么SortedDictionary呢,它肯定是一个选项,尽管它会消耗更多的内存。由于它是排序的,您可以简单地使用sortedDictionary.First()来获取具有最小键的键值对(因此在您的示例中是最老的元素)。

UPDATE:这是一种使用字典进行快速查找和使用有序双链表进行其他场景的混合方法。

class MyItem
{
    // ...
}
class TimeSlot
{
    public readonly DateTime TimeStamp;
    public TimeSlot(DateTime timeStamp)
    {
        TimeStamp = timeStamp;
        // ...
    }
    public bool IsComplete = false;
    public void HandleItemReceived(MyItem item)
    {
        // ...
    }
    // Dedicated members
    public TimeSlot PrevPending, NextPending;
}
class Synhronizer
{
    const int _capacity = 500;
    Dictionary<DateTime, TimeSlot> pendingSlotMap = new Dictionary<DateTime, TimeSlot>(_capacity + 1);
    TimeSlot firstPending, lastPending;
    //Let's assume that this method is not called concurrently, and only once per "MyItem"
    public void HandleItemReceived(DateTime timeStamp, MyItem item)
    {
        TimeSlot slot;
        if (!pendingSlotMap.TryGetValue(timeStamp, out slot))
        {
            slot = new TimeSlot(timeStamp);
            Add(slot);
            //Sometimes we don't receive all the items for one timestamps, which may leads to some ghost-incomplete TimeSlot
            if (pendingSlotMap.Count > _capacity)
            {
                // Remove the oldest, which in this case is the first
                var oldestSlot = firstPending;
                Remove(oldestSlot);
                //Additional work here to log/handle this case
            }
        }
        slot.HandleItemReceived(item);
        if (slot.IsComplete)
        {
            PushTimeSlotSyncronized(slot);
            Remove(slot);
        }
    }
    void Add(TimeSlot slot)
    {
        pendingSlotMap.Add(slot.TimeStamp, slot);
        // Starting from the end, search for a first slot having TimeStamp < slot.TimeStamp
        // If the TimeStamps almost come in order, this is O(1) op.
        var after = lastPending;
        while (after != null && after.TimeStamp > slot.TimeStamp)
            after = after.PrevPending;
        // Insert the new slot after the found one (if any).
        if (after != null)
        {
            slot.PrevPending = after;
            slot.NextPending = after.NextPending;
            after.NextPending = slot;
            if (slot.NextPending == null) lastPending = slot;
        }
        else
        {
            if (firstPending == null)
                firstPending = lastPending = slot;
            else
            {
                slot.NextPending = firstPending;
                firstPending.PrevPending = slot;
                firstPending = slot;
            }
        }
    }
    void Remove(TimeSlot slot)
    {
        pendingSlotMap.Remove(slot.TimeStamp);
        if (slot.NextPending != null)
            slot.NextPending.PrevPending = slot.PrevPending;
        else
            lastPending = slot.PrevPending;
        if (slot.PrevPending != null)
            slot.PrevPending.NextPending = slot.NextPending;
        else
            firstPending = slot;
        slot.PrevPending = slot.NextPending = null;
    }
    void PushTimeSlotSyncronized(TimeSlot slot)
    {
        // ...
    }
}

一些附加用法:

从旧的到最新的迭代:

for (var slot = firstPending; slot != null; slot = slot.NextPending)
{
    // do something
}

从最旧的到最新的迭代,并根据标准删除项:

for (TimeSlot slot = firstPending, nextSlot; slot != null; slot = nextSlot)
{
    nextSlot = slot.NextPending;
    if (ShouldRemove(slot))
        Remove(slot);
}

对于反向场景也是一样,但是使用lastPendingPrevPending成员代替。

下面是一个简单的示例。列表中的insert方法消除了元素交换。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            List<Data> inputs = new List<Data>() {
                new Data() { date = DateTime.Parse("10/22/15 6:00AM"), data = "abc"},
                new Data() { date = DateTime.Parse("10/22/15 4:00AM"), data = "def"},
                new Data() { date = DateTime.Parse("10/22/15 6:30AM"), data = "ghi"},
                new Data() { date = DateTime.Parse("10/22/15 12:00AM"), data = "jkl"},
                new Data() { date = DateTime.Parse("10/22/15 3:00AM"), data = "mno"},
                new Data() { date = DateTime.Parse("10/22/15 2:00AM"), data = "pqr"},
            };
            Data data = new Data();
            foreach (Data input in inputs)
            {
                data.Add(input);
            }
        }
    }
    public class Data
    {
        public static List<Data> sortedData = new List<Data>();
        public DateTime date { get; set; }
        public string data { get; set;}

        public void Add(Data newData)
        {
            if(sortedData.Count == 0)
            {
                sortedData.Add(newData);
            }
            else
            {
               Boolean added = false;
               for(int index = sortedData.Count - 1; index >= 0; index--)
               {
                   if(newData.date > sortedData[index].date)
                   {
                       sortedData.Insert(index + 1, newData);
                       added = true;
                       break;
                   }
               }
               if (added == false)
               {
                   sortedData.Insert(0, newData);
               }
            }
        }
    }
}​