LINQ 聚合 30 分钟间隔到小时

本文关键字:小时 分钟 聚合 LINQ | 更新日期: 2023-09-27 18:33:41

我不是 LINQ 的超级专家,我有第三方提供的以下数据:数据

Start: 6:00
End: 6:30
value: 1 
Start: 7:00
End: 7:30
value: 1
Start: 8:00
End: 8:30
value: 1
Start: 9:00
End: 9:30
value: 1
Start: 10:00
End: 10:30
value: 1
Start: 11:00
End: 11:30
value: 1
Start: 12:00
End: 12:30
value: 1
Start: 13:00
End: 13:30
value: 1
Start: 14:00
End: 14:30
value: 1
...
Start: 05:00
End: 05:30
value: 1

这些数据持续一周,然后是 30 天和 365 天。

我需要将每个 30 分钟的块转换为一个小时。

例如

Start: 6:00
End: 7:00
Value: 2
Start:7:00
End: 8:00
Value:2
......

假设开始、结束和价值是一行,有人可以帮助如何实现上述目标吗?

LINQ 聚合 30 分钟间隔到小时

此查询能够按给定AggregationType分组,并且能够使用第二个参数 checkType 过滤掉不完整的组。

private enum AggerationType { Year = 1, Month = 2, Day = 3, Hour = 4 }
private IList<Data> RunQuery(AggerationType groupType, AggerationType checkType)
{
    // The actual query which does to trick
    var result =
        from d in testList
        group d by new {
            d.Start.Year,
            Month = (int)groupType >= (int)AggerationType.Month ? d.Start.Month : 1,
            Day = (int)groupType >= (int)AggerationType.Day ? d.Start.Day : 1,
            Hour = (int)groupType >= (int)AggerationType.Hour ? d.Start.Hour : 1
        } into g
        // The where clause checks how much data needs to be in the group
        where CheckAggregation(g.Count(), checkType)
        select new Data() { Start = g.Min(m => m.Start), End = g.Max(m => m.End), Value = g.Sum(m => m.Value) };
    return result.ToList();
}
private bool CheckAggregation(int groupCount, AggerationType checkType)
{
    int requiredCount = 1;
    switch(checkType)
    {
        // For year all data must be multiplied by 12 months
        case AggerationType.Year:
            requiredCount = requiredCount * 12; 
            goto case AggerationType.Month;
        // For months all data must be multiplied by days in month
        case AggerationType.Month:
            // I use 30 but this depends on the given month and year
            requiredCount = requiredCount * 30; 
            goto case AggerationType.Day;
        // For days all data need to be multiplied by 24 hour
        case AggerationType.Day:
            requiredCount = requiredCount * 24;
            goto case AggerationType.Hour;
        // For hours all data need to be multiplied by 2 (because slots of 30 minutes)
        case AggerationType.Hour:
            requiredCount = requiredCount * 2;
            break;
    }
    return groupCount == requiredCount;
}

如果需要,这里有一些测试数据:

class Data
{
    public DateTime Start { get; set; }
    public DateTime End { get; set; }
    public int Value { get; set; }
}
// Just setup some test data simulary to your example
IList<Data> testList = new List<Data>();
DateTime date = DateTime.Parse("6:00"); 
// This loop fills just some data over several years, months and days
for (int year = date.Year; year > 2010; year--)
{
    for(int month = date.Month; month > 0; month--)
    {
        for (int day = date.Day; day > 0; day--)
        {
            for(int hour = date.Hour; hour > 0; hour--)
            {
                DateTime testDate = date.AddHours(-hour).AddDays(-day).AddMonths(-month).AddYears(-(date.Year - year));
                testList.Add(new Data() { Start = testDate, End = testDate.AddMinutes(30), Value = 1 });
                testList.Add(new Data() { Start = testDate.AddMinutes(30), End = testDate.AddHours(1), Value = 1 });
            }
        }
    }
}

下面是代码。由于switch的说法,它似乎有点丑陋。最好重构它,但它应该显示这个想法。

var items = input.Split(''n');
Func<string, string> f = s =>
{
    var strings = s.Split(new[] {':'}, 2);
    var key = strings[0];
    var value = strings[1];
    switch (key.ToLower())
    {
        case "start":
            return s;
        case "value":
            return String.Format("{0}: {1}", key, Int32.Parse(value) + 1);
        case "end": 
            return String.Format("{0}: {1:h:mm}", key,
                DateTime.Parse(value) +
                TimeSpan.FromMinutes(30));
        default:
            return "";
    }
};
var resultItems = items.Select(f);
Console.Out.WriteLine("result = {0}",
                          String.Join(Environment.NewLine, resultItems));
实际上很难

用纯 LINQ 完全解决这个问题。为了使生活更轻松,需要编写至少一个允许转换枚举的帮助程序方法。请看下面的例子。在这里,我利用了TimeIntervalIEnumerable,并有一个自定义的Split方法(使用 C# 迭代器实现(,该方法将两个元素连接在一个Tuple中:

class TimeInterval
{
    DateTime Start;
    DateTime End;
    int Value;
}
IEnumerable<TimeInterval> ToHourlyIntervals(
    IEnunumerable<TimeInterval> halfHourlyIntervals)
{
    return
        from pair in Split(halfHourlyIntervals)
        select new TimeInterval
        {
            Start = pair.Item1.Start,
            End = pair.Item2.End,
            Value = pair.Item1.Value + pair.Item2.Value
        };
}
static IEnumerable<Tuple<T, T>> Split<T>(
    IEnumerable<T> source)
{
    using (var enumerator = source.GetEnumerator())
    {
        while (enumerator.MoveNext())
        {
            T first = enumerator.Current;
            if (enumerator.MoveNext())
            {            
                T second = enumerator.Current;
                yield return Tuple.Create(first, second);
            }
        }
    }
}

同样可以应用于问题的第一部分(从字符串列表中提取半小时TimeInterval(:

IEnumerable<TimeInterval> ToHalfHourlyIntervals(
    IEnumerable<string> inputLines)
{
    return
        from triple in TripleSplit(inputLines)
        select new TimeInterval
        {
            Start = DateTime.Parse(triple.Item1.Replace("Start: ", "")),
            End = DateTime.Parse(triple.Item2.Replace("End: ", "")),
            Value = Int32.Parse(triple.Item3)
        };
}

在这里,我使用了一个返回Tuple<T, T, T>的自定义TripleSplit方法(这将很容易编写(。完成此操作后,完整的解决方案将如下所示:

// Read data lazilzy from disk (or any other source)
var lines = File.ReadLines(path);
var halfHourlyIntervals = ToHalfHourlyIntervals(lines);
var hourlyIntervals = ToHourlyIntervals(halfHourlyIntervals);
foreach (var interval in hourlyIntervals)
{
    // process
}

这个解决方案的好处是它完全推迟了。它一次处理一行,这允许您无限期地处理大源,而不会有任何内存不足异常的危险,考虑到您的给定要求,这似乎很重要:

这些数据持续一周,然后是 30 天和 365 天。