历史Linq过滤器行差异

本文关键字:过滤器 Linq 历史 | 更新日期: 2023-09-27 18:09:32

我有一个存储产品更改历史的表,并且希望获得Col1或Col2或Col3中有更改的记录列表,但不显示这三列中任何一列中没有更改的记录。

下面是一个用SQL完成的示例。你和Linq相处得怎么样?

创建测试临时表

CREATE TABLE #ProductHistorical(
    IdProductHistorical int IDENTITY(1,1) NOT NULL,
    IdProduct int NOT NULL,
    DateChange datetime NULL,
    Col1 int NOT NULL,
    Col2 int NOT NULL,
    Col3 int NOT NULL,
 CONSTRAINT PK_ProductHistorical PRIMARY KEY CLUSTERED (IdProductHistorical ASC))
GO

插入测试数据

INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (1, CAST(0x0000A13900000000 AS DateTime), 1, 2, 3)
INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (1, CAST(0x0000A13A00000000 AS DateTime), 1, 2, 3)
INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (1, CAST(0x0000A13B00000000 AS DateTime), 1, 2, 3)
INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (1, CAST(0x0000A13C00000000 AS DateTime), 1, 1, 3)
INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (1, CAST(0x0000A13D00000000 AS DateTime), 1, 1, 3)
INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (1, CAST(0x0000A13E00000000 AS DateTime), 2, 2, 2)
INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (1, CAST(0x0000A13F00000000 AS DateTime), 2, 2, 2)
INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (1, CAST(0x0000A14000000000 AS DateTime), 2, 2, 2)
INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (1, CAST(0x0000A14100000000 AS DateTime), 1, 2, 3)
INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (2, CAST(0x0000A14200000000 AS DateTime), 1, 1, 1)
INSERT #ProductHistorical ( IdProduct, DateChange, Col1, Col2, Col3) VALUES (2, CAST(0x0000A14300000000 AS DateTime), 1, 1, 2)
SQL查询

SELECT  phWithChanges.DateChange,
       phWithChanges.Col1,
       phWithChanges.Col2,
       phWithChanges.Col3 
FROM #ProductHistorical ph 
       CROSS APPLY (
          SELECT TOP 1 * 
          FROM #ProductHistorical phPost 
          WHERE phPost.IdProduct=ph.IdProduct AND
                phPost.IdProductHistorical>ph.IdProductHistorical AND
                (phPost.Col1<>ph.Col1 OR phPost.Col2<>ph.Col2 OR phPost.Col2<>ph.Col2)
          ORDER BY phPost.IdProductHistorical ASC) phWithChanges
WHERE ph.IdProduct=1
GROUP BY phWithChanges.DateChange,phWithChanges.Col1,phWithChanges.Col2,phWithChanges.Col3
UNION
    --Add First Row
    SELECT * FROM
       (SELECT TOP 1 
          phFirst.DateChange,
          phFirst.Col1,
          phFirst.Col2,
          phFirst.Col3
       FROM #ProductHistorical phFirst
       WHERE phFirst.IdProduct=1 ORDER BY phFirst.IdProductHistorical) rowFirst
ORDER BY 1

数据
IdProductHistorical IdProduct   DateChange              Col1        Col2        Col3
------------------- ----------- ----------------------- ----------- ----------- -----------
1                   1           2013-01-01 00:00:00.000 1           2           3
2                   1           2013-01-02 00:00:00.000 1           2           3
3                   1           2013-01-03 00:00:00.000 1           2           3
4                   1           2013-01-04 00:00:00.000 1           1           3
5                   1           2013-01-05 00:00:00.000 1           1           3
6                   1           2013-01-06 00:00:00.000 2           2           2
7                   1           2013-01-07 00:00:00.000 2           2           2
8                   1           2013-01-08 00:00:00.000 2           2           2
9                   1           2013-01-09 00:00:00.000 1           2           3
10                  2           2013-01-10 00:00:00.000 1           1           1
11                  2           2013-01-11 00:00:00.000 1           1           2
结果

DateChange              Col1        Col2        Col3
----------------------- ----------- ----------- -----------
2013-01-01 00:00:00.000 1           2           3
2013-01-04 00:00:00.000 1           1           3
2013-01-06 00:00:00.000 2           2           2
2013-01-09 00:00:00.000 1           2           3

你用Linq做得怎么样?

第一种方法

var query=(
        from ph in ProductHistorical.Where(p=>p.IdProduct==1)
        orderby ph.DateChange ascending 
        select new ProductHistoricalItem
                   {
                       DateChange = ph.DataChange,
                       Col1 = ph.Col1,
                       Col2 = ph.Col2,
                       Col3 = ph.Col3
                   });

List<ProductHistoricalItem> listResult=new List<ProductHistoricalItem>();
ProductHistoricalItem previous = null;
foreach (ProductHistoricalItem item in query)
{
    if (previous == null ||
        previous.Col1 != item.Col1 ||
        previous.Col2 != item.Col2 ||
        previous.Col3 != item.Col3)
    {
        listResult.Add(item);
        previous = item;
    }
}

这不是很有效。我怎么能不使用循环呢?

历史Linq过滤器行差异

基本上,我试图完全应用您的逻辑并将其转换为Linq代码。

var linqQuery = context.ProductHistoricals
    .SelectMany(ph => context.ProductHistoricals, (ph, phPost) => new { ph = ph, phPost = phPost }) // cross join
    .Where(a => a.ph.IdProduct == a.phPost.IdProduct
            && a.ph.IdProductHistorical > a.phPost.IdProductHistorical
            && (
                a.phPost.Col1 != a.ph.Col1 
             || a.phPost.Col2 != a.ph.Col2 
             || a.phPost.Col3 != a.ph.Col3))
    .Select(a => a.ph)
    .GroupBy(p => new { p.IdProduct, p.Col1, p.Col2, p.Col3 })
    .Select(p => p.OrderBy(phPost => phPost.IdProductHistorical).FirstOrDefault())
    .Union
    (
       // add first row
       context.ProductHistoricals
              .GroupBy(t => t.IdProduct)
              .Select(t => t.OrderBy(p => p.IdProductHistorical).FirstOrDefault())
    );

这个查询返回

1   2013-01-01  1   2   3
1   2013-01-04  1   1   3
1   2013-01-06  2   2   2
1   2013-01-09  1   2   3
2   2013-01-10  1   1   1
2   2013-01-11  1   1   2

作为参考,下面是生成的SQL:

SELECT [t10].[test], [t10].[IdProductHistorical], [t10].[IdProduct], [t10].[DateChange], [t10].[Col1], [t10].[Col2], [t10].[Col3]
FROM (
    SELECT [t5].[test], [t5].[IdProductHistorical], [t5].[IdProduct], [t5].[DateChange], [t5].[Col1], [t5].[Col2], [t5].[Col3]
    FROM (
        SELECT [t0].[IdProduct], [t0].[Col1], [t0].[Col2], [t0].[Col3]
        FROM [dbo].[ProductHistorical] AS [t0], [dbo].[ProductHistorical] AS [t1]
        WHERE ([t0].[IdProduct] = [t1].[IdProduct]) AND ([t0].[IdProductHistorical] > [t1].[IdProductHistorical]) AND (([t1].[Col1] <> [t0].[Col1]) OR ([t1].[Col2] <> [t0].[Col2]) OR ([t1].[Col3] <> [t0].[Col3]))
        GROUP BY [t0].[IdProduct], [t0].[Col1], [t0].[Col2], [t0].[Col3]
        ) AS [t2]
    OUTER APPLY (
        SELECT TOP (1) 1 AS [test], [t3].[IdProductHistorical], [t3].[IdProduct], [t3].[DateChange], [t3].[Col1], [t3].[Col2], [t3].[Col3]
        FROM [dbo].[ProductHistorical] AS [t3], [dbo].[ProductHistorical] AS [t4]
        WHERE ([t2].[IdProduct] = [t3].[IdProduct]) AND ([t2].[Col1] = [t3].[Col1]) AND ([t2].[Col2] = [t3].[Col2]) AND ([t2].[Col3] = [t3].[Col3]) AND ([t3].[IdProduct] = [t4].[IdProduct]) AND ([t3].[IdProductHistorical] > [t4].[IdProductHistorical]) AND (([t4].[Col1] <> [t3].[Col1]) OR ([t4].[Col2] <> [t3].[Col2]) OR ([t4].[Col3] <> [t3].[Col3]))
        ORDER BY [t3].[IdProductHistorical]
        ) AS [t5]
    UNION
    SELECT [t9].[test], [t9].[IdProductHistorical], [t9].[IdProduct], [t9].[DateChange], [t9].[Col1], [t9].[Col2], [t9].[Col3]
    FROM (
        SELECT [t6].[IdProduct]
        FROM [dbo].[ProductHistorical] AS [t6]
        GROUP BY [t6].[IdProduct]
        ) AS [t7]
    OUTER APPLY (
        SELECT TOP (1) 1 AS [test], [t8].[IdProductHistorical], [t8].[IdProduct], [t8].[DateChange], [t8].[Col1], [t8].[Col2], [t8].[Col3]
        FROM [dbo].[ProductHistorical] AS [t8]
        WHERE [t7].[IdProduct] = [t8].[IdProduct]
        ORDER BY [t8].[IdProductHistorical]
        ) AS [t9]
    ) AS [t10]

如果你愿意实现你自己的LINQ扩展方法,你可以使用"WhereWithPrevious"版本(最初的实现是SelectWithPrevious,从这里开始),像这样:

public static IEnumerable<TSource> WhereWithPrevious<TSource>
(this IEnumerable<TSource> source,
 Func<TSource, TSource, bool> selector)
{
    using (var iterator = source.GetEnumerator())
    {
        if (!iterator.MoveNext())
        {
             yield break;
        }
        TSource previous = default(TSource);
        // return the first item always
        yield return iterator.Current;
        while (iterator.MoveNext())
        {
            if(previous != null && selector(previous, iterator.Current))
            {
                yield return iterator.Current;
            }
            previous = iterator.Current;
        }
    }
}

,查询看起来像这样:

ProductHistorical.Where(p => p.Id == 1)
        .Select(
            p => new
            {
                DateChange = p.Date,
                Col1 = p.Col1,
                Col2 = p.Col2,
                Col3 = p.Col3
            })
        .OrderBy(p => p.DateChange)
        .WhereWithPrevious((prev, curr) =>
               prev.Col1 != curr.Col1
            || prev.Col2 != curr.Col2
            || prev.Col3 != curr.Col3)

这个怎么样?

int? col1 = null;
int? col2 = null;
int? col3 = null;
ProductHistorical.GroupBy(p=>p.IdProduct)
.Select(grp => grp.OrderBy(p=> p.DateChange)
    .Select(p => {
        var changed = col1 != p.Col1 || col2 != p.Col2 || col3 != p.Col3;
        col1 = p.Col1; col2 = p.Col2; col3 = P.Col3;
        return new { p = p, changed = changed };
    }
    .Where(p => p.changed)
    .Select(p => p.p)
)
.SelectMany(p => p)