查询以从每个组中删除早于n个活动日期的记录
本文关键字:活动日 活动 日期 记录 删除 查询 | 更新日期: 2023-09-27 18:25:10
我有一个数据集,需要每天修剪。它是从一个定期将记录写入表的进程中填充的。
我目前有一个简单的查询可以做到这一点:
DELETE FROM dataTable WHERE entryDate < dateadd(day, -5, GETDATE())
但问题是,这个过程是不可靠的;可能有几天根本没有写入任何数据。
因此,我真正需要的是一个查询,它可以追溯到写入数据的5天(可能是非连续的),而不是5个日历天。
例如,如果我运行以下查询:
SELECT cast(entryDate as date) as LogDate
FROM dataTable
group by category, cast(entryDate as date)
order by cast(entryDate as date) desc
我可能会得到一个结果:
Category Date
Foo 2015-11-30
Foo 2015-11-29
Foo 2015-11-26
Foo 2015-11-25
Foo 2015-11-21
Foo 2015-11-20 <-- Start Pruning here, not the 25th.
Foo 2015-11-19
Foo 2015-11-18
Bar 2015-11-30
Bar 2015-11-29
Bar 2015-11-28
Bar 2015-11-27
Bar 2015-11-26
Bar 2015-11-25 <-- This one is OK to prune at the 25th.
Bar 2015-11-24
Bar 2015-11-23
在删除之前,我需要查询一直返回到第20个。
您可以使用row_number
来获取表有条目的最后5天。然后根据生成的数字进行删除。
SQL Fiddle
with rownums as (SELECT row_number() over(partition by category order by cast(entryDate as date) desc) as rn
,*
FROM dataTable
)
delete from rownums where rn <= 5 --use > 5 for records prior to the last 5 days
如果每天可以有多个条目,请使用dense_rank
对行进行编号。
with rownums as (SELECT dense_rank() over(partition by category order by cast(entryDate as date) desc) as rn
,*
FROM dataTable)
delete from rownums where rn > 5;
试试这样的方法。
;WITH orderedDates (LogDate, RowNum)
AS
(
SELECT [CACHEDATE] AS LogDate, ROW_NUMBER() OVER (ORDER BY CACHEDATE DESC) AS RowNum
FROM
dataTable
GROUP BY CACHEDATE
)
DELETE dataTable
WHERE CACHEDATE IN
(SELECT LogDate FROM orderedDates
WHERE ROWNUM > 5) --or however many you need to go back