如何创建大小相等的组
本文关键字:何创建 创建 | 更新日期: 2024-10-18 16:19:27
我有一个按日期显示汽车信息的表。
car_id date <aditional info>
1 20160101
2 20160102
1 20160103
2 20160104
1 20160105
1 20160106
1 20160107
1 20160108
1 20160109
1 20160110
如果汽车有<= 5
记录,那么将只有一个组。CASE car_id = 2
如果汽车有<= 10
记录,那么将有两个相等的组。CASE car_id = 1
按日期排序
- 20160101-20160106
GROUP 1
- 20160107-20160110
GROUP 2
如果汽车有">10",那么将有三个均匀分布的组。
渴望输出
car_id date group_id
1 20160101 1
1 20160103 1
1 20160105 1
1 20160106 1
------------------------
1 20160107 2
1 20160108 2
1 20160109 2
1 20160110 2
------------------------
2 20160102 1
2 20160104 1
我试着用ntile()
,但不能使组号不变。
SQL Fiddle演示
SELECT car_id,
"date",
ntile(3) over (partition by car_id order by "date") as group_id
FROM Table1
如果可以直接在
C# LINQ
上完成,则可以获得加分,否则我将在postgres上创建一个函数。
传统信息我将以不同的颜色(组)显示历史汽车信息,因此数据量较小的汽车将以单一颜色显示。并且颜色的最大数量将是3。
您可以使用ntile
:
SELECT car_id, "date",
ntile(CASE WHEN c <= 5 THEN 1
WHEN c <= 10 THEN 2
ELSE 3
END) OVER (PARTITION BY car_id ORDER BY "date") AS group_id
FROM (SELECT car_id, "date",COUNT(*) OVER(PARTITION BY car_id) AS c
FROM Table1) AS s
SqlFiddleDemo
输出:
╔════════╦══════════╦══════════╗
║ car_id ║ date ║ group_id ║
╠════════╬══════════╬══════════╣
║ 1 ║ 20160101 ║ 1 ║
║ 1 ║ 20160103 ║ 1 ║
║ 1 ║ 20160105 ║ 1 ║
║ 1 ║ 20160106 ║ 1 ║
║ 1 ║ 20160107 ║ 2 ║
║ 1 ║ 20160108 ║ 2 ║
║ 1 ║ 20160109 ║ 2 ║
║ 1 ║ 20160110 ║ 2 ║
║ 2 ║ 20160102 ║ 1 ║
║ 2 ║ 20160104 ║ 1 ║
╚════════╩══════════╩══════════╝
我会使用row_number()
和count()
:手动计算组
select t1.*,
(case when cnt <= 5 then 1
when car_id * 2 <= cnt then 1
else 2
end) as grp
from (select t1.*,
row_number() over (partition by car_id order by date) as seqnum,
count(*) over (partition by car_id) as cnt
from table1
) t
where cnt <= 10
order by car_id, grp, date;
SQL Fiddle演示
正如戈登所建议的那样。
- 首先计算每个
car_id
的日期数 - 然后根据
cnt
分配我想要的组数 - 然后使用
groups
作为ntile(grp)
分析函数
WITH car_dates_count as (
select t1.*,
count(*) over (partition by car_id) as cnt
from table1 t1
),
car_groups as (
select cdc.*,
(case when cnt <= 5 then 1
when cnt <= 10 then 2
else 3
end) as grp
from car_dates_count cdc
)
SELECT *,
ntile(grp) over (partition by car_id order by "date") as group_id
FROM car_groups;
输出
| car_id | date | cnt | grp | group_id |
|--------|----------|-----|-----|----------|
| 1 | 20160101 | 9 | 2 | 1 | '
| 1 | 20160102 | 9 | 2 | 1 | |
| 1 | 20160103 | 9 | 2 | 1 | |
| 1 | 20160104 | 9 | 2 | 1 | |
| 1 | 20160105 | 9 | 2 | 1 | |=> (cnt 9 <= 10) Mean two groups
| 1 | 20160106 | 9 | 2 | 2 | |
| 1 | 20160107 | 9 | 2 | 2 | |
| 1 | 20160108 | 9 | 2 | 2 | |
| 1 | 20160109 | 9 | 2 | 2 | /
--------------------------------------------
| 2 | 20160101 | 5 | 1 | 1 | '
| 2 | 20160102 | 5 | 1 | 1 | |
| 2 | 20160103 | 5 | 1 | 1 | |=> (cnt 5 <= 5) Mean one group
| 2 | 20160104 | 5 | 1 | 1 | |
| 2 | 20160105 | 5 | 1 | 1 | /
--------------------------------------------
| 3 | 20160101 | 16 | 3 | 1 | '
| 3 | 20160102 | 16 | 3 | 1 | |
| 3 | 20160103 | 16 | 3 | 1 | |
| 3 | 20160104 | 16 | 3 | 1 | |
| 3 | 20160105 | 16 | 3 | 1 | |
| 3 | 20160106 | 16 | 3 | 1 | |
| 3 | 20160107 | 16 | 3 | 2 | |
| 3 | 20160108 | 16 | 3 | 2 | |=> (cnt 16 > 10) Mean three groups
| 3 | 20160109 | 16 | 3 | 2 | |
| 3 | 20160110 | 16 | 3 | 2 | |
| 3 | 20160111 | 16 | 3 | 2 | |
| 3 | 20160112 | 16 | 3 | 3 | |
| 3 | 20160113 | 16 | 3 | 3 | |
| 3 | 20160114 | 16 | 3 | 3 | |
| 3 | 20160115 | 16 | 3 | 3 | |
| 3 | 20160116 | 16 | 3 | 3 | /
假设您有如下定义的Car
类,则可以使用Linq
来完成此操作。
public class Car
{
public int car_id;
public DateTime date;
// additional info
}
我们需要应用分组两次,一次在car_id
上,下一次在均匀拆分为多个组时。我更喜欢使用Linq
。
var grouped = cars.GroupBy(c=>c.car_id)
.Select(c =>
new
{
car_grp_id = c.Key,
splits = c.Select((s,i)=>
new
{
grp_id = i/(c.Count() <= 5 ? 5 :(c.Count() %2 ==0)? c.Count() /2 : (c.Count() /3 +1)),
item = s
}),
})
.Select(s=>
new
{
grouponcars = s.splits.GroupBy(g=>g.grp_id)
.Select(x=>
new
{
group_id = x.Key,
cars = x.Select(y=>y.item)
})
})
.ToList();
输出
groupid : 0 - Car_Id : 1 - Date -1/1/2016 12:00:00 AM
groupid : 0 - Car_Id : 1 - Date -1/3/2016 12:00:00 AM
groupid : 0 - Car_Id : 1 - Date -1/5/2016 12:00:00 AM
groupid : 0 - Car_Id : 1 - Date -1/6/2016 12:00:00 AM
----------------------------------------
groupid : 1 - Car_Id : 1 - Date -1/7/2016 12:00:00 AM
groupid : 1 - Car_Id : 1 - Date -1/8/2016 12:00:00 AM
groupid : 1 - Car_Id : 1 - Date -1/9/2016 12:00:00 AM
groupid : 1 - Car_Id : 1 - Date -1/10/2016 12:00:00 AM
----------------------------------------
groupid : 0 - Car_Id : 2 - Date -1/2/2016 12:00:00 AM
groupid : 0 - Car_Id : 2 - Date -1/4/2016 12:00:00 AM
检查此Demo