复杂的林克查询
本文关键字:查询 林克 复杂 | 更新日期: 2023-09-27 18:33:23
我在数据库中有一个表,其中包含 2 个字段:索引 (int), 电子邮件( varchar(100) )
我需要执行以下操作:
- 按域名对所有电子邮件进行分组(所有电子邮件都已小写)。
- 选择所有组中的所有电子邮件,其中域的电子邮件总和不超过步骤 1 之前电子邮件总数的 20%。
代码示例:
DataContext db = new DataContext();
//Domains to group by
List<string> domains = new List<string>() { "gmail.com", "yahoo.com", "hotmail.com" };
Dictionary<string, List<string>> emailGroups = new Dictionary<string, List<string>>();
//Init dictionary
foreach (string thisDomain in domains)
{
emailGroups.Add(thisDomain, new List<string>());
}
//Get distinct emails
var emails = db.Clients.Select(x => x.Email).Distinct();
//Total emails
int totalEmails = emails.Count();
//One percent of total emails
int onePercent = totalEmails / 100;
//Run on each email
foreach (var thisEmail in emails)
{
//Run on each domain
foreach (string thisDomain in emailGroups.Keys)
{
//If email from this domain
if (thisEmail.Contains(thisDomain))
{
//Add to dictionary
emailGroups[thisDomain].Add(thisEmail);
}
}
}
//Will store the final result
List<string> finalEmails = new List<string>();
//Run on each domain
foreach (string thisDomain in emailGroups.Keys)
{
//Get percent of emails in group
int thisDomainPercents = emailGroups[thisDomain].Count / onePercent;
//More than 20%
if (thisDomainPercents > 20)
{
//Take only 20% and join to the final result
finalEmails = finalEmails.Union(emailGroups[thisDomain].Take(20 * onePercent)).ToList();
}
else
{
//Join all to the final result
finalEmails = finalEmails.Union(emailGroups[thisDomain]).ToList();
}
}
有谁知道更好的制作方法?
我想不出一种不至少两次击中数据库的方法,一次用于分组,一次用于总数,您可以尝试类似
var query = from u in db.Users
group u by u.Email.Split('@')[1] into g
select new
{
Domain = g.Key,
Users = g.ToList()
};
query = query.Where(x => x.Users.Count <= (db.Users.Count() * 0.2));
假设您要获取每个组中按升序排列的最后一项:
int m = (int) (input.Count() * 0.2);
var result = input.GroupBy(x=>x.email.Split('@')[1],
(key,g)=>g.OrderByDescending(x=>x.index).Take(m)
.OrderBy(x=>x.index))
.SelectMany(g=>g);//If you want to get the last result without grouping
或者这个:
var result = input.GroupBy(x=>x.email.Split('@')[1],
(key,g)=>g.OrderBy(x=>x.index)
.Skip(g.Count()-m))
.SelectMany(g=>g);//If you want to get the last result without grouping
var maxCount = db.Users.Count() * 0.2;
var query = (from u in db.Users
group u by u.Email.Split('@')[1] into g
select new
{
Domain = g.Key,
Users = g.Take(maxCount).ToList()
})
.SelectMany(x => x.Users);