加速linq实体查询

本文关键字:查询 实体 linq 加速 | 更新日期: 2023-09-27 18:12:04

我目前有一个查询,不需要很长时间,有时崩溃,因为在数据库中的数据量。

有人能注意到我能做些什么来加快速度吗?
public IList<Report> GetReport(CmsEntities context, long manufacturerId, long? regionId, long? vehicleTypeId)
        {
            var now = DateTime.Now;
            var today = new DateTime(now.Year, now.Month, 1);
            var date1monthago = today.AddMonths(-1);
            var date2monthago = today.AddMonths(-2);
            var date3monthago = today.AddMonths(-3);
            var date4monthago = today.AddMonths(-4);
            var date5monthago = today.AddMonths(-5);
            var date6monthago = today.AddMonths(-6);
            today = TimeManager.EndOfDay(new DateTime(now.AddMonths(-1).Year, today.AddMonths(-1).Month, DateTime.DaysInMonth(now.Year, today.AddMonths(-1).Month)));             
            var query = from item in context.Invoices
                         where item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Select(x => x.ManufacturerId).Contains(manufacturerId)
                         && (item.InvoiceDate >= date6monthago && item.InvoiceDate <= today)
                         && (regionId.HasValue && regionId.Value > 0 ? item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Select(x => x.RegionId).Contains(regionId.Value) : true)
                         && (item.InvType == "I" || item.InvType == null)
                         && (vehicleTypeId.HasValue && vehicleTypeId.Value > 0 ? item.Repair.Job.Vehicle.Model.VehicleTypes.Select(x => x.Id).Contains(vehicleTypeId.Value) : true)
                         select item;
            var query2 = from item in query
                         group item by new { item.Repair.Job.Bodyshop } into g
                         let manufJobs = query.Where(x => x.Repair.Job.Vehicle.Model.ManufacturerId == manufacturerId && x.Repair.Job.BodyshopId == g.Key.Bodyshop.Id)
                         let allJobs = query.Where(x => x.Repair.Job.BodyshopId == g.Key.Bodyshop.Id)
                         select new tReport
                         {                                     
    MonthSixManufJobTotal = manufJobs.Where(x => x.InvoiceDate.Month == date6monthago.Month && x.InvoiceDate.Year == date6monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthSixJobTotal = allJobs.Where(x => x.InvoiceDate.Month == date6monthago.Month && x.InvoiceDate.Year == date6monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthFiveManufJobTotal = manufJobs.Where(x => x.InvoiceDate.Month == date5monthago.Month && x.InvoiceDate.Year == date5monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthFiveJobTotal = allJobs.Where(x => x.InvoiceDate.Month == date5monthago.Month && x.InvoiceDate.Year == date5monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthFourManufJobTotal = manufJobs.Where(x => x.InvoiceDate.Month == date4monthago.Month && x.InvoiceDate.Year == date4monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthFourJobTotal = allJobs.Where(x => x.InvoiceDate.Month == date4monthago.Month && x.InvoiceDate.Year == date4monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthThreeManufJobTotal = manufJobs.Where(x => x.InvoiceDate.Month == date3monthago.Month && x.InvoiceDate.Year == date3monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthThreeJobTotal = allJobs.Where(x => x.InvoiceDate.Month == date3monthago.Month && x.InvoiceDate.Year == date3monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthTwoManufJobTotal = manufJobs.Where(x => x.InvoiceDate.Month == date2monthago.Month && x.InvoiceDate.Year == date2monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthTwoJobTotal = allJobs.Where(x => x.InvoiceDate.Month == date2monthago.Month && x.InvoiceDate.Year == date2monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthOneManufJobTotal = manufJobs.Where(x => x.InvoiceDate.Month == date1monthago.Month && x.InvoiceDate.Year == date1monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthOneJobTotal = allJobs.Where(x => x.InvoiceDate.Month == date1monthago.Month && x.InvoiceDate.Year == date1monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    ManufTotal = manufJobs.GroupBy(x => x.Repair.Job).Count(),
    Total = allJobs.GroupBy(x => x.Repair.Job).Count(),
    PercentageOf = ((decimal)manufJobs.GroupBy(x => x.Repair.Job).Count() / (decimal)allJobs.GroupBy(x => x.Repair.Job).Count()) * 100
                         };
            return query2.OrderBy(x => x).ToList();
        }

编辑
var query = from item in context.Invoices.AsNoTracking()
                    where item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(x => x.ManufacturerId == manufacturerId)
                    && (item.InvoiceDate >= date12monthago && item.InvoiceDate <= today)
                    && (item.InvType == "I" || item.InvType == null)
                    select item;
        if (regionId.HasValue && regionId.Value > 0)
        {
            query = query.Where(item => item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Select(x => x.RegionId).Contains(regionId.Value));
        }
        if (vehicleTypeId.HasValue && vehicleTypeId.Value > 0)
        {
            query = query.Where(item => item.Repair.Job.Vehicle.Model.VehicleTypes.Select(x => x.Id).Contains(vehicleTypeId.Value));
        }

              var query2 = from item in hey
                     group item by new { item.Repair.Job.Bodyshop, item.InvoiceDate.Month } into m
                     select new TReport
                     {
                         Bodyshop = m.Key.Bodyshop.Name,
                         Bays = m.Key.Bodyshop.Bays,
                         Region = m.Key.Bodyshop.Manufacturer2Bodyshop.FirstOrDefault(x => x.ManufacturerId == manufacturerId).Region.Name,
                         BodyshopCode = m.Key.Bodyshop.Manufacturer2Bodyshop.FirstOrDefault(x => x.ManufacturerId == manufacturerId).BodyshopCode,
                         Total = m.Count(),
                         ManufTotal = m.Where(x => x.Repair.Job.Vehicle.Model.ManufacturerId == manufacturerId).Count(),
                         Totals = m.GroupBy(j => j.InvoiceDate.Month).Select(j => new TPercentReportInner
                         {
                             Month = j.Key,
                             ManufTotal = j.Where(x => x.Repair.Job.Vehicle.Model.ManufacturerId == manufacturerId).Count(),
                             AllTotal = j.Count()
                         })
                     };

我把这个问题删减了。但即使这样,现在的表现也比以前差了吗?

加速linq实体查询

我将首先从查询中删除硬编码的可选条件,这将允许查询优化器根据您拥有的参数使用不同的查询计划,例如:

var query = from item in context.Invoices.AsNoTracking()
            where item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Select(x => x.ManufacturerId).Contains(manufacturerId)
            && (item.InvoiceDate >= date12monthago && item.InvoiceDate <= today)
            && (item.InvType == "I" || item.InvType == null)
            select item;
if (regionId.HasValue && regionId.Value > 0)
    query=query.Where(item=>item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Select(x => x.RegionId).Contains(regionId.Value));
if (vehicleTypeId.HasValue && vehicleTypeId.Value > 0)
    query=query.Where(item=>item.Repair.Job.Vehicle.Model.VehicleTypes.Select(x => x.Id).Contains(vehicleTypeId.Value));
var query2 = from item in query
             group item by new { item.InvoiceDate.Month, item.Repair.Job.Bodyshop } into g
             select new TReport
             {
                 BodyshopId = g.Key.Bodyshop.Id,  
                 Month = g.Key.Month,
                 MonthAllJobTotal = g.Count()
             };
return query2.ToList();

您还可以检查转换.Select(x=>x.id).Contains(id).Any(x=>x.Id==id)是否执行得更快,尽管我认为它们在查询计划和执行速度上相似。这将给你:

var query = from item in context.Invoices.AsNoTracking()
            where item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(m=>m.ManufacturerId==manufacturerId)
            && (item.InvoiceDate >= date12monthago && item.InvoiceDate <= today)
            && (item.InvType == "I" || item.InvType == null)
            select item;
if (regionId.HasValue && regionId.Value > 0)
    query=query.Where(item=>item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(m=>m.RegionId==regionId.Value));
if (vehicleTypeId.HasValue && vehicleTypeId.Value > 0)
    query=query.Where(item=>item.Repair.Job.Vehicle.Model.VehicleTypes.Any(v=>v.Id==vehicleTypeId.Value));
var query2 = from item in query
             group item by new { item.InvoiceDate.Month, item.Repair.Job.Bodyshop } into g
             select new TReport
             {
                 BodyshopId = g.Key.Bodyshop.Id,  
                 Month = g.Key.Month,
                 MonthAllJobTotal = g.Count()
             };
return query2.ToList();

根据你所拥有的,我猜.AsNoTracking()对你的帮助很小,但它不会伤害你。当检索大量实体时,它有更大的影响,而这似乎没有做。

然后,我将通过删除硬编码的ManufacturerId来清理和标准化您的查询,这将给您:

var query = from item in context.Invoices.AsNoTracking()
            where (item.InvoiceDate >= date12monthago && item.InvoiceDate <= today)
            && (item.InvType == "I" || item.InvType == null)
            select item;
if (manufacturerId.HasValue && manufacturerId.Value > 0)
    query=query.Where(item=>item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(m=>m.ManufacturerId==manufacturerId));
if (regionId.HasValue && regionId.Value > 0)
    query=query.Where(item=>item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(m=>m.RegionId==regionId.Value));
if (vehicleTypeId.HasValue && vehicleTypeId.Value > 0)
    query=query.Where(item=>item.Repair.Job.Vehicle.Model.VehicleTypes.Any(v=>v.Id==vehicleTypeId.Value));
var query2 = from item in query
             group item by new { item.InvoiceDate.Month, item.Repair.Job.Bodyshop } into g
             select new TReport
             {
                 BodyshopId = g.Key.Bodyshop.Id,  
                 Month = g.Key.Month,
                 MonthAllJobTotal = g.Count()
             };
return query2.ToList();

,然后最后,我会返回一个IQueryable而不是列表,这样,如果你不需要一个或多个列,他们可以从最终查询中删除,以及像:

public IQueryable<Report> GetReport(CmsEntities context, long? manufacturerId, long? regionId, long? vehicleTypeId)
    {
{
        var now = DateTime.Now;
        var today = new DateTime(now.Year, now.Month, 1);
        var date1monthago = today.AddMonths(-1);
        var date2monthago = today.AddMonths(-2);
        var date3monthago = today.AddMonths(-3);
        var date4monthago = today.AddMonths(-4);
        var date5monthago = today.AddMonths(-5);
        var date6monthago = today.AddMonths(-6);
        today = TimeManager.EndOfDay(new DateTime(now.AddMonths(-1).Year, today.AddMonths(-1).Month, DateTime.DaysInMonth(now.Year, today.AddMonths(-1).Month)));             
    var query = from item in context.Invoices.AsNoTracking()
                where (item.InvoiceDate >= date12monthago && item.InvoiceDate <= today)
                && (item.InvType == "I" || item.InvType == null)
                select item;
    if (manufacturerId.HasValue && manufacturerId.Value > 0)
        query=query.Where(item=>item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(m=>m.ManufacturerId==manufacturerId));
    if (regionId.HasValue && regionId.Value > 0)
        query=query.Where(item=>item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(m=>m.RegionId==regionId.Value));
    if (vehicleTypeId.HasValue && vehicleTypeId.Value > 0)
        query=query.Where(item=>item.Repair.Job.Vehicle.Model.VehicleTypes.Any(v=>v.Id==vehicleTypeId.Value));
    var query2 = from item in query
                 group item by new { item.InvoiceDate.Month, item.Repair.Job.Bodyshop } into g
                 select new TReport
                 {
                     BodyshopId = g.Key.Bodyshop.Id,  
                     Month = g.Key.Month,
                     MonthAllJobTotal = g.Count()
                 };
    return query2;
}

然后我将它们分开并将它们转换为扩展方法:

public static class MyExtensions 
{
    public static IQueryable<Invoice> Recent(this IQueryable<Invoice> context,long? manufacturerId=null,long? regionId=null,long? vehicleId=null)
    {
        var now = DateTime.Now;
        var today = new DateTime(now.Year, now.Month, 1);
        var date1monthago = today.AddMonths(-1);
        var date2monthago = today.AddMonths(-2);
        var date3monthago = today.AddMonths(-3);
        var date4monthago = today.AddMonths(-4);
        var date5monthago = today.AddMonths(-5);
        var date6monthago = today.AddMonths(-6);
        today = TimeManager.EndOfDay(new DateTime(now.AddMonths(-1).Year, today.AddMonths(-1).Month, DateTime.DaysInMonth(now.Year, today.AddMonths(-1).Month)));             
        var query = from item in context.Invoices.AsNoTracking()
                where (item.InvoiceDate >= date12monthago && item.InvoiceDate <= today)
                && (item.InvType == "I" || item.InvType == null)
                select item;
        if (manufacturerId.HasValue && manufacturerId.Value > 0)
            query=query.Where(item=>item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(m=>m.ManufacturerId==manufacturerId));
        if (regionId.HasValue && regionId.Value > 0)
            query=query.Where(item=>item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(m=>m.RegionId==regionId.Value));
        if (vehicleTypeId.HasValue && vehicleTypeId.Value > 0)
            query=query.Where(item=>item.Repair.Job.Vehicle.Model.VehicleTypes.Any(v=>v.Id==vehicleTypeId.Value));
    return query;
}
public static IQueryable<Report> ToReport(this IQueryable<Invoice> context)
{
    return (from item in query
                 group item by new { item.InvoiceDate.Month, item.Repair.Job.Bodyshop } into g
                 select new TReport
                 {
                     BodyshopId = g.Key.Bodyshop.Id,  
                     Month = g.Key.Month,
                     MonthAllJobTotal = g.Count()
                 });
}
}

现在你可以做以下的事情:

var reports=db.Invoices.Recent.ToReport(); 

var reports=db.Invoices.Recent(ManufacturerEnum.Toyota).ToReport();

您可以实现分页以避免物化所有结果。我是说,你可以实现Skip和Take linq方法

基于您的代码的简单示例:

public IList<Report> GetReport(CmsEntities context, long manufacturerId, long? regionId, long? vehicleTypeId, int pageSize, int currentPage)
        {
        //Code removed to simplify
        return  query2.Skip(pageSize * currentPage).Take(pageSize );
        }

首先,一些通用的SQL优化技巧:

在尝试进行优化之前,您应该始终分析。分析的综合优势在于,它可以为您提供客观的描述或您在性能方面所处的位置,并提示您应该从哪里开始进行优化工作。作为一个免费的额外好处,在一天结束的时候,你可以用清晰的客观数字(甚至是ppt爱好者的绩效图表)向管理层证明你的努力工作是合理的。

正如一些人建议的那样,您可以尝试用两种不同的方式进行优化:

  • 优化SQL计划(可能是一个耗时的任务)

    1. 找出SQL中的查询内容
    2. 在SQL管理工作室或类似的工具中运行查询,以定义最佳的执行计划(您可能需要在进程中添加索引)。您应该从那里知道优化是否足以匹配您的性能标准。
    3. 研究当前的执行计划,检查当前的查询是否使用了最优的索引/键,连接。
    4. 修改代码或使用中间SQL对象(存储过程或视图)使linq to SQL代码使用优化的执行计划。
  • 优化代码处理数据的方式(减少耗时的任务)

    1. 缓存(这可以通过多种方式实现,但原则是获取所需的数据并在代码中处理它)。当查询优化器无法有效地管理您的查询(旧数据库版本/引擎不太好)时,它特别有效。
    2. 编译链接到SQL查询。Joe Albahari解释得比我好。

您首先选择哪种方法仅仅取决于您期望从中获得的性能提升以及您从分析会话中获得的结果。

从linq to sql代码的复杂性来看,生成的sql查询很可能不是最优的。另一方面,与使用远程SQL数据库的I/O相比,内存对象管理中的linq相当快。

请注意,如果您不缓存查询,它将每次获取数据。缓存可以这样做:

var allJobs = query.Where(x => x.Repair.Job.BodyshopId == g.Key.Bodyshop.Id).ToArray();

作为一般的经验法则,K keep It S simple S愚蠢和你的查询不是。您可以稍微重构一下,因为看起来有相当多的冗余代码:

MonthSixManufJobTotal = manufJobs.Where(x => x.InvoiceDate.Month == date6monthago.Month && x.InvoiceDate.Year == date6monthago.Year).GroupBy(x => x.Repair.Job).Count(),
    MonthSixJobTotal = allJobs.Where(x => x.InvoiceDate.Month == date6monthago.Month && x.InvoiceDate.Year == date6monthago.Year).GroupBy(x => x.Repair.Job).Count(),

有一个相当普遍的信念,通过将所有代码分组在同一个方法中(以类似于汇编代码的方式),您可以限制方法调用的数量,然后在性能方面表现得更好。然而,事实是反直觉的。使用分而治之的规则,方法块代码实际上最终会表现得更好,因为它更容易维护、优化、增强和重构。

(注意:我是基于你的编辑查询)

有几件事你可以尝试。首先,您可以将.Select().Contains()的使用简化为使用.Any()。这个可以导致EF生成一个更简单的查询,尽管如果不使用SQL Server Profiler或Visual Studio Debugger来查看实际的SQL查询很难判断。例如,修改".Select(x => x.ManufacturerId).Contains(manufacturerId)"为.Any(x => x.ManufacturerId == manufacturerId) .

要尝试的第二件事是对SQL Server执行第一个查询,并在应用程序中处理第二个查询。EF延迟查询的执行,直到出现结果枚举(例如.ToList()foreach)。因此,您可以在第一个查询上尝试(from item in context ... select item).ToList(),这将导致第二个查询在应用程序中执行,而不是在SQL Server上执行。如果第二个查询中的group by由SQL Server而不是应用程序本身导致性能下降,那么这将有所帮助。

然而,如果您尝试第二个建议,如果它是virtual导航属性,则item.Repair.Job.Bodyshop分组可能会产生负面影响,因为EF必须单独获得该对象(而不是在一个查询中获得所有对象)。这可以通过将第一个查询更改为context.Invoices.AsNoTracking().Include("Repair.Job")context.Invoices.AsNoTracking().Include(x => x.Repair.Job)来缓解(第二个选项不可用较旧的EF版本),并将第二个查询更改为group item by new { item.InvoiceDate.Month, item.Repair.Job.BodyshopId } .

我建议您将其分解为较小的位大小部分。将初始数据放入单独的列表中,然后可能将它们缝合在一起。

你可以得到很多里程到ToDictionary和tollookup扩展。

我敢打赌,如果你看一下你发送到数据库的SQL,它将是疯狂的。

随着不必要的where条件的消除,我将把赌注押在group by子句上。

您将item.Repair.Job.Bodyshop作为分组字段之一。每当您使用这样的东西时,EF将生成一个SQL GROUP BY子句,其中包括所有来自相应表的字段。我不知道在db表中有多少列对应于您的Bodyshop实体,但在任何情况下,使用这种方式很可能不允许创建一个好的SQL执行计划。

我建议您尝试以下与您的精简查询等价的语句:

var query = context.Invoices.AsNoTracking().Where(item =>
    (item.InvType == "I" || item.InvType == null) &&
    (item.InvoiceDate >= date12monthago && item.InvoiceDate <= today));
if (regionId.HasValue && regionId.Value > 0)
    query = query.Where(item =>
        item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(source =>
            source.ManufacturerId == manufacturerId && source.RegionId == regionId.Value));
else
    query = query.Where(item => 
        item.Repair.Job.Bodyshop.Manufacturer2Bodyshop.Any(source => 
            source.ManufacturerId == manufacturerId));
if (vehicleTypeId.HasValue && vehicleTypeId.Value > 0)
    query = query.Where(item =>
        item.Repair.Job.Vehicle.Model.VehicleTypes.Any(vehicleType => 
            vehicleType.Id == vehicleTypeId.Value);
var query2 = query
    .GroupBy(item => new { Month = item.InvoiceDate.Month, BodyshopId = item.Repair.Job.Bodyshop.Id })
    .Select(g => new TReport { BodyshopId = g.Key.BodyshopId, Month = g.Key.Month, MonthAllJobTotal = g.Count() });
var result = query2.ToList();