哪个方法执行得更好:.Any()vs .Count()> 0?

System.Linq命名空间中,我们现在可以扩展IEnumerableAny()Count()扩展方法。

最近,我被告知,如果我要检查一个集合包含在它里面1个或多个项目,我应该使用.Any()而不是扩展方法.Count() > 0扩展方法,因为.Count()扩展方法必须遍历所有项目。

其次,一些集合有一个属性(不是扩展方法),它是CountLength 。 它会更好利用这些,而不是.Any().Count()

是/娜?


如果你开始的东西,有一个.Length.Count (如ICollection<T> IList<T> List<T>等) -那么这将是最快的选项,因为它并不需要请通过Any()检查非空IEnumerable<T>序列所需的GetEnumerator() / MoveNext() / Dispose()序列。

对于IEnumerable<T> ,那么Any()通常会更快,因为它只需要查看一次迭代。 但是请注意, Count()的LINQ-to-Objects实现会检查ICollection<T> (使用.Count作为优化) - 所以如果您的底层数据源直接是列表/集合,则不会有巨大的差异。 不要问我为什么它不使用非泛型ICollection ...

当然,如果已经使用LINQ对其进行过滤等( Where等),你将有一个迭代器基于块的序列,因此该ICollection<T>优化是无用的。

通常使用IEnumerable<T> :使用Any() -p


注意:当实体框架4是真实的时,我写了这个答案。 这个答案的要点是不要进入微不足道.Any() VS .Count()性能测试。 重点在于表明EF远远不够完美。 新版本的比较好...但如果你的代码的一部分这是缓慢的,它采用EF,测试直接TSQL和性能进行比较,而不是依赖于假设(即.Any()总是比快.Count() > 0 ) 。


虽然我同意最多的答案和意见 - 特别是在Any信号开发人员意图比Count() > 0更好 - 我已经有了在SQL Server(EntityFramework 4)上Count数量级更快的情况。

这里是Any查询超时例外(在~200.000记录):

con = db.Contacts.
    Where(a => a.CompanyId == companyId && a.ContactStatusId <= (int) Const.ContactStatusEnum.Reactivated
        && !a.NewsletterLogs.Any(b => b.NewsletterLogTypeId == (int) Const.NewsletterLogTypeEnum.Unsubscr)
    ).OrderBy(a => a.ContactId).
    Skip(position - 1).
    Take(1).FirstOrDefault();

以毫秒为单位执行Count版本:

con = db.Contacts.
    Where(a => a.CompanyId == companyId && a.ContactStatusId <= (int) Const.ContactStatusEnum.Reactivated
        && a.NewsletterLogs.Count(b => b.NewsletterLogTypeId == (int) Const.NewsletterLogTypeEnum.Unsubscr) == 0
    ).OrderBy(a => a.ContactId).
    Skip(position - 1).
    Take(1).FirstOrDefault();

我需要找到一种方法来查看LINQ所产生的确切SQL - 但很明显,在某些情况下, CountAny之间存在巨大的性能差异,不幸的是,似乎在任何情况下都不能坚持使用Any

编辑:这里是生成的SQL。 美女,你可以看到;)

ANY

exec sp_executesql N'SELECT TOP (1) 
[Project2].[ContactId] AS [ContactId], 
[Project2].[CompanyId] AS [CompanyId], 
[Project2].[ContactName] AS [ContactName], 
[Project2].[FullName] AS [FullName], 
[Project2].[ContactStatusId] AS [ContactStatusId], 
[Project2].[Created] AS [Created]
FROM ( SELECT [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created], row_number() OVER (ORDER BY [Project2].[ContactId] ASC) AS [row_number]
    FROM ( SELECT 
        [Extent1].[ContactId] AS [ContactId], 
        [Extent1].[CompanyId] AS [CompanyId], 
        [Extent1].[ContactName] AS [ContactName], 
        [Extent1].[FullName] AS [FullName], 
        [Extent1].[ContactStatusId] AS [ContactStatusId], 
        [Extent1].[Created] AS [Created]
        FROM [dbo].[Contact] AS [Extent1]
        WHERE ([Extent1].[CompanyId] = @p__linq__0) AND ([Extent1].[ContactStatusId] <= 3) AND ( NOT EXISTS (SELECT 
            1 AS [C1]
            FROM [dbo].[NewsletterLog] AS [Extent2]
            WHERE ([Extent1].[ContactId] = [Extent2].[ContactId]) AND (6 = [Extent2].[NewsletterLogTypeId])
        ))
    )  AS [Project2]
)  AS [Project2]
WHERE [Project2].[row_number] > 99
ORDER BY [Project2].[ContactId] ASC',N'@p__linq__0 int',@p__linq__0=4

COUNT

exec sp_executesql N'SELECT TOP (1) 
[Project2].[ContactId] AS [ContactId], 
[Project2].[CompanyId] AS [CompanyId], 
[Project2].[ContactName] AS [ContactName], 
[Project2].[FullName] AS [FullName], 
[Project2].[ContactStatusId] AS [ContactStatusId], 
[Project2].[Created] AS [Created]
FROM ( SELECT [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created], row_number() OVER (ORDER BY [Project2].[ContactId] ASC) AS [row_number]
    FROM ( SELECT 
        [Project1].[ContactId] AS [ContactId], 
        [Project1].[CompanyId] AS [CompanyId], 
        [Project1].[ContactName] AS [ContactName], 
        [Project1].[FullName] AS [FullName], 
        [Project1].[ContactStatusId] AS [ContactStatusId], 
        [Project1].[Created] AS [Created]
        FROM ( SELECT 
            [Extent1].[ContactId] AS [ContactId], 
            [Extent1].[CompanyId] AS [CompanyId], 
            [Extent1].[ContactName] AS [ContactName], 
            [Extent1].[FullName] AS [FullName], 
            [Extent1].[ContactStatusId] AS [ContactStatusId], 
            [Extent1].[Created] AS [Created], 
            (SELECT 
                COUNT(1) AS [A1]
                FROM [dbo].[NewsletterLog] AS [Extent2]
                WHERE ([Extent1].[ContactId] = [Extent2].[ContactId]) AND (6 = [Extent2].[NewsletterLogTypeId])) AS [C1]
            FROM [dbo].[Contact] AS [Extent1]
        )  AS [Project1]
        WHERE ([Project1].[CompanyId] = @p__linq__0) AND ([Project1].[ContactStatusId] <= 3) AND (0 = [Project1].[C1])
    )  AS [Project2]
)  AS [Project2]
WHERE [Project2].[row_number] > 99
ORDER BY [Project2].[ContactId] ASC',N'@p__linq__0 int',@p__linq__0=4

看起来纯粹在EXISTS下工作比计算Count要糟糕得多,然后在计数== 0的地方进行。

让我知道你们是否在我的发现中看到了一些错误。 无论Any vs Count的讨论如何,所有这些都可以避免,当更改为存储过程时,任何更复杂的LINQ都会更好;)。


由于这是相当热门的话题,答案不同,我不得不重新审视问题。

测试环境: EF 6.1.3,SQL Server,300k记录

表模型

class TestTable
{
    [Key]
    public int Id { get; set; }

    public string Name { get; set; }

    public string Surname { get; set; }
}

测试代码:

class Program
{
    static void Main()
    {
        using (var context = new TestContext())
        {
            context.Database.Log = Console.WriteLine;

            context.TestTables.Where(x => x.Surname.Contains("Surname")).Any(x => x.Id > 1000);
            context.TestTables.Where(x => x.Surname.Contains("Surname") && x.Name.Contains("Name")).Any(x => x.Id > 1000);
            context.TestTables.Where(x => x.Surname.Contains("Surname")).Count(x => x.Id > 1000);
            context.TestTables.Where(x => x.Surname.Contains("Surname") && x.Name.Contains("Name")).Count(x => x.Id > 1000);

            Console.ReadLine();
        }
    }
}

结果:

任何()〜3ms

Count()〜230ms为第一个查询,〜400ms为秒

备注:

对于我的情况,EF没有像他在文章中提到的@Ben那样生成SQL。

链接地址: http://www.djcxy.com/p/33495.html

上一篇: Which method performs better: .Any() vs .Count() > 0?

下一篇: Composite key entity and dont want to declare PK keys