[Analyzer] Prefer .Length/Count/IsEmpty over Any() #75933

stephentoub · 2022-09-20T21:44:58Z

If .Any() is used on an expression strongly-type as an ICollection<T> or an implementing type with a Count property (or a T[] with a Length property), the code should be using .Count != 0/.Length != 0 instead (or using !IsEmpty if it has an IsEmpty property).

(I really thought we had this analyzer, but I can't find any record of it now.)

The text was updated successfully, but these errors were encountered:

ghost · 2022-09-20T21:45:06Z

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

If .Any() is used on an expression strongly-type as an ICollection/ICollection<T> or an implementing type with a Count property (or a T[] with a Length property), the code should be using .Count != 0/.Length != 0 instead (or using !IsEmpty if it has an IsEmpty property).

(I really thought we had this analyzer, but I can't find any record of it now.)

Author:	stephentoub
Assignees:	-
Labels:	`area-System.Collections`, `code-analyzer`, `code-fixer`
Milestone:	8.0.0

terrajobst · 2022-09-27T18:33:53Z

Video

Looks good as proposed

Severity: Info
Category: Performance

CollinAlpert · 2022-10-12T21:04:04Z

I'd like to do this one.

stephentoub · 2022-10-12T21:37:31Z

Excellent. Thanks!

CollinAlpert · 2022-10-15T14:07:41Z

I just found this Analyzer. Should I rather extend that one's functionality or create a new Analyzer only for .Length and .IsEmpty?

stephentoub · 2022-10-24T14:58:39Z

If extending that analyzer results in cleaner code overall, then I think that's probably fine, e.g. if you'd otherwise be duplicating the majority of that analyzer's code and the new analysis requires just a few additional lines. But regardless of whether it's augmenting an existing analyzer or adding a new one, it should have its own new diagnostic ID; which component raises that ID is an implementation detail.

fowl2 · 2022-10-25T03:00:52Z

Isn't there some way to make the runtime performance of Any() equivalent to .Length != 0 etc. rather than encouraging code churn and obscuring intent of all the calling code?'

Default interface implementation plus inlining attribute? Some sort of "stronger" inlining attribute? Enhancing source generators?

stephentoub · 2022-10-25T14:10:32Z

Isn't there some way to make the runtime performance of Any() equivalent to .Length != 0 etc. rather than encouraging code churn and obscuring intent of all the calling code?'

I don't believe if (array.Length != 0) is obscuring intent. If anything, I personally think if (array.Any()) is more obscure... I have to mentally translate that, and my thought process is along the lines of "array.Any()? Oh, does the array contain anything? Ok, so this is checking whether the length is not 0? Or... is it checking for non-default/non-null values?" As for runtime performance, Any() will do a type check for ICollection<T> and return its .Count != 0, but there's still the Any method invocation (it's unlikely to be inlined nor would forcing it be a good idea in the majority case), there's a type check, and there's an interface invocation. It's a whole lot better to just access .Length/Count, and IMHO, clearer on the intent as well.

fowl2 · 2022-11-03T12:23:35Z

I don't believe if (array.Length != 0) is obscuring intent. If anything, I personally think if (array.Any()) is more obscure... I have to mentally translate that, and my thought process is along the lines of "array.Any()? Oh, does the array contain anything?

I guess this is more about what you're familiar with then. To me, Any() expresses "Is there anything in this collection?" better than "Is the Length a particular value? Or is it Count? Is it 'not is empty'?".

Or... is it checking for non-default/non-null values?"

Is there a common library or other platform that uses a Any() method in this way?

As for runtime performance, Any() [...]

Yeah I was hoping for something to make compile it down to the same runtime cost no matter which syntax is used. I guess that would have to be in the compiler proper at this point and probably not considered to be a big enough gain for the complexity there.

A fixer like this pushes the complexity cost out by scattering it as code churn across the vastly greater quantity of codebases.

RenderMichael · 2022-11-04T14:27:01Z

Would it work to try using TryGetNonEnumeratedCount() in Any(), or would that impact performance too much where that method returns false?

Lancelotbronner · 2022-11-14T14:08:08Z

Could bool IsEmpty => Count/Length is 0; be added to arrays and collections? I've always wondered why it was missing and now that spans have an IsEmpty I wonder even more.

pedrofr · 2022-11-29T11:31:41Z

Could bool IsEmpty => Count/Length is 0; be added to arrays and collections? I've always wondered why it was missing and now that spans have an IsEmpty I wonder even more.

In the same vein, can't IEnumerable (the non generic one) have an extension method, bundled with Linq, called bool IsEmpty()? Would it be possibly inlined?

stephentoub · 2022-11-29T11:41:39Z

For new API requests, please open new issues. Thanks.

Applesauce314 · 2022-11-29T13:14:01Z

Isn't there some way to make the runtime performance of Any() equivalent to .Length != 0 etc. rather than encouraging code churn and obscuring intent of all the calling code?'

Default interface implementation plus inlining attribute? Some sort of "stronger" inlining attribute? Enhancing source generators?

I get that a code analyzer is a lot easier to implement than then improving the runtime performance of Any(), but the interaction with CA1827 - Do not use Count/LongCount when Any can be used will need to be carefully considered.

CA1827 plus this new rule will result in these instruction to programmers:

use Any instead of Count
use Count or Length instead of Any

obviously in the background the rules are actually:

use Any() instead of Count() (the function call one)
use Count or Length instead of Any() (the property ones)
But I am sure this will confuse more than one person.

I general I have been using Any() instead of Count or Length as it makes the "is there anything in this list? (as in a bunch of items, not a C# list)" check have the same appearance whether you are checking a List<T> or an IEnumerable<T>. But I have always wondered if I was paying a performance penalty for this.

pedrofr · 2022-11-29T23:46:24Z

@stephentoub, my comment wasn't exactly a request, but rather exactly what @Applesauce314 is talking about.

The proposed analyzer would be confusing considering CA1827 and the fact that different, but similar in theme, types would have different best practices: Length/Count/Any() for T[]/ICollection<T>/IEnumerable<T>, respectively.

In my opinion this analyzer shouldn't exist if a common, objective and performant API is possible.

aKzenT · 2022-12-16T19:51:46Z

I agree that discouraging Any() in favour of Length and Count hurts readability and hides real user intent. It is also reasonable that there are implementations of collection interfaces where computing Count is more expensive than just checking if a list is empty. E.g. think of some kind of linked list implementation. So hiding real attempt to get performance gains in limitted cases is not a good solution.

On the other hand checking if a collection is empty or not is a very common scenario that should have very low overhead in performance terms. So if there is no easy way to teach the compiler/runtime to optimize the call away, I think having a dedicated IsEmpty property is the best solution overall and would also solve the issue arround confusing and seemingly conflicting rules.

To this end I have created an API proposal to get the discussion started: #79769

Timovzl · 2022-12-19T15:19:29Z

I agree that discouraging Any() in favour of Length and Count hurts readability and hides real user intent.

@aKzenT, I'm with Stephen Toub on this one:

I don't believe if (array.Length != 0) is obscuring intent. If anything, I personally think if (array.Any()) is more obscure... I have to mentally translate that, and my thought process is along the lines of "array.Any()? Oh, does the array contain anything? Ok, so this is checking whether the length is not 0? Or... is it checking for non-default/non-null values?"

Moreover, the Length or Count property gives a clear hint that this information is directly accessible. As a counterexample, an IEnumerable<T> does not expose such a property, since it may use deferred enumeration and/or may not support multiple enumeration. Therefore, the presence and use of a Length or Count property gives me confirmation that no enumeration is happening and that the count can indeed be easily checked.

It is also reasonable that there are implementations of collection interfaces where computing Count is more expensive than just checking if a list is empty. E.g. think of some kind of linked list implementation.

Any such implementation that I have seen exposes a Count property that it simply updates at modification time. But for the sake of argument, if a collection is so specific that it does not, then we are probably dealing with extreme performance (or concurrency) constraints, and using that collection type's dedicated features (most likely in the form of its own IsEmpty property) is in order.

Wouldn't a general IsEmpty property expose yet another way of doing things? It also reduces consistency with checking for other population counts, such as Count == 1.

sfiruch · 2023-01-11T21:57:00Z

This wouldn't be even an issue in an ideal world, right? The performance of .Any() should be strictly better than the alternative. Is it impossible to add an explicit .Any() method to arrays and List<T>?

The general approach of using analyzers to work around usability issues in the frameworks seems wrong to me.

stephentoub · 2023-01-11T22:20:27Z

Is it impossible to add an explicit .Any() method to arrays and List?

Impossible? No, but we're then marching down a path of adding one for every collection type, which isn't viable, and in the limit actually is impossible since we don't control the implementations of nor can we reference most collection implementations in the world.

The performance of .Any() should be strictly better than the alternative.

Why? Please share the implementation of Any() that would achieve this.

sfiruch · 2023-01-12T01:38:22Z

Is it impossible to add an explicit .Any() method to arrays and List?

Impossible? No, but we're then marching down a path of adding one for every collection type, which isn't viable, and in the limit actually is impossible since we don't control the implementations of nor can we reference most collection implementations in the world.

The alternatives appear to be worse:
a) implement analyzer rules for most collection implementations in the world
b) implement a general rule in an analyzer which might be wrong for half of most collection implementations in the world. I have several collection implementations in my code bases where counting is more expensive than checking whether the collection is empty.

The performance of .Any() should be strictly better than the alternative.

Why? Please share the implementation of Any() that would achieve this.

You're absolutely right. I should've written equal or better instead of strictly better. The reasoning is obvious: .Any() computes less information than .Length or .Count. Practically speaking, adding

public bool Any() => Count!=0;

to the most common collection types appears to be a simple solution, which appears to have only upsides:

Usability: No analyzer/training/documentation/configuration/brainpower required
Efficiency: It's efficient no matter which way developers check
Productivity: Developers are not asked to change existing code
Effort: Low (but I'm probably missing some reflection-related compatibility nightmares)

In my view, an analyzer should be only considered if there's not easy way to make the framework do "the right thing" otherwise. Otherwise we should consider a "replace all your code with hand-optimized assembly" analyzer 😉

stephentoub · 2023-01-12T02:58:28Z

I have several collection implementations in my code bases where counting is more expensive than checking whether the collection is empty.

Regardless of this discusson, that's asking for perf problems as your consumers assume Count is cheap. And if you're your only consumer and you don't like the suggestion offered by this diagnostic id, you can turn off the suggestion.

adding to the most common collection types appears to be a simple solution, which appears to have only upsides:

It does not have only upside. For example, how do you propose handling ICollection<T>.Any()? If the answer is "don't", that's a large gap as it's very common to pass around IList<T>, ICollection<T>, etc., and if the answer is DIMs, that's very likely a breaking change in the making due the nature of DIMs and the diamond pattern. And if the answer is add an extension on each of the interfaces, that's also likely to lead to breaking changes due to ambiguity.

Besides that, every single collection needs a new Any() method? That's not upside. And why stop there? What about the rest of LINQ? Add each of those hundreds of methods to every collection?

Usability

You've cited usability as a benefit here. I continue to disagree with that notion. I understand some devs think that .Any() is clearer than Count > 0 or the like; I and others see the opposite. That's especially true with a collection like List<bool>, where I see Any() as being ambiguous and Count > 0 as being unambiguous.

Efficiency

It's efficient if the collection exposes it. The vast majority of collection types will not, nor will any new methods be added to the core collections in past releases, in netstandard2.0, etc. And now anyone reading code needs to know the exact type of the variable in question and know whether foo.Any() will resolve to an instance method or LINQ extension.

Productivity

No one is being asked to do anything. The tool is making a suggestion. Not a warning, not an error, a suggestion. If you don't like the suggestion, you can ignore it, disable it. The suggestion isn't even made in command-line tooling and builds. (And if everyone who defines a collection needs to also add an Any() method, that's a hit in productivity.) If you don't care about perf this much, disable all the perf analyzer suggestions for that matter.

sfiruch · 2023-01-12T10:01:34Z

Regardless of this discusson, that's asking for perf problems as your consumers assume Count is cheap. And if you're your only consumer and you don't like the suggestion offered by this diagnostic id, you can turn off the suggestion.

Right, but in practice there are cases like ConcurrentDictionary, BitArray or lazily produced collections. For those .Any() should be faster than .Count.

You've cited usability as a benefit here. I continue to disagree with that notion. I understand some devs think that .Any() is clearer than Count > 0 or the like; I and others see the opposite. That's especially true with a collection like List<bool>, where I see Any() as being ambiguous and Count > 0 as being unambiguous.

Here I wasn't talking about the specifics of .Any() vs. Count, but instead about the usability of .NET itself. When there are multiple obvious ways to write something, all these ways should perform well. Better analyzers, training or documentation are just work-arounds for a more fundamental problem. Such problems should be fixed one time in the framework, instead of requiring thousands of changes in thousands of repos, and brain cells from your average developer.

Personally, I find .Any() to be clearer and better express my intent. Please note, that I'm not the only one with this view. But it's fine to disagree.

No solution will work perfectly all the time, but shouldn't we try to remove sharp corners and focus on the pit of success?

Besides that, every single collection needs a new Any() method? That's not upside. And why stop there? What about the rest of LINQ? Add each of those hundreds of methods to every collection?

Not every single collection. Commonly used collections should have a performant .Any() implementation. It's a one time cost, for developers which are intimately familiar with the performance details of .NET.

Ideally, all of LINQ would be performant. There are ways to achieve this (ask Joe Duffy or check https://github.com/NetFabric/LinqBenchmarks).

stephentoub added area-System.Collections code-analyzer Marks an issue that suggests a Roslyn analyzer code-fixer Marks an issue that suggests a Roslyn code fixer labels Sep 20, 2022

stephentoub added this to the 8.0.0 milestone Sep 20, 2022

stephentoub added the api-ready-for-review API is ready for review, it is NOT ready for implementation label Sep 27, 2022

terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Sep 27, 2022

stephentoub assigned CollinAlpert Oct 12, 2022

layomia mentioned this issue Oct 24, 2022

System.Collections work planned for .NET 8 #77391

Closed

14 tasks

CollinAlpert mentioned this issue Oct 29, 2022

Analyzer: Prefer .Length/Count/IsEmpty over Any() dotnet/roslyn-analyzers#6236

Merged

jeffhandley added the partner-impact This issue impacts a partner who needs to be kept updated label Nov 14, 2022

buyaa-n mentioned this issue Nov 16, 2022

.NET 8 developers can verify more APIs for correct usage to speed up their development #78442

Closed

tarekgh mentioned this issue Nov 22, 2022

Libraries work to support Cloud Native effort for .NET 8 #78518

Closed

61 tasks

aKzenT mentioned this issue Dec 16, 2022

[API Proposal]: Add IsEmpty property to collections and arrays. #79769

Open

buyaa-n closed this as completed in dotnet/roslyn-analyzers#6236 Jan 30, 2023

CollinAlpert mentioned this issue Feb 17, 2023

Add documentation for CA1860. dotnet/docs#34157

Merged

ghost locked as resolved and limited conversation to collaborators Mar 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Analyzer] Prefer .Length/Count/IsEmpty over Any() #75933

[Analyzer] Prefer .Length/Count/IsEmpty over Any() #75933

stephentoub commented Sep 20, 2022 •

edited

Loading

ghost commented Sep 20, 2022

terrajobst commented Sep 27, 2022 •

edited by dotnet-api-review bot

Loading

CollinAlpert commented Oct 12, 2022

stephentoub commented Oct 12, 2022

CollinAlpert commented Oct 15, 2022

stephentoub commented Oct 24, 2022 •

edited

Loading

fowl2 commented Oct 25, 2022

stephentoub commented Oct 25, 2022 •

edited

Loading

fowl2 commented Nov 3, 2022

RenderMichael commented Nov 4, 2022

Lancelotbronner commented Nov 14, 2022

pedrofr commented Nov 29, 2022

stephentoub commented Nov 29, 2022

Applesauce314 commented Nov 29, 2022 •

edited

Loading

pedrofr commented Nov 29, 2022 •

edited

Loading

aKzenT commented Dec 16, 2022

Timovzl commented Dec 19, 2022

sfiruch commented Jan 11, 2023

stephentoub commented Jan 11, 2023 •

edited

Loading

sfiruch commented Jan 12, 2023

stephentoub commented Jan 12, 2023 •

edited

Loading

sfiruch commented Jan 12, 2023

[Analyzer] Prefer .Length/Count/IsEmpty over Any() #75933

[Analyzer] Prefer .Length/Count/IsEmpty over Any() #75933

Comments

stephentoub commented Sep 20, 2022 • edited Loading

ghost commented Sep 20, 2022

terrajobst commented Sep 27, 2022 • edited by dotnet-api-review bot Loading

CollinAlpert commented Oct 12, 2022

stephentoub commented Oct 12, 2022

CollinAlpert commented Oct 15, 2022

stephentoub commented Oct 24, 2022 • edited Loading

fowl2 commented Oct 25, 2022

stephentoub commented Oct 25, 2022 • edited Loading

fowl2 commented Nov 3, 2022

RenderMichael commented Nov 4, 2022

Lancelotbronner commented Nov 14, 2022

pedrofr commented Nov 29, 2022

stephentoub commented Nov 29, 2022

Applesauce314 commented Nov 29, 2022 • edited Loading

pedrofr commented Nov 29, 2022 • edited Loading

aKzenT commented Dec 16, 2022

Timovzl commented Dec 19, 2022

sfiruch commented Jan 11, 2023

stephentoub commented Jan 11, 2023 • edited Loading

sfiruch commented Jan 12, 2023

stephentoub commented Jan 12, 2023 • edited Loading

sfiruch commented Jan 12, 2023

stephentoub commented Sep 20, 2022 •

edited

Loading

terrajobst commented Sep 27, 2022 •

edited by dotnet-api-review bot

Loading

stephentoub commented Oct 24, 2022 •

edited

Loading

stephentoub commented Oct 25, 2022 •

edited

Loading

Applesauce314 commented Nov 29, 2022 •

edited

Loading

pedrofr commented Nov 29, 2022 •

edited

Loading

stephentoub commented Jan 11, 2023 •

edited

Loading

stephentoub commented Jan 12, 2023 •

edited

Loading