-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex collections should implement generic collection interfaces #13933
Comments
Hey Justin! |
Love the idea. I'm not sure how strict the backwards compatibility policy is for this particular library, but I'm going to assume it's pretty high considering how widespread use of this library is. Based on that, I believe the path with the highest chance of success is adding the following interfaces:
In particular, the above intentionally avoid adding Also note that it won't be possible to use the |
@sharwell what overloads from public class CaptureCollection : ICollection, ICollection<Capture>, IReadOnlyCollection<Capture>
{
// Existing...
public Capture this[int i] { get; }
public int Count { get; }
public IEnumerator GetEnumerator();
Object ICollection.SyncRoot { get; }
bool ICollection.IsSynchronized { get; }
void ICollection.CopyTo(Array array, int arrayIndex);
// Proposed...
bool ICollection<Capture>.IsReadOnly { get; }
void ICollection<Capture>.Add(Capture item);
void ICollection<Capture>.Clear();
bool ICollection<Capture>.Contains(Capture item);
void ICollection<Capture>.CopyTo(Capture[] array, int arrayIndex);
bool ICollection<Capture>.Remove(Capture item);
IEnumerator<Capture> IEnumerable<Capture>.GetEnumerator();
} |
@justinvp It's not the methods of public static void Foo(this ICollection collection)
{
}
public static void Foo<T>(this ICollection<T> collection)
{
} Currently, if you call
|
@sharwell Fair enough :) |
…(fixes #271) * CaptureCollection implements IReadOnlyList<Capture> * GroupCollection implements IReadOnlyList<Group> * MatchCollection implements IReadOnlyList<Match>
How common are such overloads in practice? public static void Foo(this ICollection collection)
{
}
public static void Foo<T>(this ICollection<T> collection)
{
} These would already be problematic as they are, because you would already run into the "ambiguous call" issue when used with the most commonly used collections ( I realize that for existing collections, such calls would already be disambiguated in order for the the code to compile, and that by implementing Would this be a binary breaking change or just a potential source breaking change? By not implementing |
That may be true, but you do get all the functionality that operates on Since I'm not on the team (just a random [hopeful] contributor), I wanted to start with what seemed to be a path of least resistance. Do you have a motivating example for why it should implement additional interfaces? |
I agree with you on However, there are places that make use of For example, LINQ's public static TSource Last<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
IList<TSource> list = source as IList<TSource>;
if (list != null) {
int count = list.Count;
if (count > 0) return list[count - 1];
}
else {
using (IEnumerator<TSource> e = source.GetEnumerator()) {
if (e.MoveNext()) {
TSource result;
do {
result = e.Current;
} while (e.MoveNext());
return result;
}
}
}
throw Error.NoElements();
}
|
@sharwell This is a compile-time break. However, unless it is proven to be prevalent, is not we consider a breaking change from our stand point (theoretically, any addition could be a breaking change.) If we do this, I'd like make sure there's no serialization impact - I think there is none[1], but I'd like someone to make explicitly sure there isn't; XML serializer doesn't require an opt in by the type and we have we've broken a few consumers in the past (and backed them out before we shipped). [1] In this case, I think the XML serializer needs a public "Add" before it will attempt to serialize this, but please confirm. |
Even if Asterisks make bad footnotes in Markdown¹. ¹ Alt+0185 (number pad only) makes for a good one² 😄 |
If we had a time machine and added From our API review standpoint however, it is perfectly fine to opt into |
So your vote is for implementing both Either way dotnet/corefx#277 should be a good starting point. 😉 |
We treat As far the serialization impact goes: I don't think implementing the read-only interfaces make a difference. AFAIK making a type implement EDIT My point is that we should implement both, the read-only interface as well as mutable-interfaces. That's the same thing we did for immutable and it's important for interop with existing APIs. |
@terrajobst Yep, but my comment was around implementing the mutable versions as well as the readonly versions. |
Agreed. I've clarified my comment above. |
@terrajobst Your edit made a huge difference. @davkean and @terrajobst : Since I haven't worked with you directly, it can be tough to understand from the wording up until that edit when you are simply pointing out a historical observation, and when you are for/against taking a particular action. I took a conservative viewpoint because I've found it to be generally more acceptable for arbitrary 3rd party projects I contribute to. Since you were both for addressing the issue with a more comprehensive solution, it helps to be more clear about it. |
@sharwell Sorry, these sorts of things will come up in the API review/speclet review, where it will be a little clearer; feedback is usually given in the form of "you must do this", "you must not do that" and "you should consider this". Very similar style to the Framework Design Guidelines themselves. |
If it makes you feel any better, it wasn't long ago (a few months maybe?) that I was about the worst at this aspect. 😯 |
I edited the description for this issue to make it more of a speclet, based on the discussion so far and the proposed API Review process. |
@sharwell In the original description I had said I would work on the implementation for this. I was planning to submit a PR after the issue had been reviewed (per the Contributing guide), but you jumped in with an initial PR. I guess there needs to be a better way of claiming "dibs" on a contribution? Not sure how we'll proceed with this after the issue is tagged "Accepting PRs". |
@justinvp I missed that part. I closed my pull request so it's all yours. I'll reopen it only if you find you aren't going to have time to get it done and post a message back here saying so. |
@justinvp IMO your spec reads well and is comprehensive. The only thing I can even think to change is to clarify that the new explicitly-implemented members are due to overload conflicts ( You might also state that methods which change the collection throw |
@sharwell Thanks! I updated the description based on your feedback. |
As I started working on an initial implementation, I decided to go ahead and implement |
For what it's worth, I am very strongly against implementing mutable interfaces on immutable collections. I am well aware of the interoperability argument, but I believe that intentionally breaking the semantic contract of an interface carries with it a very high burden of proof, and I haven't seen any compelling evidence that such burden has been met, especially given that interoperability concerns have multiple possible mitigations. It is bad enough that we have to live with the existing semantic dissonance caused by years of not having a non-writable indexable interface. We should not be making the problem even worse. |
I'm afraid that ship has already sailed. Immutable collection is marked as stable and already implements the mutable interfaces.
We don't break the semantic contract of an interface. The .NET collection interfaces don't guarantee that the consumer can mutate the instance. That's why both The |
That ship may have sailed for Immutable Collections, but here we are talking about existing Regex collections. I think it is still a valid discussion to have, nor am the first to suggest it; I am throwing my (limited) weight behind @sharwell's point earlier in this thread.
The .NET collection interfaces don't guarantee mutability according to their documentation; but when half of the methods on the type are for mutating the instance, I would argue that the interface does convey mutability simply by its contents. From an API design perspective the names and signatures of members are a far more important communication mechanism with the API consumer than the documentation is.
I have never in my career seen the IsReadOnly property checked by non-Framework code, nor can I think of many cases where it would make sense. The number of use cases that can reasonably alter their behavior based on the IsReadOnly property is vanishingly small; the vast majority of implementations either write to the list, or they don't. This is also the same argument that was used in 2004 to justify the lack of a read-only interface; yet the arrival of IReadOnlyList in .NET 4.5 is compelling evidence that it doesn't hold water. I am afraid that we have become numb to the absurdity of implementing an interface and then throwing NotSupportedException from half of its members, because we were forced to do exactly that for so long. But we are not forced to do it any longer, and we shouldn't keep doing it just because it is what we have always done. |
We believe that consistency trumps most other aspects, which is why following existing patterns is virtually always good unless there are compelling reasons to abandon them. We re-evaluated whether abandoning would be worthwhile when we added immutable collections and decided against it. I don't see a reason why Checking It's also worth pointing out that read-only collections exist since .NET 1.0. So it's not that the property |
You need to o be able to build against this library and run on .NET Framework 4.6. Sent from my Windows Phone From: Justin Van Pattenmailto:notifications@github.com But is it possible for an assembly compiled against .NET 4.5 (or earlier) to run against .NET Core? If not, the argument of binary compatibility is moot, since the code will have to be recompiled anyway... Implementing the generic GetEnumerator() implicitly would be great, since it would make it possible to write foreach (var capture in captures). That is an excellent point, @thomaslevesquehttps://github.com/thomaslevesque! Personally, I'd really like to be able to expose the generic GetEnumerator() to allow var in foreach. Since this is a new library in .NET Core, this may be the best opportunity to make such a change (before developers take a binary dependency on this library). It'd be a source compatible change, so any existing code ported to .NET Core would continue to work when recompiled, without the developer having to make any changes. Further, if this change is possible, we may also want to take the opportunity to expose a struct enumerator to avoid the enumerator allocation. I'd be happy to make such changes to the PR, if the owners agree the change can be made. — |
Ah, I knew I was missing something ;) I guess it wouldn't matter for reference types, but it would break for value types... EDIT: Actually my last comment doesn't make sense... we're talking about specific collections, and there are no value types involved here. |
So if you accept this suggestion (adding the generic interfaces to these types), the changes will have to be added to .NET 4.6 as well? How much time is left to make changes to .NET 4.6? |
Actually my previous comment doesn't make sense... we're talking about specific collections, and there are no value types involved here. But then, couldn't it work in .NET 4.6 ? If the code is bound at compile time to a |
Yes @justinvp any types that already exist on the full desktop framework will need to be updated there as well if they are updated in .NET Core. We have a high level goal of making code written for .NET work no matter what .NET platform you are targeting. As for how much time is left to add new public APIs to .NET 4.6 at this point we are not accepting any more unless they are critical. With that said we fully expect .NET Core to evolve much more quickly then the full .NET framework that ships in Windows. We just have to be aware that any changes need to eventually make it there as well so while thinking about breaking changes you cannot just consider .NET Core you must consider the other platforms as well. If we don't do that then we risk diverging the platforms and making .NET development as a whole more difficult. |
Thanks, @weshaggard. I think it'd be worth updating one of the wiki pages with this information as I don't believe it's explicitly stated anywhere currently. I opened dotnet/corefx#392 to track that. |
@davkean & @weshaggard, this is getting off-topic, but one more related question about the full .NET framework: if I have some code that I compile against the .NET Core regex library, and I run my compiled assembly on the full .NET framework, does it use the regex types in the .NET Core regex library or the regex types in the full .NET framework at runtime? |
@justinvp it would run the code that ships in the .NET Framework. The assembly identity System.Text.RegularExpressions ships in the box and is in the GAC so unless there were other changes to the identity then you wouldn't be running the code from this .NET Core library. |
From the API Review:
These are the three methods:
Why? Is this just to match I can see |
Yes, it's basically consistency with |
Can you show me sample code of how you would use |
@davkean, in the API Review, you call out that
|
@davkean ? |
Sorry, for the late reply. I agreed with your assessment - I can think of contrived examples, but nothing solid. |
This issue was reviewed today. It looks good as proposed. |
Implement IList<T>, IReadOnlyList<T>, and IList on the Regex collections. Fixes #271
PR dotnet/corefx#1756 is ready for review (against the future branch). Note: I just want to call out that the PR has I had originally proposed
I was actually on board with the feedback to make I was planning to make I just want to call out that I've deviated slightly from this, making Let me know if you'd like me to change |
@justinvp Nope no need to change anything, agree with what you've done here. Can't review the code, but someone on the team will be able to. |
Fixed with dotnet/corefx#1756 |
Cherrypick dotnet/corefx@991edc8 to initialize `header->msg_flags` Addresses mono/mono#13859 Coverity: 1443505
CaptureCollection
,GroupCollection
, andMatchCollection
currently only implement the non-genericICollection
interface. These collections should implement the generic collection interfaces to better interoperate with more modern APIs, such as LINQ. Since these collections are already indexable, they should implementIList<T>
andIReadOnlyList<T>
, as well as the non-genericIList
(to be consistent with the generic interfaces).Rationale and Usage
This is certainly a nice-to-have, but it is a long-standing request that developers still ask about. Implementing the generic interfaces will allow these collections to be used more easily with LINQ and interoperate better with more modern framework and library APIs.
For example, to use these collections with LINQ right now you have to know about and remember to use
Enumerable.Cast<TSource>()
to cast the non-genericIEnumerable
into anIEnumerable<T>
:With these changes you'd no longer have to do that:
Plus, in the above example, you'd get a performance improvement when using
Enumerable.Last<TSource>.()
as its implementation has a fast-path for collections that implementIList<T>
.Proposed API
Details
NotSupportedException
(likeReadOnlyCollection<T>
).IList
should be implemented as well. These collections are indexable and ifIList<T>
andIReadOnlyList<T>
are being implemented,IList
should be implemented as well. This does add several more members, but they are all implemented explicitly so they don't add any new public members to intellisense, and the implementations are very straightforward.ICollection<T>.CopyTo
is implemented implicitly (public).NotSupportedException
(likeReadOnlyCollection<T>
).IList
members are implemented explicitly to hide non-generic members from intellisense.IList<T>.IndexOf
andICollection<T>.Contains
are implemented explicitly because these methods aren't very useful for these collections and should not be visible in intellisense by default. They're not useful because an implementation usingEqualityComparer<T>.Default
(consistent with other collections) will search the collection using reference equality due to the fact thatCapture
,Group
, andMatch
do not implementIEquatable<T>
and do not overrideEquals()
andGetHashCode()
. Further, these types do not have public constructors -- they are created internally by the regex engine, making it very unlikely that you'd want to search for an item in a collection "A" that was obtained from collection "B".IEnumerable<T>.GetEnumerator()
must be implemented explicitly because the non-genericIEnumerable.GetEnumerator()
is already implemented implicitly and we can't overload on return type. This also precludes returning astruct
Enumerator
(for better enumeration performance) because changing the return type of the existing method would be a binary breaking change. As a result, you'll still have to specify the type when usingforeach
(e.g.foreach (Capture capture in captures)
); you won't be able to usevar
(e.g.foreach (var capture in captures)
), unfortunately.Open Questions
GroupCollection
implementIDictionary<string, Group>
,IReadOnlyDictionary<string, Group>
, andIDictionary
?GroupCollection
already has a string indexer. Is it worth implementing the dictionary interfaces as part of this? Personally, I'm leaning toward "no" because there isn't a compelling scenario for the dictionary interfaces, and they can always be added in the future when needed.Pull Request
A PR with the proposed changes is available: dotnet/corefx#1756
Updates
IList
. These collections are indexable and it would be strange ifIList<T>
andIReadOnlyList<T>
were implemented alongsideICollection
but withoutIList
.DebuggerDisplay
andDebuggerTypeProxy
attributes.ICollection<T>.CopyTo
implicit (public).The text was updated successfully, but these errors were encountered: