-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iter::Fuse is unsound with how specialization currently behaves around HRTB fn pointers #85863
Comments
I think this is a known hole see #68970 on rustc_unsafe_specialization_marker. |
@the8472 That section also says
This issue is a way to “cause use after frees with purely safe code” (not 100% a “use after free” with the current example code, but it can certainly be turned into “use-after-free” UB, too). So it is not a known hole AFAICT. I mean, sure, the fact that On a similar note, there are no other open |
Funny that you mentioned Maybe your remark even helped me a little bit in finding it. (I’m always unable to accurately recall afterwards how exactly I found such a soundness issue, so I’m not sure.) |
Wouldn't that be a breaking API change? |
Sure, it's a breaking change. I suppose this depends on the long term vision for Fuse. I haven't thought about it for too long myself either at the time of writing the idea of making it invariant, and my feeling when suggesting it was “maybe a Fuse type that holds its promise of being a no-op when the iterator implements Anyways, since then, I'm thinking more along the lines of a long term solution of “somehow detect if the With such a long term vision for making |
CC @matthewjasper for clarification on what kind of unsoundness is expected from user-implementable |
"purely safe code" includes library code in this context, which |
That sounds to me like some safety loopholes around specializations on |
For the record, the same thing is indeed also possible using trait objects. |
Assigning priority as discussed in the Zulip thread of the Prioritization Working Group. @rustbot label -I-prioritize +P-critical |
Added a new high-level explanation for both this issue and #85873 at the beginning of this issue. |
Another possible fix for Applying the same idea to |
@rustbot label +T-lang I think this is at least worth putting on the lang-team radar. |
Based on discussion in the @rust-lang/lang meeting today, this is worth discussing further in a language context to talk about how to make specialization safer, but this is not P-critical for lang, just for the specific impls in the standard library that use |
I'm trying to understand how this would work. Your example has a subtype that does implement |
I also realized that all of this was not a problem with the old |
Interesting observation. I wasn’t aware of this change / earlier design. AFAICT it’s indeed true that the worst that could happen with a done-flag design is that a non- @cuviper Your PR states “The former done flag was roughly similar to an Option tag, but left |
I think it’s the other way. My example uses
and
The former implements This terminology also has the effect that “larger” lifetimes create subtypes and “shorter” lifetimes create supertypes, even though “super” sounds more like “larger”. |
Ah, right, function are contravariant in their parameters. In my defense, you also wrote in the OP:
|
It's been a while, but I don't think there was any concrete misuse of |
Fair point, I’ll fix that. As you can tell by my lengthy reply above I’ve needed to double-check everything thoroughly myself, too, to make sure I’m not correcting you on something that you didn’t get wrong. These things are highly non-intuitive. By the way, it doesn’t have anything to do with contravariance of |
I am having trouble following this conversation, so I won't claim to have a deep opinion, but since there is a lot of confusion about which way the variance goes &etc., more cleanly expressing the subtype/supertype relationships seems desirable if we're going to expand the syntax for universal quantification. This is largely aside from the point, though. |
I opened #86765 and #86766 with reduced and eliminated specialization, so we can compare performance. I also considered just changing those |
A diagram with some subtyping relationshipsTo recap, here is a hopefully illustrative diagram summarizing the basic subtyping relationships: I have greyed out the types there that are necessarily / known to be My personal view of the situationI believe the "surprising thing" here is that some of us may have intuitively thought (I personally did!) that subtyping necessarily involved variance, and thus, involved there being different lifetimes. Or, in other words, some of us may have intuitively thought that a But there are: it turns out that a higher order Thus, not specializing over lifetimes does not prevent specializing over {sub,super}types. This, however, in and of itself, is not a problem (≠ specializing over lifetimes, which is definitely a problem even without Indeed, specialization itself is not to blame, here. There could have been two non-generic impls, one for The problem stems from This is no different to some oversight regarding some In practiceThe short answer is to say that that impl for But the issue here is that the type conversion is very sneaky, since it's a subtyping conversion, which can happen at any point under our feet. So I agree with @cuviper that For the case of
Thus, rather than a whole Then, if we have
|
@danielhenrymantilla I believe that writing “outlives” as “⊇” also has potential for confusion. It might be useful if you’re thinking of lifetimes as a set of lines/places in your source code, but I wouldn’t necessarily want to introduce it as actual notation, because a correspondence between supersets (“⊇”) and subtypes – both using |
That diagram would be a great addition to the reference! |
This comment has been minimized.
This comment has been minimized.
@the8472 The original example works just as well with a generic impl<'a> FusedIterator for MyEmptyIterator<fn(&'a ())> {} instead of just impl FusedIterator for MyEmptyIterator<fn(&'static ())> {} The former kind of
|
Objection retracted. |
A day later, I still think that moving back to a |
The other downside of the The performance of #86765 looks stable enough that this would be my preference, although it does lose in-place iteration. |
The current direction is for them to remain distinct types. See
I'm not sure within |
New high-level explanation of this issue, also covering #85873
Fuse<T>
andZip<T, U>
have optimizations relying on specialization if their type parameters implement a trait (FusedIterator
orTrustedRandomAccess
, respectively).These optimizations fundamentally change the way the iterator operates.
All type arguments are covariant. Coercing e.g.
Fuse<T>
toFuse<U>
ifU
is a subtype ofT
can “switch between” these fundamentally different ways of operation ifT: !FusedIterator
andU: FusedIterator
which can bring the iterator into an invalid state that can cause UB; the same kind of problem exists forZip
.For
Zip
, this problem can be avoided ifTrustedRandomAccess
never differs between types and subtypes.Copy
-dependent impls for e.g.vec::IntoIter
have to be removed (PR #85874).Regarding
Fuse
:FusedIterator
is a safe public trait, this is problematic. Some possible fixes: Either changeFuse
to not get into a UB-causing invalid state by such a coercion (which kills some (or perhaps most) of the optimization offered byFuse
), or makeFuse
invariant (but that’s a breaking change!). Here’s another possible approach, but this one requires new language features and could be done in the future if an immediate fix that just makesFuse
less performant is taken now.Specialization not differentiating between e.g.
Foo<'a>
andFoo<'b>
made this issue harder to spot.But
fn(&())
(that is,for<'a> fn(&'a ())
) is asupertype[edited:] subtype offn(&'static ())
and those are entirely different types.Previous start of issue description
(playground)
@rustbot label T-libs, T-libs-impl, T-compiler, A-specialization, A-iterators, A-typesystem, A-lifetimes, A-traits, C-bug
someone please add the unsound labelthanksExplanation
The types
for<'a> fn(&'a ())
(i.e.fn(&())
) andfn(&'static ())
do seem to be distinguished by rust’s specialization. (I.e. the difference between the two types is not erased before the trait impl for specializing onFusedIterator
is chosen.) But it’s also possible to convertFuse<MyEmptyIterator<fn(&i32)>>
intoFuse<MyEmptyIterator<fn(&'static i32)>>
. This way, the first call to.next()
will use unspecialized code and write aNone
value into the field ofFuse
, whereas the second call will use the specialized version that usesintrinsics::unreachable()
to unwrap the containedOption
.I haven’t tested it, but I’d guess thatthe same thing might be possible using HRTB trait objects instead of function pointers. (See comment below.)Possible fixes
This soundness issue would probably go away if, somehow, specialization would also consider types such as
fn(&())
andfn(&'static ())
to be “the same type”. Edit2: I don’t think that this approach is reasonable, and it’s probably even impossible.The easiest / fastest fix for now would probably be to remove all the
intrinsics::unreachable()
assertions in the implementation ofFuse
and replace them either withpanic
s (which results in unexpected panicking but at least no unsoundness), or replace them with code that “properly” handles theNone
case as the iterator being empty. In either case, performance benefits fromFusedIterator
implementations could probably somewhat be retained by somehow marking theSome
case as the hot path and theNone
case as a cold path. (In the approach with theNone
case panicking, this would probably be automatically considered the cold path due to the panic.)Edit: Yet another alternative fix is to make
Fuse<T>
invariant overT
.The text was updated successfully, but these errors were encountered: