-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Codegen ZSTs without an allocation #123936
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Codegen ZSTs without an allocation This makes sure that &[] is equivalent to unsafe code (from_raw_parts(dangling, 0)). No new stable guarantee is intended about whether or not we do this, this is just an optimization. This regressed in rust-lang#67000 (no comments I can see about that regression in the PR, though it did change the test modified here). We had previously performed this optimization since rust-lang#63635.
This comment has been minimized.
This comment has been minimized.
💔 Test failed - checks-actions |
1266e22
to
6081b49
Compare
This comment has been minimized.
This comment has been minimized.
d989121
to
388646a
Compare
@bors try |
Codegen ZSTs without an allocation This makes sure that &[] is equivalent to unsafe code (from_raw_parts(dangling, 0)). No new stable guarantee is intended about whether or not we do this, this is just an optimization. This regressed in rust-lang#67000 (no comments I can see about that regression in the PR, though it did change the test modified here). We had previously performed this optimization since rust-lang#63635.
This comment has been minimized.
This comment has been minimized.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Some changes occurred in run-make tests. cc @jieyouxu |
Finished benchmarking commit (0d984e9): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 677.239s -> 677.892s (0.10%) |
a353f2c
to
ce11dc8
Compare
Rebased atop #123941, but otherwise this looks ready for review. Perf results indicate neutral runtime performance but a clear, though small, improvement in binary sizes. |
cc @RalfJung This seems totally fine to me, I didn't try to understand the changes to the llvm wrapper yet 🤔 intend to look at it in-depth in a few days and then merge it unless there are any concerns |
Those changes will get squashed/rebased away (already approved in a separate PR). IMO the fact that this found 1 case of UB and 1 broken test is actually pretty good justification for wanting to do it :) |
Sorry, I'm not comfortable approving codegen PRs... |
☀️ Test successful - checks-actions |
Finished benchmarking commit (38104f3): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 677.444s -> 677.262s (-0.03%) |
FYI, this PR seems to have somehow broken Bevy's rendering with Rust 1.79, but only on Windows, and only with some hardware. Any tips on where to start debugging or resolving this would be appreciated. |
^ and only when using DX12 as a backend |
The issue is most likely that something was inappropriately relying on pointer equalities or inequalities that changed. Statics are not constants, but people slip up and address them as the same thing. Even if the Rust code itself doesn't do address equating, Rust coerces references to pointers, so often people directly pass a reference to a constant into FFI. Some DX12 code could have been happily reading some random uninit garbage and not really caring because it wasn't really using that read for anything important (i.e. "UB, but it had no significant effect in actual executions of a short-lived, very simple demo program"), but with this change, decided to deref a pointer into the zeroth page today, causing a segfault. |
( It is of course entirely plausible I am talking nonsense and it is a much more exotic problem or at least a more interesting bug, but I would first try to eliminate that case, which is why the curiosity about a backtrace in #126442 (comment) ) |
To work around likely bugs in (older versions of) PCRE2. Namely, at one point, PCRE2 would dereference the haystack pointer even when the length was zero. This was reported in #10 and we worked around this in #11 by passing a pointer to a const `&[]`, with the (erroneous) presumption that this would be a valid pointer to dereference. In retrospect though, this was a little silly, because you should never be dereferencing a pointer to an empty slice. It's not valid. Alas, at that time, Rust did actually hand you a valid pointer that could be dereferenced. But [this PR][rust-pull] changed that. And thus, we're back to where we started: handing buggy versions of PCRE2 a zero length haystack with a dangling pointer. So we fix this once and for all by passing a slice of length 1, but with a haystack length of 0, to the PCRE2 search routine when searching an empty haystack. This will guarantee the provision of a dereferencable pointer should PCRE2 decide to dereference it. Fixes #42 [rust-pull]: rust-lang/rust#123936
To work around likely bugs in (older versions of) PCRE2. Namely, at one point, PCRE2 would dereference the haystack pointer even when the length was zero. This was reported in #10 and we worked around this in #11 by passing a pointer to a const `&[]`, with the (erroneous) presumption that this would be a valid pointer to dereference. In retrospect though, this was a little silly, because you should never be dereferencing a pointer to an empty slice. It's not valid. Alas, at that time, Rust did actually hand you a valid pointer that could be dereferenced. But [this PR][rust-pull] changed that. And thus, we're back to where we started: handing buggy versions of PCRE2 a zero length haystack with a dangling pointer. So we fix this once and for all by passing a slice of length 1, but with a haystack length of 0, to the PCRE2 search routine when searching an empty haystack. This will guarantee the provision of a dereferencable pointer should PCRE2 decide to dereference it. Fixes #42 [rust-pull]: rust-lang/rust#123936
This makes sure that &[] is equivalent to unsafe code (from_raw_parts(dangling, 0)). No new stable guarantee is intended about whether or not we do this, this is just an optimization.
This regressed in #67000 (no comments I can see about that regression in the PR, though it did change the test modified here). We had previously performed this optimization since #63635.