-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rollup of 9 pull requests #94331
Rollup of 9 pull requests #94331
Commits on Feb 20, 2022
-
Configuration menu - View commit details
-
Copy full SHA for c00f635 - Browse repository at this point
Copy the full SHA c00f635View commit details -
Configuration menu - View commit details
-
Copy full SHA for 43dbd83 - Browse repository at this point
Copy the full SHA 43dbd83View commit details
Commits on Feb 21, 2022
-
Stop manually SIMDing in swap_nonoverlapping
Like I previously did for `reverse`, this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have. It does still need logic to type-erase where appropriate, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`. As a bonus, this also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y>
Configuration menu - View commit details
-
Copy full SHA for 8ca47d7 - Browse repository at this point
Copy the full SHA 8ca47d7View commit details
Commits on Feb 22, 2022
-
Configuration menu - View commit details
-
Copy full SHA for da896d3 - Browse repository at this point
Copy the full SHA da896d3View commit details -
Configuration menu - View commit details
-
Copy full SHA for fbe1c15 - Browse repository at this point
Copy the full SHA fbe1c15View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3d23477 - Browse repository at this point
Copy the full SHA 3d23477View commit details
Commits on Feb 23, 2022
-
Continue improvements on the --check-cfg implementation
- Test the combinations of --check-cfg with partial values() and --cfg - Test that we detect unexpected value when none are expected
Configuration menu - View commit details
-
Copy full SHA for 8d3de56 - Browse repository at this point
Copy the full SHA 8d3de56View commit details -
Configuration menu - View commit details
-
Copy full SHA for a556a2a - Browse repository at this point
Copy the full SHA a556a2aView commit details -
Configuration menu - View commit details
-
Copy full SHA for c73a2f8 - Browse repository at this point
Copy the full SHA c73a2f8View commit details -
Configuration menu - View commit details
-
Copy full SHA for f047af2 - Browse repository at this point
Copy the full SHA f047af2View commit details
Commits on Feb 24, 2022
-
`scan_escape` currently has a fast path (for when the first char isn't '\\') and a slow path. This commit changes `scan_escape` so it only handles the slow path, i.e. the actual escaping code. The fast path is inlined into the two call sites. This change makes the code faster, because there is no function call overhead on the fast path. (`scan_escape` is a big function and doesn't get inlined.) This change also improves readability, because it removes a bunch of mode checks on the the fast paths.
Configuration menu - View commit details
-
Copy full SHA for 37d9ea7 - Browse repository at this point
Copy the full SHA 37d9ea7View commit details -
Inline a hot closure in
from_lit_token
.The change looks big because `rustfmt` rearranges things, but the only real change is the inlining annotation.
Configuration menu - View commit details
-
Copy full SHA for 44308dc - Browse repository at this point
Copy the full SHA 44308dcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 70018c1 - Browse repository at this point
Copy the full SHA 70018c1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 34319ff - Browse repository at this point
Copy the full SHA 34319ffView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8ba7436 - Browse repository at this point
Copy the full SHA 8ba7436View commit details -
Configuration menu - View commit details
-
Copy full SHA for ee98dc8 - Browse repository at this point
Copy the full SHA ee98dc8View commit details -
Rollup merge of rust-lang#93714 - compiler-errors:can-type-impl-copy-…
…error-span, r=jackh726 better ObligationCause for normalization errors in `can_type_implement_copy` Some logic is needed so we can point to the field when given totally nonsense types like `struct Foo(<u32 as Iterator>::Item);` Fixes rust-lang#93687
Configuration menu - View commit details
-
Copy full SHA for 44de8c0 - Browse repository at this point
Copy the full SHA 44de8c0View commit details -
Rollup merge of rust-lang#93845 - compiler-errors:in-band-lifetimes, …
…r=cjgillot Remove in band lifetimes As discussed in t-lang backlog bonanza, the `in_band_lifetimes` FCP closed in favor for the feature not being stabilized. This PR removes `#![feature(in_band_lifetimes)]` in its entirety. Let me know if this PR is too hasty, and if we should instead do something intermediate for deprecate the feature first. r? ``@scottmcm`` (or feel free to reassign, just saw your last comment on rust-lang#44524) Closes rust-lang#44524
Configuration menu - View commit details
-
Copy full SHA for b33098f - Browse repository at this point
Copy the full SHA b33098fView commit details -
Rollup merge of rust-lang#94175 - Urgau:check-cfg-improvements, r=pet…
…rochenkov Improve `--check-cfg` implementation This pull-request is a mix of improvements regarding the `--check-cfg` implementation: - Simpler internal representation (usage of `Option` instead of separate bool) - Add --check-cfg to the unstable book (based on the RFC) - Improved diagnostics: * List possible values when the value is unexpected * Suggest if possible a name or value that is similar - Add more tests (well known names, mix of combinations, ...) r? `@petrochenkov`
Configuration menu - View commit details
-
Copy full SHA for 92b5d1d - Browse repository at this point
Copy the full SHA 92b5d1dView commit details -
Rollup merge of rust-lang#94212 - scottmcm:swapper, r=dtolnay
Stop manually SIMDing in `swap_nonoverlapping` Like I previously did for `reverse` (rust-lang#90821), this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have. A variety of codegen tests are included to confirm that the various cases are still being vectorized. It does still need logic to type-erase in some cases, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`. As a bonus, this change also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y> <details> <summary>ASM for this example</summary> ## Before (from godbolt) note the `push`/`pop`s and `memcpy` ```x86 swap_m256_slice: push r15 push r14 push r13 push r12 push rbx sub rsp, 32 cmp rsi, rcx jne .LBB0_6 mov r14, rsi shl r14, 5 je .LBB0_6 mov r15, rdx mov rbx, rdi xor eax, eax .LBB0_3: mov rcx, rax vmovaps ymm0, ymmword ptr [rbx + rax] vmovaps ymm1, ymmword ptr [r15 + rax] vmovaps ymmword ptr [rbx + rax], ymm1 vmovaps ymmword ptr [r15 + rax], ymm0 add rax, 32 add rcx, 64 cmp rcx, r14 jbe .LBB0_3 sub r14, rax jbe .LBB0_6 add rbx, rax add r15, rax mov r12, rsp mov r13, qword ptr [rip + memcpy@GOTPCREL] mov rdi, r12 mov rsi, rbx mov rdx, r14 vzeroupper call r13 mov rdi, rbx mov rsi, r15 mov rdx, r14 call r13 mov rdi, r15 mov rsi, r12 mov rdx, r14 call r13 .LBB0_6: add rsp, 32 pop rbx pop r12 pop r13 pop r14 pop r15 vzeroupper ret ``` ## After (from my machine) Note no `rsp` manipulation, sorry for different ASM syntax ```x86 swap_m256_slice: cmpq %r9, %rdx jne .LBB1_6 testq %rdx, %rdx je .LBB1_6 cmpq $1, %rdx jne .LBB1_7 xorl %r10d, %r10d jmp .LBB1_4 .LBB1_7: movq %rdx, %r9 andq $-2, %r9 movl $32, %eax xorl %r10d, %r10d .p2align 4, 0x90 .LBB1_8: vmovaps -32(%rcx,%rax), %ymm0 vmovaps -32(%r8,%rax), %ymm1 vmovaps %ymm1, -32(%rcx,%rax) vmovaps %ymm0, -32(%r8,%rax) vmovaps (%rcx,%rax), %ymm0 vmovaps (%r8,%rax), %ymm1 vmovaps %ymm1, (%rcx,%rax) vmovaps %ymm0, (%r8,%rax) addq $2, %r10 addq $64, %rax cmpq %r10, %r9 jne .LBB1_8 .LBB1_4: testb $1, %dl je .LBB1_6 shlq $5, %r10 vmovaps (%rcx,%r10), %ymm0 vmovaps (%r8,%r10), %ymm1 vmovaps %ymm1, (%rcx,%r10) vmovaps %ymm0, (%r8,%r10) .LBB1_6: vzeroupper retq ``` </details> This does all its copying operations as either the original type or as `MaybeUninit`s, so as far as I know there should be no potential abstract machine issues with reading padding bytes as integers. <details> <summary>Perf is essentially unchanged</summary> Though perhaps with more target features this would help more, if it could pick bigger chunks ## Before ``` running 10 tests test slice::swap_with_slice_4x_usize_30 ... bench: 894 ns/iter (+/- 11) test slice::swap_with_slice_4x_usize_3000 ... bench: 99,476 ns/iter (+/- 2,784) test slice::swap_with_slice_5x_usize_30 ... bench: 1,257 ns/iter (+/- 7) test slice::swap_with_slice_5x_usize_3000 ... bench: 139,922 ns/iter (+/- 959) test slice::swap_with_slice_rgb_30 ... bench: 328 ns/iter (+/- 27) test slice::swap_with_slice_rgb_3000 ... bench: 16,215 ns/iter (+/- 176) test slice::swap_with_slice_u8_30 ... bench: 312 ns/iter (+/- 9) test slice::swap_with_slice_u8_3000 ... bench: 5,401 ns/iter (+/- 123) test slice::swap_with_slice_usize_30 ... bench: 368 ns/iter (+/- 3) test slice::swap_with_slice_usize_3000 ... bench: 28,472 ns/iter (+/- 3,913) ``` ## After ``` running 10 tests test slice::swap_with_slice_4x_usize_30 ... bench: 868 ns/iter (+/- 36) test slice::swap_with_slice_4x_usize_3000 ... bench: 99,642 ns/iter (+/- 1,507) test slice::swap_with_slice_5x_usize_30 ... bench: 1,194 ns/iter (+/- 11) test slice::swap_with_slice_5x_usize_3000 ... bench: 139,761 ns/iter (+/- 5,018) test slice::swap_with_slice_rgb_30 ... bench: 324 ns/iter (+/- 6) test slice::swap_with_slice_rgb_3000 ... bench: 15,962 ns/iter (+/- 287) test slice::swap_with_slice_u8_30 ... bench: 281 ns/iter (+/- 5) test slice::swap_with_slice_u8_3000 ... bench: 5,324 ns/iter (+/- 40) test slice::swap_with_slice_usize_30 ... bench: 275 ns/iter (+/- 5) test slice::swap_with_slice_usize_3000 ... bench: 28,277 ns/iter (+/- 277) ``` </detail>
Configuration menu - View commit details
-
Copy full SHA for 87f826d - Browse repository at this point
Copy the full SHA 87f826dView commit details -
Rollup merge of rust-lang#94242 - compiler-errors:fat-uninhabitable-p…
…ointer, r=michaelwoerister properly handle fat pointers to uninhabitable types Calculate the pointee metadata size by using `tcx.struct_tail_erasing_lifetimes` instead of duplicating the logic in `fat_pointer_kind`. Open to alternatively suggestions on how to fix this. Fixes rust-lang#94149 r? ``@michaelwoerister`` since you touched this code last, I think!
Configuration menu - View commit details
-
Copy full SHA for 2da2737 - Browse repository at this point
Copy the full SHA 2da2737View commit details -
Rollup merge of rust-lang#94308 - tmiasko:normalize-main-ret-ty, r=ol…
…i-obk Normalize main return type during mono item collection & codegen The issue can be observed with `-Zprint-mono-items=lazy` in: ```rust #![feature(termination_trait_lib)] fn main() -> impl std::process::Termination { } ``` ``` BEFORE: MONO_ITEM fn std::rt::lang_start::<impl std::process::Termination> ``@@`` t.93933fa2-cgu.2[External] AFTER: MONO_ITEM fn std::rt::lang_start::<()> ``@@`` t.df56e625-cgu.1[External] ```
Configuration menu - View commit details
-
Copy full SHA for c20ed90 - Browse repository at this point
Copy the full SHA c20ed90View commit details -
Rollup merge of rust-lang#94315 - lcnr:auto-trait-lint-update, r=oli-obk
update auto trait lint for `PhantomData` cc rust-lang#93367 (comment)
Configuration menu - View commit details
-
Copy full SHA for d4db2be - Browse repository at this point
Copy the full SHA d4db2beView commit details -
Rollup merge of rust-lang#94316 - nnethercote:improve-string-literal-…
…unescaping, r=petrochenkov Improve string literal unescaping Some easy wins that affect a few popular crates. r? `@matklad`
Configuration menu - View commit details
-
Copy full SHA for 25cc094 - Browse repository at this point
Copy the full SHA 25cc094View commit details -
Rollup merge of rust-lang#94327 - Mark-Simulacrum:avoid-macro-sp, r=p…
…etrochenkov Avoid emitting full macro body into JSON errors While investigating rust-lang#94322, it was noted that currently the JSON diagnostics for macro backtraces include the full def_site span -- the whole macro body. It seems like this shouldn't be necessary, so this PR adjusts the span to just be the "guessed head", typically the macro name. It doesn't look like we keep enough information to synthesize a nicer span here at this time. Atop rust-lang#92123, this reduces output for the src/test/ui/suggestions/missing-lifetime-specifier.rs test from 660 KB to 156 KB locally.
Configuration menu - View commit details
-
Copy full SHA for 39d8195 - Browse repository at this point
Copy the full SHA 39d8195View commit details