Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollup of 9 pull requests #94331

Closed
wants to merge 25 commits into from
Closed

Commits on Feb 20, 2022

  1. Configuration menu
    Copy the full SHA
    c00f635 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    43dbd83 View commit details
    Browse the repository at this point in the history

Commits on Feb 21, 2022

  1. Stop manually SIMDing in swap_nonoverlapping

    Like I previously did for `reverse`, this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have.
    
    It does still need logic to type-erase where appropriate, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`.
    
    As a bonus, this also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y>
    scottmcm committed Feb 21, 2022
    Configuration menu
    Copy the full SHA
    8ca47d7 View commit details
    Browse the repository at this point in the history

Commits on Feb 22, 2022

  1. Configuration menu
    Copy the full SHA
    da896d3 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    fbe1c15 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3d23477 View commit details
    Browse the repository at this point in the history

Commits on Feb 23, 2022

  1. Continue improvements on the --check-cfg implementation

    - Test the combinations of --check-cfg with partial values() and --cfg
    - Test that we detect unexpected value when none are expected
    Urgau committed Feb 23, 2022
    Configuration menu
    Copy the full SHA
    8d3de56 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a556a2a View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c73a2f8 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    f047af2 View commit details
    Browse the repository at this point in the history

Commits on Feb 24, 2022

  1. Improve scan_escape.

    `scan_escape` currently has a fast path (for when the first char isn't
    '\\') and a slow path.
    
    This commit changes `scan_escape` so it only handles the slow path, i.e.
    the actual escaping code. The fast path is inlined into the two call
    sites.
    
    This change makes the code faster, because there is no function call
    overhead on the fast path. (`scan_escape` is a big function and doesn't
    get inlined.)
    
    This change also improves readability, because it removes a bunch of
    mode checks on the the fast paths.
    nnethercote committed Feb 24, 2022
    Configuration menu
    Copy the full SHA
    37d9ea7 View commit details
    Browse the repository at this point in the history
  2. Inline a hot closure in from_lit_token.

    The change looks big because `rustfmt` rearranges things, but the only
    real change is the inlining annotation.
    nnethercote committed Feb 24, 2022
    Configuration menu
    Copy the full SHA
    44308dc View commit details
    Browse the repository at this point in the history
  3. update auto trait lint

    lcnr committed Feb 24, 2022
    Configuration menu
    Copy the full SHA
    70018c1 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    34319ff View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    8ba7436 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    ee98dc8 View commit details
    Browse the repository at this point in the history
  7. Rollup merge of rust-lang#93714 - compiler-errors:can-type-impl-copy-…

    …error-span, r=jackh726
    
    better ObligationCause for normalization errors in `can_type_implement_copy`
    
    Some logic is needed so we can point to the field when given totally nonsense types like `struct Foo(<u32 as Iterator>::Item);`
    
    Fixes rust-lang#93687
    Dylan-DPC authored Feb 24, 2022
    Configuration menu
    Copy the full SHA
    44de8c0 View commit details
    Browse the repository at this point in the history
  8. Rollup merge of rust-lang#93845 - compiler-errors:in-band-lifetimes, …

    …r=cjgillot
    
    Remove in band lifetimes
    
    As discussed in t-lang backlog bonanza, the `in_band_lifetimes` FCP closed in favor for the feature not being stabilized. This PR removes `#![feature(in_band_lifetimes)]` in its entirety.
    
    Let me know if this PR is too hasty, and if we should instead do something intermediate for deprecate the feature first.
    
    r? ``@scottmcm`` (or feel free to reassign, just saw your last comment on rust-lang#44524)
    Closes rust-lang#44524
    Dylan-DPC authored Feb 24, 2022
    Configuration menu
    Copy the full SHA
    b33098f View commit details
    Browse the repository at this point in the history
  9. Rollup merge of rust-lang#94175 - Urgau:check-cfg-improvements, r=pet…

    …rochenkov
    
    Improve `--check-cfg` implementation
    
    This pull-request is a mix of improvements regarding the `--check-cfg` implementation:
    
    - Simpler internal representation (usage of `Option` instead of separate bool)
    - Add --check-cfg to the unstable book (based on the RFC)
    - Improved diagnostics:
        * List possible values when the value is unexpected
        * Suggest if possible a name or value that is similar
    - Add more tests (well known names, mix of combinations, ...)
    
    r? `@petrochenkov`
    Dylan-DPC authored Feb 24, 2022
    Configuration menu
    Copy the full SHA
    92b5d1d View commit details
    Browse the repository at this point in the history
  10. Rollup merge of rust-lang#94212 - scottmcm:swapper, r=dtolnay

    Stop manually SIMDing in `swap_nonoverlapping`
    
    Like I previously did for `reverse` (rust-lang#90821), this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have.
    
    A variety of codegen tests are included to confirm that the various cases are still being vectorized.
    
    It does still need logic to type-erase in some cases, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`.
    
    As a bonus, this change also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y>
    
    <details>
    
    <summary>ASM for this example</summary>
    
    ## Before (from godbolt)
    
    note the `push`/`pop`s and `memcpy`
    
    ```x86
    swap_m256_slice:
            push    r15
            push    r14
            push    r13
            push    r12
            push    rbx
            sub     rsp, 32
            cmp     rsi, rcx
            jne     .LBB0_6
            mov     r14, rsi
            shl     r14, 5
            je      .LBB0_6
            mov     r15, rdx
            mov     rbx, rdi
            xor     eax, eax
    .LBB0_3:
            mov     rcx, rax
            vmovaps ymm0, ymmword ptr [rbx + rax]
            vmovaps ymm1, ymmword ptr [r15 + rax]
            vmovaps ymmword ptr [rbx + rax], ymm1
            vmovaps ymmword ptr [r15 + rax], ymm0
            add     rax, 32
            add     rcx, 64
            cmp     rcx, r14
            jbe     .LBB0_3
            sub     r14, rax
            jbe     .LBB0_6
            add     rbx, rax
            add     r15, rax
            mov     r12, rsp
            mov     r13, qword ptr [rip + memcpy@GOTPCREL]
            mov     rdi, r12
            mov     rsi, rbx
            mov     rdx, r14
            vzeroupper
            call    r13
            mov     rdi, rbx
            mov     rsi, r15
            mov     rdx, r14
            call    r13
            mov     rdi, r15
            mov     rsi, r12
            mov     rdx, r14
            call    r13
    .LBB0_6:
            add     rsp, 32
            pop     rbx
            pop     r12
            pop     r13
            pop     r14
            pop     r15
            vzeroupper
            ret
    ```
    
    ## After (from my machine)
    
    Note no `rsp` manipulation, sorry for different ASM syntax
    
    ```x86
    swap_m256_slice:
    	cmpq	%r9, %rdx
    	jne	.LBB1_6
    	testq	%rdx, %rdx
    	je	.LBB1_6
    	cmpq	$1, %rdx
    	jne	.LBB1_7
    	xorl	%r10d, %r10d
    	jmp	.LBB1_4
    .LBB1_7:
    	movq	%rdx, %r9
    	andq	$-2, %r9
    	movl	$32, %eax
    	xorl	%r10d, %r10d
    	.p2align	4, 0x90
    .LBB1_8:
    	vmovaps	-32(%rcx,%rax), %ymm0
    	vmovaps	-32(%r8,%rax), %ymm1
    	vmovaps	%ymm1, -32(%rcx,%rax)
    	vmovaps	%ymm0, -32(%r8,%rax)
    	vmovaps	(%rcx,%rax), %ymm0
    	vmovaps	(%r8,%rax), %ymm1
    	vmovaps	%ymm1, (%rcx,%rax)
    	vmovaps	%ymm0, (%r8,%rax)
    	addq	$2, %r10
    	addq	$64, %rax
    	cmpq	%r10, %r9
    	jne	.LBB1_8
    .LBB1_4:
    	testb	$1, %dl
    	je	.LBB1_6
    	shlq	$5, %r10
    	vmovaps	(%rcx,%r10), %ymm0
    	vmovaps	(%r8,%r10), %ymm1
    	vmovaps	%ymm1, (%rcx,%r10)
    	vmovaps	%ymm0, (%r8,%r10)
    .LBB1_6:
    	vzeroupper
    	retq
    ```
    
    </details>
    
    This does all its copying operations as either the original type or as `MaybeUninit`s, so as far as I know there should be no potential abstract machine issues with reading padding bytes as integers.
    
    <details>
    
    <summary>Perf is essentially unchanged</summary>
    
    Though perhaps with more target features this would help more, if it could pick bigger chunks
    
    ## Before
    
    ```
    running 10 tests
    test slice::swap_with_slice_4x_usize_30                            ... bench:         894 ns/iter (+/- 11)
    test slice::swap_with_slice_4x_usize_3000                          ... bench:      99,476 ns/iter (+/- 2,784)
    test slice::swap_with_slice_5x_usize_30                            ... bench:       1,257 ns/iter (+/- 7)
    test slice::swap_with_slice_5x_usize_3000                          ... bench:     139,922 ns/iter (+/- 959)
    test slice::swap_with_slice_rgb_30                                 ... bench:         328 ns/iter (+/- 27)
    test slice::swap_with_slice_rgb_3000                               ... bench:      16,215 ns/iter (+/- 176)
    test slice::swap_with_slice_u8_30                                  ... bench:         312 ns/iter (+/- 9)
    test slice::swap_with_slice_u8_3000                                ... bench:       5,401 ns/iter (+/- 123)
    test slice::swap_with_slice_usize_30                               ... bench:         368 ns/iter (+/- 3)
    test slice::swap_with_slice_usize_3000                             ... bench:      28,472 ns/iter (+/- 3,913)
    ```
    
    ## After
    
    ```
    running 10 tests
    test slice::swap_with_slice_4x_usize_30                            ... bench:         868 ns/iter (+/- 36)
    test slice::swap_with_slice_4x_usize_3000                          ... bench:      99,642 ns/iter (+/- 1,507)
    test slice::swap_with_slice_5x_usize_30                            ... bench:       1,194 ns/iter (+/- 11)
    test slice::swap_with_slice_5x_usize_3000                          ... bench:     139,761 ns/iter (+/- 5,018)
    test slice::swap_with_slice_rgb_30                                 ... bench:         324 ns/iter (+/- 6)
    test slice::swap_with_slice_rgb_3000                               ... bench:      15,962 ns/iter (+/- 287)
    test slice::swap_with_slice_u8_30                                  ... bench:         281 ns/iter (+/- 5)
    test slice::swap_with_slice_u8_3000                                ... bench:       5,324 ns/iter (+/- 40)
    test slice::swap_with_slice_usize_30                               ... bench:         275 ns/iter (+/- 5)
    test slice::swap_with_slice_usize_3000                             ... bench:      28,277 ns/iter (+/- 277)
    ```
    
    </detail>
    Dylan-DPC authored Feb 24, 2022
    Configuration menu
    Copy the full SHA
    87f826d View commit details
    Browse the repository at this point in the history
  11. Rollup merge of rust-lang#94242 - compiler-errors:fat-uninhabitable-p…

    …ointer, r=michaelwoerister
    
    properly handle fat pointers to uninhabitable types
    
    Calculate the pointee metadata size by using `tcx.struct_tail_erasing_lifetimes` instead of duplicating the logic in `fat_pointer_kind`. Open to alternatively suggestions on how to fix this.
    
    Fixes rust-lang#94149
    
    r? ``@michaelwoerister`` since you touched this code last, I think!
    Dylan-DPC authored Feb 24, 2022
    Configuration menu
    Copy the full SHA
    2da2737 View commit details
    Browse the repository at this point in the history
  12. Rollup merge of rust-lang#94308 - tmiasko:normalize-main-ret-ty, r=ol…

    …i-obk
    
    Normalize main return type during mono item collection & codegen
    
    The issue can be observed with `-Zprint-mono-items=lazy` in:
    
    ```rust
    #![feature(termination_trait_lib)]
    fn main() -> impl std::process::Termination { }
    ```
    ```
    BEFORE: MONO_ITEM fn std::rt::lang_start::<impl std::process::Termination> ``@@`` t.93933fa2-cgu.2[External]
    AFTER:  MONO_ITEM fn std::rt::lang_start::<()> ``@@`` t.df56e625-cgu.1[External]
    ```
    Dylan-DPC authored Feb 24, 2022
    Configuration menu
    Copy the full SHA
    c20ed90 View commit details
    Browse the repository at this point in the history
  13. Rollup merge of rust-lang#94315 - lcnr:auto-trait-lint-update, r=oli-obk

    update auto trait lint for `PhantomData`
    
    cc rust-lang#93367 (comment)
    Dylan-DPC authored Feb 24, 2022
    Configuration menu
    Copy the full SHA
    d4db2be View commit details
    Browse the repository at this point in the history
  14. Rollup merge of rust-lang#94316 - nnethercote:improve-string-literal-…

    …unescaping, r=petrochenkov
    
    Improve string literal unescaping
    
    Some easy wins that affect a few popular crates.
    
    r? `@matklad`
    Dylan-DPC authored Feb 24, 2022
    Configuration menu
    Copy the full SHA
    25cc094 View commit details
    Browse the repository at this point in the history
  15. Rollup merge of rust-lang#94327 - Mark-Simulacrum:avoid-macro-sp, r=p…

    …etrochenkov
    
    Avoid emitting full macro body into JSON errors
    
    While investigating rust-lang#94322, it was noted that currently the JSON diagnostics for macro backtraces include the full def_site span -- the whole macro body.
    
    It seems like this shouldn't be necessary, so this PR adjusts the span to just be the "guessed head", typically the macro name. It doesn't look like we keep enough information to synthesize a nicer span here at this time.
    
    Atop rust-lang#92123, this reduces output for the src/test/ui/suggestions/missing-lifetime-specifier.rs test from 660 KB to 156 KB locally.
    Dylan-DPC authored Feb 24, 2022
    Configuration menu
    Copy the full SHA
    39d8195 View commit details
    Browse the repository at this point in the history