Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable has_thread_local for i686-msvc #123257

Merged
merged 1 commit into from
Apr 13, 2024
Merged

Conversation

ChrisDenton
Copy link
Member

A few years back, has_thread_local was disabled as a workaround for a compiler issue. While the exact cause was never tracked down, it was suspected to be caused by the compiler inlining a thread local access across a dylib boundary. This should be fixed now so let's try again.

@rustbot
Copy link
Collaborator

rustbot commented Mar 31, 2024

r? @lcnr

rustbot has assigned @lcnr.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot
Copy link
Collaborator

rustbot commented Mar 31, 2024

These commits modify compiler targets.
(See the Target Tier Policy.)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 31, 2024
@lcnr
Copy link
Contributor

lcnr commented Apr 3, 2024

r? compiler

@rustbot rustbot assigned fmease and unassigned lcnr Apr 3, 2024
@ChrisDenton
Copy link
Member Author

ChrisDenton commented Apr 7, 2024

Note that this is enabled for x86_64, aarch64 and thumbv7a msvc. And it's also set for i686-pc-windows-gnullvm (which uses the same native Windows tls). Leaving i686-msvc as something of an anomaly.

@fmease
Copy link
Member

fmease commented Apr 12, 2024

Sorry for the delay!

This should be fixed now

I suppose you're referring to #108089?

The changes look reasonable (I've read a bit through older issues and PRs) but I'm not an expert on this. However I will r+ this unless you know of someone who might have a clearer picture of this and could quickly look over this.

@bors rollup=never (may cause unclear regressions)

@ChrisDenton
Copy link
Member Author

ChrisDenton commented Apr 12, 2024

Yeah, sorry. We were never able to come up with a good test for failures which makes "yeah, I think this should be ok now" based on vibes the best we can hope for. I'm well aware this is not great and hopefully in the (touch wood) unlikely event it fails again we can at least get a good test case out of it.

@fmease
Copy link
Member

fmease commented Apr 12, 2024

@bors r+

@bors
Copy link
Contributor

bors commented Apr 12, 2024

📌 Commit 17176b8 has been approved by fmease

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 12, 2024
@bors
Copy link
Contributor

bors commented Apr 13, 2024

⌛ Testing commit 17176b8 with merge f5aa452...

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 13, 2024
Re-enable `has_thread_local` for i686-msvc

A few years back, `has_thread_local` was disabled as a workaround for a compiler issue. While the exact cause was never tracked down, it was suspected to be caused by the compiler inlining a thread local access across a dylib boundary. This should be fixed now so let's try again.
@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-stable failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
failures:

---- [codegen] tests/codegen/target-feature-inline-closure.rs stdout ----

error: verification with 'FileCheck' failed
status: exit status: 1
command: "/checkout/obj/build/x86_64-unknown-linux-gnu/ci-llvm/bin/FileCheck" "--input-file" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/target-feature-inline-closure/target-feature-inline-closure.ll" "/checkout/tests/codegen/target-feature-inline-closure.rs" "--check-prefix=CHECK" "--check-prefix" "NONMSVC" "--allow-unused-prefixes" "--dump-input-context" "100"
--- stderr -------------------------------
/checkout/tests/codegen/target-feature-inline-closure.rs:27:16: error: CHECK-NOT: excluded string found in input
/checkout/tests/codegen/target-feature-inline-closure.rs:27:16: error: CHECK-NOT: excluded string found in input
 // CHECK-NOT: fadd
               ^
/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/target-feature-inline-closure/target-feature-inline-closure.ll:35:69: note: found here
 call fastcc void @_ZN4core9core_arch3x863avx13_mm256_add_ps17h6f0bbfadd3fa9ee8E(ptr noalias nocapture noundef nonnull align 32 dereferenceable(32) %_0, ptr noalias nocapture noundef nonnull align 32 dereferenceable(32) %1, ptr noalias nocapture noundef nonnull align 32 dereferenceable(32) %0)

Input file: /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/target-feature-inline-closure/target-feature-inline-closure.ll
Check file: /checkout/tests/codegen/target-feature-inline-closure.rs


-dump-input=help explains the following input dump.
Input was:
<<<<<<
<<<<<<
        1: ; ModuleID = 'target_feature_inline_closure.5b935a73a363356c-cgu.0' 
        2: source_filename = "target_feature_inline_closure.5b935a73a363356c-cgu.0" 
        3: target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" 
        4: target triple = "x86_64-unknown-linux-gnu" 
        5:  
        6: ; core::core_arch::x86::avx::_mm256_add_ps 
        7: ; Function Attrs: inlinehint mustprogress nofree norecurse nosync nounwind nonlazybind willreturn memory(argmem: readwrite) uwtable 
        8: define internal fastcc void @_ZN4core9core_arch3x863avx13_mm256_add_ps17h6f0bbfadd3fa9ee8E(ptr noalias nocapture noundef writeonly align 32 dereferenceable(32) %_0, ptr noalias nocapture noundef readonly align 32 dereferenceable(32) %a, ptr noalias nocapture noundef readonly align 32 dereferenceable(32) %b) unnamed_addr #0 { 
        9: start: 
       10:  %0 = load <8 x float>, ptr %a, align 32 
       11:  %1 = load <8 x float>, ptr %b, align 32 
       12:  %2 = fadd <8 x float> %0, %1 
       13:  store <8 x float> %2, ptr %_0, align 32 
       14:  ret void 
       15: } 
       16:  
       17: ; Function Attrs: mustprogress nofree norecurse nosync nounwind nonlazybind willreturn memory(argmem: readwrite) uwtable 
       18: define void @with_avx(ptr noalias nocapture noundef writeonly sret([32 x i8]) align 32 dereferenceable(32) %_0, ptr noalias nocapture noundef readonly align 32 dereferenceable(32) %x) unnamed_addr #1 { 
       19: start: 
       20:  %0 = load <8 x float>, ptr %x, align 32 
       21:  %1 = fadd <8 x float> %0, %0 
       22:  store <8 x float> %1, ptr %_0, align 32, !alias.scope !3, !noalias !6 
       23:  ret void 
       24: } 
       25:  
       26: ; Function Attrs: mustprogress nofree norecurse nosync nounwind nonlazybind willreturn memory(argmem: readwrite) uwtable 
       27: define void @without_avx(ptr noalias nocapture noundef writeonly sret([32 x i8]) align 32 dereferenceable(32) %_0, ptr noalias nocapture noundef readonly align 32 dereferenceable(32) %x) unnamed_addr #2 { 
       28: start: 
       29:  %0 = alloca <8 x float>, align 32 
       30:  %1 = alloca <8 x float>, align 32 
       31:  %2 = load <8 x float>, ptr %x, align 32 
       32:  store <8 x float> %2, ptr %1, align 32 
       33:  store <8 x float> %2, ptr %0, align 32 
       34: ; call core::core_arch::x86::avx::_mm256_add_ps 
       35:  call fastcc void @_ZN4core9core_arch3x863avx13_mm256_add_ps17h6f0bbfadd3fa9ee8E(ptr noalias nocapture noundef nonnull align 32 dereferenceable(32) %_0, ptr noalias nocapture noundef nonnull align 32 dereferenceable(32) %1, ptr noalias nocapture noundef nonnull align 32 dereferenceable(32) %0) 
not:27                                                                         !~~~                                                                                                                                                                                                                                error: no match expected
       36:  ret void 
       37: } 
       38:  
       39: attributes #0 = { inlinehint mustprogress nofree norecurse nosync nounwind nonlazybind willreturn memory(argmem: readwrite) uwtable "probe-stack"="inline-asm" "target-cpu"="x86-64" "target-features"="+avx" } 
       40: attributes #1 = { mustprogress nofree norecurse nosync nounwind nonlazybind willreturn memory(argmem: readwrite) uwtable "probe-stack"="inline-asm" "target-cpu"="x86-64" "target-features"="+avx" } 
       41: attributes #2 = { mustprogress nofree norecurse nosync nounwind nonlazybind willreturn memory(argmem: readwrite) uwtable "probe-stack"="inline-asm" "target-cpu"="x86-64" } 
       42:  
       43: !llvm.module.flags = !{!0, !1} 
       44: !llvm.ident = !{!2} 
       45:  
       46: !0 = !{i32 8, !"PIC Level", i32 2} 
       47: !1 = !{i32 2, !"RtLibUseGOT", i32 1} 
       48: !2 = !{!"rustc version 1.79.0 (f5aa45222 2024-04-13)"} 
       49: !3 = !{!4} 
       50: !4 = distinct !{!4, !5, !"_ZN4core9core_arch3x863avx13_mm256_add_ps17h6f0bbfadd3fa9ee8E: %_0"} 
       51: !5 = distinct !{!5, !"_ZN4core9core_arch3x863avx13_mm256_add_ps17h6f0bbfadd3fa9ee8E"} 
       52: !6 = !{!7, !8, !9, !11} 
       53: !7 = distinct !{!7, !5, !"_ZN4core9core_arch3x863avx13_mm256_add_ps17h6f0bbfadd3fa9ee8E: %a"} 
       54: !8 = distinct !{!8, !5, !"_ZN4core9core_arch3x863avx13_mm256_add_ps17h6f0bbfadd3fa9ee8E: %b"} 
       55: !9 = distinct !{!9, !10, !"_ZN29target_feature_inline_closure8with_avx28_$u7b$$u7b$closure$u7d$$u7d$17hfa4b1978a7320069E: %x"} 
       56: !10 = distinct !{!10, !"_ZN29target_feature_inline_closure8with_avx28_$u7b$$u7b$closure$u7d$$u7d$17hfa4b1978a7320069E"} 
       57: !11 = distinct !{!11, !10, !"_ZN29target_feature_inline_closure8with_avx28_$u7b$$u7b$closure$u7d$$u7d$17hfa4b1978a7320069E: %y"} 
------------------------------------------



@bors
Copy link
Contributor

bors commented Apr 13, 2024

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 13, 2024
@ChrisDenton
Copy link
Member Author

Seems spurious, considering this PR doesn't affect Linux

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 13, 2024
@bors
Copy link
Contributor

bors commented Apr 13, 2024

⌛ Testing commit 17176b8 with merge f96442b...

@bors
Copy link
Contributor

bors commented Apr 13, 2024

☀️ Test successful - checks-actions
Approved by: fmease
Pushing f96442b to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 13, 2024
@bors bors merged commit f96442b into rust-lang:master Apr 13, 2024
12 checks passed
@rustbot rustbot added this to the 1.79.0 milestone Apr 13, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f96442b): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.3% [0.3%, 0.4%] 6
Improvements ✅
(primary)
-0.3% [-0.3%, -0.3%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.3% [-0.3%, -0.3%] 1

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 675.861s -> 676.802s (0.14%)
Artifact size: 316.04 MiB -> 316.07 MiB (0.01%)

@rustbot rustbot added the perf-regression Performance regression. label Apr 13, 2024
@lqd
Copy link
Member

lqd commented Apr 13, 2024

These 2 benchmarks seem to be on tiny noise spikes as of late, this is obviously noise.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Apr 13, 2024
@ChrisDenton ChrisDenton deleted the enable-tls branch April 13, 2024 12:53
@workingjubilee workingjubilee added the O-x86_32 Target: x86 processors, 32 bit (like i686-*) label Jul 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. O-x86_32 Target: x86 processors, 32 bit (like i686-*) perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants