Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS destructors on the main thread are not always run #28129

Open
4 tasks
alexcrichton opened this issue Aug 31, 2015 · 19 comments
Open
4 tasks

TLS destructors on the main thread are not always run #28129

alexcrichton opened this issue Aug 31, 2015 · 19 comments
Labels
A-destructors Area: Destructors (`Drop`, …) A-thread-locals Area: Thread local storage (TLS) C-bug Category: This is a bug. E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@alexcrichton
Copy link
Member

alexcrichton commented Aug 31, 2015

There are some platforms where TLS destructors are run when the main thread exits, and there are some platforms where this does not happen,~~ and there are some platforms where things just go crazy ~~. For example, testing this program:

struct Foo;

impl Drop for Foo {
    fn drop(&mut self) {
        println!("wut");
    }
}

thread_local!(static FOO: Foo = Foo);

fn main() {
    FOO.with(|_| {});
}

This should print Foo dtor, but prints nothing on some targets:

  • musl (e.g. i686-unknown-linux-musl, x86_64-unknown-linux-musl)
  • android (e.g. arm-linux-androideabi)
  • Linux with glibc before 2.18
  • Maybe more?

Here's a more complicated testcase that also involves initializing a destructor while destructors are being run; ideally this will be added to the test suite at some point:

struct Bar;

impl Drop for Bar {
    fn drop(&mut self) {
        println!("Bar dtor");
    }
}

struct Foo;

impl Drop for Foo {
    fn drop(&mut self) {
        println!("Foo dtor");
        // We initialize another thread-local inside the dtor, which is an interesting corner case.
        thread_local!(static BAR: Bar = Bar);
        BAR.with(|_| {});
    }
}

thread_local!(static FOO: Foo = Foo);

fn main() {
    FOO.with(|_| {});
}

some older notes

@ranma42
Copy link
Contributor

ranma42 commented Sep 17, 2015

The memory corruption in MacOSX seems to be related to the TLS usage during Drop of a TLS value.
If the print! invocation is replaced with stdout().write, no crash occurs.
However, if stdout is used within main, the process dies with the error

thread '<main>' panicked at 'cannot access stdout during shutdown', ../src/libcore/option.rs:333

both if drop calls print! and if it calls write.

bors added a commit that referenced this issue Sep 23, 2015
This is part of some cleanup I did while investigating #28129.
This also ensures that `on_panic` is run even if the user has registered too many callbacks.
bors added a commit that referenced this issue Sep 26, 2015
This is mainly to avoid infinite recursion and make debugging more convenient in the anomalous case in which `on_panic` panics.
I encountered such issues while changing libstd to debug/fix part of #28129.

While writing this I was wondering about which functions belong to `panicking` and which to `unwind`.
I placed them in this way mostly because of convenience, but I would strongly appreciate guidance.
@ranma42
Copy link
Contributor

ranma42 commented Sep 29, 2015

pthread destructors seem to have the same behaviour on MacOS X as on Linux (i.e. they are not run from when the main thread terminates). Googling around seems to suggest that this is a known (expected?) fact. Specifically, TLS destructors are only run when pthread_exit is invoked, which for non-main threads happens implicitly.
The only authoritative source I was able to find is http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_exit.html

@alexcrichton
Copy link
Member Author

Yeah I think solving this in the case of pthreads will either require us documenting "dtors may not run" or perhaps adding our own atexit handler or something like that to handle this (e.g. some custom code on our end). Not entirely sure what that would look like though!

@ranma42
Copy link
Contributor

ranma42 commented Sep 29, 2015

In order to make the atexit handler practical we would need to keep explicitly track the active destructors (at least on the main thread), i.e. basically keep the list of DTORS as in the pthread fallback implementation. This might not be as bad as it looks; in fact it might allow us to reuse some code (currently the Linux fallback dtor and the normal Windows dtor are basically doing the same thing in a slightly different way).

Another option would be to start the main function in a new thread and immediately joining it in lang_start. This would be about as effective as hooking the destructors in the Rust atexit, because for Rust application either way would work just fine and for Rust libraries used by non-Rust applications TLS would not be destroyed in either case.

The C atexit function would suffer from the same limitations as the Rust atexit (need to track dtors), but it might help in non-Rust applications, assuming that go through the normal termination (exit, not _Exit nor quick_exit nor an abort of any kind) as expected by the C runtime.

Documenting "dtors may not run" looks like a reasonable compromise to me, especially if we can guarantee that this will at most happen for the main thread (and, ideally, only on non-Rust apps). I believe that this guarantee is especially important if threads are spawned and joined (instead of managed as a thread pool) to ensure that leaks are bounded.

@alexcrichton
Copy link
Member Author

we would need to keep explicitly track the active destructors

Yeah this probably isn't so bad as we already do it in some cases, so it'd just be a matter of shuffling things around.

Another option would be to start the main function in a new thread and immediately joining it in lang_start

Unfortunately I think this won't work because there's a number of GUI frameworks (or something like that) which only work on the main thread (I think on Windows in particular), so running off-the-main-thread by default may be a bit of a heavy hammer to fix this!

Documenting "dtors may not run" looks like a reasonable compromise to me

I agree it's probably not that bad, but if we could get OSX and Windows working reliably, it's more pressure for us to get pthreads working reliably :). I feel like Windows is pretty easy to fix, it seems like some small error is just being missed there. OSX also feels the same way to me in terms of weird things happening. The pthreads case also isn't super critical on Linux because it's only used on older linuxes.

Overall I think we may still have enough rope left to climb out and close this issue, so I wouldn't be quite willing just yet to close it out by documenting things may not run.

@retep998
Copy link
Member

retep998 commented Mar 1, 2017

there's a number of GUI frameworks (or something like that) which only work on the main thread (I think on Windows in particular)

Windows itself does not actually care which thread you do the UI on, as long as you're consistent. Once you create a window on a given thread, that thread now has a message queue which you're obligated to pump forever.

@Mark-Simulacrum Mark-Simulacrum added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed A-io labels Jun 25, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-bug Category: This is a bug. label Jul 22, 2017
@alexcrichton alexcrichton added the A-thread-locals Area: Thread local storage (TLS) label Aug 25, 2017
@mtak-
Copy link
Contributor

mtak- commented Jan 12, 2019

Just confirmed the memory corruption in the OP on OSX occurs for the same reason listed here #57534 (comment). It is not specific to the main thread. AFAICT it is UB on macos to register destructors during thread cleanup via _tlv_atexit,

Centril added a commit to Centril/rust that referenced this issue Jan 17, 2019
…ichton

OSX: fix rust-lang#57534 registering thread dtors while running thread dtors

r? @alexcrichton

- "fast" `thread_local` destructors get run even on the main thread
- "fast" `thread_local` dtors, can initialize other `thread_local`'s

One corner case where this fix doesn't work, is when a C++ `thread_local` triggers the initialization of a rust `thread_local`.

I did not add any std::thread specific flag to indicate that the thread is currently exiting, which would be checked before registering a new dtor (I didn't really know where to stick that). I think this does the trick tho!

Let me know if anything needs tweaking/fixing/etc.

resolves this for macos: rust-lang#28129
fixes: rust-lang#57534
Centril added a commit to Centril/rust that referenced this issue Jan 17, 2019
…ichton

OSX: fix rust-lang#57534 registering thread dtors while running thread dtors

r? @alexcrichton

- "fast" `thread_local` destructors get run even on the main thread
- "fast" `thread_local` dtors, can initialize other `thread_local`'s

One corner case where this fix doesn't work, is when a C++ `thread_local` triggers the initialization of a rust `thread_local`.

I did not add any std::thread specific flag to indicate that the thread is currently exiting, which would be checked before registering a new dtor (I didn't really know where to stick that). I think this does the trick tho!

Let me know if anything needs tweaking/fixing/etc.

resolves this for macos: rust-lang#28129
fixes: rust-lang#57534
bors added a commit that referenced this issue Jan 20, 2019
OSX: fix #57534 registering thread dtors while running thread dtors

r? @alexcrichton

- "fast" `thread_local` destructors get run even on the main thread
- "fast" `thread_local` dtors, can initialize other `thread_local`'s

One corner case where this fix doesn't work, is when a C++ `thread_local` triggers the initialization of a rust `thread_local`.

I did not add any std::thread specific flag to indicate that the thread is currently exiting, which would be checked before registering a new dtor (I didn't really know where to stick that). I think this does the trick tho!

Let me know if anything needs tweaking/fixing/etc.

resolves this for macos: #28129
fixes: #57534
@jonas-schievink jonas-schievink added the A-destructors Area: Destructors (`Drop`, …) label Apr 27, 2020
@RalfJung
Copy link
Member

RalfJung commented Apr 29, 2020

So from the 4 items listed in the OP, the first two are fine, and the third was a separate bug that got fixed. EDIT: oh, the 2nd one is also not fine, I see.

What is the situation on Windows nowadays? (Cc @retep998)

erickt added a commit to erickt/notify that referenced this issue Jun 10, 2021
We had some test failures because crossbeam-channel may panic when trying
to call recv() during thread shutdown. This seems to be similar to this
upstream bug: crossbeam-rs/crossbeam#321.
Unfortunately it seems that some operating systems may tear down thread-local
storage early, rust-lang/rust#28129, which can
trigger panics if trying to interact with TLS during a drop.

To avoid this issue, this switches from using a channel to signal the thread
shutdown to just using the join handle (which we should have been doing
anyway).
erickt added a commit to erickt/notify that referenced this issue Jun 10, 2021
We had some test failures because crossbeam-channel may panic when trying
to call recv() during thread shutdown. This seems to be similar to this
upstream bug: crossbeam-rs/crossbeam#321.
Unfortunately it seems that some operating systems may tear down thread-local
storage early, rust-lang/rust#28129, which can
trigger panics if trying to interact with TLS during a drop.

To avoid this issue, this switches from using a channel to signal the thread
shutdown to just using the join handle (which we should have been doing
anyway).
0xpr03 pushed a commit to notify-rs/notify that referenced this issue Jun 28, 2021
We had some test failures because crossbeam-channel may panic when trying
to call recv() during thread shutdown. This seems to be similar to this
upstream bug: crossbeam-rs/crossbeam#321.
Unfortunately it seems that some operating systems may tear down thread-local
storage early, rust-lang/rust#28129, which can
trigger panics if trying to interact with TLS during a drop.

To avoid this issue, this switches from using a channel to signal the thread
shutdown to just using the join handle (which we should have been doing
anyway).
papazof added a commit to cloudkernels/ttrpc-rust that referenced this issue Nov 24, 2022
Calling Rust from a GCC destructor can result in panics when using pthreads (rust-lang/rust#28129)

Move client request code to a separate thread to work around this issue

Signed-off-by: Kostis Papazafeiropoulos <papazof@gmail.com>
ananos pushed a commit to nubificus/ttrpc-rust that referenced this issue Jun 16, 2023
Calling Rust from a GCC destructor can result in panics when using pthreads (rust-lang/rust#28129)

Move client request code to a separate thread to work around this issue

Signed-off-by: Kostis Papazafeiropoulos <papazof@gmail.com>
papazof added a commit to nubificus/ttrpc-rust that referenced this issue Nov 10, 2023
Calling Rust from a GCC destructor can result in panics when using pthreads (rust-lang/rust#28129)

Move client request code to a separate thread to work around this issue

Signed-off-by: Kostis Papazafeiropoulos <papazof@gmail.com>
@workingjubilee
Copy link
Member

Looks like the only failing one left then is "Linux pthread destructors - appear to not work" -- and where do we even use that? On playground, it also prints wut.

It seems we do not, but rather, we currently effectively require libc support to include thread-local support.

@workingjubilee
Copy link
Member

Wait.

Android and OpenBSD.

cc @quininer Sorry to bother, but you made this PR to add "emulated TLS" support: #117873

Can you confirm if this is still an issue on the appropriate targets?

@quininer
Copy link
Contributor

@workingjubilee I confirmed that emutls on android will print "wut".

@workingjubilee
Copy link
Member

Thank you! It seems this is resolved then? Hopefully?

@RalfJung RalfJung added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Jun 22, 2024
@RalfJung
Copy link
Member

RalfJung commented Jun 22, 2024

I am a bit confused by the macos sitaution: I don't see how the simple program above could be affected by #57655. Very strange. 🤷

Do we have a test checking that main thread destructors run? Reopening as needs-test.

@RalfJung
Copy link
Member

PR with a test is up: #126829

(I couldn't find an existing test.)

bors added a commit to rust-lang-ci/rust that referenced this issue Jun 22, 2024
…ubilee

add test for main thread thread-local destructors

Fixes rust-lang#28129
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 23, 2024
…ubilee

add test for main thread thread-local destructors

Fixes rust-lang#28129
@RalfJung
Copy link
Member

Turns out musl is still broken, I opened a new issue for that: #126858.

bors added a commit to rust-lang-ci/rust that referenced this issue Jun 23, 2024
…ubilee

add test for main thread thread-local destructors

Fixes rust-lang#28129
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 24, 2024
…ubilee

add test for main thread thread-local destructors

Fixes rust-lang#28129
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 24, 2024
…ubilee

add test for main thread thread-local destructors

Fixes rust-lang#28129
@RalfJung
Copy link
Member

RalfJung commented Jul 6, 2024

I gave up on the PR since even some glibc targets are effected (including the one we test on CI), so now we have two issues tracking the same thing -- this and #126858.

EDIT: I have closed the other issue.

@RalfJung RalfJung changed the title TLS destructors on the main thread are a little sketchy TLS destructors on the main thread are not always run Jul 6, 2024
papazof added a commit to nubificus/ttrpc-rust that referenced this issue Sep 30, 2024
Calling Rust from a GCC destructor can result in panics when using pthreads (rust-lang/rust#28129)

Move client request code to a separate thread to work around this issue

Signed-off-by: Kostis Papazafeiropoulos <papazof@gmail.com>
@surban
Copy link
Contributor

surban commented Dec 3, 2024

Testing on FreeBSD 14 shows that TLS destructors are not run for the main thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-destructors Area: Destructors (`Drop`, …) A-thread-locals Area: Thread local storage (TLS) C-bug Category: This is a bug. E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.