Revise breadcrumb_writer_t ownership #6766

sdmaclea · 2019-06-11T00:06:37Z

Make breadcrumb_writer_t own m_files
Manage breadcrumb_writer_t with a shared pointer
Prevent premature destruction by adding a shared pointer for the background thread

Fixes #4646 on master

lpereira

Were you able to reproduce this race condition? I'm curious to understand why this happened in the first place.

The patch LGTM with comments, though.

lpereira · 2019-06-11T00:41:48Z

src/corehost/cli/hostpolicy/breadcrumbs.cpp

 {
-    if (m_enabled)
+    auto instance = std::make_shared<breadcrumb_writer_t>(files);


While you're cleaning this up, would it be possible to get rid of begin_write() and end_write(), and use only the constructor/destructor of breadcrumb_writer_t? If that's possible, then make_shared() can be called in run_app_for_context(), and the branch to determine if end_write() has to be called can be junked. This would simplify this class a little bit.

Another thing that might be possible here is to get rid of breadcrumb_writer_t entirely and capture the breadcrumbs with a lambda passed directly to std::thread. I like the call to swap() to own the breadcrumbs map, though, and I don't know if there's a good way to do something similar without it.

Something like this could potentially simplify things: https://gist.github.com/lpereira/48403e8540524b7854743d68b6dd879d

sdmaclea · 2019-06-13T16:42:17Z

@lpereira

Were you able to reproduce this race condition? I'm curious to understand why this happened in the first place.

I was not able to reproduce the failure.

My working hypothesis is that the startup failure in coreclr is leaving the stack in a corrupted state as the process is exiting. Coreclr does lots of magic to handle exceptions so there could be a case which we missed.

I believe this is actually a symptom of a fatal failure in coreclr. The AV is masking the real failure.

My strategy is to move all the thread state off the main stack.

To allow for the join I needed a shared pointer, or a thread object.

I chose a shared pointer because I know exactly what state is stored within it. I can reason about what will happen if

its destructor is prematurely called
its destructor is not called
it is corrupted through a random stack overwrite (but destructor is not called)
it is corrupted through a random stack overwrite (then destructor is called)

In general I like your gist, but I have a much harder time reasoning about the layout of the main stack. I don't really know what assumptions the compiler is allowed to make. Or what state the thread object contains.

Also for backporting to the 2.2 branch, I wanted to keep the major functional changes in the breadcumb* files.

elinor-fung

For the existing code / original fix, it looks like the intention is to always finish writing out the breadcrumbs if exception handling is done is such a way that the destructor is called. Would that no longer be the case now?

Re: how the failure happend in the first place. Maybe silly question, but are we sure that the aspnet side is compiled in such a way that the destructor of breadcrumb_writer_t would have been called? I think that means compiling with \EHa, not just using __try/__except (not sure - the docs are a bit complicated and I have not really dealt with the exception handling model).

elinor-fung · 2019-06-13T19:54:14Z

src/corehost/cli/hostpolicy/breadcrumbs.h

-    ~breadcrumb_writer_t();
-
-    void begin_write();
+    breadcrumb_writer_t(std::unordered_set<pal::string_t> &files);


Can we make the constructor private now? It should always be created through begin_write now, right?

That was my first approach. Unfortunately, std::make_shared requires a public constructor. We could use new and assign a bare pointer to the shared_ptr.

sdmaclea · 2019-06-13T20:36:29Z

are we sure that the aspnet side is compiled in such a way that the destructor of breadcrumb_writer_t would have been called?

I don't really know. I doubt it. I know when I was working on adding tracing of unhandled exceptions for this same host, it was very difficult to get the exception handling right (when prototyping in corerun). I was only able to get it right when I did it inside coreclr with the pal TRY/EXCEPT macros.

I was trying to make sure breadcrumb writer didn't fail even if the host exception handling was wrong.

sdmaclea · 2019-06-13T20:42:02Z

the destructor is called. Would that no longer be the case now?

It would be the case if the smart_ptr destructor is called.
For the apparent case where the destructor is not called, it still would not be the case.
Edit - it would no longer be the case. If we cared, It could possibly be added back.

I am not particularly worried about the breadcrumbs finishing. These fatal exceptions should really only occur in development scenarios. I think of breadcrumbs as being important in a deployed scenario.

sdmaclea · 2019-06-13T21:55:04Z

@pakrym Do you know how the asp.net iis host is configured to catch exceptions? Or how can we find out?

pakrym · 2019-06-17T15:11:48Z

cc @jkotalik

jkotalik · 2019-06-17T15:21:21Z

https://github.com/aspnet/AspNetCore/blob/master/src/Servers/IIS/AspNetCoreModuleV2/InProcessRequestHandler/inprocessapplication.cpp#L427

sdmaclea · 2019-06-17T22:08:19Z

I think that means compiling with \EHa, not just using __try/__except (not sure - the docs are a bit complicated and I have not really dealt with the exception handling model).

My read of the docs says you are correct.

@steveharter's design for #4315 explicitly stated \EHa was required in the host

Looking at the IIS source the exception method seems to be set to \EHs. The destructors are not guaranteed to be called in this case and #4315 will have no affect.

This PR should fix the breadcrumb AV when there is a SEH (asynchronous exception) on the main thread.

Make breadcrumb_writer_t own m_files Manage breadcrumb_writer_t with a shared pointer Prevent premature destruction by adding a shared pointer for the background thread

sdmaclea added the area-Host label Jun 11, 2019

sdmaclea added this to the 3.0 milestone Jun 11, 2019

sdmaclea requested review from lpereira, vitek-karas and elinor-fung June 11, 2019 00:06

sdmaclea self-assigned this Jun 11, 2019

lpereira approved these changes Jun 11, 2019

View reviewed changes

sdmaclea force-pushed the AV branch from 7861cab to 99bf37a Compare June 13, 2019 16:49

elinor-fung reviewed Jun 13, 2019

View reviewed changes

Revise breadcrumb_writer_t ownership

dbbfad6

Make breadcrumb_writer_t own m_files Manage breadcrumb_writer_t with a shared pointer Prevent premature destruction by adding a shared pointer for the background thread

sdmaclea force-pushed the AV branch from 99bf37a to dbbfad6 Compare June 17, 2019 22:40

sdmaclea merged commit 300aa77 into dotnet:master Jun 18, 2019

sdmaclea deleted the AV branch June 18, 2019 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revise breadcrumb_writer_t ownership #6766

Revise breadcrumb_writer_t ownership #6766

sdmaclea commented Jun 11, 2019 •

edited

Loading

lpereira left a comment

lpereira Jun 11, 2019

lpereira Jun 11, 2019

sdmaclea commented Jun 13, 2019 •

edited

Loading

elinor-fung left a comment

elinor-fung Jun 13, 2019

sdmaclea Jun 13, 2019 •

edited

Loading

sdmaclea commented Jun 13, 2019

sdmaclea commented Jun 13, 2019 •

edited

Loading

sdmaclea commented Jun 13, 2019

pakrym commented Jun 17, 2019

jkotalik commented Jun 17, 2019

sdmaclea commented Jun 17, 2019 •

edited

Loading

Revise breadcrumb_writer_t ownership #6766

Revise breadcrumb_writer_t ownership #6766

Conversation

sdmaclea commented Jun 11, 2019 • edited Loading

lpereira left a comment

Choose a reason for hiding this comment

lpereira Jun 11, 2019

Choose a reason for hiding this comment

lpereira Jun 11, 2019

Choose a reason for hiding this comment

sdmaclea commented Jun 13, 2019 • edited Loading

elinor-fung left a comment

Choose a reason for hiding this comment

elinor-fung Jun 13, 2019

Choose a reason for hiding this comment

sdmaclea Jun 13, 2019 • edited Loading

Choose a reason for hiding this comment

sdmaclea commented Jun 13, 2019

sdmaclea commented Jun 13, 2019 • edited Loading

sdmaclea commented Jun 13, 2019

pakrym commented Jun 17, 2019

jkotalik commented Jun 17, 2019

sdmaclea commented Jun 17, 2019 • edited Loading

sdmaclea commented Jun 11, 2019 •

edited

Loading

sdmaclea commented Jun 13, 2019 •

edited

Loading

sdmaclea Jun 13, 2019 •

edited

Loading

sdmaclea commented Jun 13, 2019 •

edited

Loading

sdmaclea commented Jun 17, 2019 •

edited

Loading