-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make backtrace buffer handling more systematic #33277
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
linux32 test failure looked like it could be real, but the test logs leave no clue which actual line failed or why (https://build.julialang.org/#/builders/13/builds/5336/steps/2/logs/stdio). I've updated the tests to be more concrete and hopefully provide a bit more of a clue what is wrong. |
Welp, linux32 test passed this time around and the original failure did seem like it would be surprising if it were related. The win64 packaging error is a problem with code signing (tests passed). Should I go ahead and merge this? I'd like to use the module tracking as a base for trying out better formatting heuristics for backtraces, eg #33065 (comment). (Also as a base for trying out some changes in #33065 (comment).) |
Ah, unfortunate clash with the nice work in #33190. That's a lot of conflicts. |
Yeah, mostly trivial, but we both seem to have added an argument to one of the functions |
8eb580a
to
26e6ece
Compare
Part of this has been extracted into #33380. I'll rebase this once that one is merged. |
273ee60
to
c3d498a
Compare
Rebased. I think this is basically good to go. Should be almost entirely internal cleanup / refactoring, with the slight exception of adding a field to |
Bump. Does anyone want a closer look at this before merging? @maleadt perhaps you might be interested as this seems relevant to your memory profiler work. Musing over the design here a little, another way to look at this is that it evolves backtrace buffers toward a special purpose serialization format which is:
I still think it's worth merging more or less as it is, but I do wonder where this leads in the longer term (how many serialization formats do we want, really?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
For allocation tracking, I guess it would be best to add another extended entry frame containing the allocation info (as opposed to making the top frame an allocation one and putting the instruction pointer in the header field like InterpreterIP does, since allocations can come from the interpreter too and an extended entry can't have multiple tags)?
Thanks for having a look! I'll rebase to fix the conflicts again (looks like a collision with #33524) and attempt a little rewording for clarification.
Yes I think that would probably be good. If I understand correctly the allocation profiling just dumps the records in the buffer and doesn't need to do anything with them until the end of the profiling run. So serializing the allocation info interleaved/associated with the backtrace data like this should be fairly natural and convenient I guess. |
Increase expressibility of what can be stored in backtrace buffers, while ensuring that the GC can find roots without knowing about the detail. To do this, introduce a new "extended backtrace entry" format which carries along the number of roots and other data in a bitpacked format. This allows the backtrace buffer to be traversed and the roots collected in a general way, without the GC knowing about interpreter frames. Use this to add the module to InterperterIP so that the module of interpreted top level thunks can be known. In the future the extended entry format should allow us to be a lot more flexible with what can be stored in a backtrace. For example, we could * Compress the backtrace cycles of runaway recursive functions so that stack overflows are much more likely to fit in the fixed-size bt_data array. * Integrate external or other types of interpreter frames into the backtrace machinery.
c3d498a
to
987e1c6
Compare
Ok I've tried a little renaming and rewording as suggested; hopefully that will be clearer now. I'll go ahead and merge once CI has run. |
Increase expressibility of what can be stored in backtrace buffers, while ensuring that the GC can find roots without knowing about the detail.
To do this, introduce a new "extended backtrace entry" format which carries along the number of roots and other data in a bitpacked format. This allows the backtrace buffer to be traversed and the roots collected in a general way, without the GC knowing about interpreter frames. Use
this to add the module to InterperterIP so that the module of interpreted top level thunks can be known (this is infrastructure to help with #33065.)
In the future the extended entry format should allow us to be a lot more flexible with what can be stored in a backtrace. For example, we could
Implementation details
[Edit] To quote the code comments regarding the buffer format:
A backtrace buffer conceptually contains a stack of instruction pointers ordered from the inner-most frame to the outermost. We store them in a special raw format for two reasons:
throw()
must populate the trace so it must be as efficient as possible.The raw buffer layout contains "frame entries" composed of one or several
jl_bt_element_t
values. From the point of view of the GC, an entry is either:An extended entry
e
is made up of severaljl_bt_element_t
values:e[0] JL_BT_NON_PTR_ENTRY
- Special marker to distinguish extended entriese[1] tags
- A bit packed uintptr_t containing a tag and the number of GC- managed and non-managed valuese[2+j]
- GC managed datae[2+ngc+i]
- Non-GC-managed dataThe format of
tags
is, from LSB to MSB:0:2 ngc
- Number of GC-managed pointers for this frame entry3:5 nptr
- Number of non-GC-managed buffer elements6:9 tag
- Entry type10:... header
- Entry-specific header data