Detailed debug-info (DWARF) support in new backends (initially x64). #2565

cfallin · 2021-01-09T10:54:22Z

This PR propagates "value labels" all the way from CLIF to DWARF
metadata on the emitted machine code. The key idea is as follows:

Translate value-label metadata on the input into "value_label"
pseudo-instructions when lowering into VCode. These
pseudo-instructions take a register as input, denote a value label,
and semantically are like a "move into value label" -- i.e., they
update the current value (as seen by debugging tools) of the given
local. These pseudo-instructions emit no machine code.
Perform a dataflow analysis at the machine-code level, tracking
value-labels that propagate into registers and into [SP+constant]
stack storage. This is a forward dataflow fixpoint analysis where each
storage location can contain a set of value labels, and each value
label can reside in a set of storage locations. (Meet function is
pairwise intersection by storage location.)

This analysis traces value labels symbolically through loads and
stores and reg-to-reg moves, so it will naturally handle spills and
reloads without knowing anything special about them.
When this analysis converges, we have, at each machine-code offset, a
mapping from value labels to some number of storage locations; for
each offset for each label, we choose the best location (prefer
registers). Note that we can choose any location, as the symbolic
dataflow analysis is sound and guarantees that the value at the
value_label instruction propagates to all of the named locations.
Then we can convert this mapping into a format that the DWARF
generation code (wasmtime's debug crate) can use.

This PR also adds the new-backend variant to the gdb tests on CI.

cfallin · 2021-01-09T10:57:56Z

@yurydelendik, I think I will need your help on this -- it appears that you wrote all of the debug support initially? Two questions:

Is there a reason that the gdb tests are currently marked #[ignore]? Are they known to be broken for other reasons?
If not, would you be able to help me debug this? I believe the analysis is generating correct, or mostly correct, mappings of labels to registers; but the debug crate in wasmtime is generating DWARF that uses the vmctx register outside of its valid range and I don't understand enough about DWARF or the code here to say why.

Any help from anyone who is passionate about debuginfo would be appreciated -- this is one of the last pieces needed to enable the switch to the new backend by default in wasmtime, and I've dumped a bunch of time into this and need to context-switch for a little bit :-)

yurydelendik · 2021-01-09T19:38:50Z

Is there a reason that the gdb tests are currently marked #[ignore]? Are they known to be broken for other reasons?

They are marked as "ignored" to not run them on developer's computer: the tests require lldb/gdb/llvm-dwarfdump install. These tests are run on the CI. See

wasmtime/.github/workflows/main.yml

Lines 287 to 293 in cacebfb

    
               # Test debug (DWARF) related functionality. 
        
               - run: | 
        
                   sudo apt-get update && sudo apt-get install -y gdb 
        
                   cargo test test_debug_dwarf -- --ignored --test-threads 1 
        
                 if: matrix.os == 'ubuntu-latest' 
        
                 env: 
        
                   RUST_BACKTRACE: 1

If not, would you be able to help me debug this?

I'll try to be useful, yes.

but the debug crate in wasmtime is generating DWARF that uses the vmctx register outside of its valid range

To keep vmctx register/value valid as long as possible is a desired functionality, as well as knowing it during stack unwinding to provide Instance information. We talked with @sunfishcode about having fixed location for vmctx some time back.

If vmctx is not available, there is no way to get a variable value stored in the instances memory and shadow stack. The variables are marked as optimized out in this case. Which is acceptable solution from debugging of optimized code point of view. Of course we have to minimize such cases to make live of developers better.

cfallin · 2021-01-09T20:46:56Z

To keep vmctx register/value valid as long as possible is a desired functionality, as well as knowing it during stack unwinding to provide Instance information. We talked with @sunfishcode about having fixed location for vmctx some time back.

If vmctx is not available, there is no way to get a variable value stored in the instances memory and shadow stack. The variables are marked as optimized out in this case. Which is acceptable solution from debugging of optimized code point of view. Of course we have to minimize such cases to make live of developers better.

Makes sense; I'll think about ways to do this. Perhaps the simplest would be to just add a hack that emits a value_label psuedo-instruction for vmctx at return -- this would keep it alive and ensure it can be accessed at least from a spillslot if not a register.

A question: I was struggling to work out the invariants in the code that converts range-associated data. What do start and end denote for the ValueLabelsRanges? Are these open or closed bounds and do they apply to the start or end of instructions? I had noticed some code that skips any "empty" ranges (start == end) -- if a value is computable only after one instruction, which ends at offset X, should we encode that as start=X, end=X+1?

There's something a little more broken going on in this PR right now, as the vmctx value is dead at a certain point, but DWARF info related to its former register is being generated. So I suspect that I'm misunderstanding some of the invariants related to ranges.

Thanks for the help!

cfallin · 2021-01-09T21:07:09Z

@yurydelendik with the vmctx-always-alive hack, this seems to pass the gdb DWARF test locally on the new backend now (and I'm watching CI); thanks!

Question above about ranges still stands; I want to make sure I got the conversions right.

Otherwise, though, I think this is ready for review.

bjorn3 · 2021-01-09T21:07:36Z

Perhaps the simplest would be to just add a hack that emits a value_label psuedo-instruction for vmctx at return -- this would keep it alive and ensure it can be accessed at least from a spillslot if not a register.

It would be nice to have a more general way of keeping a value alive somewhere for debuginfo purposes.

cfallin · 2021-01-09T21:12:54Z

Perhaps the simplest would be to just add a hack that emits a value_label psuedo-instruction for vmctx at return -- this would keep it alive and ensure it can be accessed at least from a spillslot if not a register.

It would be nice to have a more general way of keeping a value alive somewhere for debuginfo purposes.

Yes, I agree. I actually thought of a really simple/ugly (but maybe more robust) way of doing this -- allocate slots on the stack for all ValueLabels, and emit actual stores to these slots every time a labeled value is defined (or the label is associated with a new value). Then the DWARF generation can always use SP-offset expressions.

This would horribly pessimize code performance, though maybe not too much more than non-optimized debug code already is; but it would be much simpler and more likely to be correct in all sorts of edge cases. I see this as sort of in the same spirit as the idea proposed in #2459 for reftypes tracing: in both cases we instrument the CLIF to provide needed functionality so that the core compiler remains simpler and easier to verify. It is a little unconventional, though, so perhaps for another time.

cfallin · 2021-01-21T22:07:46Z

@fitzgen or @bnjbvr, would you be able to spare some time to review this? I don't want it to sit too long as it's starting to become stale (I'll rebase across the conflicts when handling any review comments). Thanks!

fitzgen · 2021-01-21T22:41:41Z

Sure, I can review.

This PR propagates "value labels" all the way from CLIF to DWARF metadata on the emitted machine code. The key idea is as follows: - Translate value-label metadata on the input into "value_label" pseudo-instructions when lowering into VCode. These pseudo-instructions take a register as input, denote a value label, and semantically are like a "move into value label" -- i.e., they update the current value (as seen by debugging tools) of the given local. These pseudo-instructions emit no machine code. - Perform a dataflow analysis *at the machine-code level*, tracking value-labels that propagate into registers and into [SP+constant] stack storage. This is a forward dataflow fixpoint analysis where each storage location can contain a *set* of value labels, and each value label can reside in a *set* of storage locations. (Meet function is pairwise intersection by storage location.) This analysis traces value labels symbolically through loads and stores and reg-to-reg moves, so it will naturally handle spills and reloads without knowing anything special about them. - When this analysis converges, we have, at each machine-code offset, a mapping from value labels to some number of storage locations; for each offset for each label, we choose the best location (prefer registers). Note that we can choose any location, as the symbolic dataflow analysis is sound and guarantees that the value at the value_label instruction propagates to all of the named locations. - Then we can convert this mapping into a format that the DWARF generation code (wasmtime's debug crate) can use. This PR also adds the new-backend variant to the gdb tests on CI.

cfallin · 2021-01-22T00:03:18Z

Thanks! I went over this again quickly and noticed a few typos, and decided to rebase while I was here. Also note there's a ridealong change to remove an outdated doc-comment about the backend pipeline (I can split that out if you'd like).

yurydelendik · 2021-01-22T14:15:55Z

crates/debug/src/transform/expression.rs

@@ -207,6 +225,31 @@ fn append_memory_deref(
                return Ok(false);
            }
        }
+        LabelValueLoc::Reg(r) => {
+            let reg = isa.map_regalloc_reg_to_dwarf(r)? as u8;
+            writer.write_u8(gimli::constants::DW_OP_breg0.0 + reg)?;


ExpressionWriter has write_op_breg

fitzgen

Code looks good, and I really appreciated the top level overview comment but, as a reader, I would have appreciated some more function/method/struct level comments so I didn't have to go searching through the code to figure out how a struct member was used and things like that. A few inline comments below. r=me with them addressed and/or questions answered, etc.

Thanks!

cranelift/codegen/src/machinst/debug.rs

I had missed that the CI config didn't actually run the tests, because (I think) `matrix.target` is not set by default (?). All of our hosts are native x86-64, so we can just gate on OS (Ubuntu) instead. I also discovered that while I had been testing with the gdb tests locally, when *all* `debug::*` tests are run, there are two that do not pass on the new backend because of specific differences in compiled code. One is a value-lifetime issue (the value is "optimized out" at the point the breakpoint is set) and the other has to do with basic-block order (it is trying to match against hardcoded machine-code offsets which have changed).

cfallin · 2021-01-23T00:47:33Z

@fitzgen I addressed your comments, but while double-checking the CI results one last time, realized the debug tests weren't actually running on CI (GitHub Actions config issue). On fixing that, discovered that the tests I was running locally were a subset of all debug tests, and two of the others were failing. These failures were due to (IMHO) not-super-important differences in codegen, so I added directives to ignore; but let me know if this is still r+ or if you'd like me to dig deeper. Thanks!

fitzgen

👍

cfallin requested a review from yurydelendik January 9, 2021 10:54

cfallin force-pushed the debug-value-labels branch from 0a103d7 to f0c68e2 Compare January 9, 2021 10:58

cfallin force-pushed the debug-value-labels branch from f0c68e2 to 1572492 Compare January 9, 2021 21:02

cfallin changed the title ~~Draft: detailed debug-info (DWARF) support in new backends (initially x64).~~ Detailed debug-info (DWARF) support in new backends (initially x64). Jan 9, 2021

cfallin force-pushed the debug-value-labels branch from 1572492 to c1e0be1 Compare January 9, 2021 21:44

cfallin requested review from bnjbvr and fitzgen January 21, 2021 22:06

cfallin added 3 commits January 21, 2021 15:59

Skip value-label analysis if no value labels are present.

997fab5

Fix a few comment typos and add a clarifying comment.

7e12abc

cfallin force-pushed the debug-value-labels branch from 59eb476 to 7e12abc Compare January 22, 2021 00:01

cfallin removed request for bnjbvr and yurydelendik January 22, 2021 00:03

yurydelendik reviewed Jan 22, 2021

View reviewed changes

fitzgen approved these changes Jan 22, 2021

View reviewed changes

cfallin added 2 commits January 22, 2021 16:02

Address review comments.

f54d0d0

fitzgen approved these changes Jan 23, 2021

View reviewed changes

cfallin merged commit 95822a5 into bytecodealliance:main Jan 23, 2021

cfallin deleted the debug-value-labels branch January 23, 2021 01:22

cfallin mentioned this pull request Feb 23, 2021

newBE: value_labels_ranges is very slow #2681

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detailed debug-info (DWARF) support in new backends (initially x64). #2565

Detailed debug-info (DWARF) support in new backends (initially x64). #2565

cfallin commented Jan 9, 2021 •

edited

Loading

cfallin commented Jan 9, 2021

yurydelendik commented Jan 9, 2021

cfallin commented Jan 9, 2021

cfallin commented Jan 9, 2021

bjorn3 commented Jan 9, 2021

cfallin commented Jan 9, 2021

cfallin commented Jan 21, 2021

fitzgen commented Jan 21, 2021

cfallin commented Jan 22, 2021

yurydelendik Jan 22, 2021

fitzgen left a comment

cfallin commented Jan 23, 2021

fitzgen left a comment

Detailed debug-info (DWARF) support in new backends (initially x64). #2565

Detailed debug-info (DWARF) support in new backends (initially x64). #2565

Conversation

cfallin commented Jan 9, 2021 • edited Loading

cfallin commented Jan 9, 2021

yurydelendik commented Jan 9, 2021

cfallin commented Jan 9, 2021

cfallin commented Jan 9, 2021

bjorn3 commented Jan 9, 2021

cfallin commented Jan 9, 2021

cfallin commented Jan 21, 2021

fitzgen commented Jan 21, 2021

cfallin commented Jan 22, 2021

yurydelendik Jan 22, 2021

Choose a reason for hiding this comment

fitzgen left a comment

Choose a reason for hiding this comment

cfallin commented Jan 23, 2021

fitzgen left a comment

Choose a reason for hiding this comment

cfallin commented Jan 9, 2021 •

edited

Loading