-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RemoveDIs] Update migration doc to explain how to handle textual IR updates #91725
Conversation
@llvm/pr-subscribers-debuginfo Author: Stephen Tozer (SLTozer) ChangesRelated review: #91724 This patch updates the RemoveDIs migration document to include details on the textual IR changes, including steps to update any downstream lit tests accordingly. These steps are the same as those used to update the lit tests in the LLVM/Clang lit tests, as detailed in the review linked above. Full diff: https://github.com/llvm/llvm-project/pull/91725.diff 1 Files Affected:
diff --git a/llvm/docs/RemoveDIsDebugInfo.md b/llvm/docs/RemoveDIsDebugInfo.md
index 9e50a2a604aa6..48a58aa7a5811 100644
--- a/llvm/docs/RemoveDIsDebugInfo.md
+++ b/llvm/docs/RemoveDIsDebugInfo.md
@@ -24,12 +24,115 @@ The debug records are not instructions, do not appear in the instruction list, a
# Great, what do I need to do!
-Approximately nothing -- we've already instrumented all of LLVM to handle these new records ("`DbgRecords`") and behave identically to past LLVM behaviour. We plan on turning this on by default some time soon, with IR converted to the intrinsic form of debug info at terminals (textual IR, bitcode) for a short while, before then changing the textual IR and bitcode formats.
+Very little -- we've already instrumented all of LLVM to handle these new records ("`DbgRecords`") and behave identically to past LLVM behaviour. This is currently being turned on by default, so that `DbgRecords` will be used by default in memory, IR, and bitcode.
+
+## API Changes
There are two significant changes to be aware of. Firstly, we're adding a single bit of debug relevant data to the `BasicBlock::iterator` class (it's so that we can determine whether ranges intend on including debug info at the beginning of a block or not). That means when writing passes that insert LLVM IR instructions, you need to identify positions with `BasicBlock::iterator` rather than just a bare `Instruction *`. Most of the time this means that after identifying where you intend on inserting something, you must also call `getIterator` on the instruction position -- however when inserting at the start of a block you _must_ use `getFirstInsertionPt`, `getFirstNonPHIIt` or `begin` and use that iterator to insert, rather than just fetching a pointer to the first instruction.
The second matter is that if you transfer sequences of instructions from one place to another manually, i.e. repeatedly using `moveBefore` where you might have used `splice`, then you should instead use the method `moveBeforePreserving`. `moveBeforePreserving` will transfer debug info records with the instruction they're attached to. This is something that happens automatically today -- if you use `moveBefore` on every element of an instruction sequence, then debug intrinsics will be moved in the normal course of your code, but we lose this behaviour with non-instruction debug info.
+## Textual IR Changes
+
+As we change from using debug intrinsics to debug records, any tools that depend on parsing IR produced by LLVM will need to handle the new format. For the most part, the difference between the printed form of a intrinsic and a record is trivial:
+
+1. An extra 2 spaces of indentation are added.
+2. The call `(tail|notail|musttail)? call void @llvm.dbg._` is replaced with `#dbg_`.
+3. The leading `metadata ` is removed from each argument to the intrinsic.
+4. The DILocation changes from being a debug attachment, `!dbg !<Num>`, to being the final intrinsic argument without a leading `!dbg`.
+
+Altogether, this gives you:
+
+```
+; Intrinsic:
+ call void @llvm.dbg.value(metadata i32 %add, metadata !10, metadata !DIExpression()), !dbg !20
+; Record:
+ #dbg_value(i32 %add, !10, !DIExpression(), !20)
+```
+
+### Test updates
+
+Any tests downstream of the main LLVM repo that test the IR output of LLVM may break as a result of the change to using records. Updating an individual test to expect records instead of intrinsics should be trivial, given the update rules above. Updating many tests may be burdensome however; to update the lit tests in the main repository, the following steps were used:
+
+1. Collect the list of failing lit tests into a single file, `failing-tests.txt`, separated by (and ending with) newlines.
+2. Use the following line to split the failing tests into tests that use update_test_checks and tests that don't:
+ ```
+ $ while IFS= read -r f; do grep -q "Assertions have been autogenerated by" "$f" && echo "$f" >> update-checks-tests.txt || echo "$f" >> manual-tests.txt; done < failing-tests.txt
+ ```
+3. For the tests that use update_test_checks, run the appropriate update_test_checks script - for the main LLVM repo, this was achieved with:
+ ```
+ $ xargs ./llvm/utils/update_test_checks.py --opt-binary ./build/bin/opt < update-checks-tests.txt
+ $ xargs ./llvm/utils/update_cc_test_checks.py --llvm-bin ./build/bin/ < update-checks-tests.txt
+ ```
+4. The remaining tests can be manually updated, although if there is a large number of tests then the following scripts may be useful; firstly, a script used to extract the check-line prefixes from a file:
+ ```
+ $ cat ./get-checks.sh
+ #!/bin/bash
+
+ # Always add CHECK, since it's more effort than it's worth to filter files where
+ # every RUN line uses other check prefixes.
+ # Then detect every instance of "check-prefix(es)=..." and add the
+ # comma-separated arguments as extra checks.
+ for filename in "$@"
+ do
+ echo "$filename,CHECK"
+ allchecks=$(grep -Eo 'check-prefix(es)?[ =][A-Z0-9_,-]+' $filename | sed -E 's/.+[= ]([A-Z0-9_,-]+).*/\1/g; s/,/\n/g')
+ for check in $allchecks; do
+ echo "$filename,$check"
+ done
+ done
+ ```
+ Then a second script to perform the work of actually updating the check-lines in each of the failing tests, with a series of simple substitution patterns:
+ ```
+ $ cat ./substitute-checks.sh
+ #!/bin/bash
+
+ file="$1"
+ check="$2"
+
+ # Any test that explicitly tests debug intrinsic output is not suitable to
+ # update by this script.
+ if grep -q "write-experimental-debuginfo=false" "$file"; then
+ exit 0
+ fi
+
+ sed -i -E -e "
+ /(#|;|\/\/).*$check[A-Z0-9_\-]*:/!b
+ /DIGlobalVariableExpression/b
+ /!llvm.dbg./bpostcall
+ s/((((((no|must)?tail )?call.*)?void )?@)?llvm.)?dbg\.([a-z]+)/#dbg_\7/
+ :postcall
+ /declare #dbg_/d
+ s/metadata //g
+ s/metadata\{/{/g
+ s/DIExpression\(([^)]*)\)\)(,( !dbg)?)?/DIExpression(\1),/
+ /#dbg_/!b
+ s/((\))?(,) )?!dbg (![0-9]+)/\3\4\2/
+ s/((\))?(, ))?!dbg/\3/
+ " "$file"
+ ```
+ Both of these scripts combined can be used on the list in `manual-tests.txt` as follows:
+ ```
+ $ cat manual-tests.txt | xargs ./get-checks.sh | sort | uniq | awk -F ',' '{ system("./substitute-checks.sh " $1 " " $2) }'
+ ```
+ These scripts dealt successfully with the vast majority of checks in `clang/test` and `llvm/test`.
+5. Verify the resulting tests pass, and detect any failing tests:
+ ```
+ $ xargs ./build/bin/llvm-lit -q < failing-tests.txt
+ ********************
+ Failed Tests (5):
+ LLVM :: DebugInfo/Generic/dbg-value-lower-linenos.ll
+ LLVM :: Transforms/HotColdSplit/transfer-debug-info.ll
+ LLVM :: Transforms/ObjCARC/basic.ll
+ LLVM :: Transforms/ObjCARC/ensure-that-exception-unwind-path-is-visited.ll
+ LLVM :: Transforms/SafeStack/X86/debug-loc2.ll
+
+
+ Total Discovered Tests: 295
+ Failed: 5 (1.69%)
+ ```
+6. Some tests may have failed - the update scripts are simplistic and preserve no context across lines, and so there are cases that they will not handle; the remaining cases must be manually updated (or handled by further scripts).
+
# C-API changes
All the functions that have been added are temporary and will be deprecated in the future. The intention is that they'll help downstream projects adapt during the transition period.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks for updating the docs! 😄
@SLTozer Maybe consider merging this as discussed in #93562 (comment)...? |
Yes, that should happen soon - before doing so it had just occurred to me that this guide assumes a unix shell, but it might be appropriate to add some windows equivalents, which I wanted to verify before adding and just haven't gotten to yet. Though if you'd prefer that to go in a separate review afterwards, I can just merge this now. |
Hmm, I guess I'd say it's better to have migration docs available sooner rather than later for anyone who may need them. Not sure how many LLVM developers are on Windows to help prioritise those additions... Maybe aim to merge this sometime this week, and if the Windows bits aren't ready this week, then add them in separate PR...? |
9b29547
to
a8db1d9
Compare
Rebased the docs to remove the existing conflict, and added some minor adjustments to the start of the "Textual IR" section, just cleaning up some of the terminology and unclear sentences. |
…updates (llvm#91725) Related review: llvm#91724 This patch updates the RemoveDIs migration document to include details on the textual IR changes, including steps to update any downstream lit tests accordingly. These steps are the same as those used to update the lit tests in the LLVM/Clang lit tests, as detailed in the review linked above.
Related review: #91724
This patch updates the RemoveDIs migration document to include details on the textual IR changes, including steps to update any downstream lit tests accordingly. These steps are the same as those used to update the lit tests in the LLVM/Clang lit tests, as detailed in the review linked above.