-
Notifications
You must be signed in to change notification settings - Fork 12.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[CSSPGO][llvm-profgen] Fix external address issues of perf reader (re…
…turn to external addr part) Before we have an issue with artificial LBR whose source is a return, recalling that "an internal code(A) can return to external address, then from the external address call a new internal code(B), making an artificial branch that looks like a return from A to B can confuse the unwinder". We just ignore the LBRs after this artificial LBR which can miss some samples. This change aims at fixing this by correctly unwinding them instead of ignoring them. List some typical scenarios covered by this change. 1) multiple sequential call back happen in external address, e.g. ``` [ext, call, foo] [foo, return, ext] [ext, call, bar] ``` Unwinder should avoid having foo return from bar. Wrong call stack is like [foo, bar] 2) the call stack before and after external call should be correctly unwinded. ``` {call stack1} {call stack2} [foo, call, ext] [ext, call, bar] [bar, return, ext] [ext, return, foo ] ``` call stack 1 should be the same to call stack2. Both shouldn't be truncated 3) call stack should be truncated after call into external code since we can't do inlining with external code. ``` [foo, call, ext] [ext, call, bar] [bar, call, baz] [baz, return, bar ] [bar, return, ext] ``` the call stack of code in baz should not include foo. ### Implementation: We leverage artificial frame to fix #2 and #3: when we got a return artificial LBR, push an extra artificial frame to the stack. when we pop frame, check if the parent is an artificial frame to pop(fix #2). Therefore, call/ return artificial LBR is just the same as regular LBR which can keep the call stack. While recording context on the trie, artificial frame is used as a tag indicating that we should truncate the call stack(fix #3). To differentiate #1 and #2, we leverage `getCallAddrFromFrameAddr`. Normally the target of the return should be the next inst of a call inst and `getCallAddrFromFrameAddr` will return the address of call inst. Otherwise, getCallAddrFromFrameAddr will return to 0 which is the case of #1. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D115550
- Loading branch information
Showing
7 changed files
with
251 additions
and
34 deletions.
There are no files selected for viewing
Binary file not shown.
28 changes: 28 additions & 0 deletions
28
llvm/test/tools/llvm-profgen/Inputs/callback-external-addr.perfscript
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
7fe8d7620597 | ||
4007b0 | ||
7fe8d727e493 | ||
5541f689495641d7 | ||
0x40069e/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x400690/P/-/-/1 0x400751/0x7fe8d762058b/P/-/-/1 0x40072b/0x40074c/P/-/-/5 0x4006ce/0x400720/P/-/-/4 0x40071b/0x4006c0/P/-/-/7 0x7fe8d76205a3/0x400715/P/-/-/1 0x4006ae/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x4006a0/P/-/-/3 0x40069e/0x7fe8d762058b/P/-/-/3 0x7fe8d7620589/0x400690/P/-/-/4 0x400590/0x7fe8d7620560/P/-/-/1 0x400710/0x400590/P/-/-/10 0x4006be/0x4006ec/P/-/-/2 0x4006e7/0x4006b0/P/-/-/3 0x400747/0x4006d0/P/-/-/2 0x7fe8d7620589/0x400730/P/-/-/4 0x400590/0x7fe8d7620560/P/-/-/1 0x4007ab/0x400590/P/-/-/2 0x4007bf/0x40077d/P/-/-/1 0x7fe8d76205a3/0x4007b0/P/-/-/2 0x40069e/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x400690/P/-/-/1 0x400751/0x7fe8d762058b/P/-/-/1 0x40072b/0x40074c/P/-/-/4 0x4006ce/0x400720/P/-/-/4 0x40071b/0x4006c0/P/-/-/3 0x7fe8d76205a3/0x400715/P/-/-/4 0x4006ae/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x4006a0/P/-/-/3 0x40069e/0x7fe8d762058b/P/-/-/3 0x7fe8d7620589/0x400690/P/-/-/4 | ||
|
||
4006ec | ||
40074c | ||
7fe8d762058b | ||
4007b0 | ||
7fe8d727e493 | ||
5541f689495641d7 | ||
0x4006be/0x4006ec/P/-/-/2 0x4006e7/0x4006b0/P/-/-/3 0x400747/0x4006d0/P/-/-/2 0x7fe8d7620589/0x400730/P/-/-/4 0x400590/0x7fe8d7620560/P/-/-/1 0x4007ab/0x400590/P/-/-/2 0x4007bf/0x40077d/P/-/-/3 0x7fe8d76205a3/0x4007b0/P/-/-/2 0x40069e/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x400690/P/-/-/1 0x400751/0x7fe8d762058b/P/-/-/1 0x40072b/0x40074c/P/-/-/7 0x4006ce/0x400720/P/-/-/4 0x40071b/0x4006c0/P/-/-/2 0x7fe8d76205a3/0x400715/P/-/-/1 0x4006ae/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x4006a0/P/-/-/3 0x40069e/0x7fe8d762058b/P/-/-/4 0x7fe8d7620589/0x400690/P/-/-/6 0x400590/0x7fe8d7620560/P/-/-/1 0x400710/0x400590/P/-/-/5 0x4006be/0x4006ec/P/-/-/3 0x4006e7/0x4006b0/P/-/-/3 0x400747/0x4006d0/P/-/-/2 0x7fe8d7620589/0x400730/P/-/-/4 0x400590/0x7fe8d7620560/P/-/-/1 0x4007ab/0x400590/P/-/-/2 0x4007bf/0x40077d/P/-/-/3 0x7fe8d76205a3/0x4007b0/P/-/-/2 0x40069e/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x400690/P/-/-/1 0x400751/0x7fe8d762058b/P/-/-/1 | ||
|
||
40074c | ||
7fe8d762058b | ||
4007b0 | ||
7fe8d727e493 | ||
5541f689495641d7 | ||
0x40072b/0x40074c/P/-/-/6 0x4006ce/0x400720/P/-/-/8 0x40071b/0x4006c0/P/-/-/1 0x7fe8d76205a3/0x400715/P/-/-/2 0x4006ae/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x4006a0/P/-/-/1 0x40069e/0x7fe8d762058b/P/-/-/2 0x7fe8d7620589/0x400690/P/-/-/4 0x400590/0x7fe8d7620560/P/-/-/1 0x400710/0x400590/P/-/-/2 0x4006be/0x4006ec/P/-/-/2 0x4006e7/0x4006b0/P/-/-/3 0x400747/0x4006d0/P/-/-/2 0x7fe8d7620589/0x400730/P/-/-/4 0x400590/0x7fe8d7620560/P/-/-/1 0x4007ab/0x400590/P/-/-/2 0x4007bf/0x40077d/P/-/-/1 0x7fe8d76205a3/0x4007b0/P/-/-/2 0x40069e/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x400690/P/-/-/1 0x400751/0x7fe8d762058b/P/-/-/1 0x40072b/0x40074c/P/-/-/4 0x4006ce/0x400720/P/-/-/4 0x40071b/0x4006c0/P/-/-/10 0x7fe8d76205a3/0x400715/P/-/-/1 0x4006ae/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x4006a0/P/-/-/3 0x40069e/0x7fe8d762058b/P/-/-/2 0x7fe8d7620589/0x400690/P/-/-/4 0x400590/0x7fe8d7620560/P/-/-/1 0x400710/0x400590/P/-/-/6 0x4006be/0x4006ec/P/-/-/4 | ||
|
||
400720 | ||
40074c | ||
7fe8d762058b | ||
4007b0 | ||
7fe8d727e493 | ||
5541f689495641d7 | ||
0x4006ce/0x400720/P/-/-/4 0x40071b/0x4006c0/P/-/-/3 0x7fe8d76205a3/0x400715/P/-/-/1 0x4006ae/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x4006a0/P/-/-/3 0x40069e/0x7fe8d762058b/P/-/-/7 0x7fe8d7620589/0x400690/P/-/-/5 0x400590/0x7fe8d7620560/P/-/-/1 0x400710/0x400590/P/-/-/5 0x4006be/0x4006ec/P/-/-/3 0x4006e7/0x4006b0/P/-/-/3 0x400747/0x4006d0/P/-/-/2 0x7fe8d7620589/0x400730/P/-/-/4 0x400590/0x7fe8d7620560/P/-/-/1 0x4007ab/0x400590/P/-/-/2 0x4007bf/0x40077d/P/-/-/2 0x7fe8d76205a3/0x4007b0/P/-/-/2 0x40069e/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x400690/P/-/-/1 0x400751/0x7fe8d762058b/P/-/-/1 0x40072b/0x40074c/P/-/-/4 0x4006ce/0x400720/P/-/-/4 0x40071b/0x4006c0/P/-/-/2 0x7fe8d76205a3/0x400715/P/-/-/2 0x4006ae/0x7fe8d7620597/P/-/-/2 0x7fe8d7620595/0x4006a0/P/-/-/3 0x40069e/0x7fe8d762058b/P/-/-/2 0x7fe8d7620589/0x400690/P/-/-/4 0x400590/0x7fe8d7620560/P/-/-/1 0x400710/0x400590/P/-/-/5 0x4006be/0x4006ec/P/-/-/2 0x4006e7/0x4006b0/P/-/-/3 |
114 changes: 114 additions & 0 deletions
114
llvm/test/tools/llvm-profgen/callback-external-addr.test
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/callback-external-addr.perfscript --binary=%S/Inputs/callback-external-addr.perfbin --output=%t --skip-symbolization | ||
; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER | ||
|
||
; Test if call stack is correctly truncated. | ||
; CHECK-UNWINDER-NOT: main:3 @ bar | ||
; CHECK-UNWINDER-NOT: main:3 @ foo | ||
; CHECK-UNWINDER-NOT: qux:3 @ baz | ||
; CHECK-UNWINDER-NOT: qux:3 @ bar | ||
|
||
; Test if return to wrong internal target | ||
; CHECK-UNWINDER-NOT: baz:0 @ bar | ||
; CHECK-UNWINDER-NOT: bar:0 @ baz | ||
; CHECK-UNWINDER-NOT: baz:0 @ main | ||
; CHECK-UNWINDER-NOT: bar:0 @ foo | ||
; CHECK-UNWINDER-NOT: baz:0 @ qux | ||
|
||
; Test for callback return from internal address to external address. | ||
; [foo:2 @ qux:2 @ callBeforeReturn] and [foo:2 @ qux:4 @ callAfterReturn] should exist | ||
; which means the callback return won't interrupt the previous call stack | ||
|
||
; CHECK-UNWINDER: [bar] | ||
; CHECK-UNWINDER: 1 | ||
; CHECK-UNWINDER: 690-69e:12 | ||
; CHECK-UNWINDER: 0 | ||
; CHECK-UNWINDER: [baz] | ||
; CHECK-UNWINDER: 1 | ||
; CHECK-UNWINDER: 6a0-6ae:7 | ||
; CHECK-UNWINDER: 0 | ||
; CHECK-UNWINDER: [foo] | ||
; CHECK-UNWINDER: 2 | ||
; CHECK-UNWINDER: 730-747:5 | ||
; CHECK-UNWINDER: 74c-751:5 | ||
; CHECK-UNWINDER: 1 | ||
; CHECK-UNWINDER: 747->6d0:5 | ||
; CHECK-UNWINDER: [foo:2 @ qux] | ||
; CHECK-UNWINDER: 4 | ||
; CHECK-UNWINDER: 6d0-6e7:5 | ||
; CHECK-UNWINDER: 6ec-710:6 | ||
; CHECK-UNWINDER: 715-71b:7 | ||
; CHECK-UNWINDER: 720-72b:6 | ||
; CHECK-UNWINDER: 3 | ||
; CHECK-UNWINDER: 6e7->6b0:6 | ||
; CHECK-UNWINDER: 71b->6c0:7 | ||
; CHECK-UNWINDER: 72b->74c:6 | ||
; CHECK-UNWINDER: [foo:2 @ qux:2 @ callBeforeReturn] | ||
; CHECK-UNWINDER: 1 | ||
; CHECK-UNWINDER: 6b0-6be:6 | ||
; CHECK-UNWINDER: 1 | ||
; CHECK-UNWINDER: 6be->6ec:7 | ||
; CHECK-UNWINDER: [foo:2 @ qux:4 @ callAfterReturn] | ||
; CHECK-UNWINDER: 1 | ||
; CHECK-UNWINDER: 6c0-6ce:7 | ||
; CHECK-UNWINDER: 1 | ||
; CHECK-UNWINDER: 6ce->720:7 | ||
; CHECK-UNWINDER: [main] | ||
; CHECK-UNWINDER: 2 | ||
; CHECK-UNWINDER: 77d-7ab:5 | ||
; CHECK-UNWINDER: 7b0-7bf:5 | ||
; CHECK-UNWINDER: 1 | ||
; CHECK-UNWINDER: 7bf->77d:5 | ||
|
||
; libcallback.c | ||
; clang -shared -fPIC -o libcallback.so libcallback.c | ||
|
||
int callback(int *cnt, int (*func1)(int), int (*func2)(int), int p) { | ||
(*cnt)++; | ||
return func1(p) + func2(p); | ||
} | ||
|
||
; test.c | ||
; clang test.c -O0 -g -fno-optimize-sibling-calls -fdebug-info-for-profiling -L $PWD -lcallback -fno-inline | ||
|
||
#include <stdio.h> | ||
|
||
int callbackCnt = 0; | ||
|
||
int callback(int *cnt, int (*func1)(int), int (*func2)(int), int p); | ||
|
||
int bar(int p) { | ||
return p + 1; | ||
} | ||
|
||
int baz(int p) { | ||
return p - 1; | ||
} | ||
|
||
int callBeforeReturn(int p) { | ||
return p + 10; | ||
} | ||
|
||
int callAfterReturn(int p) { | ||
return p - 10; | ||
} | ||
|
||
int qux(int p) { | ||
p += 10; | ||
int ret = callBeforeReturn(p); | ||
ret = callback(&callbackCnt, bar, baz, ret); | ||
ret = callAfterReturn(ret); | ||
return ret; | ||
} | ||
|
||
int foo (int p) { | ||
p -= 10; | ||
return qux(p); | ||
} | ||
|
||
int main(void) { | ||
int sum = 0; | ||
for (int i = 0; i < 1000 * 1000; i++) { | ||
sum += callback(&callbackCnt, foo, bar, i); | ||
} | ||
printf("callback count=%d, sum=%d\n", callbackCnt, sum); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.