Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch stack reader needs updating for new information #151

Open
algr opened this issue Oct 12, 2023 · 4 comments
Open

branch stack reader needs updating for new information #151

algr opened this issue Oct 12, 2023 · 4 comments

Comments

@algr
Copy link

algr commented Oct 12, 2023

The branch stack entry reader will warn about reserved bits being non-zero, so the reader needs to know about all the fields that the kernel is producing:

diff --git a/src/quipper/kernel/perf_event.h b/src/quipper/kernel/perf_event.h
index 9f44a26..2a9ac28 100644
--- a/src/quipper/kernel/perf_event.h
+++ b/src/quipper/kernel/perf_event.h
@@ -1065,7 +1065,11 @@ struct perf_branch_entry {
       in_tx : 1,     /* in transaction */
       abort : 1,     /* transaction abort */
       cycles : 16,   /* cycle count to last branch */
-      reserved : 44;
+      type : 4,      /* branch type, or 15 to indicate new type */
+      spec : 2,      /* speculation outcome */
+      new_type : 4,  /* branch type (new types) */
+      priv : 3,      /* privilege level */
+      reserved : 31;
 };

 }  // namespace quipper
diff --git a/src/quipper/kernel/perf_internals.h b/src/quipper/kernel/perf_internals.h
index 3906e1f..f72122a 100644
--- a/src/quipper/kernel/perf_internals.h
+++ b/src/quipper/kernel/perf_internals.h
@@ -259,7 +259,11 @@ struct branch_flags {
   u64 in_tx : 1;
   u64 abort : 1;
   u64 cycles : 16;
-  u64 reserved : 44;
+  u64 type : 4;
+  u64 spec : 2;
+  u64 new_type : 4;
+  u64 priv : 3;
+  u64 reserved : 31;
 };

 struct branch_entry {

Probably, it should save at least some of this new information in the entry. The 'type' field may be needed by whoever is consuming this infomation, in order to ignore or distinguish some kinds of branches (e.g. interrupts). If type==15, new_type is used as an extended type field.

(Support for 'type' seems to have been added by the recent "update to latest internal version" merge, but "new_type" is still missing and we are already seeing kernels make use of that.)

@algr
Copy link
Author

algr commented Oct 12, 2023

Another problem with the branch stack reader: in PerfParser::MapBranchStack, if any of the branch stack entries has an unmappable address, the function immediately returns false. The caller, MapSampleEvent, then doesn't increment num_sample_events_mapped. At the end of the parse, if fewer than 95% of samples are mapped, the parse fails.

Statistically, even if only 1% of branch entries were unmapped, this would cause around 30% of sample records to have at least one unmapped branch and the entire parse would fail. If we're using this kind of threshold the best way might be to count each branch stack entry as a sample, and have MapBranchStack return the number of successfully mapped branch entries.

But it's worse than that. It may be that each sample record will contain at least one unmappable branch. E.g. for Arm BRBE we see each stack start with an IRQ entry:

... branch stack: nr:32
.....  0: 000040003a3d3404 -> 0000000000000000 0 cycles  P   0 IRQ
.....  1: 000040003a2de190 -> 000040003a3d33f0 0 cycles  P   0 CALL
.....  2: 000040003a2de610 -> 000040003a2de188 0 cycles  P   0 RET

to_ip is zero so this causes the whole branch stack to fail. And because every sample is like this, the entire parse fails.

Either the reader should apply the 95% threshold to individual entries as suggested above, or test the type/new_type fields (see above) to see if the entry indicates an interrupt or abort, or (simplest) assume that a zero target address indicates an interrupt or abort.

@shenhanc78
Copy link

shenhanc78 commented Nov 6, 2023

Hi Algr, we have not seen "warns about reserved bits being non-zero", I guess the arm perf raw have some data disparate from X86 world, will you be able to submit such a sample perf.data file and I may run and test? (If perf data contains google specific content, we may continue this discussion on bugnizer.)

@shenhanc78
Copy link

Another problem with the branch stack reader: in PerfParser::MapBranchStack, if any of the branch stack entries has an unmappable address, the function immediately returns false. The caller, MapSampleEvent, then doesn't increment num_sample_events_mapped. At the end of the parse, if fewer than 95% of samples are mapped, the parse fails.

Statistically, even if only 1% of branch entries were unmapped, this would cause around 30% of sample records to have at least one unmapped branch and the entire parse would fail. If we're using this kind of threshold the best way might be to count each branch stack entry as a sample, and have MapBranchStack return the number of successfully mapped branch entries.

But it's worse than that. It may be that each sample record will contain at least one unmappable branch. E.g. for Arm BRBE we see each stack start with an IRQ entry:

... branch stack: nr:32
.....  0: 000040003a3d3404 -> 0000000000000000 0 cycles  P   0 IRQ
.....  1: 000040003a2de190 -> 000040003a3d33f0 0 cycles  P   0 CALL
.....  2: 000040003a2de610 -> 000040003a2de188 0 cycles  P   0 RET

to_ip is zero so this causes the whole branch stack to fail. And because every sample is like this, the entire parse fails.

Either the reader should apply the 95% threshold to individual entries as suggested above, or test the type/new_type fields (see above) to see if the entry indicates an interrupt or abort, or (simplest) assume that a zero target address indicates an interrupt or abort.

Could you also submit a sample perf.data file for this problem? (We may continue this discuss if per.data files contain sensitive data)

@algrant-arm
Copy link

Apologies, I can't easily submit a perf.data for this. I hope the problem description is enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants