-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for collecting/visualizing Perf Profile data #77
Conversation
@@ -91,6 +91,8 @@ To see a step-by-step example, please see our example [here](./EXAMPLE.md) | |||
|
|||
`-vv, --verbose --verbose` more verbose messages | |||
|
|||
`--intensive` gather data for which the CPU utilization is high when collected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also indicate what that current includes (so you know what you're getting in exchange for the CPU usage)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
src/data/flamegraphs.rs
Outdated
Err(e) => if e.kind() == ErrorKind::NotFound { | ||
error!("Perf command not found"); | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about other errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a default case for all other error types.
<div class="tab"> | ||
<button class="tablinks" name="system_info" id="default">SUT Config</button> | ||
<button class="tablinks" name="sysctl">Sysctl Data</button> | ||
<button class="tablinks" name="cpu_utilization">CPU Utilization</button> | ||
<button class="tablinks" name="flamegraphs">Flamegraphs</button> | ||
<button class="tablinks" name="top_functions">Top Functions</button> | ||
<button class="tablinks" name="processes">Processes</button> | ||
<button class="tablinks" name="perfstat">PMU Stats</button> | ||
<button class="tablinks" name="meminfo">Meminfo</button> | ||
<button class="tablinks" name="vmstat">VM Stat</button> | ||
<button class="tablinks" name="kernel_config">Kernel Config</button> | ||
<button class="tablinks" name="sysctl">Sysctl Data</button> | ||
<button class="tablinks" name="vmstat">VM Stat</button> | ||
<button class="tablinks" name="interrupts">Interrupt Data</button> | ||
<button class="tablinks" name="disk_stats">Disk Stats</button> | ||
<button class="tablinks" name="perfstat">PMU Stats</button> | ||
<button class="tablinks" name="netstat">Net Stats</button> | ||
</div> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to start trying to organize these, e.g. the the two new ones would be under a "profiling" category. Static data would be another. Beyond that the grouping isn't as clear, maybe system, devices, and performance (just making names up).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This order is based on what people generally look at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already a mess and is going to get worse.
Related: the profiling-based pages, if shown despite there being no data, should explain why there's no data and how to collect it.
@@ -91,6 +91,8 @@ To see a step-by-step example, please see our example [here](./EXAMPLE.md) | |||
|
|||
`-vv, --verbose --verbose` more verbose messages | |||
|
|||
`--intensive` gather data for which the CPU utilization is high when collected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea of having a more general argument is likely going to be regretted later. At least for these features, I'm starting to think this should be more like --profiling
, and then that opens up the future options --profiling-frequency
and others, naturally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The aim was to have a intensive flag which in the future would include other types of data collection. It would also prevent us from having multiple options for per data type. We would then have a --disable <datatype1, datatype2, ..., datatype n> which could control which ones were gathered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know what the intent was, but I am doubting that it will work out (as in it will start getting awkward).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it to --profile
.
src/data.rs
Outdated
pub struct PrepareParams { | ||
pub time: u64, | ||
pub file_path: String, | ||
pub dir: String, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of thing will turn into a kitchen sink over time. Along with switching to a command line option to enable profiling, this would be profiling options, and you should not force it upon every data type when it only applies to one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are running a custom preparation step, the lib will give you the kitchen sink and you are free to use whichever ones you need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the "--intensive" approach, I don't think this is not going to be ideal long-term.
src/data.rs
Outdated
is_cpu_intensive: false, | ||
prepare_params: PrepareParams::new(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You really need to find a way such that data types are constructed and added to a list during initialization, rather than all possible ones are statically created and then optionally ignored/skipped at runtime. As we get more types that require more of their own configuration (and optional-ness), this current approach is not going to scale well.
src/data.rs
Outdated
pub fn is_cpu_intensive(&mut self) { | ||
self.is_cpu_intensive = true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that starts with is_
should be an accessor, not a mutator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:( . It was supposed to imply 'hey this data type is_cpu_intensive'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_cpu_intensive() or set_cpu_intensive(bool)
or just drop it since you already can access the .is_cpu_intensive attribute directly (and some places do)
The example report demonstrates one of the big problems here, need for debuginfo to get symbols. The flamegraph is mostly There should be some way to let the user know which are necessary (I know if you try to |
b1ac230
to
8f86a18
Compare
impl CollectData for PerfProfileRaw { | ||
fn prepare_data_collector(&mut self, params: PrepareParams) -> Result<()> { | ||
match Command::new("perf") | ||
.args(["record", "-a", "-q", "-g", "-k", "1", "-F", "99", "-e", "cpu-clock:pppH", "-o", ¶ms.data_file_path, "--", "sleep", ¶ms.collection_time.to_string()]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the significance of params.data_file_path
? It's not the final output file here (hasn't been run through the passes like "inject").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the perf.data file. APerf names is to perf_profile_timestamp
. In the report generation side, we pass this through the perf report --stdio -g none --percent-limit 1
command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain what the definition of data_file_path
is. There has to be something very significant about it, particularly in light of having data types that create other files of their own choosing under the data_dir
path. What are you going to do when a data type outputs two files, both of which have to be known at report generation time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During initialization, every data type has a file created by the lib at data_dir/datatype_timestamp.bin
. The data_file_path
is the path of this file. A datatype will only have 1 file created for it. If more files need to be created, use the prepare_data_collector
and do your custom process in data_dir
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the problem is that you're deferring the inject to report time, so you were still getting away with each data type producing only exactly 1 file.
Since inject
has to be done at the end of collection, what are you going to do? Have data_file_path
be the injected file, and then create a temporary under data_dir
for the real-time data collection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the inject step to after_data_collection().
src/data/flamegraphs.rs
Outdated
let out_jit = Command::new("perf") | ||
.args(["inject", "-j", "-i", &file_name, "-o", perf_jit_loc.clone().to_str().unwrap()]) | ||
.status(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't do the inject at report generation time unless you happen to be on the same machine (and more than that, the same processes (as in exact same process, not just the same name) are running)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to make sure that by the end of collection your resulting file has all symbol names already gathered in it, because they won't necessarily be resolvable at report generation time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The expectation is you run aperf report
on the SUT where aperf record
was run. The inject step can be moved to the aperf record
after_data_collection part.
if perf_jit_loc.exists() { | ||
let out_script = Command::new("perf") | ||
.args(["script", "-f", "-i", perf_jit_loc.to_str().unwrap()]) | ||
.output()?; | ||
write!(script_out, "{}", std::str::from_utf8(&out_script.stdout)?.to_string())?; | ||
Folder::default().collapse_file(Some(script_loc), collapse_out)?; | ||
fg_out = std::fs::OpenOptions::new().read(true).write(true).truncate(true).open(fg_loc)?; | ||
flamegraph::from_files(&mut Options::default(), &vec![collapse_loc.to_path_buf()], fg_out)?; | ||
} | ||
let processed_data = vec![ProcessedData::Flamegraph(profile)]; | ||
Ok(processed_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing in the if
appears to have any side-effects that modify profile
, so how does any subsequent code/UI/whatever behave differently based on whether perf_jit_loc
existed or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before the if, we write the default fg_out
. If the if
is successful, fg_out
gets overwritten. fg_out
is what the UI is looking at. The profile
is something we have to return to the caller.
We are using the perf tool to gather the profiling data. There are two outputs from 1 perf profile data. First, is the raw top functions gathered by using
perf report --stdio --percent-limit 1
. The second, is a flamegraph generated using the output ofperf script
and the flamegraph rust crate. The flamegraph generated this way is very nearly the same as the one generated with Brendan Greggs' perl script.A new flag is introduced for aperf record called
--intensive
. To enable perf profiling, the user must specify the--intensive
flag duringaperf record
.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
compprofile.tar.gz