Aperf update #65

janaknat · 2023-05-05T17:07:01Z

Description of changes:

Move HTML files from src/bin/html_files/ to src/html_files.
Remove the use of HTTP to get the data.
- We are now using local files which gets produced by aperf.
- This removes the need to run a local web server.
Move from 2 binaries (aperf-collector, aperf-visualizer) to 1 binary (aperf).
- There are now 2 sub-commands: record, report.
- aperf record => The same functionality as aperf-collector.
  - The data is collected in the directory as before. By default, a tar.gz of the directory is now produced.
- aperf report => The same functionality as aperf-visualizer.
  - aperf report can take the directory containing the data or a tar.gz of the directory as produced by aperf record.
  - A report is now generated which contains all the files needed to visualize the data with a web browser.
Remove the use of environment variables to control debug information of aperf.
- Previously, one had to export APERF_LOG_LEVEL=<debug|trace>.
- Now, use aperf <command> -v for debug information and aperf <command> -vv for trace information.

Attached the output of aperf report.

aperf_report.tar.gz

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

This also includes: * Remove the use of HTTP to get the data. * The data is now fed from local files which will be generated at run time by the report generator. * Use setTimeout to prevent the main UI thread from locking up when rendering the graphs.

All data types are allowed only 2 types of visualization calls. "keys" and "values".

Previously, aperf had one binary for each part of the process, collector and visualizer. Move to a single binary 'aperf' with subcommands. Usage: * aperf record -r <run-name> Records the performance data. Similar to aperf-collector. * aperf report -r <data location> Generates a report based on the data collected. Similar to aperf-visualizer. * aperf record, by default, will collect the data and generate a tar.gz of the directory. * aperf report, by default, will generate static HTML files along with a tar.gz of the directory. It will also contain a tar of the raw data.

wash-amzn

IMO this should not be doing any tarballing or un-tarballing.

wash-amzn · 2023-05-05T18:40:46Z

README.md

 ```
-./aperf-visualizer -r <COLLECTOR_DIRECTORY> -p <PORT_NUMBER>
+./aperf report -r <COLLECTOR_DIRECTORY> -p <PORT_NUMBER>


port number? ;)

wash-amzn · 2023-05-05T18:42:53Z

README.md

@@ -87,20 +82,18 @@ To see a step-by-step example, please see our example [here](./EXAMPLE.md)
 `-r, --run-name` run name (name of the run for organization purposes, creates directory of the same name, default of aperf_[timestamp])


-`./aperf-visualizer -h`
+`./aperf report -h`

 **Visualizer Flags:**

 `-v, --version` version of APerf visualizer


The commit message says -v is verbose, this say it's version info. This also isn't specific to the report subcommand.

Fortunately there's also the -v and -vv non-subcommand-specific options to document in the same way.

wash-amzn · 2023-05-05T18:45:21Z

src/bin/aperf.rs

+    match verbose {
+        1 => level = LevelFilter::Debug,
+        2 => level = LevelFilter::Trace,
+        _ => level = LevelFilter::Info,


Doesn't this mean -vvv is info? That would be unfortunate.

wash-amzn · 2023-05-05T18:46:35Z

src/data.rs

@@ -180,6 +180,13 @@ macro_rules! processed_data {
                    )*
                }
            }
+            pub fn get_calls(&mut self) -> Result<Vec<String>> {


Time to stop thinking of these as calls, since it's not done back to a running HTTP server anymore.

But we still 'call' the visualizer code.

wash-amzn · 2023-05-05T18:48:48Z

src/data/interrupts.rs

@@ -33,6 +33,7 @@ impl CollectData for InterruptDataRaw {
        self.time = TimeEnum::DateTime(Utc::now());
        self.data = String::new();
        self.data = std::fs::read_to_string("/proc/interrupts")?;
+        trace!("{:#?}", self.data);


I know trace is detailed, but this looks excessive.

If this is consistent trace behavior across other data types I'm fine with it, but that doesn't appear to be the case (at least it's not being added to the others as part of this diff).

As far as I can tell this is still not appropriate. If I'm wrong, say something.

After running this branch locally, I see that trace mode is pretty intense, so I guess this fits in.

Yeah. The trace functionality is a last resort. It print EVERYTHING.

wash-amzn · 2023-05-05T19:21:11Z

src/report.rs

+pub fn is_dir(dir: String) -> Result<bool> {
+    let file_type = fs::metadata(dir.clone())?.file_type();
+    if file_type.is_dir() {
+        return Ok(true);
+    }
+    return Ok(false);
+}


Path has a is_dir function for you https://doc.rust-lang.org/std/path/struct.Path.html#method.is_dir

You have to construct a Path anyway so there's no need for this function.

wash-amzn · 2023-05-05T19:24:14Z

src/report.rs

+        fs::copy(&archive_name, archive_dst)?;
+
+        /* Delete temp archive */
+        fs::remove_file(&archive_name)?;


Why not just move it instead of copy/delete?

Yeah. That's simpler.

wash-amzn · 2023-05-05T19:24:54Z

src/report.rs

+        fs::remove_file(&archive_name)?;
+        return Ok(());
+    }
+    if infer::get_from_path(&dir)?.unwrap().mime_type() == "application/gzip" {


The variable naming leads to confusion. How would a directory (it's location is stored in a variable named dir, afterall) possibly have a mime type of application/gzip?

I can rename it to run.

wash-amzn · 2023-05-05T19:28:05Z

src/report.rs

+        form_and_copy_archive(dir.clone())?;
+    }
+    /* Generate base HTML, JS files */
+    let _ico_file = File::create("aperf_report/ico")?;


What is that for? ico? Typo related to favicon.ico?

That is for the favicon.ico. Typo. I'll remove the _.

wash-amzn · 2023-05-05T19:29:14Z

src/report.rs

+    }
+    /* Generate aperf_report.tar.gz */
+    info!("Generating aperf_report.tar.gz");
+    let tar_gz = File::create("aperf_report.tar.gz")?;


No way can the output filename always be the same thing. No way.

I was looking at how perf record always generates perf.data. Something similar. We could append all the runs which were used to generate the report.

I don't mind copying perf behavior, but you have to do it completely. Both on the collection and on the reporting side, that includes renaming an existing file out of the way (I don't know the full extent of perf's behavior here, I believe it only does a single rename. Note that the renames aren't possible with directories, so that's another complication (having to delete something in the way)).

* Update README to remove references to port number. * Print more info messages during record.

wash-amzn · 2023-05-09T14:10:37Z

src/report.rs

        /* Copy archive to aperf_report */
        let archive_dst = format!("aperf_report/data/archive/{}", archive_name);
        fs::copy(&archive_name, archive_dst)?;
-
-        /* Delete temp archive */
-        fs::remove_file(&archive_name)?;


Now instead of copy and delete, you're just copying, so leaving the temporary archive around?

Yes. Say you use aperf to record the performance data. $PWD contains run_data/ and run_data.tar.gz. If you then do aperf report -r run_data/ , then I should not delete run_data.tar.gz that was generated by aperf record.

wash-amzn · 2023-05-09T14:12:11Z

src/report.rs

@@ -107,36 +96,50 @@ pub fn report(report: &Report) -> Result<()> {
        dir_paths.push(path.to_str().unwrap().to_string());
    }

+    /* Generate report name */
+    let mut report_name = "aperf_report".to_string();


Really should be trying to use Path objects as much as possible and not repeatedly constructing paths as strings

wash-amzn · 2023-05-09T14:18:40Z

src/data/interrupts.rs

@@ -33,6 +33,7 @@ impl CollectData for InterruptDataRaw {
        self.time = TimeEnum::DateTime(Utc::now());
        self.data = String::new();
        self.data = std::fs::read_to_string("/proc/interrupts")?;
+        trace!("{:#?}", self.data);


If this is consistent trace behavior across other data types I'm fine with it, but that doesn't appear to be the case (at least it's not being added to the others as part of this diff).

wash-amzn · 2023-05-09T14:19:36Z

src/record.rs

-    info!("To see debug messages export APERF_LOG_LEVEL=[debug|trace]");
-
-    let args = Args::parse();
+pub fn record(record: &Record) -> Result<()> {


Still very echo-y in here

wash-amzn · 2023-05-09T14:21:15Z

src/report.rs

+    }
+    /* Generate aperf_report.tar.gz */
+    info!("Generating aperf_report.tar.gz");
+    let tar_gz = File::create("aperf_report.tar.gz")?;


I don't mind copying perf behavior, but you have to do it completely. Both on the collection and on the reporting side, that includes renaming an existing file out of the way (I don't know the full extent of perf's behavior here, I believe it only does a single rename. Note that the renames aren't possible with directories, so that's another complication (having to delete something in the way)).

wash-amzn · 2023-05-09T16:18:30Z

README.md

-`aperf-collector` collects performance data and stores them in a series of files. These files are then viewed using `aperf-visualizer` either on the same machine the performance data was collected on or a remote machine running `aperf-visualizer`. 
-
-To visualize the data using `aperf-visualizer` download the directory created by `aperf-collector` and load the data with `aperf-visualizer`.
+`aperf record` records performance data and stores them in a series of files. A report is then generated with `aperf report` and viewed in any system with a web browser.


s/and viewed in any/and can be viewed in any/

Also, move to use PathBuf when forming paths in the report. Updated the README and EXAMPLE with the new option. Example: aperf report -r aarch64_run -r x86_run -n compare_runs This generates compare_runs/ with the report and a compare_runs.tar.gz. If a report name is not given, aperf defaults to concatinating the run names to form a report name. aperf report -r aarch64_run -r x86_run This generates aperf_report_aarch64_run_x86_run/ and a aperf_report_aarch64_run_x86_run.tar.gz

wash-amzn · 2023-05-10T19:10:09Z

EXAMPLE.md

 ```

+The APerf recorder also produces a tarball of the run data. To generate a report with the tarball generated by the recorder us the following command:


wash-amzn · 2023-05-10T19:11:13Z

EXAMPLE.md

+The APerf recorder also produces a tarball of the run data. To generate a report with the tarball generated by the recorder us the following command:
+
+```
+./aperf-v0.1.6-alpha-aarch64/aperf report --run-directory c7g_performance_run_1.tar.gz


--run- directory c7g_performance_run_1 .tar.gz

:/ I guess the tarball becomes a directory, so it's kind of not wrong.

--run would be more appropriate if multiple forms of input are accepted

Okay. We can change the arg name.

wash-amzn · 2023-05-10T19:16:32Z

EXAMPLE.md

 ```

+The APerf recorder also produces a tarball of the run data. To generate a report with the tarball generated by the recorder us the following command:


I don't think the behavior of the recorder should be getting described in the visualization section. This part should merely say you can supply either a run's directory or tarball

wash-amzn · 2023-05-10T19:22:18Z

src/report.rs

+        let directory = get_dir(dir.to_string())?;
+        let path = Path::new(&directory);
+        if dir_stems.contains(&path.file_stem().unwrap().to_str().unwrap().to_string()) {
+            error!("Cannot process two directories with the same name");


Not necessarily directories

src/report.rs

wash-amzn · 2023-05-10T19:28:15Z

src/report.rs

+
+        /* Create a temp archive */
+        let archive_name = format!("{}.tar.gz", &dir_stem);
+        let tar_gz = fs::File::create(&archive_name)?;


If I supply a run that's like /tmp/foo-bar, isn't this going to create (and then leave behind) foo-bar.tar.gz in PWD?

wash-amzn · 2023-05-10T19:29:54Z

src/report.rs

+        fs::copy(&archive_name, archive_dst)?;
+        return Ok(());
+    }
+    if infer::get_from_path(&loc)?.unwrap().mime_type() == "application/gzip" {


nitpick: should be an 'else if'

wash-amzn · 2023-05-10T19:32:03Z

src/report.rs

+        let tar_gz = File::open(&dir)?;
+        let tar = GzDecoder::new(tar_gz);
+        let mut archive = tar::Archive::new(tar);
+        archive.unpack(".")?;


What happens if the tarball contains absolute paths in it?

And won't this similarly leave things around, like if I supply /tmp/foo-bar.tar.gz this will leave a foo-bar directory in PWD

The standard tar command has protections in it to reduce the impact of a malicious tarball (https://unix.stackexchange.com/a/276962/398490), but even if we had that (which I don't know if tar::Archive does), there's still the threat of extracting a bunch of garbage files all over the place and leaving a mess

The tarballs format is expected to be the same as what the recorder produces.

The run data can either be fed as a directory or as a tarball generated by aperf record. Change arg name from --run-directory to --run.

If the directory or archive fed to aperf report does not contain valid run data, fail gracefully. If the total number of fails is equal to the total number of visualizers then the run data is invalid. If only some of the visualizers fail during init, it could be that some data was not collected. This will help in being backwards compatible.

wash-amzn · 2023-05-15T16:08:43Z

src/lib.rs

+    #[error("Invalid run name")]
+    InvalidRecordName,


Invalid record name? What's the thinking behind the "record" part here, versus calling it InvalidRunName, which would align with InvalidRunData above?

Ah. That is not needed. Holdover from when I was considering limiting what can be a run name. I'll remove.

wash-amzn · 2023-05-15T16:14:38Z

src/report.rs

 pub fn form_and_copy_archive(loc: String, report_name: &PathBuf) -> Result<()> {
    if Path::new(&loc).is_dir() {
        let dir_stem = Path::new(&loc).file_stem().unwrap().to_str().unwrap().to_string();

        /* Create a temp archive */
        let archive_name = format!("{}.tar.gz", &dir_stem);
-        let tar_gz = fs::File::create(&archive_name)?;
+        let archive_path = format!("{}/{}", APERF_TMP, archive_name);


Does this temp file not get cleaned up? Why not just write directly to the destination file?

Towards the end, the directory gets deleted.

An aperf_tmp is used for all intermediate steps aperf performs when forming the report.

janaknat added 5 commits May 5, 2023 16:54

Move HTML files to src/

16adb32

This also includes: * Remove the use of HTTP to get the data. * The data is now fed from local files which will be generated at run time by the report generator. * Use setTimeout to prevent the main UI thread from locking up when rendering the graphs.

Change data type visualizer calls

1ee8d13

All data types are allowed only 2 types of visualization calls. "keys" and "values".

Update README

e9a8d57

Generate aperf binary only

8053f61

wash-amzn reviewed May 5, 2023

View reviewed changes

Report name includes run names

1d4432a

* Update README to remove references to port number. * Print more info messages during record.

wash-amzn reviewed May 9, 2023

View reviewed changes

wash-amzn reviewed May 10, 2023

View reviewed changes

src/report.rs Outdated Show resolved Hide resolved

wash-amzn reviewed May 10, 2023

View reviewed changes

Change reporter arg name

f594ead

The run data can either be fed as a directory or as a tarball generated by aperf record. Change arg name from --run-directory to --run.

janaknat force-pushed the aperf_update branch from abe8130 to f594ead Compare May 11, 2023 17:07

wash-amzn reviewed May 15, 2023

View reviewed changes

Allow specifying non-PWD for aperf record/report

40cdb03

An aperf_tmp is used for all intermediate steps aperf performs when forming the report.

janaknat force-pushed the aperf_update branch from 6da66cc to 40cdb03 Compare May 15, 2023 18:33

wash-amzn approved these changes May 15, 2023

View reviewed changes

janaknat merged commit 6be61ee into main May 15, 2023

janaknat deleted the aperf_update branch July 7, 2023 20:11

		```

		The APerf recorder also produces a tarball of the run data. To generate a report with the tarball generated by the recorder us the following command:

Aperf update #65

Aperf update #65

Conversation

janaknat commented May 5, 2023

wash-amzn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment