Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add benchmarks/run_tests.sh #6043

Merged
merged 1 commit into from
Dec 8, 2023
Merged

ci: add benchmarks/run_tests.sh #6043

merged 1 commit into from
Dec 8, 2023

Conversation

cota
Copy link
Collaborator

@cota cota commented Dec 7, 2023

Tested locally with:

$ test/benchmarks/run_tests.sh -L
[...]
+ make -C /src/pytorch/xla/test/benchmarks all
make: Entering directory '/src/pytorch/xla/test/benchmarks'
   AGGREGATE --accelerator=a6000 --test=inference --report=speedup
   DIFF a6000.inference.speedup.test
   RM a6000.inference.speedup.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=histogram
   DIFF v100.inference.histogram.test
   RM v100.inference.histogram.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=latest
   DIFF v100.inference.latest.test
   RM v100.inference.latest.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=speedup
   DIFF v100.inference.speedup.test
   RM v100.inference.speedup.test.tmp
make: Leaving directory '/src/pytorch/xla/test/benchmarks'

$ test/benchmarks/run_tests.sh -L -V 1
[...]
+ make -C /src/pytorch/xla/test/benchmarks V=1 all
make: Entering directory '/src/pytorch/xla/test/benchmarks'
python3 ../../benchmarks/aggregate.py --accelerator=a6000 --test=inference --report=speedup \
        --input-dirname=. --format=csv > a6000.inference.speedup.test.tmp
git diff --no-index a6000.inference.speedup.test a6000.inference.speedup.test.tmp
rm -f a6000.inference.speedup.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=histogram \
        --input-dirname=. --format=csv > v100.inference.histogram.test.tmp
git diff --no-index v100.inference.histogram.test v100.inference.histogram.test.tmp
rm -f v100.inference.histogram.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=latest \
        --input-dirname=. --format=csv > v100.inference.latest.test.tmp
git diff --no-index v100.inference.latest.test v100.inference.latest.test.tmp
rm -f v100.inference.latest.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=speedup \
        --input-dirname=. --format=csv > v100.inference.speedup.test.tmp
git diff --no-index v100.inference.speedup.test v100.inference.speedup.test.tmp
rm -f v100.inference.speedup.test.tmp
make: Leaving directory '/src/pytorch/xla/test/benchmarks'

@cota
Copy link
Collaborator Author

cota commented Dec 7, 2023

Continues discussion from #6035

@cota
Copy link
Collaborator Author

cota commented Dec 7, 2023

Nice! CI is running now including the changes in this commit. Will fix.

@cota
Copy link
Collaborator Author

cota commented Dec 7, 2023

Sent #6044 to fix this.

@cota
Copy link
Collaborator Author

cota commented Dec 7, 2023

Rebased to pick up the fix; re-running CI here.

@vanbasten23
Copy link
Collaborator

vanbasten23 commented Dec 7, 2023

Glad it is running. A few questions:

  1. I see the test has a6000 and v100 in the test name. The test doesn't need to run on a6000 or v100, right?
  2. IIUC, files such as v100.inference.latest.test serve as the expected aggregate.py output. If a dev make a valid change to benchmarks/aggregate.py, the tests will also likely fail. How should the dev fix the test?
  3. In the test/benchmarks/Makefile, would $(QUIET_DIFF)git diff --no-index $@ $@.tmp fail if the expected file and actual file differ?

@cota
Copy link
Collaborator Author

cota commented Dec 7, 2023

  1. I see the test has a6000 and v100 in the test name. The test doesn't need to run on a6000 or v100, right?

Right. The test files could be called anything. It's just useful to have them named like that because the file names match the accelerator data they contain (so when passing --accelerator= we know which csv file to look at).

  1. IIUC, files such as v100.inference.latest.test serve as the expected aggregate.py output. If a dev make a valid change to benchmarks/aggregate.py, the tests will also likely fail. How should the dev fix the test?

That's the point of the test. If the CSV output from aggregate.py changes (e.g. through a valid change that adds a new computation), that means the output "golden" files should also reflect that. This will allow us to catch bugs, i.e. non-functional changes (which are still valid changes) should require no changes to the "golden" files.

  1. In the test/benchmarks/Makefile, would $(QUIET_DIFF)git diff --no-index $@ $@.tmp fail if the expected file and actual file differ?

Indeed. Here's an example, where I've edited the golden file by replacing a correct value with the incorrect string:

$ make v100.inference.latest.test
   AGGREGATE --accelerator=v100 --test=inference --report=latest
   DIFF v100.inference.latest.test
diff --git a/v100.inference.latest.test b/v100.inference.latest.test.tmp
index 17dc9481e..0780011eb 100644
--- a/v100.inference.latest.test
+++ b/v100.inference.latest.test.tmp
@@ -1,3 +1,3 @@
 # WorkloadNumber,Speedup(Inductor/Oldest Inductor),ModelName(Inductor),Speedup(PytorchXLA/Oldest Inductor),ModelName(PytorchXLA)
 0,1.2957024017088707,Background_Matting,0.7341253951099924,Background_Matting
-1,incorrect,BERT_pytorch,1.3260604829383074,BERT_pytorch
+1,1.4722963316097097,BERT_pytorch,1.3260604829383074,BERT_pytorch
make: *** [Makefile:15: v100.inference.latest.test] Error 1

@cota
Copy link
Collaborator Author

cota commented Dec 7, 2023

Another fix was needed (#6045); re-running CI.

@cota
Copy link
Collaborator Author

cota commented Dec 7, 2023

Re-running CI after another fix (#6047) went in.

@cota
Copy link
Collaborator Author

cota commented Dec 7, 2023

Another fix in #6051. Hopefully this is the last one needed.

@vanbasten23
Copy link
Collaborator

The CI fails with:

   DIFF a6000.inference.speedup.test
diff --git a/a6000.inference.speedup.test b/a6000.inference.speedup.test.tmp
index f47718a..cfceaae 100644
make: *** [Makefile:15: a6000.inference.speedup.test] Error 1
--- a/a6000.inference.speedup.test
+++ b/a6000.inference.speedup.test.tmp
@@ -1,2 +1,2 @@
 # Datetime,Speedup(Inductor/Oldest Inductor),Speedup(PytorchXLA/Oldest Inductor)
-2023-11-10,1.0,0.7685582181407948
+2023-11-11,1.0,0.7685582181407948

Seems the date is different.

@vanbasten23
Copy link
Collaborator

One last comment and feel free to address in a follow-up pr. Could you add some documentation about if the benchmark CI fails, how should the users debug, such as what command to run locally to reproduce the issue?

@cota
Copy link
Collaborator Author

cota commented Dec 8, 2023

The CI fails with:

--- a/a6000.inference.speedup.test
+++ b/a6000.inference.speedup.test.tmp
@@ -1,2 +1,2 @@
 # Datetime,Speedup(Inductor/Oldest Inductor),Speedup(PytorchXLA/Oldest Inductor)
-2023-11-10,1.0,0.7685582181407948
+2023-11-11,1.0,0.7685582181407948

Seems the date is different.

Fixed in #6051, re-running now.

@cota cota force-pushed the aggregate_b branch 2 times, most recently from b560bcc to 31012a8 Compare December 8, 2023 15:29
@cota
Copy link
Collaborator Author

cota commented Dec 8, 2023

One last comment and feel free to address in a follow-up pr. Could you add some documentation about if the benchmark CI fails, how should the users debug, such as what command to run locally to reproduce the issue?

Done.

Tested locally with:

```
$ test/benchmarks/run_tests.sh -L
[...]
+ make -C /src/pytorch/xla/test/benchmarks all
make: Entering directory '/src/pytorch/xla/test/benchmarks'
   AGGREGATE --accelerator=a6000 --test=inference --report=speedup
   DIFF a6000.inference.speedup.test
   RM a6000.inference.speedup.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=histogram
   DIFF v100.inference.histogram.test
   RM v100.inference.histogram.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=latest
   DIFF v100.inference.latest.test
   RM v100.inference.latest.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=speedup
   DIFF v100.inference.speedup.test
   RM v100.inference.speedup.test.tmp
make: Leaving directory '/src/pytorch/xla/test/benchmarks'

$ test/benchmarks/run_tests.sh -L -V 1
[...]
+ make -C /src/pytorch/xla/test/benchmarks V=1 all
make: Entering directory '/src/pytorch/xla/test/benchmarks'
python3 ../../benchmarks/aggregate.py --accelerator=a6000 --test=inference --report=speedup \
        --input-dirname=. --format=csv > a6000.inference.speedup.test.tmp
git diff --no-index a6000.inference.speedup.test a6000.inference.speedup.test.tmp
rm -f a6000.inference.speedup.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=histogram \
        --input-dirname=. --format=csv > v100.inference.histogram.test.tmp
git diff --no-index v100.inference.histogram.test v100.inference.histogram.test.tmp
rm -f v100.inference.histogram.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=latest \
        --input-dirname=. --format=csv > v100.inference.latest.test.tmp
git diff --no-index v100.inference.latest.test v100.inference.latest.test.tmp
rm -f v100.inference.latest.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=speedup \
        --input-dirname=. --format=csv > v100.inference.speedup.test.tmp
git diff --no-index v100.inference.speedup.test v100.inference.speedup.test.tmp
rm -f v100.inference.speedup.test.tmp
make: Leaving directory '/src/pytorch/xla/test/benchmarks'

```
@cota
Copy link
Collaborator Author

cota commented Dec 8, 2023

Finally CI is passing.

Copy link
Collaborator

@vanbasten23 vanbasten23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cota cota merged commit 0313c6a into master Dec 8, 2023
20 checks passed
chunnienc pushed a commit to chunnienc/xla that referenced this pull request Dec 14, 2023
Tested locally with:

```
$ test/benchmarks/run_tests.sh -L
[...]
+ make -C /src/pytorch/xla/test/benchmarks all
make: Entering directory '/src/pytorch/xla/test/benchmarks'
   AGGREGATE --accelerator=a6000 --test=inference --report=speedup
   DIFF a6000.inference.speedup.test
   RM a6000.inference.speedup.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=histogram
   DIFF v100.inference.histogram.test
   RM v100.inference.histogram.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=latest
   DIFF v100.inference.latest.test
   RM v100.inference.latest.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=speedup
   DIFF v100.inference.speedup.test
   RM v100.inference.speedup.test.tmp
make: Leaving directory '/src/pytorch/xla/test/benchmarks'

$ test/benchmarks/run_tests.sh -L -V 1
[...]
+ make -C /src/pytorch/xla/test/benchmarks V=1 all
make: Entering directory '/src/pytorch/xla/test/benchmarks'
python3 ../../benchmarks/aggregate.py --accelerator=a6000 --test=inference --report=speedup \
        --input-dirname=. --format=csv > a6000.inference.speedup.test.tmp
git diff --no-index a6000.inference.speedup.test a6000.inference.speedup.test.tmp
rm -f a6000.inference.speedup.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=histogram \
        --input-dirname=. --format=csv > v100.inference.histogram.test.tmp
git diff --no-index v100.inference.histogram.test v100.inference.histogram.test.tmp
rm -f v100.inference.histogram.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=latest \
        --input-dirname=. --format=csv > v100.inference.latest.test.tmp
git diff --no-index v100.inference.latest.test v100.inference.latest.test.tmp
rm -f v100.inference.latest.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=speedup \
        --input-dirname=. --format=csv > v100.inference.speedup.test.tmp
git diff --no-index v100.inference.speedup.test v100.inference.speedup.test.tmp
rm -f v100.inference.speedup.test.tmp
make: Leaving directory '/src/pytorch/xla/test/benchmarks'
```
golechwierowicz pushed a commit that referenced this pull request Jan 12, 2024
Tested locally with:

```
$ test/benchmarks/run_tests.sh -L
[...]
+ make -C /src/pytorch/xla/test/benchmarks all
make: Entering directory '/src/pytorch/xla/test/benchmarks'
   AGGREGATE --accelerator=a6000 --test=inference --report=speedup
   DIFF a6000.inference.speedup.test
   RM a6000.inference.speedup.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=histogram
   DIFF v100.inference.histogram.test
   RM v100.inference.histogram.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=latest
   DIFF v100.inference.latest.test
   RM v100.inference.latest.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=speedup
   DIFF v100.inference.speedup.test
   RM v100.inference.speedup.test.tmp
make: Leaving directory '/src/pytorch/xla/test/benchmarks'

$ test/benchmarks/run_tests.sh -L -V 1
[...]
+ make -C /src/pytorch/xla/test/benchmarks V=1 all
make: Entering directory '/src/pytorch/xla/test/benchmarks'
python3 ../../benchmarks/aggregate.py --accelerator=a6000 --test=inference --report=speedup \
        --input-dirname=. --format=csv > a6000.inference.speedup.test.tmp
git diff --no-index a6000.inference.speedup.test a6000.inference.speedup.test.tmp
rm -f a6000.inference.speedup.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=histogram \
        --input-dirname=. --format=csv > v100.inference.histogram.test.tmp
git diff --no-index v100.inference.histogram.test v100.inference.histogram.test.tmp
rm -f v100.inference.histogram.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=latest \
        --input-dirname=. --format=csv > v100.inference.latest.test.tmp
git diff --no-index v100.inference.latest.test v100.inference.latest.test.tmp
rm -f v100.inference.latest.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=speedup \
        --input-dirname=. --format=csv > v100.inference.speedup.test.tmp
git diff --no-index v100.inference.speedup.test v100.inference.speedup.test.tmp
rm -f v100.inference.speedup.test.tmp
make: Leaving directory '/src/pytorch/xla/test/benchmarks'
```
bhavya01 pushed a commit that referenced this pull request Apr 22, 2024
Tested locally with:

```
$ test/benchmarks/run_tests.sh -L
[...]
+ make -C /src/pytorch/xla/test/benchmarks all
make: Entering directory '/src/pytorch/xla/test/benchmarks'
   AGGREGATE --accelerator=a6000 --test=inference --report=speedup
   DIFF a6000.inference.speedup.test
   RM a6000.inference.speedup.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=histogram
   DIFF v100.inference.histogram.test
   RM v100.inference.histogram.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=latest
   DIFF v100.inference.latest.test
   RM v100.inference.latest.test.tmp
   AGGREGATE --accelerator=v100 --test=inference --report=speedup
   DIFF v100.inference.speedup.test
   RM v100.inference.speedup.test.tmp
make: Leaving directory '/src/pytorch/xla/test/benchmarks'

$ test/benchmarks/run_tests.sh -L -V 1
[...]
+ make -C /src/pytorch/xla/test/benchmarks V=1 all
make: Entering directory '/src/pytorch/xla/test/benchmarks'
python3 ../../benchmarks/aggregate.py --accelerator=a6000 --test=inference --report=speedup \
        --input-dirname=. --format=csv > a6000.inference.speedup.test.tmp
git diff --no-index a6000.inference.speedup.test a6000.inference.speedup.test.tmp
rm -f a6000.inference.speedup.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=histogram \
        --input-dirname=. --format=csv > v100.inference.histogram.test.tmp
git diff --no-index v100.inference.histogram.test v100.inference.histogram.test.tmp
rm -f v100.inference.histogram.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=latest \
        --input-dirname=. --format=csv > v100.inference.latest.test.tmp
git diff --no-index v100.inference.latest.test v100.inference.latest.test.tmp
rm -f v100.inference.latest.test.tmp
python3 ../../benchmarks/aggregate.py --accelerator=v100 --test=inference --report=speedup \
        --input-dirname=. --format=csv > v100.inference.speedup.test.tmp
git diff --no-index v100.inference.speedup.test v100.inference.speedup.test.tmp
rm -f v100.inference.speedup.test.tmp
make: Leaving directory '/src/pytorch/xla/test/benchmarks'
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants