Added e2e LTC tests #916

henrytwo · 2022-06-07T22:46:30Z

This PRs include multiple test suites to evaluate the Torch MLIR LTC backend, example backend implementation, and several example models:

MLIR text tests, which ensure the emitted MLIR matches a known "good" output
- This will be removed after we can lower to linalg and execute on reference backend
Numeric tests to compare the outputs using CPU and LTC (ensures losses and model parameters match after several training iterations)
- Currently we are executing on the JIT graph, but this will be updated in the future to use MLIR
- We are operating under the assumption that the JIT -> MLIR conversion is working fine, and the MLIR accurately represents the JIT graph

Once CI/CD is fixed for the LTC branch, these e2e tests will run automatically.

cc: @antoniojkim @ke1337

silvasean · 2022-06-08T18:30:27Z

In terms of testing architecture, checking the .mlir output from the tests is going to be really fragile. I would recommend that you plug into the existing e2e test suite for torchscript (misnomer now). You should be able to do something like we did for the op-by-op eager mode: https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir_e2e_test/torchscript/configs/eager_mode.py

and then plug it in here:

torch-mlir/e2e_testing/torchscript/main.py

Line 74 in c1da9ed

if args.config == 'refbackend':

henrytwo · 2022-06-08T20:20:31Z

In terms of testing architecture, checking the .mlir output from the tests is going to be really fragile. I would recommend that you plug into the existing e2e test suite for torchscript (misnomer now). You should be able to do something like we did for the op-by-op eager mode: https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir_e2e_test/torchscript/configs/eager_mode.py

and then plug it in here:

torch-mlir/e2e_testing/torchscript/main.py

Line 74 in c1da9ed

if args.config == 'refbackend':

Got it, I'll look into adding the LTC frontend to that. I have a few questions first though:

Would we still want to keep the numeric tests for the LTC example models (mnist and bert), or are we assuming that if the per op tests pass it should be good?
For the tests in e2e_testing.torchscript.main, do we validate that the numerics are actually correct? I skimmed thru the test cases, and it looks like we just make sure the modules compile and run

silvasean · 2022-06-08T23:27:54Z

Yes, the assumptions is that per-op tests combined should yield a valid integration test. Not necessarily the case, but a really good first approximation. We actually have a few of the e2e tests that run e.g. a whole resnet or bert and check the numerical correctness, but those are a big testing time burden so we avoid them as much as possible.
Those e2e tests always check correctness of the numerical results. See all the reporting in

torch-mlir/python/torch_mlir_e2e_test/torchscript/reporting.py

Line 263 in 298d095

def report_results(results: List[TestResult],

henrytwo · 2022-06-09T19:15:48Z

@silvasean I added a new config for the tests that use LTC. I had to xfail 305 tests due to incompatible ops. Let me know if there's anything else you'd like me to change!

silvasean

Great!

silvasean · 2022-06-09T19:51:30Z

e2e_testing/torchscript/xfail_sets.py

@@ -154,3 +154,312 @@
    "TestMultipleTensorReturn_basic",
    "AdaptiveAvgPool2dUnitOutputSizeStaticModule_basic",
 }
+
+LTC_XFAIL_SET = {
+    "AdaptiveAvgPool2dNonUnitOutputSizeDynamicModule_basic",


Any triage on why this could be failing?

If we are running the JIT graph directly, I would assume that everything "Just Works".

We don't have support for all ops yet. Some are missing some shape inference functions, some ops are blacklisted, etc.

Ah, got it. It's a little annoying that we are duplicating the shape functions for LTC. Could you collaborate with @eellison to reuse their shape stuff?

We have been upstreaming all our shape functions from https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/dialects/torch/importer/jit_ir/build_tools/shape_lib_gen.py, so this should all be part of a big source of truth upstream. E.g. pytorch/pytorch#76889

Hm our shape inferencing happens in C++, so I don't think there's much we can do as far as reusing that code 😢

All of the shapes functions are also stored in C++ fwiw, so I don't think that's a fundamental blocker. but I dont know all the details of what's going on here

@eellison Can you point me to where that is in the PyTorch repo? I just looked at the PR that Sean linked earlier, which had C++ with some long strings.

It should turn into JIT IR (that's what you see in the strings) that we can then load up and execute from C++.

@silvasean are you suggesting that we apply this at the JIT graph level, or port this over to work with LTC generally?

If it's the former, I don't think we can use it, because we require shape information at the LTC layer before generating a JIT graph.

Oh, interesting. Yeah, that makes things more tricky to reuse. @eellison is the shape inference infra tied specifically to JIT graphs?

I suppose we could create a shapeless jit graph just for the purpose of reusing the shape inference infrastructure, but that's quite a few hoops to jump through.

* Added e2e LTC Torch MLIR tests * Fix seed for reproducability * Check if computation is None before getting debug string * Updated unit tests, and added numeric tests * Print name of the model layer that fails numeric validation * Run LTC e2e test with CI/CD * Set seed in main function, instead of beginning of execution * Add comment to specify number of digits of precision * Fixed typo * Remove tests for LTC example models * Added LTC option to torchscript e2e * Implement compile and run for LTC e2e test * xfail all tests that use ops that aren't currently supported

henrytwo added 7 commits June 7, 2022 16:18

Added e2e LTC Torch MLIR tests

a4382a0

Fix seed for reproducability

ef3f325

Check if computation is None before getting debug string

4f7f999

Updated unit tests, and added numeric tests

9546f10

Print name of the model layer that fails numeric validation

e76464b

Run LTC e2e test with CI/CD

d00f8c7

Set seed in main function, instead of beginning of execution

f75c473

henrytwo requested a review from silvasean June 7, 2022 22:46

henrytwo self-assigned this Jun 7, 2022

Add comment to specify number of digits of precision

28c50e0

henrytwo force-pushed the henrytu/ltc_tests branch from 0f4592d to 28c50e0 Compare June 8, 2022 00:14

Fixed typo

0b737f7

henrytwo added 4 commits June 9, 2022 15:13

Remove tests for LTC example models

9e23d97

Added LTC option to torchscript e2e

923a9fe

Implement compile and run for LTC e2e test

890979c

xfail all tests that use ops that aren't currently supported

08101eb

silvasean approved these changes Jun 9, 2022

View reviewed changes

henrytwo merged commit cfaf125 into llvm:torch_mlir_ltc_backend Jun 9, 2022

henrytwo deleted the henrytu/ltc_tests branch June 9, 2022 19:56

tanyokwok mentioned this pull request Sep 21, 2022

features/bladedisc rebase 20220830 pai-disc/torch-mlir#20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added e2e LTC tests #916

Added e2e LTC tests #916

henrytwo commented Jun 7, 2022 •

edited

Loading

silvasean commented Jun 8, 2022

henrytwo commented Jun 8, 2022 •

edited

Loading

silvasean commented Jun 8, 2022

henrytwo commented Jun 9, 2022

silvasean left a comment

silvasean Jun 9, 2022

henrytwo Jun 9, 2022

silvasean Jun 9, 2022 •

edited

Loading

henrytwo Jun 10, 2022

eellison Jun 10, 2022

henrytwo Jun 10, 2022

silvasean Jun 13, 2022

henrytwo Jun 14, 2022

henrytwo Jun 14, 2022

silvasean Jun 14, 2022

Added e2e LTC tests #916

Added e2e LTC tests #916

Conversation

henrytwo commented Jun 7, 2022 • edited Loading

silvasean commented Jun 8, 2022

henrytwo commented Jun 8, 2022 • edited Loading

silvasean commented Jun 8, 2022

henrytwo commented Jun 9, 2022

silvasean left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

silvasean Jun 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henrytwo commented Jun 7, 2022 •

edited

Loading

henrytwo commented Jun 8, 2022 •

edited

Loading

silvasean Jun 9, 2022 •

edited

Loading