Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added e2e LTC tests #916

Merged
merged 13 commits into from
Jun 9, 2022
Merged

Conversation

henrytwo
Copy link
Member

@henrytwo henrytwo commented Jun 7, 2022

This PRs include multiple test suites to evaluate the Torch MLIR LTC backend, example backend implementation, and several example models:

  • MLIR text tests, which ensure the emitted MLIR matches a known "good" output
    • This will be removed after we can lower to linalg and execute on reference backend
  • Numeric tests to compare the outputs using CPU and LTC (ensures losses and model parameters match after several training iterations)
    • Currently we are executing on the JIT graph, but this will be updated in the future to use MLIR
    • We are operating under the assumption that the JIT -> MLIR conversion is working fine, and the MLIR accurately represents the JIT graph

Once CI/CD is fixed for the LTC branch, these e2e tests will run automatically.

cc: @antoniojkim @ke1337

@henrytwo henrytwo requested a review from silvasean June 7, 2022 22:46
@henrytwo henrytwo self-assigned this Jun 7, 2022
@silvasean
Copy link
Contributor

In terms of testing architecture, checking the .mlir output from the tests is going to be really fragile. I would recommend that you plug into the existing e2e test suite for torchscript (misnomer now). You should be able to do something like we did for the op-by-op eager mode: https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir_e2e_test/torchscript/configs/eager_mode.py

and then plug it in here:

if args.config == 'refbackend':

@henrytwo
Copy link
Member Author

henrytwo commented Jun 8, 2022

In terms of testing architecture, checking the .mlir output from the tests is going to be really fragile. I would recommend that you plug into the existing e2e test suite for torchscript (misnomer now). You should be able to do something like we did for the op-by-op eager mode: https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir_e2e_test/torchscript/configs/eager_mode.py

and then plug it in here:

if args.config == 'refbackend':

Got it, I'll look into adding the LTC frontend to that. I have a few questions first though:

  1. Would we still want to keep the numeric tests for the LTC example models (mnist and bert), or are we assuming that if the per op tests pass it should be good?
  2. For the tests in e2e_testing.torchscript.main, do we validate that the numerics are actually correct? I skimmed thru the test cases, and it looks like we just make sure the modules compile and run

@silvasean
Copy link
Contributor

  1. Yes, the assumptions is that per-op tests combined should yield a valid integration test. Not necessarily the case, but a really good first approximation. We actually have a few of the e2e tests that run e.g. a whole resnet or bert and check the numerical correctness, but those are a big testing time burden so we avoid them as much as possible.
  2. Those e2e tests always check correctness of the numerical results. See all the reporting in
    def report_results(results: List[TestResult],

@henrytwo
Copy link
Member Author

henrytwo commented Jun 9, 2022

@silvasean I added a new config for the tests that use LTC. I had to xfail 305 tests due to incompatible ops. Let me know if there's anything else you'd like me to change!

Copy link
Contributor

@silvasean silvasean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@@ -154,3 +154,312 @@
"TestMultipleTensorReturn_basic",
"AdaptiveAvgPool2dUnitOutputSizeStaticModule_basic",
}

LTC_XFAIL_SET = {
"AdaptiveAvgPool2dNonUnitOutputSizeDynamicModule_basic",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any triage on why this could be failing?

If we are running the JIT graph directly, I would assume that everything "Just Works".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have support for all ops yet. Some are missing some shape inference functions, some ops are blacklisted, etc.

Copy link
Contributor

@silvasean silvasean Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. It's a little annoying that we are duplicating the shape functions for LTC. Could you collaborate with @eellison to reuse their shape stuff?

We have been upstreaming all our shape functions from https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/dialects/torch/importer/jit_ir/build_tools/shape_lib_gen.py, so this should all be part of a big source of truth upstream. E.g. pytorch/pytorch#76889

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm our shape inferencing happens in C++, so I don't think there's much we can do as far as reusing that code 😢

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the shapes functions are also stored in C++ fwiw, so I don't think that's a fundamental blocker. but I dont know all the details of what's going on here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eellison Can you point me to where that is in the PyTorch repo? I just looked at the PR that Sean linked earlier, which had C++ with some long strings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should turn into JIT IR (that's what you see in the strings) that we can then load up and execute from C++.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@silvasean are you suggesting that we apply this at the JIT graph level, or port this over to work with LTC generally?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's the former, I don't think we can use it, because we require shape information at the LTC layer before generating a JIT graph.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting. Yeah, that makes things more tricky to reuse. @eellison is the shape inference infra tied specifically to JIT graphs?

I suppose we could create a shapeless jit graph just for the purpose of reusing the shape inference infrastructure, but that's quite a few hoops to jump through.

@henrytwo henrytwo merged commit cfaf125 into llvm:torch_mlir_ltc_backend Jun 9, 2022
@henrytwo henrytwo deleted the henrytu/ltc_tests branch June 9, 2022 19:56
antoniojkim pushed a commit that referenced this pull request Jun 30, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
antoniojkim pushed a commit that referenced this pull request Jun 30, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
antoniojkim pushed a commit that referenced this pull request Jul 5, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
antoniojkim pushed a commit that referenced this pull request Jul 7, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
henrytwo added a commit that referenced this pull request Jul 8, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
henrytwo added a commit that referenced this pull request Jul 8, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
henrytwo added a commit that referenced this pull request Jul 12, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
antoniojkim pushed a commit that referenced this pull request Jul 15, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
antoniojkim pushed a commit that referenced this pull request Jul 19, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
antoniojkim pushed a commit that referenced this pull request Jul 22, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
henrytwo added a commit that referenced this pull request Jul 29, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
henrytwo added a commit that referenced this pull request Jul 29, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
henrytwo added a commit that referenced this pull request Jul 30, 2022
* Added e2e LTC Torch MLIR tests

* Fix seed for reproducability

* Check if computation is None before getting debug string

* Updated unit tests, and added numeric tests

* Print name of the model layer that fails numeric validation

* Run LTC e2e test with CI/CD

* Set seed in main function, instead of beginning of execution

* Add comment to specify number of digits of precision

* Fixed typo

* Remove tests for LTC example models

* Added LTC option to torchscript e2e

* Implement compile and run for LTC e2e test

* xfail all tests that use ops that aren't currently supported
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants