You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BladeDISC is an end-to-end compiler that supports dynamic shape features, and dynamic shape is widely used on the training scene, this issue descript how to improve PyTorch training performance with DISC based on the LazyTensorCore(LTC) mechanism.
According to LTC, a MARK API should be called manually at the end of each iteration to sync and execute a Graph on a physical device.
Lowering To TorchScript, LTC uses TorchScript as the backend engine, ref TSBackendImpl, we can use it lower Lazy IR to TorchScript IR.
Cluster DISC SubGraph,
DISC Compilation Stage
a. mhlo conversation, DISC uses MLIR::mhloas the front-end, we should convert TorchScript IR to mhlo before compilation.
b. compiling to an executable program, call DISC entry function to compile mhlo IR to an executable file (a dynamic library file).
c. disc execution, call DISC RAL to execute the executable program with input Tensors.
TorchScript Execution, finally call torch::jit::GraphExecutorto execute the TorchScript IR and return the result Tensors.
Implement and TODO Actions
To implement the above features, we should build a Pybind library _torch_disc.so to expose step_mark API with some important C++ functions, the TODO actions as the following:
Yancey1989
changed the title
[PoC] TorchDISC to improve PyTorch training workload
[PoC] TorchDisc: accelerating PyTorch training via LTC + BladeDISC
Apr 6, 2022
Background
BladeDISC is an end-to-end compiler that supports dynamic shape features, and dynamic shape is widely used on the training scene, this issue descript how to improve PyTorch training performance with DISC based on the LazyTensorCore(LTC) mechanism.
feature branch: https://github.com/alibaba/BladeDISC/tree/features/torch_disc_devel
Design Overview
a. mhlo conversation, DISC uses MLIR::mhloas the front-end, we should convert TorchScript IR to mhlo before compilation.
b. compiling to an executable program, call DISC entry function to compile mhlo IR to an executable file (a dynamic library file).
c. disc execution, call DISC RAL to execute the executable program with input Tensors.
Implement and TODO Actions
To implement the above features, we should build a Pybind library
_torch_disc.so
to expose step_mark API with some important C++ functions, the TODO actions as the following:_torch_disc.so
with Torch LTC, Mhlo Builder, and DISC. [TorchDISC] setup torch-disc building environment and CI job #158Reference
The text was updated successfully, but these errors were encountered: