[ROCm] Triton in XLA for ROCm - ir_emitter_triton related changes. #10649

zoranjovanovic-ns · 2024-03-18T15:28:27Z

Second commit of the series for enabling Triton in XLA for ROCm.

ddunl · 2024-03-19T20:08:39Z

xla/service/gpu/ir_emitter_triton.cc

@@ -122,14 +126,21 @@ limitations under the License.
 #include "tsl/platform/path.h"
 #include "tsl/platform/status.h"
 #include "tsl/platform/statusor.h"
+#ifdef TENSORFLOW_USE_ROCM


I think it'd be good if possible to minimize the number of ifdefs, so maybe ifdefs used everywhere at the top, then

#ifdef GOOGLE_CUDA <includes> #endif #ifdef TENSORFLOW_USE_ROCM <includes> #endif

or similar at the bottom of the includes.

tdanyluk · 2024-03-20T15:31:28Z

xla/service/gpu/ir_emitter_triton.cc

+#ifndef TENSORFLOW_USE_ROCM
 absl::Status CreateTritonPipeline(
-    mlir::OpPassManager& pm, const se::CudaComputeCapability& cc,
+    mlir::OpPassManager& pm, const se::GpuComputeCapability& cc,
    const TritonGemmConfig& config,
    mt::nvidia_gpu::ClusterInfo& out_cluster_info) {
-  const int ccAsInt = cc.major * 10 + cc.minor;
+#else
+absl::Status CreateTritonPipeline(
+    mlir::OpPassManager& pm, const se::GpuComputeCapability& cc,
+    const TritonGemmConfig& config) {
+#endif


In general I would recommend to reduce the number of ifdefs inside the code, to close to 0.
Could we have some kind of abstraction, like have a
Something class, and a derived CudaSomething and RocmSomething or just use a runtime if based on whether the se::GpuComputeCapability is cuda or rocm?

I talked with my colleagues and a class is not really needed.
We could use something like this:

something.h:

int something();

something_cuda.cc:

int something() { // implemented with cuda }

something_rocm.cc:

int something() { // implemented with rocm }

And then build either something_rocm.cc or something_cuda.cc with a select.
So ir_emitter_triton.cc would remain mostly device vendor independent and just include the shared "something.h'.

Thank you for the proposal and clarification. I will change the implementation (based on proposal) and update PR.

zoranjovanovic-ns · 2024-04-03T15:19:58Z

@penpornk rebased and pushed,

…changes. Imported from GitHub PR openxla/xla#10649 Second commit of the series for enabling Triton in XLA for ROCm. Copybara import of the project: -- 23d442f83c731cd86131bcd1d91c4e3d7cc42468 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Triton in XLA for ROCm - ir_emitter_triton related changes. Merging this change closes #10649 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10649 from ROCm:rocm_triton_backend_3 23d442f83c731cd86131bcd1d91c4e3d7cc42468 PiperOrigin-RevId: 621830985

zoranjovanovic-ns · 2024-04-05T14:14:58Z

Hi @xla-rotation, I have rebased and pushed, do I need to do something more?

…changes. Imported from GitHub PR openxla/xla#10649 Second commit of the series for enabling Triton in XLA for ROCm. Copybara import of the project: -- 23d442f83c731cd86131bcd1d91c4e3d7cc42468 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Triton in XLA for ROCm - ir_emitter_triton related changes. Merging this change closes #10649 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10649 from ROCm:rocm_triton_backend_3 23d442f83c731cd86131bcd1d91c4e3d7cc42468 PiperOrigin-RevId: 621830985

penpornk · 2024-04-05T14:25:32Z

@zoranjovanovic-ns Thank you for checking and for merging conflicts! We are applying some more fixes to pass internal tests. I'll let you know if we need more help. :)

…changes. Imported from GitHub PR openxla/xla#10649 Second commit of the series for enabling Triton in XLA for ROCm. Copybara import of the project: -- 23d442f83c731cd86131bcd1d91c4e3d7cc42468 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Triton in XLA for ROCm - ir_emitter_triton related changes. Merging this change closes #10649 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10649 from ROCm:rocm_triton_backend_3 23d442f83c731cd86131bcd1d91c4e3d7cc42468 PiperOrigin-RevId: 621830985

…changes. Imported from GitHub PR openxla/xla#10649 Second commit of the series for enabling Triton in XLA for ROCm. Copybara import of the project: -- 23d442f83c731cd86131bcd1d91c4e3d7cc42468 by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Triton in XLA for ROCm - ir_emitter_triton related changes. Merging this change closes #10649 PiperOrigin-RevId: 622202797

FUTURE_COPYBARA_INTEGRATE_REVIEW=#10649 from ROCm:rocm_triton_backend_3 23d442f PiperOrigin-RevId: 622213543

Reverts 3d6326c FUTURE_COPYBARA_INTEGRATE_REVIEW=#10649 from ROCm:rocm_triton_backend_3 23d442f PiperOrigin-RevId: 622186078

Ubuntu image used in TF SIG Build Dockerfile upgraded from 20.04 to 22.04 (LTS). Reverts 3d6326c FUTURE_COPYBARA_INTEGRATE_REVIEW=#10649 from ROCm:rocm_triton_backend_3 23d442f PiperOrigin-RevId: 615889505

…eters. Use TiledHloInstructions in the Triton emitter. Reverts 3d6326c FUTURE_COPYBARA_INTEGRATE_REVIEW=#10649 from ROCm:rocm_triton_backend_3 23d442f PiperOrigin-RevId: 621553019

github-actions bot added the kokoro:force-run Forces CI to rerun label Mar 18, 2024

github-actions bot assigned kamaljeeti and xla-rotation Mar 18, 2024