Native CUDA support #6578

cinfra-nvidia · 2018-11-01T21:36:22Z

ATTENTION! Please read and follow:

if this is a question about how to build / test / query / deploy using Bazel, ask it on StackOverflow instead: https://stackoverflow.com/questions/tagged/bazel

if this is a discussion starter, send it to bazel-discuss@googlegroups.com

if this is a bug or feature request, fill the form below as best as you can.

Description of the problem / feature request:

Add native toolchain support for CUDA.

Feature requests: what underlying problem are you trying to solve with this feature?

Reduce overhead to adopting bazel for CUDA software projects.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Compiling the saxpy example from https://devblogs.nvidia.com/easy-introduction-cuda-c-and-c/ without duplicating all the toolchain work done for something like TensorFlow.

If you name the file with a ".cu" suffix as is typical for CUDA programs, I do not believe the TensorFlow toolchain works. Also note that CUDA headers are often suffixed with a ".cuh"

What operating system are you running Bazel on?

Linux, Windows

What's the output of `bazel info release`?

0.17.2

If `bazel info release` returns "development version" or "(@non-git)", tell us how you built Bazel.

N/A

What's the output of `git remote get-url origin ; git rev-parse master ; git rev-parse HEAD` ?

N/A

Have you found anything relevant by searching the web?

https://groups.google.com/forum/#!topic/bazel-discuss/JF8BUqH-hzE
https://stackoverflow.com/questions/46354467/bazel-cuda-add-executable-equivalent
https://github.com/tensorflow/tensorflow/blob/master/third_party/gpus/cuda_configure.bzl
https://github.com/Hibbert-pku/bazel_nvcc

Any other information, logs, or outputs that you want to share?

N/A

The text was updated successfully, but these errors were encountered:

cinfra-nvidia · 2018-11-01T21:44:44Z

/cc @Artem-B @jlebar

Artem-B · 2018-11-02T00:16:02Z

This is a problem that straddles multiple domains. While it's relatively easy to solve for any given project, solving it once and for all is a bit more complicated. It's probably doable, but will take some work and may need to wait a bit until bazel configurability features mature. In order to make things work for the end user, someone will need to own, develop and maintain something very similar to what's been done in tensorflow. Chances are that that something would end up being more complicated than it's tensorflow's counterpart as it would potentially need to handle more diverse use cases.

Off the top of my head bazel needs to deal with following:

Compiler used for CUDA compilation. Bazel's making steady progress (#5574) on improving build configurability, it's not quite there yet. Ideally we want to have some sort of cuda_library() rule which will do the right thing under the hood. Currently we do that via a global configuration + fair amount of hacking in the crosstool_wrapper_driver_is_not_gcc in case of NVCC.

This brings us to the question of "what exactly is the right thing to do?".

In case of clang it's somewhat simpler to answer as the same compiler can compile and link both C++ and CUDA sources using the same compiler options, so the job of cuda_library() would be to adjust few compiler options so we can target the right GPUs. Otherwise the job is done by the same crosstool that builds the rest of the tree and things should work more or less out of the box. #5574 will make some things cleaner (we would not need a global flag to enable/disable CUDA support), but it's not required to make things work.

In case of NVCC (which is probably the most common use case at the moment) things are a bit more complicated. If clang is the host compiler, one can't use nvcc as it only supports particular versions of it on MacOS. If host uses gcc, we run into the problem that NVCC uses different command line options, so we need to get clever and attempt to filter/translate them as we currently do it in the crosstool_wrapper_driver_is_not_gcc. My guess is that it will take quite a bit more work to properly integrate this into bazel itself. At the extreme it may take a custom crosstool + #5574 + some way to enforce that NVCC crosstool compiles in a way that's compatible with the crosstool used to compile regular C++ files.

Making the whole thing working on windows will probably be even more interesting. CUDA support in clang on windows is currently broken with no volunteers to investigate and fix it. I have no idea what would it take to make cuda compilation working with bazel on windows.

That should cover basic compilation.

Eventually people will want to do fancier things like compiling GPU code with -rdc=true and then linking GPU code. @chsigg has been working on bazel rules to make it work in TensorFlow, but there are still some sharp corners.

In general, I'd probably start with extracting existing CUDA build logic from tensorflow. It's not perfect, but it's a reasonable starting point.

cinfra-nvidia · 2018-11-02T01:20:47Z

Thanks for the quick feedback and insights into the challenges.

I think your suggestion makes sense, which I interpreted as initially migrate toward projects using a common git submodule that encapsulates the logic necessary to use CUDA in bazel with only a WORKSPACE change.

Artem-B · 2018-11-02T04:39:43Z

That sounds reasonable. I'm mostly concerned how to get bazel to do things I want.

jin · 2018-11-03T16:41:38Z

@aiuto @mhlopko I'm only slightly familiar with CUDA, and I'm not sure if this should be assigned to Product or C++ team. Please reassign if needed.

nevion · 2019-02-15T08:01:37Z

Allow me restart the ball to rolling. Using clang as justification, you can build cuda and opencl files directly. Users must manage flags but this does work right now. I really would like to see bazel support cuda as a first class citizen and not lean on the wrapper from tensorflow to round-about dispatch nvcc - it works but it is hard to maintain being detached yet coupled to bazel development.

diff --git a/src/main/java/com/google/devtools/build/lib/rules/cpp/CppFileTypes.java b/src/main/java/com/google/devtools/build/lib/rules/cpp/CppFileTypes.java
index 5f9e5cff7a..bbffa35271 100644
--- a/src/main/java/com/google/devtools/build/lib/rules/cpp/CppFileTypes.java
+++ b/src/main/java/com/google/devtools/build/lib/rules/cpp/CppFileTypes.java
@@ -25,7 +25,7 @@ import java.util.regex.Pattern;
  * C++-related file type definitions.
  */
 public final class CppFileTypes {
-  public static final FileType CPP_SOURCE = FileType.of(".cc", ".cpp", ".cxx", ".c++", ".C");
+  public static final FileType CPP_SOURCE = FileType.of(".cc", ".cpp", ".cxx", ".c++", ".C", ".cu", ".cl");
   public static final FileType C_SOURCE = FileType.of(".c");
   public static final FileType OBJC_SOURCE = FileType.of(".m");
   public static final FileType OBJCPP_SOURCE = FileType.of(".mm");
@@ -35,7 +35,7 @@ public final class CppFileTypes {
       FileTypeSet.of(CppFileTypes.CPP_SOURCE, CppFileTypes.C_SOURCE);
 
   public static final FileType CPP_HEADER =
-      FileType.of(".h", ".hh", ".hpp", ".ipp", ".hxx", ".inc");
+      FileType.of(".h", ".hh", ".hpp", ".ipp", ".hxx", ".inc", ".inl", ".cuh");
   public static final FileType PCH = FileType.of(".pch");
   public static final FileTypeSet OBJC_HEADER = FileTypeSet.of(CPP_HEADER, PCH);

nevion · 2019-02-18T08:11:01Z

@philwo ^ just wanted to make sure I pulled in another pair of eyes and see if this helps make the situation better

hlopko · 2019-02-18T17:55:27Z

@ilya-biryukov @scentini

So our plan is to fix #6926 in a robust and extensible way, and then expand the support to cuda, getting the inspiration from what tensorflow has developed. We will not have any cycles to work on cuda in the first half of 2019. In the second half of 2019 we may, but this is not an official promise :) If the community develops a custom high quality cuda toolchain in the meantime, we'll try our best to integrate that.

nevion · 2019-02-19T03:53:26Z

@hlopko what about file extensions? To get around the this, the tensorflow script copies/symlinks cu files to cpp so the "c++" wrapper will compile them. This get's rid of that step and as mentioned, since clang supports these files natively... can we have some of these other file extensions? I lost patience waiting for a pure starlark way of doing this.

graphicsMan · 2020-01-14T21:23:04Z

Is there any new status on first-class CUDA support? Or even bzl rules that are well supported? I found a github project that provides some rules (https://github.com/llvim/bazel_rule_cuda, allegedly code pulled from tensorflow), but I'm pretty wary considering one of the first things the landing page says is "use at risk", and the project hasn't been updated in 3 years.

oquenchil · 2020-01-23T13:00:20Z

Bazel won't be providing first-class CUDA support for the time being. We will work on any blockers in Bazel that might prevent someone else from providing that kind of support though.

nevion · 2020-01-28T11:31:56Z

@oquenchil what about allowing the cl and cu files to be compiled in clang? This seems to only need the file extensions to be recognized per #6578 (comment)

Does it make sense to not support the file extension in some way where clang has provided straightforward compilation?

oquenchil · 2020-02-04T12:59:26Z

If adding those file extensions is all that is needed, then we can accept a PR for that.

…le them with a C++ compiler. This will allow CUDA and OpenCL files to be compiled with clang, which supports both natively. Change taken from #6578 (comment) RELNOTES: Treat .cu and .cl files as C++ source. CUDA or OpenCL are not natively supported and will require custom flags to compile with e.g. clang. PiperOrigin-RevId: 293200984

chsigg · 2020-02-04T20:32:32Z

The change that @nevion suggested has just landed with 64a1326. Thanks a lot.

*** Reason for rollback *** Broke many targets in nightly TGP. *** Original change description *** Recognize CUDA and OpenCL file types (.cu, .cl, .inl, .cuh) and compile them with a C++ compiler. This will allow CUDA and OpenCL files to be compiled with clang, which supports both natively. Change taken from #6578 (comment) RELNOTES: Treat .cu and .cl files as C++ source. CUDA or OpenCL are not natively supported and will require custom flags to compile with e.g. clang. PiperOrigin-RevId: 293324316

nevion · 2020-02-15T09:14:32Z

@oquenchil / @chsigg what happened, it looks like this got rolled back?

Broke many targets in nightly TGP.
250647b

chsigg · 2020-02-15T20:06:26Z

Yes, it got rolled back because with the header extensions, it tried to build header modules from CUDA files :-(

I will try again on Monday with just the source extensions. I gave this a spin with a minimal bazel project to build a CUDA app:

WORKSPACE

new_local_repository(
  name="cuda",
  build_file="BUILD.cuda",
  path="/usr/local/cuda",
)

BUILD.cuda

package(
  default_visibility=["//visibility:public"],
)
cc_library(
  name="cuda_runtime",
  hdrs=glob(["include/**"]),
  srcs=["lib64/libcudart_static.a"],
  linkopts=["-ldl", "-lpthread", "-lrt"],
  includes = ["include"],
)

BUILD

cc_binary(
    name = "main",
    srcs = ["main.cu"],
    copts = ["--cuda-gpu-arch=sm_60"],
    deps = ["@cuda//:cuda_runtime"],
)

main.cu

__global__ void Kernel() {}
int main() { Kernel<<<1, 1>>>(); }

CC=clang bazel build :main

nevion · 2020-02-18T08:44:18Z

@chsigg

hm , just to double check, this isn't splash damage from the .inl extension, right? This is precompiling headers or c++ modules?

Which build failed it? https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/1377#d76ef5a8-53d2-4c4d-aa0e-c8c5d8dca4dc ?

jin · 2020-02-18T14:17:16Z

If this break builds internally, consider gating it with a flag.

…em with the C++ compiler. This will allow CUDA and OpenCL files to be compiled with clang, which supports both natively. Change taken from #6578 (comment) RELNOTES: Treat .cu and .cl files as C++ source. CUDA or OpenCL are not natively supported and will require custom flags to compile with e.g. clang. PiperOrigin-RevId: 295733167

chsigg · 2020-02-18T17:34:18Z

I just submitted it again without the header extensions. The issue was internal builds that tried to turn CUDA headers into our variant of C++ modules.

…

On Tue, Feb 18, 2020 at 3:17 PM Jin ***@***.***> wrote: If this break builds internally, consider gating it with a flag. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#6578?email_source=notifications&email_token=ABZM5DQ43P2BKILNWWUOJJDRDPUW5A5CNFSM4GBFZNY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMCEASA#issuecomment-587481160>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZM5DRJXLCK25TGW6M7OVLRDPUW5ANCNFSM4GBFZNYQ> .

louchenyao · 2020-08-12T03:13:58Z

@chsigg Hi, I am wondering if .cuh header extension can be added? .cuh is widely used even in the official library. I think once .cuh is recongnized, most small CUDA project will immediately work with Bazel.

nevion · 2020-08-12T03:44:52Z

@chsigg ,@louchenyao is correct - cuh is the standard extension for cuda headers. I wanted to make sure to not bite the hand that feeds and get the support officially in - but I do think it's time to give users outside google the opportunity to compile with standard extensions.

Perhaps these can be user specifiable extensions? That would give google or external users the opportunity to override defaults. Perhaps google can patch out the usage of cuh for internal builds since this was only for internal variant of c++ modules? I'd hate to have to continue patching bazel builds for this one thing defined on the java side that passes all public unit tests.

liuliu · 2020-09-11T20:53:16Z

Started to work on this last night. I modified TensorFlow's Bazel CUDA rules / with Joe Toth's example and made it available https://github.com/liuliu/rules_cuda as a remote git repository. It uses the most up-to-date TensorFlow rules so far.

This PR mirrors the fix in bazelbuild@8804128 therefore shouldn't be considered as a standalone PR, but rather a follow up for the said commit as a bugfix. .cuh is widely used as suffix for CUDA. It is made more prominent since "cub" (https://nvlabs.github.io/cub/) recently distributed with CUDA ToolKit 11, hence, now .cuh headers are in every CUDA ToolKit distribution. From that lens, this PR simply fixed the said commit so that newer CUDA ToolKit can be natively supported in Bazel. I also believe if this PR made its way in, we can close this issue: bazelbuild#6578

chsigg · 2021-01-12T12:48:21Z

FYI, the TensorFlow Runtime has a self-contained cuda_library that works with clang:
https://github.com/tensorflow/runtime/tree/master/third_party/rules_cuda

Some more work is required to make it work with gcc and nvcc, but hopefully I will get to that in the coming months.

dmadisetti · 2021-05-04T16:49:18Z

Opened an issue in tensorflow/runtime for reference: tensorflow/runtime#56

cloudhan · 2022-05-19T07:05:16Z

Here is my pure starklark impl https://github.com/cloudhan/rules_cuda

See real world (nccl and nvbench) examples https://github.com/cloudhan/rules_cuda_examples

…hem with the C++ compiler. This will allow CUDA and OpenCL files to be compiled with clang, which supports both natively. Change taken from bazelbuild/bazel#6578 (comment) RELNOTES: Treat .cu and .cl files as C++ source. CUDA or OpenCL are not natively supported and will require custom flags to compile with e.g. clang. PiperOrigin-RevId: 295733167

*** Reason for rollback *** Broke many targets in nightly TGP. *** Original change description *** Recognize CUDA and OpenCL file types (.cu, .cl, .inl, .cuh) and compile them with a C++ compiler. This will allow CUDA and OpenCL files to be compiled with clang, which supports both natively. Change taken from bazelbuild/bazel#6578 (comment) RELNOTES: Treat .cu and .cl files as C++ source. CUDA or OpenCL are not natively supported and will require custom flags to compile with e.g. clang. PiperOrigin-RevId: 293324316

…ile them with a C++ compiler. This will allow CUDA and OpenCL files to be compiled with clang, which supports both natively. Change taken from bazelbuild/bazel#6578 (comment) RELNOTES: Treat .cu and .cl files as C++ source. CUDA or OpenCL are not natively supported and will require custom flags to compile with e.g. clang. PiperOrigin-RevId: 293200984

jsharpe · 2022-11-03T15:43:42Z

For anyone following this issue - @cloudhan's rules have now been migrated to https://github.com/bazel-contrib/rules_cuda. Any contributions are most welcome; also feel free to come chat in #cuda on bazel slack, particularly if you have any requirements that aren't met by the current rules.

github-actions · 2024-01-08T01:29:22Z

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

github-actions · 2024-04-07T01:29:31Z

This issue has been automatically closed due to inactivity. If you're still interested in pursuing this, please post @bazelbuild/triage in a comment here and we'll take a look. Thanks!

jin added untriaged team-Bazel General Bazel product/strategy issues labels Nov 3, 2018

jin added the type: feature request label Nov 5, 2018

hlopko added team-Rules-CPP Issues for C++ rules P2 We'll consider working on this in future. (Assignee optional) and removed team-Bazel General Bazel product/strategy issues untriaged labels Dec 14, 2018

hlopko mentioned this issue Dec 14, 2018

Rewrite cc_configure #6926

Closed

hlopko assigned scentini Feb 18, 2019

oquenchil self-assigned this Mar 6, 2019

oquenchil removed their assignment May 10, 2019

scentini assigned hlopko and unassigned scentini May 20, 2019

hlopko removed their assignment Dec 6, 2019

nevion mentioned this issue Feb 15, 2020

Release 2.2 (stable) - February 2020 #10133

Closed

10 tasks

louchenyao mentioned this issue Aug 12, 2020

Allow rules_cc to include headers with any file extension #10552

Open

liuliu mentioned this issue Sep 18, 2020

Recognize CUDA header file type (.cuh) #12134

Closed

c-mita added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed P2 We'll consider working on this in future. (Assignee optional) labels Nov 23, 2020

dmadisetti mentioned this issue May 13, 2022

Explicitly add cuda and cxx features to cc_* rules #15499

Closed

chsigg mentioned this issue Sep 5, 2022

Create bazel-contrib/rules_cuda bazel-contrib/SIG-rules-authors#57

Closed

github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Jan 8, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native CUDA support #6578

Native CUDA support #6578

cinfra-nvidia commented Nov 1, 2018

cinfra-nvidia commented Nov 1, 2018

Artem-B commented Nov 2, 2018

cinfra-nvidia commented Nov 2, 2018

Artem-B commented Nov 2, 2018

jin commented Nov 3, 2018 •

edited

Loading

nevion commented Feb 15, 2019 •

edited

Loading

nevion commented Feb 18, 2019

hlopko commented Feb 18, 2019

nevion commented Feb 19, 2019

graphicsMan commented Jan 14, 2020

oquenchil commented Jan 23, 2020

nevion commented Jan 28, 2020

oquenchil commented Feb 4, 2020

chsigg commented Feb 4, 2020

nevion commented Feb 15, 2020 •

edited

Loading

chsigg commented Feb 15, 2020

nevion commented Feb 18, 2020

jin commented Feb 18, 2020

chsigg commented Feb 18, 2020 via email

louchenyao commented Aug 12, 2020

nevion commented Aug 12, 2020 •

edited

Loading

liuliu commented Sep 11, 2020 •

edited

Loading

chsigg commented Jan 12, 2021

dmadisetti commented May 4, 2021

cloudhan commented May 19, 2022 •

edited

Loading

jsharpe commented Nov 3, 2022

github-actions bot commented Jan 8, 2024

github-actions bot commented Apr 7, 2024

Native CUDA support #6578

Native CUDA support #6578

Comments

cinfra-nvidia commented Nov 1, 2018

Description of the problem / feature request:

Feature requests: what underlying problem are you trying to solve with this feature?

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

What operating system are you running Bazel on?

What's the output of bazel info release?

If bazel info release returns "development version" or "(@non-git)", tell us how you built Bazel.

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

Have you found anything relevant by searching the web?

Any other information, logs, or outputs that you want to share?

cinfra-nvidia commented Nov 1, 2018

Artem-B commented Nov 2, 2018

cinfra-nvidia commented Nov 2, 2018

Artem-B commented Nov 2, 2018

jin commented Nov 3, 2018 • edited Loading

nevion commented Feb 15, 2019 • edited Loading

nevion commented Feb 18, 2019

hlopko commented Feb 18, 2019

nevion commented Feb 19, 2019

graphicsMan commented Jan 14, 2020

oquenchil commented Jan 23, 2020

nevion commented Jan 28, 2020

oquenchil commented Feb 4, 2020

chsigg commented Feb 4, 2020

nevion commented Feb 15, 2020 • edited Loading

chsigg commented Feb 15, 2020

nevion commented Feb 18, 2020

jin commented Feb 18, 2020

chsigg commented Feb 18, 2020 via email

louchenyao commented Aug 12, 2020

nevion commented Aug 12, 2020 • edited Loading

liuliu commented Sep 11, 2020 • edited Loading

chsigg commented Jan 12, 2021

dmadisetti commented May 4, 2021

cloudhan commented May 19, 2022 • edited Loading

jsharpe commented Nov 3, 2022

github-actions bot commented Jan 8, 2024

github-actions bot commented Apr 7, 2024

What's the output of `bazel info release`?

If `bazel info release` returns "development version" or "(@non-git)", tell us how you built Bazel.

What's the output of `git remote get-url origin ; git rev-parse master ; git rev-parse HEAD` ?

jin commented Nov 3, 2018 •

edited

Loading

nevion commented Feb 15, 2019 •

edited

Loading

nevion commented Feb 15, 2020 •

edited

Loading

nevion commented Aug 12, 2020 •

edited

Loading

liuliu commented Sep 11, 2020 •

edited

Loading

cloudhan commented May 19, 2022 •

edited

Loading