Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't link a shared library using clang #162

Closed
rnburn opened this issue Sep 11, 2023 · 2 comments
Closed

Can't link a shared library using clang #162

rnburn opened this issue Sep 11, 2023 · 2 comments

Comments

@rnburn
Copy link

rnburn commented Sep 11, 2023

I get duplicate symbol errors when trying to link a shared library.

Here's an example

C++ code

// mul.h
#pragma once

void mul(double* res, const double* a, const double* b, int n) noexcept;

// mul.cc
#include "mul.h"

__global__
void mul_impl(double* res, const double *a, const double* b, int n) {
  for (int i = 0; i < n; i++)
    res[i] = a[i] * b[i];
}

void mul(double* res, const double* a, const double* b, int n) noexcept {
  double* a_dev;
  cudaMalloc(&a_dev, sizeof(double)*n);
  cudaMemcpy(a_dev, a, sizeof(double)*n, cudaMemcpyHostToDevice);

  double* b_dev;
  cudaMalloc(&b_dev, sizeof(double)*n);
  cudaMemcpy(b_dev, b, sizeof(double)*n, cudaMemcpyHostToDevice);

  double* res_dev;
  cudaMalloc(&res_dev, sizeof(double)*n);

  mul_impl<<<1, 1>>>(res_dev, a_dev, b_dev, n);
  cudaMemcpy(res, res_dev, sizeof(double)*n, cudaMemcpyDeviceToHost);

  cudaFree(a_dev);
  cudaFree(b_dev);
  cudaFree(res_dev);
}

// mulladd.h
#pragma once

void muladd(double* res, const double* a, const double* b, const double* c, int n) noexcept;

// mulladd.cc
#include "mul.h"
#include "muladd.h"

__global__
void muladd_impl(double* res, const double *a, int n) {
  for (int i = 0; i < n; i++)
    res[i] += a[i];
}

void muladd(double* res, const double* a, const double* b, const double* c, int n) noexcept {
  mul(res, a, b, n);

  double* a_dev;
  cudaMalloc(&a_dev, sizeof(double)*n);
  cudaMemcpy(a_dev, a, sizeof(double)*n, cudaMemcpyHostToDevice);

  double* res_dev;
  cudaMalloc(&res_dev, sizeof(double)*n);
  cudaMemcpy(res_dev, res, sizeof(double)*n, cudaMemcpyHostToDevice);
  cudaMemcpy(res, res_dev, sizeof(double)*n, cudaMemcpyDeviceToHost);

  muladd_impl<<<1, 1>>>(res_dev, a_dev, n);

  cudaFree(a_dev);
  cudaFree(res_dev);
}

BUILD file

cuda_library(
  name = "mul",
  hdrs = [
    "mul.h",
  ],
  srcs = [
    "mul.cc",
  ],
  rdc = True,
)

cuda_library(
  name = "muladd",
  hdrs = [
    "muladd.h",
  ],
  srcs = [
    "muladd.cc",
  ],
  deps = [
    ":mul",
  ],
  rdc = True,
)

cc_shared_library(
  name = "t",
  deps = [
    ":mul",
    ":muladd",
  ],
)

Trying to link gives me the error

ERROR: /src/test/BUILD:65:18: Linking test/libt.so failed: (Exit 1): clang failed: error executing command (from target //test:t) /usr/lib/llvm-18/bin/clang @bazel-out/k8-fastbuild/bin/test/libt.so-2.params
ld.lld: error: duplicate symbol: __device_stub__mul_impl(double*, double const*, double const*, int)
>>> defined at mul.cc
>>>            bazel-out/k8-fastbuild/bin/test/_objs/mul/mul.rdc.pic.o:(__device_stub__mul_impl(double*, double const*, double const*, int))
>>> defined at mul.cc
>>>            bazel-out/k8-fastbuild/bin/test/_objs/mul/mul.rdc.pic.o:(.text+0x0)

ld.lld: error: duplicate symbol: mul(double*, double const*, double const*, int)
>>> defined at mul.cc
>>>            bazel-out/k8-fastbuild/bin/test/_objs/mul/mul.rdc.pic.o:(mul(double*, double const*, double const*, int))
>>> defined at mul.cc
>>>            bazel-out/k8-fastbuild/bin/test/_objs/mul/mul.rdc.pic.o:(.text+0xB0)

ld.lld: error: duplicate symbol: __fatbinwrap__nv_cbca4bafd4699d5e
>>> defined at mul.cc
>>>            bazel-out/k8-fastbuild/bin/test/_objs/mul/mul.rdc.pic.o:(__cuda_fatbin_wrapper)
>>> defined at mul.cc
>>>            bazel-out/k8-fastbuild/bin/test/_objs/mul/mul.rdc.pic.o:(__cuda_fatbin_wrapper)

ld.lld: error: duplicate symbol: __cudaRegisterLinkedBinary__nv_cbca4bafd4699d5e
>>> defined at muladd.cu
>>>            bazel-out/k8-fastbuild/bin/test/_objs/_dlink/muladd/muladd.rdc.pic.o:(__cudaRegisterLinkedBinary__nv_cbca4bafd4699d5e)
>>> defined at mul.cu
>>>            bazel-out/k8-fastbuild/bin/test/_objs/_dlink/mul/mul.rdc.pic.o:(.text+0x0)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Target //test:t failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.324s, Critical Path: 0.06s
INFO: 3 processes: 3 internal.
FAILED: Build did NOT complete successfully

Doing something similar with alwayslink and cc_binary instead of cc_shared_library gives a similar result.

@rnburn rnburn changed the title Can't link a shared library. Can't link a shared library using clang Sep 11, 2023
@cloudhan
Copy link
Collaborator

I can produce you issue.

Could you please change the :mul target from cuda_library to cuda_objects as a workaround?

cuda_objects(
    name = "mul",
    srcs = ["mul.cc"],
    hdrs = ["mul.h"],
)

#125 introduced the ability to do transitive device link on cuda object files from deps, at that time, it seems to be harmless but I am not quite sure about it. Seems some cases are not well considered in #125 (comment) I also observed similar behavior in the nccl example, but I don't have time to figure it out, so it is recommended to stick to cuda_objects only.

If some unresolvable condition is caused by that PR I think we better revert it at sometime and only rely on the old behavior.

@cloudhan
Copy link
Collaborator

cloudhan commented Oct 6, 2023

#Fixed in #167, but you should note that transitive dlink will not work as expected (so you should stick with cuda_objects at the moment). This is nvlink limitation, see #167 (comment) for details.

@cloudhan cloudhan closed this as completed Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants