Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check matmul types and error at compile-time if the backend doesn't support them #540

Merged
merged 1 commit into from
Dec 18, 2023

Conversation

cliffburdick
Copy link
Collaborator

@cliffburdick cliffburdick commented Dec 17, 2023

Fixes #538

@cliffburdick
Copy link
Collaborator Author

/blossom-ci

@cliffburdick
Copy link
Collaborator Author

/blossom-ci

@AtomicVar
Copy link
Contributor

AtomicVar commented Dec 17, 2023

@cliffburdick Great work! But I wonder why int8_t is listed as a supported GEMM type? I tried int8_t GEMM and it failed. cublasLtMatmulAlgoGetHeuristic() in ConfigureCublasLt() will return a CUBLAS_STATUS_NOT_SUPPORTED if we use int8_t.

Code to reproduce:

#include "matx.h"
#include <cassert>
#include <cstdio>

using namespace matx;

#define TYPE int8_t

int main() {
  MATX_ENTER_HANDLER();

  index_t M = 2;
  index_t N = 3;

  auto m = make_tensor<TYPE>({M, N});
  auto v = make_tensor<TYPE>({N, 1});

  m.SetVals({{1, 2, 3},
             {4, 5, 6}});
  v.SetVals({{1, 2, 3}});

  auto out = make_tensor<TYPE>({M, 1});

  (out = matmul(m, v)).run();

  cudaStreamSynchronize(0);

  printf("m:\n");
  print(m);
  printf("v:\n");
  print(v);
  printf("out:\n");
  print(out);

  CUDA_CHECK_LAST_ERROR();
  MATX_EXIT_HANDLER();
}

Output:

matxException (matxMatMulError: ret == CUBLAS_STATUS_SUCCESS) - /workspaces/MatX/include/matx/transforms/matmul.h:714

Stack Trace:
 examples/matmul_int : ()+0x8e4d
 examples/matmul_int : ()+0x1a3ee
 examples/matmul_int : ()+0x17ff0
 examples/matmul_int : ()+0x143f5
 examples/matmul_int : ()+0x10e16
 examples/matmul_int : ()+0x6664
 /lib/x86_64-linux-gnu/libc.so.6 : ()+0x29d90
 /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0x80
 examples/matmul_int : ()+0x4ea5

@cliffburdick
Copy link
Collaborator Author

@cliffburdick Great work! But I wonder why int8_t is listed as a supported GEMM type? I tried int8_t GEMM and it failed. cublasLtMatmulAlgoGetHeuristic() in ConfigureCublasLt() will return a CUBLAS_STATUS_NOT_SUPPORTED if we use int8_t.

Code to reproduce:

#include "matx.h"
#include <cassert>
#include <cstdio>

using namespace matx;

#define TYPE int8_t

int main() {
  MATX_ENTER_HANDLER();

  index_t M = 2;
  index_t N = 3;

  auto m = make_tensor<TYPE>({M, N});
  auto v = make_tensor<TYPE>({N, 1});

  m.SetVals({{1, 2, 3},
             {4, 5, 6}});
  v.SetVals({{1, 2, 3}});

  auto out = make_tensor<TYPE>({M, 1});

  (out = matmul(m, v)).run();

  cudaStreamSynchronize(0);

  printf("m:\n");
  print(m);
  printf("v:\n");
  print(v);
  printf("out:\n");
  print(out);

  CUDA_CHECK_LAST_ERROR();
  MATX_EXIT_HANDLER();
}

Output:

matxException (matxMatMulError: ret == CUBLAS_STATUS_SUCCESS) - /workspaces/MatX/include/matx/transforms/matmul.h:714

Stack Trace:
 examples/matmul_int : ()+0x8e4d
 examples/matmul_int : ()+0x1a3ee
 examples/matmul_int : ()+0x17ff0
 examples/matmul_int : ()+0x143f5
 examples/matmul_int : ()+0x10e16
 examples/matmul_int : ()+0x6664
 /lib/x86_64-linux-gnu/libc.so.6 : ()+0x29d90
 /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0x80
 examples/matmul_int : ()+0x4ea5

I'm going by the support matrix here:
https://docs.nvidia.com/cuda/cublas/#id98

I will try it out

@cliffburdick
Copy link
Collaborator Author

@AtomicVar the reason it's not working is the compute and scale types are wrong. Looking into it.

@cliffburdick
Copy link
Collaborator Author

@AtomicVar the reason it's not working is the compute and scale types are wrong. Looking into it.

Even with the correct scalar and compute types it's failing the heuristic check. Still investigating

@AtomicVar
Copy link
Contributor

Even with the correct scalar and compute types it's failing the heuristic check. Still investigating

I'm also facing this weird issue. Don't know where's the problem.

@cliffburdick
Copy link
Collaborator Author

/blossom-ci

@cliffburdick cliffburdick merged commit bfe279e into main Dec 18, 2023
1 check passed
@cliffburdick cliffburdick deleted the gemm_type_guards branch December 18, 2023 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] matmul do not support int32 tensors.
3 participants