Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

thrust::system::cuda::detail::deallocation_fn type mismatch with cudaFree #1458

Closed
Mochimazui opened this issue Jun 9, 2021 · 5 comments · Fixed by #1582
Closed

thrust::system::cuda::detail::deallocation_fn type mismatch with cudaFree #1458

Mochimazui opened this issue Jun 9, 2021 · 5 comments · Fixed by #1582
Assignees
Labels
P1: should have Necessary, but not critical. repro: verified The provided repro has been validated. type: bug: functional Does not work as intended.
Milestone

Comments

@Mochimazui
Copy link

Using VS2019 Version 16.10.1 and thurst code from CUDA 11.3.1 installation.

Build failes on "thrust/system/cuda/memory_resource.h:87" with error a value of type "cudaError_t (*)(void *)" cannot be used to initialize an entity of type "deallocation_fn"

typedef detail::cuda_memory_resource<cudaMalloc, cudaFree,
        thrust::cuda::pointer<void> >
        device_memory_resource;

cudaFree type is enum cudaError (__stdcall*)(void *) but deallocation_fn type is enum cudaError (__cdecl*)(void *).

Seems we need add __stdcall to deallocation_fn, or modify cuda_memory_resource template to allow both __cdecl and __stcall function pointer here.

Code:

#include <cstdio>
#include <typeinfo>

#include <cuda.h>
#include <cuda_runtime.h>
#include <driver_types.h>

#include <thrust/device_vector.h>

int main()
{
    thrust::device_vector<int> a;

    printf("cudaFree: %s\n", typeid(cudaFree).name());
    printf("&cudaFree: %s\n", typeid(&cudaFree).name());

    typedef cudaError_t(* deallocation_fn_0)(void*);
    printf("deallocation_fn: %s\n", typeid(deallocation_fn_0).name());

    typedef cudaError_t(__stdcall * deallocation_fn_1)(void*);
    printf("fixed deallocation_fn: %s\n", typeid(deallocation_fn_1).name());
    
    return 0;
}
@alliepiper alliepiper added type: bug: functional Does not work as intended. repro: unverified A complete repro has been provided, but needs validation and triage. labels Jun 17, 2021
@alliepiper alliepiper added this to the 1.14.0 milestone Jun 17, 2021
@alliepiper alliepiper self-assigned this Jun 17, 2021
@alliepiper
Copy link
Collaborator

@Mochimazui What flags are you using? I built and ran the snippet above, but everything looks ok with our build system:

cudaFree: enum cudaError __cdecl(void * __ptr64)
&cudaFree: enum cudaError (__cdecl*)(void * __ptr64)
deallocation_fn: enum cudaError (__cdecl*)(void * __ptr64)
fixed deallocation_fn: enum cudaError (__cdecl*)(void * __ptr64)

@alliepiper alliepiper assigned Mochimazui and unassigned alliepiper Jun 18, 2021
@alliepiper alliepiper added the info needed Cannot make progress without more information. label Jun 18, 2021
@alliepiper
Copy link
Collaborator

Closing, cannot reproduce. Please reopen with more info if this is still needed.

@alliepiper alliepiper removed this from the 1.14.0 milestone Aug 17, 2021
@ojakubcik
Copy link

This happens when CUDA is used by 32-bit build. In 64-bit build, __stdcall is accepted and ignored by compiler, but in 32-bit it is enforced. cudaMalloc and cudaFree are defined with CUDARTAPI macro, which in MSVC environment expands to __stdcall.

Thus, the fix (for 32-bit compatible usage) should be:

typedef cudaError_t (CUDARTAPI *allocation_fn)(void**, std::size_t);
typedef cudaError_t (CUDARTAPI *deallocation_fn)(void *);

@alliepiper alliepiper removed the info needed Cannot make progress without more information. label Sep 13, 2021
@alliepiper
Copy link
Collaborator

Thanks for the info -- reopening.

@alliepiper alliepiper reopened this Sep 13, 2021
@alliepiper alliepiper assigned alliepiper and unassigned Mochimazui Dec 16, 2021
@alliepiper alliepiper added this to the 1.16.0 milestone Dec 16, 2021
@alliepiper alliepiper added P1: should have Necessary, but not critical. repro: verified The provided repro has been validated. and removed repro: unverified A complete repro has been provided, but needs validation and triage. labels Dec 16, 2021
alliepiper added a commit to alliepiper/thrust that referenced this issue Dec 16, 2021
CUDA runtime functions are annotated with CUDARTAPI, which expands to
__stdcall on MSVC.

`thrust/system/cuda/memory_resource.h` defines some
function pointer types intended for use with CUDART memory
allocation/deallocation functions, but they lack the CUDARTAPI markup.

Added this markup to fix 32-bit builds. Repro'd issue and validated
fix on MSVC 2019.

Fixes NVIDIA#1458.
@alliepiper
Copy link
Collaborator

@Mochimazui @ojakubcik #1582 should fix this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P1: should have Necessary, but not critical. repro: verified The provided repro has been validated. type: bug: functional Does not work as intended.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants