-
-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore CUDA 9.x compatibility #304
Labels
Comments
eyalroz
added a commit
that referenced
this issue
Mar 24, 2022
* Corrected wrong version check in `nvtx.hpp` (needed 10000, was using 1000) * Dropped the `unregistered_` memory type from the `memory::type_t` enum (actually, I've forgotten why we had it there in the first place) * Defining some `kernel_t` and `context_t` methods conditionally, since they're not supported in CUDA 9.2; and I'd rather they not fail at runtime. * Made a mode of operation in the "p2p bandwidth latency test" example program, which makes use of a CUDA 10 API call, and was not implemented in the CUDA 9.x version of the example program, unavailable when the CUDA version is under 10.0 * In CUDA 9.2 NVRTC, you can't get the address of a `__constant__`, only of a kernel. so, we disable the tests involving `__constant__` symbols. Caveats: * Some tests still fail; it remains to determine why. * These changes target 9.2 compatibility, not 9.0.
eyalroz
added a commit
that referenced
this issue
Mar 24, 2022
eyalroz
added a commit
that referenced
this issue
Mar 24, 2022
eyalroz
added a commit
that referenced
this issue
Mar 24, 2022
eyalroz
added a commit
that referenced
this issue
Mar 24, 2022
eyalroz
added a commit
that referenced
this issue
Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible. * Added all missing CUDA driver and runtime API error enum values to our own named error enum. * Renamed some named errors for clarity, based on their Doxygen comment in the CUDA headers. * Now (apparently) fully-compatible with CUDA 9.2.
eyalroz
added a commit
that referenced
this issue
Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible. * Added all missing CUDA driver and runtime API error enum values to our own named error enum. * Renamed some named errors for clarity, based on their Doxygen comment in the CUDA headers. * Now (apparently) fully-compatible with CUDA 9.2.
eyalroz
added a commit
that referenced
this issue
Mar 24, 2022
* Now using CUDA driver API error enum values wherever possible. * Added all missing CUDA driver and runtime API error enum values to our own named error enum. * Renamed some named errors for clarity, based on their Doxygen comment in the CUDA headers. * Now (apparently) fully-compatible with CUDA 9.2.
eyalroz
added a commit
that referenced
this issue
Jun 20, 2022
* Corrected wrong version check in `nvtx.hpp` (needed 10000, was using 1000) * Dropped the `unregistered_` memory type from the `memory::type_t` enum (actually, I've forgotten why we had it there in the first place) * Defining some `kernel_t` and `context_t` methods conditionally, since they're not supported in CUDA 9.2; and I'd rather they not fail at runtime. * Made a mode of operation in the "p2p bandwidth latency test" example program, which makes use of a CUDA 10 API call, and was not implemented in the CUDA 9.x version of the example program, unavailable when the CUDA version is under 10.0 * In CUDA 9.2 NVRTC, you can't get the address of a `__constant__`, only of a kernel. so, we disable the tests involving `__constant__` symbols. Caveats: * Some tests still fail; it remains to determine why. * These changes target 9.2 compatibility, not 9.0.
eyalroz
added a commit
that referenced
this issue
Jun 20, 2022
* Now using CUDA driver API error enum values wherever possible. * Added all missing CUDA driver and runtime API error enum values to our own named error enum. * Renamed some named errors for clarity, based on their Doxygen comment in the CUDA headers. * Now (apparently) fully-compatible with CUDA 9.2.
eyalroz
added a commit
that referenced
this issue
Aug 6, 2022
…_grid_params_for_max_occupancy` in CUDA version 10.0 - which does not yet support it.
eyalroz
added a commit
that referenced
this issue
Aug 6, 2022
eyalroz
added a commit
that referenced
this issue
Feb 23, 2023
* No longer allocating heap memory on enqueue and releaing it during launch - only passing pointers the user has provided. Part of the motivation for this is enabling stream capture and re-execution of the launch. * Separated a method for enqueuing no-argument callables and enqueuing functions which take a single (pointer) argument. * Enqueued callables no longer receive a stream (as CUDA has moved away from this convention and we can't make it happen without the heap allocation scheme we had before * `#ifdef`'ed out parts of `launch_config_builder.hpp` which require CUDA 10.0 to run (essentially obtaining minimum dimensions for maximum occupancy). * Dropped some redundant comments in `stream.hpp` about the choice of API functions
eyalroz
added a commit
that referenced
this issue
Feb 23, 2023
* No longer allocating heap memory on enqueue and releaing it during launch - only passing pointers the user has provided. Part of the motivation for this is enabling stream capture and re-execution of the launch. * Separated a method for enqueuing no-argument callables and enqueuing functions which take a single (pointer) argument. * Enqueued callables no longer receive a stream (as CUDA has moved away from this convention and we can't make it happen without the heap allocation scheme we had before * `#ifdef`'ed out parts of `launch_config_builder.hpp` which require CUDA 10.0 to run (essentially obtaining minimum dimensions for maximum occupancy). * Dropped some redundant comments in `stream.hpp` about the choice of API functions
eyalroz
added a commit
that referenced
this issue
Mar 9, 2023
* No longer allocating heap memory on enqueue and releaing it during launch - only passing pointers the user has provided. Part of the motivation for this is enabling stream capture and re-execution of the launch. * Separated a method for enqueuing no-argument callables and enqueuing functions which take a single (pointer) argument. * Enqueued callables no longer receive a stream (as CUDA has moved away from this convention and we can't make it happen without the heap allocation scheme we had before * `#ifdef`'ed out parts of `launch_config_builder.hpp` which require CUDA 10.0 to run (essentially obtaining minimum dimensions for maximum occupancy). * Dropped some redundant comments in `stream.hpp` about the choice of API functions
eyalroz
added a commit
that referenced
this issue
Mar 13, 2023
* No longer allocating heap memory on enqueue and releaing it during launch - only passing pointers the user has provided. Part of the motivation for this is enabling stream capture and re-execution of the launch. * Separated a method for enqueuing no-argument callables and enqueuing functions which take a single (pointer) argument. * Enqueued callables no longer receive a stream (as CUDA has moved away from this convention and we can't make it happen without the heap allocation scheme we had before * `#ifdef`'ed out parts of `launch_config_builder.hpp` which require CUDA 10.0 to run (essentially obtaining minimum dimensions for maximum occupancy). * Dropped some redundant comments in `stream.hpp` about the choice of API functions
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We're currently not compatible with CUDA 9.x; let's fix that.
The text was updated successfully, but these errors were encountered: