Warn the user when choice of grid size can cause a CUDA error #340
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds warnings for two situations where the choice of grid size could result in a CUDA error.
Simulations failing to run: panic: CUDA_ERROR_INVALID_VALUE #284
When the number of cells along an axis has a prime factor >127, the
CUDA_ERROR_INVALID_VALUE
error occurs because of the inner workings of the cuFFT algorithm (see @jplauzie's reply).The new warning (example below) is already raised when the grid is not 7-smooth, i.e. when there is a prime factor greater than 7. This includes the >127 case, while also raising awareness about the recommendation to use a 7-smooth grid.
panic: CURAND_STATUS_LENGTH_NOT_MULTIPLE issue when grid size odd and temperature finite #314
When temperature is nonzero, and the grid contains an odd number of cells, the
CURAND_STATUS_LENGTH_NOT_MULTIPLE
error occurs. This is explained in thecurandGenerateNormal
documentation:The new warning (example below) is raised if the grid is odd, when the random thermal field is updated for the first time.
These warnings are printed during program execution, so may be buried within the output. Alternatively, an error could be raised, but that seems premature if the CUDA error has not yet occurred. Alternatively, the warning could be printed at the very end of the output, but that seems hard to implement.