You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Today the plugin is conservative in asking users to enable floating point calculations by default. There will always be some differences in floating point calculations between CPU and GPU. We should enable floating point by default, by setting these configurations to true:
Describe the solution you'd like
Enable floating point by default. Allow users to disable float if they need to fall back to the CPU behavior. Add a warning in the driver log to let users know when floating point operations are used and that there may be differences between CPU and GPU calculations of floats, and point them to the documentation related to floats.
We should also turn on spark.rapids.sql.incompatibleOps.enabled as part of this, and log warnings about incompatibleOps being on by default.
If a user has explicitly set a config to enabled then we should disable the warning.
Documentation will need to be updated as well, and this should be flagged in the release notes.
Describe alternatives you've considered
We could fix as many of the differences between floating point calculations between the CPU and GPU. That will take quite a bit of time and may not get us all the way there.
Additional context
We've had floating point off by default because of concerns of users trying to do things with floating point aggregations like joins on floats or unions on floats. However without floating point operations on, there is a performance hit as many operations will fall back to the CPU.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Today the plugin is conservative in asking users to enable floating point calculations by default. There will always be some differences in floating point calculations between CPU and GPU. We should enable floating point by default, by setting these configurations to true:
spark.rapids.sql.castDecimalToFloat.enabled
spark.rapids.sql.castFloatToIntegralTypes.enabled
spark.rapids.sql.castFloatToString.enabled
spark.rapids.sql.castStringToFloat.enabled
spark.rapids.sql.csv.read.float.enabled
spark.rapids.sql.json.read.float.enabled
spark.rapids.sql.improvedFloatOps.enabled
spark.rapids.sql.variableFloatAgg.enabled
Describe the solution you'd like
Enable floating point by default. Allow users to disable float if they need to fall back to the CPU behavior. Add a warning in the driver log to let users know when floating point operations are used and that there may be differences between CPU and GPU calculations of floats, and point them to the documentation related to floats.
We should also turn on
spark.rapids.sql.incompatibleOps.enabled
as part of this, and log warnings about incompatibleOps being on by default.If a user has explicitly set a config to enabled then we should disable the warning.
Documentation will need to be updated as well, and this should be flagged in the release notes.
Describe alternatives you've considered
We could fix as many of the differences between floating point calculations between the CPU and GPU. That will take quite a bit of time and may not get us all the way there.
Additional context
We've had floating point off by default because of concerns of users trying to do things with floating point aggregations like joins on floats or unions on floats. However without floating point operations on, there is a performance hit as many operations will fall back to the CPU.
The text was updated successfully, but these errors were encountered: