Release v24.08.0 · NVIDIA/spark-rapids-tools

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.08.0/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.08.0/

Changes

Remove calculation of gpu cluster recommendation from python tool when cluster argument is passed (#1278)
Remove unused argument --target_platform in Python Tool (#1279)
Qualification tool: Add output stats file for Execs(operators) (#1225)
Include GPU information in the cluster recommendation for Dataproc and OnPrem (#1265)
Remove speedup based recommendation column from qual_summary csv (#1268)
Fix prediction CSV files for multiple qual directories (#1267)
Clean up tools after removing CLI dependency (#1256)
Rename cluster shape columns to use 'worker' prefix in the output files and rename metadata file (#1258)
Remove CLI dependency in Dataproc _pull_gpu_hw_info implementation (#1245)
Replace split_nds with split_train_val (#1252)
Update xgboost models and metrics (#1244)
Add footnotes for config recommendations and speedup category in top candidate view (#1243)
[BUG] Update Dataproc instance catalog for n1 series GPU info (#1242)
Improvements in Cluster Config Recommender (#1241)
Improve console output from python tool for failed/gpu/photon event logs (#1235)
[FEA] Generate and use instance description file for Databricks-Azure platform (#1232)
Remove arguments related to cost-savings (#1230)
Updated models for latest databricks-aws datasets (#1231)
Refactor QualX for Linter and Test Compatibility (#1228)
Generate summary metadata file and fix node recommendation in python (#1216)
[FEA] Remove gcloud CLI dependency for Dataproc platform (#1223)
Updated models for latest dataproc eventlogs (#1226)
Remove estimation-model column from qualification summary (#1220)
Add option to add features.csv files to training set (#1212)
Disable cost saving functionality (#1218)
[FEA] Remove CLI dependency for EMR and Databricks-AWS platforms in user tool (#1196)
Fix some basic pylint errors in qualx code (#1210)
Qual tool tuning rec based on CPU event log coherently recommend tunings and node setup and infer cluster from eventlog (#1188)
Add shap command to internal CLI for debugging (#1197)
Add internal CLI to generate instance descriptions for CSPs (#1137)
[FEA] Support custom XGBoost model file via user tools CLI (#1184)
Updated models for new training data (#1186)
Add evaluate_summary command to internal CLI (#1185)
[DOC] Fix broken link to qualX docs and update python prerequisites (#1180)
Bump to certifi-2024.7.4 and urllib3-1.26.19 (#1173)
Disable UI-HTML report by default in Qualification tool (#1168)
Fix parsing App IDs inside metrics directory in QualX (#1167)
Refactor Databricks-AWS Qual tool to cache and process pricing info from DB website (#1141)
Add plugin mechanism for dataset-specific preprocessing in qualx (#1148)
Unsupported op logic should read action column from qual's output (#1150)
Update qualx readme for training (#1140)
Disable pylint-unreachable code in tox.ini (#1145)

Core

Include GPU information in the cluster recommendation for Dataproc and OnPrem (#1265)
[TASK] Optimize the storage of accumulables in core tools (#1263)
Sync GetJsonObject support with Rapids-Plugin (#1266)
Do not create new StageInfo object (#1261)
[FEA] Add support for map_from_arrays in qualification tools (#1248)
Rename cluster shape columns to use 'worker' prefix in the output files and rename metadata file (#1258)
Fix stage level metrics output csv file (#1251)
Handle event logs with wildcards in status report generation (#1237)
Fix duplicate records in DataSourceInfo report (#1227)
Reduce memory footprint of stageInfo (#1222)
Ensure UTF-8 encoding for reading non-english characters (#1211)
Sync plugin support for hash-hive and shift operators (#1198)
Sync-up the support of parse_url in qualification tool (#1195)
Include status information for failed event logs in core tool (#1187)
[FEA] Adding Benchmarking classes to evaluate core tools performance (#1169)
[BUG] Fix handling of non-english characters in tools output files (#1189)
[Bug] Fix java Qual tool handling of --platform argument (#1161)
Add all stage metrics to tools output (#1151)
Follow-up 1142: remove TODO line (#1146)
Mark wholestageCodeGen as shouldRemove when child nodes are removed (#1142)
[FEA] Display full failure messages in failed CSV files (#1135)

Miscellaneous

Qualification tool: Add option to filter event logs for a maximum file system size (#1275)
Qualification tool should print Kryo related recommendations (#1204)
Fix header check script to exclude files (#1224)
Update header check script for pre-commit hooks (#1219)
Follow-up 1189: handle non-english characters in data-output.js (#1208)
Update pre-commit hooks to check for headers and white-spaces (#1205)
user-tools:Update --help for cluster argument (#1178)
Support fine-tuning models (#1174)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v24.08.0

Packages

Changes

User Tools

Core

Miscellaneous