v0.2.0
v0.2.0 introduces numerous optimizations that reduce Tabular average inference time by 4x and average disk usage by 10x compared to v0.1.0, as well as a refactored ImagePredictor API to better align with the other tasks and a 20x inference speedup in Vision tasks. This release contains 42 commits from 9 contributors.
This release is non-breaking when upgrading from v0.1.0, with four exceptions:
ImagePredictor.predict
andImagePredictor.predict_proba
have different output formats.TabularPredictor.evaluate
andTabularPredictor.evaluate_predictions
have different output formats.- Custom dictionary inputs to
TabularPredictor.fit
'shyperparameter_tune_kwargs
argument now have a different format. - Models trained in v0.1.0 should only be loaded with v0.1.0. Loading models trained in different versions of AutoGluon is not supported.
See the full commit change-log here: v0.1.0...v0.2.0
Thanks to the 9 contributors that contributed to the v0.2.0 release!
Special thanks to the 3 first-time contributors! @taesup-aws, @ValerioPerrone, @lukemorrill
Full Contributor List (ordered by # of commits):
@Innixma, @zhreshold, @gradientsky, @jwmueller, @mseeger, @sxjscience, @taesup-aws, @ValerioPerrone, @lukemorrill
Major Changes
Tabular
- Reduced overall inference time on
best_quality
preset by 4x (and 2x on others). @Innixma, @gradientsky - Reduced overall disk usage on
best_quality
preset by 10x. @Innixma - Reduced training time and inference time of K-Nearest-Neighbor models by 250x, and reduced disk usage by 10x via:
- Efficient out-of-fold implementation (10x training & inference speedup, 10x reduced disk usage) on
best_quality
preset. @Innixma (#1022) - [Experimental] Integration of the scikit-learn-intelex package (25x training & inference speedup). @Innixma (#1049)
- This is currently not installed by default. Try it via
pip install autogluon.tabular[all,skex]
orpip install "scikit-learn-intelex<2021.3"
. Once installed, AutoGluon will automatically use it.
- This is currently not installed by default. Try it via
- Efficient out-of-fold implementation (10x training & inference speedup, 10x reduced disk usage) on
- Reduced training time, inference time, and disk usage of RandomForest and ExtraTrees models by 10x via efficient out-of-fold implementation. @Innixma (#1066, #1082)
- Reduced training time by 30% and inference time by 75% on the FastAI neural network model. @gradientsky (#977)
- Added
quantile
as a newproblem_type
to support quantile regression problems. @taesup-aws, @jwmueller (#1005, #1040)- Try it out with the quantile regression example script!
- [Experimental] Added GPU accelerated RandomForest, K-Nearest-Neighbors and Linear models via integration with NVIDIA RAPIDS. @Innixma (#995, #997, #1000)
- This is not enabled by default. Try it out by first installing RAPIDS and then installing AutoGluon.
- Currently, the models need to be specially passed to the
.fit
hyperparameters argument. Refer to the below kaggle kernel for an example or check out RAPIDS official AutoGluon example.
- Currently, the models need to be specially passed to the
- See how to use AutoGluon + RAPIDS to get top 1% on the Otto kaggle competition with an interactive kaggle kernel!
- This is not enabled by default. Try it out by first installing RAPIDS and then installing AutoGluon.
- [Experimental] Added option to specify early stopping rounds for models LightGBM, CatBoost, and XGBoost via a new model parameter
ag.early_stop
. @Innixma (#1037)- Try it out via
hyperparameters={'XGB': {'ag.early_stop': 500}}
. - The API for this may change in future releases as we try to optimize usage of early stopping in AutoGluon.
- Try it out via
- [Experimental] Added adaptive early stopping to LightGBM. This will attempt to choose when to stop training the model more smartly than using an early stopping rounds value. @Innixma (#1042)
- Re-ordered model training priority to perform better when
time_limit
is small. Fortime_limit=3600
on datasets with over 100,000 rows, v0.2.0 has a 65% win-rate over v0.1.0. @Innixma (#1059, #1084) - Adjusted time allocation to stack layers when performing multi-layer stacking to allow for longer training on earlier layers. @Innixma (#1075)
- Updated CatBoost to v0.25. @Innixma (#1064)
- Added
extra_metrics
argument to.leaderboard
. @Innixma (#1058) - Added feature group importance support to
.feature_importance
. @Innixma (#989)- Now, users can get the combined importance of a group of features.
predictor.feature_importance(test_data, features=['A', 'B', 'C', ('AB', ['A', 'B'])])
- [BREAKING] Refactored
.evalute
and.evaluate_predictions
to be easier to use and share the same code logic. @Innixma (#1080)- The output type has changed and the sign of the metric score has been flipped in some circumstances.
Vision
- Reduced inference time by 20x via various optimizations in inference batching. @zhreshold
- Fixed a problem when loading saved models on cpu-only machines when models are trained on GPU. @zhreshold
- Improved model fitting performance by up to 10% for ObjectDetector when
presets
is empty. @zhreshold - [BREAKING] Refactored
predict
andpredict_proba
methods inImagePredictor
to have the same output formats asTabularPredictor
andTextPredictor
. @zhreshold (#1044)- This change is BREAKING. Previous users of v0.1.0 should ensure they update to use the new formats if they made use of the old
predict
andpredict_proba
when switching to v0.2.0.
- This change is BREAKING. Previous users of v0.1.0 should ensure they update to use the new formats if they made use of the old
- Added improved support for CSV and pandas DataFrame input to
ImagePredictor
. @zhreshold (#1010)- See our new data preparation tutorial to give it a try!
- Added early stopping strategies that significantly improve training efficiency. @zhreshold (#1039)
General
- [Experimental] Added new hyperparameter tuning method: constrained bayesian optimization. @ValerioPerrone (#1034)
- General HPO code improvement / cleanup. @mseeger, @gradientsky (#971, #1002, #1050)
- Fixed ENAS issue when passing in custom datasets. @lukemorrill (#1015)
- Fixed incorrect dependency link between
autogluon.mxnet
andautogluon.extra
causing crash on import. @Innixma (#1032) - Various minor updates and fixes. @Innixma, @jwmueller, @zhreshold, @sxjscience (#990, #996, #998, #1007, #1035, #1052, #1055, #1057, #1072, #1081, #1088)