v0.2.0 introduces numerous optimizations that reduce Tabular average inference time by 4x and average disk usage by 10x compared to v0.1.0, as well as a refactored ImagePredictor API to better align with the other tasks and a 20x inference speedup in Vision tasks. This release contains 42 commits from 9 contributors.

This release is non-breaking when upgrading from v0.1.0, with four exceptions:

ImagePredictor.predict and ImagePredictor.predict_proba have different output formats.
TabularPredictor.evaluate and TabularPredictor.evaluate_predictions have different output formats.
Custom dictionary inputs to TabularPredictor.fit's hyperparameter_tune_kwargs argument now have a different format.
Models trained in v0.1.0 should only be loaded with v0.1.0. Loading models trained in different versions of AutoGluon is not supported.

See the full commit change-log here: v0.1.0...v0.2.0

Thanks to the 9 contributors that contributed to the v0.2.0 release!

Special thanks to the 3 first-time contributors! @taesup-aws, @ValerioPerrone, @lukemorrill

Full Contributor List (ordered by # of commits):

@Innixma, @zhreshold, @gradientsky, @jwmueller, @mseeger, @sxjscience, @taesup-aws, @ValerioPerrone, @lukemorrill

Major Changes

Tabular

Reduced overall inference time on best_quality preset by 4x (and 2x on others). @Innixma, @gradientsky
Reduced overall disk usage on best_quality preset by 10x. @Innixma
Reduced training time and inference time of K-Nearest-Neighbor models by 250x, and reduced disk usage by 10x via:
- Efficient out-of-fold implementation (10x training & inference speedup, 10x reduced disk usage) on best_quality preset. @Innixma (#1022)
- [Experimental] Integration of the scikit-learn-intelex package (25x training & inference speedup). @Innixma (#1049)
  - This is currently not installed by default. Try it via pip install autogluon.tabular[all,skex] or pip install "scikit-learn-intelex<2021.3". Once installed, AutoGluon will automatically use it.
Reduced training time, inference time, and disk usage of RandomForest and ExtraTrees models by 10x via efficient out-of-fold implementation. @Innixma (#1066, #1082)
Reduced training time by 30% and inference time by 75% on the FastAI neural network model. @gradientsky (#977)
Added quantile as a new problem_type to support quantile regression problems. @taesup-aws, @jwmueller (#1005, #1040)
- Try it out with the quantile regression example script!
[Experimental] Added GPU accelerated RandomForest, K-Nearest-Neighbors and Linear models via integration with NVIDIA RAPIDS. @Innixma (#995, #997, #1000)
- This is not enabled by default. Try it out by first installing RAPIDS and then installing AutoGluon.
  - Currently, the models need to be specially passed to the .fit hyperparameters argument. Refer to the below kaggle kernel for an example or check out RAPIDS official AutoGluon example.
- See how to use AutoGluon + RAPIDS to get top 1% on the Otto kaggle competition with an interactive kaggle kernel!
[Experimental] Added option to specify early stopping rounds for models LightGBM, CatBoost, and XGBoost via a new model parameter ag.early_stop. @Innixma (#1037)
- Try it out via hyperparameters={'XGB': {'ag.early_stop': 500}}.
- The API for this may change in future releases as we try to optimize usage of early stopping in AutoGluon.
[Experimental] Added adaptive early stopping to LightGBM. This will attempt to choose when to stop training the model more smartly than using an early stopping rounds value. @Innixma (#1042)
Re-ordered model training priority to perform better when time_limit is small. For time_limit=3600 on datasets with over 100,000 rows, v0.2.0 has a 65% win-rate over v0.1.0. @Innixma (#1059, #1084)
Adjusted time allocation to stack layers when performing multi-layer stacking to allow for longer training on earlier layers. @Innixma (#1075)
Updated CatBoost to v0.25. @Innixma (#1064)
Added extra_metrics argument to .leaderboard. @Innixma (#1058)
Added feature group importance support to .feature_importance. @Innixma (#989)
- Now, users can get the combined importance of a group of features.
- predictor.feature_importance(test_data, features=['A', 'B', 'C', ('AB', ['A', 'B'])])
[BREAKING] Refactored .evalute and .evaluate_predictions to be easier to use and share the same code logic. @Innixma (#1080)
- The output type has changed and the sign of the metric score has been flipped in some circumstances.

Vision

Reduced inference time by 20x via various optimizations in inference batching. @zhreshold
Fixed a problem when loading saved models on cpu-only machines when models are trained on GPU. @zhreshold
Improved model fitting performance by up to 10% for ObjectDetector when presets is empty. @zhreshold
[BREAKING] Refactored predict and predict_proba methods in ImagePredictor to have the same output formats as TabularPredictor and TextPredictor. @zhreshold (#1044)
- This change is BREAKING. Previous users of v0.1.0 should ensure they update to use the new formats if they made use of the old predict and predict_proba when switching to v0.2.0.
Added improved support for CSV and pandas DataFrame input to ImagePredictor. @zhreshold (#1010)
- See our new data preparation tutorial to give it a try!
Added early stopping strategies that significantly improve training efficiency. @zhreshold (#1039)

General

[Experimental] Added new hyperparameter tuning method: constrained bayesian optimization. @ValerioPerrone (#1034)
General HPO code improvement / cleanup. @mseeger, @gradientsky (#971, #1002, #1050)
Fixed ENAS issue when passing in custom datasets. @lukemorrill (#1015)
Fixed incorrect dependency link between autogluon.mxnet and autogluon.extra causing crash on import. @Innixma (#1032)
Various minor updates and fixes. @Innixma, @jwmueller, @zhreshold, @sxjscience (#990, #996, #998, #1007, #1035, #1052, #1055, #1057, #1072, #1081, #1088)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.0

Major Changes

Tabular

Vision

General