Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax dependency on lightgbm #46

Merged
merged 1 commit into from
Apr 29, 2024
Merged

Conversation

jlopezpena
Copy link
Contributor

Latest version of lightgbm is 4.3, but arfs is hard-pinning version 3.3.1. This PR relaxes the dependency to allow for more recent lightgbm versions

@ThomasBury
Copy link
Owner

Hi @jlopezpena, Upgrading to LightGBM v4 won't be a straightforward process. Here's why:

Output Shape Changes: The output shapes of LightGBM models have changed across versions. We'll need to verify compatibility with our current code.

Optuna Compatibility: Compatibility testing is required to ensure Optuna works seamlessly with LightGBM v4.

Downgrading Context:
downgrading to LightGBM v3.3.1 due to compatibility issues with private packages I'm developing and which use ARFS (as referenced in the changelog) see also #32

I'll try to check in the near future if I can work on it

@jlopezpena
Copy link
Contributor Author

Optuna works with lighgbm >= 4 since last summer: optuna/optuna#4844

Is there any test suite I can run to check if anything breaks?

@jlopezpena
Copy link
Contributor Author

For reference, I just ran my modelling pipeline on my branch, with LightGBM 4.3.0, and it ran successfully without any errors. I am obviously not using every method in ARFS, but I did use the following:

  • CollinearityThreshold
  • VariableImportance with lgb_kwargs = {"objective": "xentropy", "zero_as_missing": False} (regression task)
  • GrootCV with cutoff=1, objective="xentropy", nfolds=10, n_iter=10
    I am always using fastshap=False because the FastTreeShap hasn't been updated since a very long time ago, and hasn't kept up with original SHAP

@ThomasBury ThomasBury self-requested a review April 10, 2024 11:30
@ThomasBury ThomasBury added the enhancement New feature or request label Apr 10, 2024
@ThomasBury ThomasBury self-assigned this Apr 10, 2024
@ThomasBury
Copy link
Owner

ThomasBury commented Apr 10, 2024

For reference, I just ran my modelling pipeline on my branch, with LightGBM 4.3.0, and it ran successfully without any errors. I am obviously not using every method in ARFS, but I did use the following:

  • CollinearityThreshold
  • VariableImportance with lgb_kwargs = {"objective": "xentropy", "zero_as_missing": False} (regression task)
  • GrootCV with cutoff=1, objective="xentropy", nfolds=10, n_iter=10
    I am always using fastshap=False because the FastTreeShap hasn't been updated since a very long time ago, and hasn't kept up with original SHAP

Thanks for reviewing!

While I don't currently have a dedicated unit test suite, I'm still working on adapting the existing one to function with the new ARFS API.

In the meantime, the provided tutorial notebooks act as an integration test suite. I'll be running these tests with the relaxed version of LightGBM. Expect to see an update on the unit test suite by the end of the month.

You're right, fastshap is a nice project but isn't yet ready for production purposes.

@ThomasBury ThomasBury merged commit fac0f34 into ThomasBury:main Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants