-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bitnami/mlflow] support google-cloud-storage #67246
[bitnami/mlflow] support google-cloud-storage #67246
Conversation
Signed-off-by: Thatcher Peskens <thatcher@t2studio.nl>
I'm chasing one weird issue that in some server configurations the mlflow client fails to download the artifact from google storage directly with a permission error; even though the server has access to the artifacts just fine; and upload also works. |
Hi @dhrp Thanks so much for this contribution! We're currently evaluating the impact of including this Python module which seems to increase the image size by 16MB. We need to decide whether it's widely used or not before including it since we want the image to include only the most important modules and ask users to extend the image adding their custom ones for less important use cases. In case we decide to accept it, please note it won't be included in the image using the "pip install" directive you proposed, but as part of the We'll keep you updated about any decision we take. |
Hi @juan131, ok; thanks for your message. It seems that at least two other people commented, and 4 👍 on my issue on the MLFlow Helm chart that they would like to have the feature. See: bitnami/charts#22720 Also: is there any way to see or contribute to how these tarballs are created? |
Hi @dhrp Thanks for the insights, I'll share with the team.
I'm afraid the compilation recipes we use to build Bitnami assets are internal. We may consider moving them to some public repo since there's nothing to hide on them. |
This Pull Request has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thank you for your contribution. |
not stale |
Can we move this forward? I personally think the amount of 👍 (now 6) on the MLflow chart depending on this change is sufficient to make this change - if it's only a 16Mb size increase. -- As I see it storing models on a durable storage really is a primary feature of MLFlow. |
Hi @dhrp I'm glad to confirm we got the "green light" to include this module by default in the image. I'm applying the required changes right now and I'll ping you once we released a new container image version including it. |
Hi @dhrp Please give it a try using the image tag |
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Pull Request. Do not hesitate to reopen it later if necessary. |
I'm closing this PR given we included the missing pip module in |
HI @juan131, somehow didn't see this until now! Thanks so much of moving this forward. Very happy about it! |
Description of the change
Updates the Bitnami MLFlow image to contain the google-cloud-storage pip module.
Benefits
This allows MLFlow to use the (built in) support for working with google cloud storage for storing and retrieving artifacts from google cloud storage. It is significant to run the MLFlow tracking server in this mode with google storage; but is also useful when using the MLFlow container in client mode.
Possible drawbacks
None that I know; though I don't know if adding a pip install at the end is the approach desired by Bitnami. • An alternative approach would be to add it to the mlflow stacksmith tarball; but AFAIK I cannot contribute to that. Will let that to the maintainers.
Applicable issues
Add support onto the container: fixes: #65108
Add support to google cloud storage in the MLFlow chart: Relates to bitnami/charts#22720
Additional information
I'm happy to change approach if directed to how.