Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/mlflow] MLFlow missing package: google-cloud-storage #65108

Closed
RussellSB opened this issue Apr 12, 2024 · 6 comments
Closed

[bitnami/mlflow] MLFlow missing package: google-cloud-storage #65108

RussellSB opened this issue Apr 12, 2024 · 6 comments
Assignees
Labels
mlflow solved tech-issues The user has a technical issue about an application

Comments

@RussellSB
Copy link

Name and Version

bitnami/mlflow

What architecture are you using?

amd64

What steps will reproduce the bug?

Right now we have MLFlow setup with GCP and relying on the bitnami image. Whenever we try log ML models to the tracking server it tries saving it under the hood to google cloud storage but fails due to missing package google-cloud-storage and its dependencies (google.auth included). To reproduce simply without having to setup the whole GCP server;

  1. Load mlflow bitnami image.
  2. Start python
  3. Interpret from google.auth.exceptions import DefaultCredentialsError (as per https://github.com/mlflow/mlflow/blob/master/mlflow/store/artifact/gcs_artifact_repo.py)

What is the expected behavior?

It imports correctly.

What do you see instead?

Traceback (most recent call last):
  File "/opt/bitnami/python/lib/python3.10/site-packages/flask/app.py", line 1455, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/bitnami/python/lib/python3.10/site-packages/flask/app.py", line 869, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/bitnami/python/lib/python3.10/site-packages/flask/app.py", line 867, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/bitnami/python/lib/python3.10/site-packages/flask/app.py", line 852, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/opt/bitnami/python/lib/python3.10/site-packages/mlflow/server/handlers.py", line 497, in wrapper
    return func(*args, **kwargs)
  File "/opt/bitnami/python/lib/python3.10/site-packages/mlflow/server/handlers.py", line 538, in wrapper
    return func(*args, **kwargs)
  File "/opt/bitnami/python/lib/python3.10/site-packages/mlflow/server/handlers.py", line 951, in _list_artifacts
    artifact_entities = _list_artifacts_for_proxied_run_artifact_root(
  File "/opt/bitnami/python/lib/python3.10/site-packages/mlflow/server/handlers.py", line 497, in wrapper
    return func(*args, **kwargs)
  File "/opt/bitnami/python/lib/python3.10/site-packages/mlflow/server/handlers.py", line 981, in _list_artifacts_for_proxied_run_artifact_root
    artifact_destination_repo = _get_artifact_repo_mlflow_artifacts()
  File "/opt/bitnami/python/lib/python3.10/site-packages/mlflow/server/handlers.py", line 175, in _get_artifact_repo_mlflow_artifacts
    _artifact_repo = get_artifact_repository(os.environ[ARTIFACTS_DESTINATION_ENV_VAR])
  File "/opt/bitnami/python/lib/python3.10/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 117, in get_artifact_repository
    return _artifact_repository_registry.get_artifact_repository(artifact_uri)
  File "/opt/bitnami/python/lib/python3.10/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 74, in get_artifact_repository
    return repository(artifact_uri)
  File "/opt/bitnami/python/lib/python3.10/site-packages/mlflow/store/artifact/gcs_artifact_repo.py", line 40, in __init__
    from google.auth.exceptions import DefaultCredentialsError
ModuleNotFoundError: No module named 'google.auth' 

Additional information

As a work around we install google-cloud-storage over the image everytime the server is connected to, but would be good to have it in built in the image since it is core functionality. Would open a PR but not sure where to install this missing package in the repo.

This also seems related; bitnami/charts#22720

@RussellSB RussellSB added the tech-issues The user has a technical issue about an application label Apr 12, 2024
@github-actions github-actions bot added the triage Triage is needed label Apr 12, 2024
@javsalgar javsalgar changed the title MLFlow missing package: google-cloud-storage [bitnami/mlflow] MLFlow missing package: google-cloud-storage Apr 18, 2024
@javsalgar
Copy link
Contributor

Hi!

Thank you so much for reporting. Indeed, these packages are missing. I created a task in our backlog to add these missing pip modules.

@javsalgar javsalgar added the on-hold Issues or Pull Requests with this label will never be considered stale label Apr 18, 2024
@github-actions github-actions bot removed the triage Triage is needed label Apr 18, 2024
@RussellSB
Copy link
Author

Great, thank you! Look forward to this,

@RussellSB
Copy link
Author

Hey, are there any updates? Would be great to know how far down the roadmap this issue could be tackled.

@dhrp
Copy link

dhrp commented May 24, 2024

I've been trying to see if I can add this dependency but I'm running into a wall... I don't have any way to see how the stacksmith dependencies are built, or make changes to it.

I tried adding the google-cloud-sdk dependency; but to no avail. From what I can see nowhere in this repository pip install is actually used; so it must be enforced. Perhaps a maintainer can help me understand how to do this?

I tried the following:
https://gist.github.com/dhrp/f5ad291ab9ab583e85da1bf930326d33

but it doesn't install the python SDK / the import does not work.

[edit]
Actually; simply adding:

RUN pip install google-cloud-storage

to the end of the Dockerfile works. Would you be interested in a contribution like this? -- or should it really go into the stacksmith part?

pinging @javsalgar. I'm planning to also pick up bitnami/charts#22720; but this is a dependency.

@juan131
Copy link
Contributor

juan131 commented Jun 28, 2024

Hi everyone

Could you please give it a try using the image tag 2.14.1-debian-12-r1? We included the missing Python module on this image revision.

@carrodher carrodher added in-progress and removed on-hold Issues or Pull Requests with this label will never be considered stale labels Jul 1, 2024
@github-actions github-actions bot assigned andresbono and unassigned carrodher Jul 1, 2024
@carrodher carrodher assigned juan131 and unassigned andresbono Jul 1, 2024
@juan131
Copy link
Contributor

juan131 commented Jul 15, 2024

I'm closing this issue given we included the missing pip module in 2.14.1-debian-12-r1 revision, please reopen it if you require further assistance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlflow solved tech-issues The user has a technical issue about an application
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants