Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requester pays access isn't working #6179

Closed
ldgauthier opened this issue Sep 24, 2019 · 7 comments · Fixed by #7700
Closed

Requester pays access isn't working #6179

ldgauthier opened this issue Sep 24, 2019 · 7 comments · Fixed by #7700
Assignees

Comments

@ldgauthier
Copy link
Contributor

Using the --gcs-project-for-requester-pays argument to access a requester-pays bucket, I tried broad-dsde-methods, "broad-dsde-methods", and 222581509023, but no dice. The log shows that the engine is reading the argument, but it doesn't seem to be passed to the cloud utils correctly.

14:23:16.753 INFO  PrintReads - GCS max retries/reopens: 20
14:23:16.753 INFO  PrintReads - Requester pays: enabled. Billed to: broad-dsde-methods
14:23:16.753 INFO  PrintReads - Initializing engine
14:23:18.501 INFO  PrintReads - Shutting down engine
[September 23, 2019 2:23:18 PM EDT] org.broadinstitute.hellbender.tools.PrintReads done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=375914496
code:      400
message:   Bucket is requester pays bucket but no user project provided.
reason:    required
location:  null
retryable: false
com.google.cloud.storage.StorageException: Bucket is requester pays bucket but no user project provided.

gsutil -u 222581509023 stat gs://fc-secure-2011b97c-a9c9-4a13-8911-f3833be31253/CCDG_WashU_CVD_EOCAD_METSIM_WGS_all/2893803451.cram works and gsutil stat gs://fc-secure-2011b97c-a9c9-4a13-8911-f3833be31253/CCDG_WashU_CVD_EOCAD_METSIM_WGS_all/2893803451.cram produces

BadRequestException: 400 Bucket is requester pays bucket but no user project provided.

I tried the above variations on export GOOGLE_CLOUD_PROJECT= in the shell, but that didn't change things. It's possible I missed some combination of the above, but at the very least our docs need clarification.

@droazen droazen added this to the GATK-Priority-Backlog milestone Oct 30, 2019
@droazen droazen self-assigned this Nov 25, 2019
@droazen droazen modified the milestones: GATK-Priority-Backlog, GATK-Triaged-Issues Nov 25, 2019
@droazen
Copy link
Collaborator

droazen commented Jan 8, 2020

Since we can't replicate this issue in newer versions of the GATK, I'm going to close this for now. Feel free to re-open if you encounter this again @ldgauthier

@droazen droazen closed this as completed Jan 8, 2020
@rahulg603
Copy link

rahulg603 commented Feb 21, 2022

Hi all -- I have recently started observing this exact issue with PrintReads. I have been using this via cromwell jobs in Terra and on Saturday this issue cropped up. I was able to recreate it locally by launching the latest GATK Docker image (4.2.5.0) and running gcloud auth login and gcloud auth application-default login. Similar to @ldgauthier adding export GOOGLE_CLOUD_PROJECT= does nothing locally.

Did this ever get resolved? Let me know if I'm missing something here.

@droazen
Copy link
Collaborator

droazen commented Feb 22, 2022

@rahulg603 This last time this was reported, we were unable to reproduce on our end, and the issue mysteriously "went away" on its own for @ldgauthier. Could you please report whether you're still getting the same error today? Are you able to access the same bucket using gsutil ?

@rahulg603
Copy link

rahulg603 commented Feb 22, 2022

Thanks for your response @droazen -- just tried it and it still does not work for me. I am able to access the bucket fine via gsutil -u {project} as expected.

I wonder if this is some GCP issue because I am also unable to get Cromwell to pull down files from requester pays (via Terra, so this should be handled in theory), an issue that a colleague also has once I asked her to run this command on a different r/p bucket using her billing project and account.

Also, for a different project and bucket the usual workflow I have to get Hail to read from r/p buckets seems to not work with this same error. Very confused.

EDIT: https://support.terra.bio/hc/en-us/articles/4447388269851 seems to provide the most parsimonious explanation:

It was determined that Google tweaked an error message causing Cromwell not to recognize buckets as requestor pays.

Wonder if something similar is going on with GATK?

@droazen
Copy link
Collaborator

droazen commented Feb 24, 2022

Reopening this ticket, as others have encountered this error recently as well

@droazen droazen reopened this Feb 24, 2022
@droazen
Copy link
Collaborator

droazen commented Feb 24, 2022

@rahulg603 We've confirmed this issue on our end, and have a proposed fix in a branch that we're currently testing. Should be resolved in the next GATK release, which will come out within the next ~week.

@droazen
Copy link
Collaborator

droazen commented Mar 2, 2022

Update on this: we've submitted a patch to the java-storage-nio library that fixes this issue (googleapis/java-storage-nio#832), and are waiting for Google to merge / release it. Once Google has released a new version of their library we'll upgrade GATK to it and this should be resolved. We still anticipate that this will happen in time for the next GATK release.

droazen added a commit that referenced this issue Mar 9, 2022
… the latest release (#7700)

Update our google-cloud-nio dependency to fix a regression in support for requester pays GCS buckets, and update related Google dependencies as necessary. Added a regression test for requester pays access that would have caught the original issue.

Resolves #6179

Co-authored-by: Louis Bergelson <louisb@broadinstitute.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants