-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API methods use overall connection project instead of object's project #1133
Comments
Hi @MasterOdin - A clarification: You say that you need to pass in the |
I'm not sure you read my code sample correctly? I am very much passing const [datasets] = await conn.getDatasets({ projectId: 'bigquery-public-data' }); The part on the dataset object I'm highlighting is its projectId: The problem is that for each dataset that is returned, for any client method I run on that dataset (e.g. For my above example, if I wanted to get all tables for this dataset in a different project than my connection, I would need to do the following: const [datasets] = await conn.getDatasets({ projectId: 'bigquery-public-data' });
const dataset = datasets[0];
const [tables] = await dataset.getTables({ projectId: 'bigquery-public-data' }); This is true then for the API methods for the objects in |
Thanks for following up! I looked into this more and here's my explanation: you can in fact pass in projectId, as you've noted, which then overrides the projectId in the request object for that particular call. Because the code is in NodeJS not TypeScript, this comes up as a type error, but still works. I'm writing a quick sample to test what happens when you create a dataset that points to a different project than the client, which may provide a quick fix. However, I do think this has the potential to be a smoother process, and I'm going to look into updating the method. I'm changing this from a bug to a feature request, and will combine it with the other issue. |
For a further example, I've got something like the following in my code (though wrapped using const projectId = 'bigquery-public-data';
const client = new BigQuery({ credentials, projectId: 'foo' });
const [datasets] = await client.getDatasets({ projectId });
for (const dataset of datasets) {
const [tables] = await dataset.getTables({ projectId });
for (const table of tables) {
await table.getMetadata({ projectId });
/* do something with table */
}
} So the initial For the subsequent calls to From my perspective, |
Hi @MasterOdin I agree with your assessment that the current library operates such that the A fairly straightforward way people have handled this is authenticating with a service account that has permissions on more than one project. If this is passed into the client as a credential object, methods in the client instance should be able to access what the service account can access. This is less relevant in the case of multitenancy with externally hosted data. In that case there are other approaches through GCP to help coordinate access and data isolation (particularly federated workloads). If you use Dataflow and are open to Typescript, you can take advantage of the BigQuery I/O Connector in the Apache Beam Typescript SDK. We have a guide for Java, but not for Node. If that is more your use-case, I'd like to know if about blockers using that TypeScript BQ I/O connector. There are some ways to optimize I/O experience in our own SDK (and we've done this in Java, Go, etc), but so far we haven't had many requests for the same in NodeJS |
We have a setup already that allows us to select stuff from other projects, and is not really related to this issue? The issue remains that fetching schema objects of a dataset from a project that is different from the one used in the client constructor requires passing a
I'm not sure how this is relevant, unless your saying that this SDK will always have this bug and that I'd be better off finding an alternative library to fit my needs. |
Thanks for getting back to me - sorry for the misunderstanding! I definitely think this behavior should be addressed, and I appreciate the clarification. I'll be looking into this feature request further to see what the timeline might be for a fix. |
Environment details
@google-cloud/bigquery
version: 6.0.0Steps to reproduce
This will throw an error when I run
getTables
with messageApiError: Not found: Dataset foo:austin_311
. To get this to work, I need to passprojectId
to every method call, be it for fetching tables, routines, metadata, etc. This was surprising to me and coworkers where we had assumed that the methods would use the object'sprojectId
versus the overall client. The console log on the dataset gives me:So the correct projectId is recorded as part of
dataset.metadata.id
and indataset.metadata.datasetReference.projectId
, but I'm guessing that that for the API calls, it usesdataset.bigQuery.projectId
or something which is different.The text was updated successfully, but these errors were encountered: