Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google_bigquery_data_transfer_config, change destination_dataset_id to **Optional** #9450

Closed
Assignees
Labels

Comments

@MarcinKowalski101
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

Terraform v0.14.6
provider registry.terraform.io/hashicorp/google v3.73.0

Affected Resource(s)

  • google_bigquery_data_transfer_config

Expected Behavior

destination_dataset_id - (Optional) The BigQuery target dataset id.

query_configs = [
{
data_source_id = "scheduled_query"
location = "europe_west3"
display_name = "DEV call stored procedure without destination"
schedule = "every 20 minutes"
destination_dataset_id = null
params = {
query = "call DEV_BQ.merge()",
}
}
]

because the flag --target_dataset needs the query to be either DDL or DML(The CALL statement is neither DDL nor DML)

if we don't specify a destination table, the query will be saved to a temprorary table, so this means the flag --target_dataset isn't required.

bq query
--use_legacy_sql=false
--location=europe-west3
--display_name='DEV call stored procedure without destination'
--schedule='every 20 minutes'
'call DEV_BQ.merge()'

Actual Behavior

destination_dataset_id - (Required) The BigQuery target dataset id.

query_configs = [
{
data_source_id = "scheduled_query"
location = "europe-west3"
display_name = "DEV call stored procedure without destination"
schedule = "every 20 minutes"
destination_dataset_id = "DEV_BQ"
params = {
query = "call DEV_BQ.merge()",
}
}
]

we got error message :
Dataset specified in the query ('') is not consistent with Destination dataset 'DEV_BQ'.

bq query
--use_legacy_sql=false
--location=europe-west3
--display_name='DEV call stored procedure without destination'
--schedule='every 20 minutes'
--target_dataset='DEV_BQ'
'call DEV_BQ.merge()'

References

  • #0000
@edwardmedia edwardmedia self-assigned this Jun 29, 2021
@edwardmedia
Copy link
Contributor

@MarcinKowalski101 help me to understand. How it works if you don't specify the destination_dataset_id? Which dataset will be where the temporary table resides?

if we don't specify a destination table, the query will be saved to a temprorary table, so this means the flag --target_dataset isn't required.

@MarcinKowalski101
Copy link
Author

Hi
according to GCP documentation
https://cloud.google.com/bigquery/docs/reference/bq-cli-reference#bq_query
Flags and arguments
The bq query command uses the following flags and arguments:
--destination_table=TABLE
When specified, the query results are saved to TABLE. Specify TABLE in the following format: PROJECT:DATASET.TABLE. If PROJECT is not specified, then the current project is assumed. If the --destination_table flag is not specified, then the query results are saved to a temporary table.
so this means the flag --target_dataset isn't required

@edwardmedia
Copy link
Contributor

edwardmedia commented Jul 1, 2021

@MarcinKowalski101 did you run the bq without providing the target_dataset? What was the result?

@MarcinKowalski101
Copy link
Author

MarcinKowalski101 commented Jul 2, 2021

Hi
yes, like I described in case description.
bq query
--use_legacy_sql=false
--location=europe-west3
--display_name='DEV call stored procedure without destination'
--schedule='every 20 minutes'
'call DEV_BQ.merge()'
as you can see there is no destination_dataset_id = "DEV_BQ".
I can only mention that it is real production usecase. Meaning that I have few runnig scheduled queries. Which I created similar to command as an example above. What I want to achieve is to rewrite my solution to terraform. But I can't do it becuse this one parameter will blown up my production environment.

@edwardmedia
Copy link
Contributor

edwardmedia commented Jul 2, 2021

@MarcinKowalski101 I'd like to see where the temp table resides without dataset specified? Can you post the temp table id?

@MarcinKowalski101
Copy link
Author

Hi
Please explain how "Can you post the temp table id?" will solve the bug or bring us closer to solving it.
Please show where did you look for answering your question to "I'd like to see where the temp table resides without dataset specified" because I don't want to send documentation page you already verified

@edwardmedia
Copy link
Contributor

edwardmedia commented Jul 7, 2021

@MarcinKowalski101 I am trying to understand what you asked below. I have never seen a table that resides outside a dataset. That is why I wanted to see the table id. Usually it could show us under which dataset it is. To your request (change destination_dataset_id to Optional), without specifying the dataset, where is the table?

so this means the flag --target_dataset isn't required

@MarcinKowalski101
Copy link
Author

Ok we can continue this academic discussion or we can back to main subject and answer to main question: Can you fix the bug or can you not ?

@edwardmedia
Copy link
Contributor

@MarcinKowalski101 to me this is not a bug, and destination_dataset_id can not be optional. Without the dataset specified, where does the table reside? I could be wrong. If this is the case, please provide the docs and/or steps that show creating a table without a dataset. Making sense?

@MarcinKowalski101
Copy link
Author

I provide all necessary documentation, I also consulted this case with GCP support to make 100% that this field is not mandatory meaning it is optional. All steps to recreate this bug is in the example. So if you don't know how to hendle it please just give information in documentation that terraform has limitation in this usecase meaning that if anyone want to use call procedure in scheduled queries terafform can't be use, instead you need to implement it directly from gcp console.

@edwardmedia
Copy link
Contributor

@megan07 what do you think about this issue?

@edwardmedia edwardmedia removed their assignment Jul 15, 2021
@megan07
Copy link
Contributor

megan07 commented Jul 16, 2021

@edwardmedia - thanks for taking a look into this!

@MarcinKowalski101, sorry you're experiencing this issue. We work on numerous different resources each day, it's sometimes hard to find the exact documentation we need, so anytime you can provide us with that documentation, the quicker we can help you. After some digging, this looks specific to data transfers of datasource type scheduled queries: https://cloud.google.com/bigquery/docs/scheduling-queries. I can make this optional.
Thanks!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.