Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow defining a bigquery_table schema directly #910

Closed
dasch opened this issue Jan 3, 2018 · 18 comments
Closed

Allow defining a bigquery_table schema directly #910

dasch opened this issue Jan 3, 2018 · 18 comments

Comments

@dasch
Copy link
Contributor

dasch commented Jan 3, 2018

Currently, one must write BigQuery table schemas as JSON, typically in a separate file that's loaded with file(). This is a rather odd user experience, as the rest of the table and most other resources in Terraform are defined using HCL.

I'd like to suggest that some DSL be added to the resource that allows defining the schema of the table. I'm not terribly opinionated about how that DSL should look, as long as I don't have to write JSON.

Affected Resource(s)

  • google_bigquery_table
@danawillow
Copy link
Contributor

Hey @dasch, the reason behind this is that BigQuery schemas can have arbitrary depth (i.e. a schema can contain another schema: https://godoc.org/google.golang.org/api/bigquery/v2#TableFieldSchema), which isn't something that HCL supports. See hashicorp/terraform#13743 for the original discussion about this.

Defining another language to support just this use case isn't really a sustainable option when JSON works perfectly well, so I'm going to go ahead and close this but thanks @dasch for the suggestion and feel free to comment if there's something I've missed :)

@dasch
Copy link
Contributor Author

dasch commented Jan 5, 2018

Do we need to support all schemas if there's still a JSON fallback option? It would be a much better UX if there was at least support for schemas that can be represented in HCL...

@danawillow
Copy link
Contributor

Are you saying that BigQuery schemas can be represented in HCL? HCL doesn't parse arbitrary text, it has to be part of a specific schema that we specify in the resource definition. Since this field is self-referential (a TableFieldSchema contains a list of other TableFieldSchemas), there's no way to represent it in the resource definition. If you have a workaround I'd be happy to hear it though!

@dasch
Copy link
Contributor Author

dasch commented Jan 8, 2018

If the HCL version simply disallows nested schemas it wouldn't be a problem, right? As long as there's a JSON fallback you'd be able to cover most use cases with a nice DSL instead of forcing everyone to use JSON.

@danawillow
Copy link
Contributor

If you have a proposal for a "nice DSL" I'd be happy to hear it- it's not really in our best interest to have to come up with a new language along with a parser/compiler for it in order to support one resource for people that don't wish to use JSON.

One thing we could do is have a schema field that allows you to define a flat list of TableFieldSchemas with no nesting, meaning people only need to fallback to json if they need nested fields. I'll reopen this issue in case anybody wants to take that on, but at least right now it won't be a high priority for us since there's a perfectly usable workaround.

@danawillow danawillow reopened this Jan 8, 2018
@dasch
Copy link
Contributor Author

dasch commented Jan 9, 2018

My preference would be for something like this:

resource "google_bigquery_table" "sales" {
  dataset_id = "${google_bigquery_dataset.default.dataset_id}"
  table_id   = "sales"

  field {
    name = "product_id"
    type = "INT64"
    mode = "REQUIRED"

    description = "The id of the product that was sold."
  }

  field {
    name = "qty"
    type = "INT64"
  }
}

@dasch
Copy link
Contributor Author

dasch commented Jan 9, 2018

Note that I'm not suggesting a new language that needs to be parsed, just HCL support for defining simple, non-nested fields.

@paddycarver
Copy link
Contributor

I think my concern would be that allowing for inline definitions of the simple case inevitably will lead to bug reports where people want to define nested schemas in HCL.

@dasch
Copy link
Contributor Author

dasch commented Mar 1, 2018

Sure, that's probably inevitable – but my guess would be that most TF resources are less powerful than tooling specific to the provider. It's not reasonable to expect TF to be able to handle every single nuance, and since there's a fallback that offers the functionality I think it's fine.

I think TF is all about a unified, great user experience. Currently, my plan output is just a long line of escaped JSON. It's worthless as a way to build confidence in the change I've made. Seeing each field separately would be a huge benefit.

@dasch
Copy link
Contributor Author

dasch commented May 7, 2018

Any update on this?

@danawillow
Copy link
Contributor

Hi @dasch, are you looking for feedback on your idea or on #1113? In that PR I see a bunch of checkboxes, only a few of which were checked. Were you planning on putting the rest of the checkboxes into that PR, or following up with a separate one? I think we just didn't realize you were waiting on us for something.

@dasch
Copy link
Contributor Author

dasch commented May 8, 2018

@danawillow I was unsure whether the PR would be accepted, so I paused work.

@paddycarver
Copy link
Contributor

In theory, I suppose you could make this a data source, and use a as_json computed property that returns the TableFieldSchema as JSON, and then each TableFieldSchema could have a fields array that accepted the list of JSON... And that may be a hack around the problem?

modular-magician pushed a commit to modular-magician/terraform-provider-google that referenced this issue Sep 27, 2019
@umairidris
Copy link
Contributor

umairidris commented Mar 12, 2020

FYI for those looking to solve this today, there is a way to define it mostly in HCL:

resource "google_bigquery_table" "foo" {
  dataset_id = "foo"
  table_id   = "foo"

  schema = jsonencode({
      foo = "bar"
  })
}

@AlfatahB
Copy link
Contributor

b/262200574

@AlfatahB
Copy link
Contributor

I've looked into this issue and it seems like @dasch wanted to add support for a new field field that will define the schema of google_bigquery_table directly in the terraform config without using a JSON file in the schema field. Following are the details I've found out:

  • Currently, we can define the schema of google_bigquery_table as JSON file using the schema field.
  • Apart from this, we can also use jsonencode function to define the schema of google_bigquery_table. I've tried using the following config file:
resource "google_bigquery_dataset" "default" {
  dataset_id    = "my_dataset"
  friendly_name = "foo"
  description   = "bar"
  location      = "asia-northeast1"
}

resource "google_bigquery_table" "default" {
  dataset_id = google_bigquery_dataset.default.dataset_id
  table_id   = "bar"
  deletion_protection=false

  schema = jsonencode([
    {
      name = "field1"
      type = "STRING"
      mode = "NULLABLE"
      description = "The Permalink"
    },
    {
      name = "field2"
      type = "STRING"
      mode = "NULLABLE"
      description = "The second field"
    }
  ])

}
  • It seems like this issue targets to add new field field to define schema using the terraform config as described in this comment but the problem with that is we might not be able to define the nested schema and if we want to add nested schema, then we need to use the schema field.

According to me, this issue has two probable solutions:

  1. Not introducing the new field field in the google_bigquery_table. We are able to use the jsonencode function as mentioned in the above points that seems similar to adding a new field field.
  2. Adding the new field field in the google_bigquery_table that conflictsWith the schema field. This new field might help to define a relatively simple schema(without having any nested schema). But, this is the limitation for this solution.

In my opinion, we should go with the jsonencode function as it is working fine(also with nested fields) and it will provide a way to define the schema of google_bigquery_table in the terraform config file.

@melinath
Copy link
Collaborator

Closing this issue due to the presence of a clean workaround with jsonencode. If there is a need for a different solution please open a new issue describing the problem & proposed solution in detail. Thanks!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants