Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot create dataflow jobs with the enableStreamingEngine boolean set #8649

Closed
n-oden opened this issue Mar 10, 2021 · 1 comment · Fixed by GoogleCloudPlatform/magic-modules#4585, #8670 or hashicorp/terraform-provider-google-beta#3049
Labels
enhancement forward/review In review; remove label to forward service/dataflow

Comments

@n-oden
Copy link

n-oden commented Mar 10, 2021

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

$ terraform -v
Terraform v0.13.4
+ provider registry.terraform.io/hashicorp/google v3.59.0
+ provider registry.terraform.io/hashicorp/google-beta v3.59.0

Affected Resource(s)

  • google_dataflow_job

Terraform Configuration Files

resource "google_pubsub_topic" "test-topic" {
  name = "test-metrics-sink"
}

resource "google_pubsub_subscription" "test-sub" {
  name  = "test-metrics-source"
  topic = "projects/myproject/topics/sourcetopic" // this should be a topic with some traffic
  expiration_policy { ttl = "86400s" }
  message_retention_duration = "600s"
}

resource "google_dataflow_job" "test_job" {
  name              = "test-ps2ps-tf"
  template_gcs_path = "gs://dataflow-templates/2021-02-15-00_RC00/Cloud_PubSub_to_Cloud_PubSub"
  temp_gcs_location = "gs://mybucket/temp"
  zone              = "us-east1-b"
  max_workers       = 2
  machine_type      = "n1-standard-2"
  on_delete         = "drain"
  additional_experiments = [
    "enable_windmill_service",
    "enable_streaming_engine",
  ]

  labels = {
    # These labels get auto-magically set in dataflow when it detects you're using a template that
    # the gcloud team wrote. If you don't manually specify them then terraform thinks you've
    # removed them and redeploys the job every time you apply regardless if you changed anything.
    goog-dataflow-provided-template-name    = "cloud_pubsub_to_cloud_pubsub"
    goog-dataflow-provided-template-version = "2021-02-15-00_rc00"
  }

  parameters = {
    inputSubscription = google_pubsub_subscription.test-sub.id
    outputTopic       = google_pubsub_topic.test-topic.id
  }
}

Debug Output

https://gist.github.com/n-oden/d5fd36c7b54fb68a50afce095a9a591b

Expected Behavior

Terraform should launch a job using the google pubsub-to-pubsub template, and the streaming engine feature should be enabled for the job.

It's not so much that terraform is misbehaving per se here -- the API request it makes to dataflow.googleapis.com is correct per the manifest above. The problem is that there is no support for setting an important boolean in the JSON document that gets posted to /v1b3/projects/myproject/locations/us-east1/templates. Read on below:

Actual Behavior

The job created by terraform does not have streaming engine enabled, and worse yet does not actually process any data.

The issue here appears to be that streaming engine is no longer enableable via the additional_experiments list: there is now a first-class configuration option in the environment section of the json document that is posted to google to create a new job.

If you create a dataflow job using a google-provided template with the gcloud cli tool, the --enable-streaming-engine flag will cause a key to be added to the environment object in the POST data.

There is no way to do this presently with terraform: there is no enable_streaming_engine argument for a google_dataflow_job resource, and passing enable_streaming_engine as a string inside the additional_experiments block as previously noted produces a broken job.

Steps to Reproduce

  1. terraform apply

To see what should happen, you can use the gcloud cli tool:

gcloud --log-http dataflow jobs run test-ps2ps \
  --enable-streaming-engine \
  --gcs-location gs://dataflow-templates/latest/Cloud_PubSub_to_Cloud_PubSub \
  --parameters=inputSubscription=projects/myproject/subscriptions/test-metrics-source,outputTopic=projects/myproject/topics/test-metrics-sink \
  --staging-location=gs://mybucket/staging/

You'll see in the log-http output that the cli makes the following API call:

==== request start ====
uri: https://dataflow.googleapis.com/v1b3/projects/myproject/locations/us-central1/templates?alt=json
method: POST
== headers start ==
accept: application/json
accept-encoding: gzip, deflate
authorization: --- Token Redacted ---
content-length: 385
content-type: application/json
== headers end ==
== body start ==
{
  "environment": {
    "enableStreamingEngine": true,
    "tempLocation": "gs://mybucket/staging/"
  },
  "gcsPath": "gs://dataflow-templates/latest/Cloud_PubSub_to_Cloud_PubSub",
  "jobName": "test-ps2ps2",
  "location": "us-central1",
  "parameters": {
    "inputSubscription": "projects/myproject/subscriptions/test-metrics-source",
    "outputTopic": "projects/myproject/topics/test-metrics-sink"
  }
}
== body end ==

Important Factoids

To my intense aggravation, the enableStreamingEngine key is not documented in google's official docs for the environment object: https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.jobs#environment but the gcloud tool is absolutely using it. :(

@ghost ghost added the bug label Mar 10, 2021
@venkykuberan venkykuberan self-assigned this Mar 10, 2021
@venkykuberan venkykuberan removed their assignment Mar 10, 2021
n-oden added a commit to odenio/magic-modules that referenced this issue Mar 11, 2021
c2thorn pushed a commit to GoogleCloudPlatform/magic-modules that referenced this issue Mar 11, 2021
* Add enable_streaming_engine argument to google_dataflow_job

This should address hashicorp/terraform-provider-google#8649

* address PR feedback
modular-magician added a commit to modular-magician/terraform-provider-google that referenced this issue Mar 11, 2021
…p#4585)

* Add enable_streaming_engine argument to google_dataflow_job

This should address hashicorp#8649

* address PR feedback

Signed-off-by: Modular Magician <magic-modules@google.com>
modular-magician added a commit to modular-magician/terraform-provider-google-beta that referenced this issue Mar 11, 2021
…p#4585)

* Add enable_streaming_engine argument to google_dataflow_job

This should address hashicorp/terraform-provider-google#8649

* address PR feedback

Signed-off-by: Modular Magician <magic-modules@google.com>
modular-magician added a commit that referenced this issue Mar 11, 2021
…8670)

* Add enable_streaming_engine argument to google_dataflow_job

This should address #8649

* address PR feedback

Signed-off-by: Modular Magician <magic-modules@google.com>
modular-magician added a commit to hashicorp/terraform-provider-google-beta that referenced this issue Mar 11, 2021
…3049)

* Add enable_streaming_engine argument to google_dataflow_job

This should address hashicorp/terraform-provider-google#8649

* address PR feedback

Signed-off-by: Modular Magician <magic-modules@google.com>
@ghost
Copy link

ghost commented Apr 11, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Apr 11, 2021
@github-actions github-actions bot added forward/review In review; remove label to forward service/dataflow labels Jan 14, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.