Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google_dataflow_flex_template_job cannot read existing pipeline's machine_type and subnetwork fields #19909

Open
yeweidaniel opened this issue Oct 17, 2024 · 3 comments
Assignees
Labels
bug forward/review In review; remove label to forward service/dataflow

Comments

@yeweidaniel
Copy link

yeweidaniel commented Oct 17, 2024

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to a user, that user is claiming responsibility for the issue.
  • Customers working with a Google Technical Account Manager or Customer Engineer can ask them to reach out internally to expedite investigation and resolution of this issue.

Terraform Version & Provider Version(s)

Terraform v0.14.9
on

  • provider registry.terraform.io/hashicorp/google v4.78.0
  • provider registry.terraform.io/hashicorp/google-beta v4.78.0

Affected Resource(s)

google_dataflow_flex_template_job

Terraform Configuration

# Generates random UUID to be used as unique identifier for the dataflow job.
# 'keepers' consists of values that, when changed, will trigger recreation of the resource.
resource "random_uuid" "pipeline_name_suffix_{{$source_resource}}" {
  keepers = {
    mapping_config = "{{.mapping_config}}"
    import_root = "{{.import_root}}"
    container_spec_gcs_path = "{{.template_path}}"
    auto_retry_feature_flags =  "{{index .feature_flags "enable_auto_retry_error_handling"}}"
  }
}

resource "google_dataflow_flex_template_job" "streaming_pipeline_{{$source_resource}}" {
    provider                = google-beta
    project                 = module.project.project_id
    name                    = "..."
    container_spec_gcs_path = "{{.template_path}}"
    parameters = {
        experiments   = "{{$merged_experiments}}"
        mappingConfig = "{{.mapping_config}}"
        importRoot = "{{.import_root}}"
        serviceAccount = google_service_account.dataflow_runner.email
        subnetwork = "{{.subnetwork}}"
        usePublicIps = false
        maxNumWorkers  = "{{.max_num_workers}}"
        enableStreamingEngine = true
        {{if eq $prime_enabled false}}
          workerMachineType = "{{.worker_machine_type}}"
        {{end}}
        featureFlags = "{{$feature_flags_json}}"
        environmentVariables = jsonencode({
            "MAPPING_VERSION":                      "{{.mapping_version}}",
            "DATA_PROJECT_ID":                      "{{.prefix}}-{{.env}}-data",
            "ENV":                                  "{{.env}}",
            "PIPELINE_TYPE":                        "harmonization",
            "DATA_SOURCE_TYPE":                     "hl7v2",
            "DATAFLOW_JOB_TYPE":                    "streaming",
        })
    }
    on_delete = "drain"
    region = "{{.dataflow_location}}"
    depends_on = [
      random_uuid.pipeline_name_suffix_{{$source_resource}}
    ]
}

Debug Output

No response

Expected Behavior

The pipeline should not restart because we haven't changed it. However it seems Terraform can't read the machine_type and subnetwork fields from the existing pipeline it created.

Actual Behavior

google_dataflow_flex_template_job.streaming_pipeline will be updated in-place

~ resource "google_dataflow_flex_template_job" "streaming_pipeline" {
...
machine_type = "n1-highmem-4" -> null
subnetwork = "https://www.googleapis.com/compute/v1/projects/.../subnetworks/streaming-subnet" -> null
# (16 unchanged attributes hidden)
}

Steps to reproduce

Run terraform apply twice, restarts the pipeline on the second try, despite making no changes.

Important Factoids

No response

References

No response

@github-actions github-actions bot added forward/review In review; remove label to forward service/dataflow labels Oct 17, 2024
@ggtisc ggtisc assigned ggtisc and slevenick and unassigned ggtisc Oct 28, 2024
@ggtisc
Copy link
Collaborator

ggtisc commented Oct 29, 2024

Hi @yeweidaniel!

We are trying to replicate this issue, but it is not possible due to the lack of information. Please share us examples we can use without sensitive information to create these resources

@ggtisc ggtisc assigned ggtisc and unassigned slevenick Oct 29, 2024
@damondouglas
Copy link

Good day, @yeweidaniel, the parameters argument in google_dataflow_flex_template_job resource block is reserved for template specific parameters and not Dataflow options such as subnetwork, service account, etc.

@yeweidaniel
Copy link
Author

Good day, @yeweidaniel, the parameters argument in google_dataflow_flex_template_job resource block is reserved for template specific parameters and not Dataflow options such as subnetwork, service account, etc.

Setting those fields in the parameters argument used to work for us and our customers. We recently started getting errors like

Error: googleapi: Error 400: The template parameters are invalid. Details:
Step #2 - "Apply": │ workerMachineType: Runtime parameter workerMachineType should not be specified in both parameters field and environment field. Specifying runtime parameters in environment field is recommended.

IIUC this seems to be a breaking change. Our customers are depending on Terraform configs specified this way. Given this is a GA service and there were no MSAs around it, can we roll back this change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug forward/review In review; remove label to forward service/dataflow
Projects
None yet
Development

No branches or pull requests

4 participants