Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataflow flex-templates resource #3520

Closed
wants to merge 11 commits into from

Conversation

geojaz
Copy link
Contributor

@geojaz geojaz commented May 15, 2020

Release Note Template for Downstream PRs (will be copied)

`google_dataflow_flex_template_job`

We'd like to get the beta provider for terraform to support Dataflow Flex-Templates. This is based on the hand-written dataflow_job resource and appears to work for flex-template CRUD activities.

I will continue to iterate on this next week with some tests and to finish off the docs. I'll ping whn it's good to be reviewed.

FYI @pupamanyu @WillBeebe @cxhercules

[drive-by edit by danawillow:] Fixes hashicorp/terraform-provider-google#6656

@modular-magician
Copy link
Collaborator

Hello! I am a robot who works on Magic Modules PRs.

I have detected that you are a community contributor, so your PR will be assigned to someone with a commit-bit on this repo for initial review.

Thanks for your contribution! A human will be with you soon.

@rileykarson, please review this PR or find an appropriate assignee.

@geojaz geojaz force-pushed the dataflex branch 2 times, most recently from a8e3101 to de0fd40 Compare May 18, 2020 17:12
@geojaz
Copy link
Contributor Author

geojaz commented May 18, 2020

Upon further thinking, I believe it makes most sense to combine resource_dataflow_job and resource_dataflow_flex_template_job, since they hit the same GCP endpoint (with different params).... I'll refactor the new logic to pull it into resource_dataflow_job.

@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform GA: Diff ( 4 files changed, 144 insertions(+), 19 deletions(-))
Terraform Beta: Diff ( 4 files changed, 144 insertions(+), 19 deletions(-))

@geojaz geojaz force-pushed the dataflex branch 2 times, most recently from 5014a12 to a99aacb Compare May 18, 2020 20:48
@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform GA: Diff ( 4 files changed, 163 insertions(+), 19 deletions(-))
Terraform Beta: Diff ( 4 files changed, 163 insertions(+), 19 deletions(-))

1 similar comment
@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform GA: Diff ( 4 files changed, 163 insertions(+), 19 deletions(-))
Terraform Beta: Diff ( 4 files changed, 163 insertions(+), 19 deletions(-))

Copy link
Member

@rileykarson rileykarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind showing me how Flex Templates end up using the same endpoint? It seems they've got a different gcloud resource (which are normally strongly linked to endpoints) as well as this CURL example which appears to hit a flex-specific endpoint: https://dataflow.googleapis.com/v1b3/projects/$PROJECT/locations/$LOCATION/flexTemplates

@geojaz geojaz closed this May 18, 2020
@geojaz geojaz reopened this May 18, 2020
@modular-magician
Copy link
Collaborator

Hello! I am a robot who works on Magic Modules PRs.

I have detected that you are a community contributor, so your PR will be assigned to someone with a commit-bit on this repo for initial review.

Thanks for your contribution! A human will be with you soon.

@emilymye, please review this PR or find an appropriate assignee.

@geojaz
Copy link
Contributor Author

geojaz commented May 18, 2020

@rileykarson You're right, I was having a brain fart. I've retracted my proposal to combine these resources and will finish off resource_dataflex_flex_template_job.

@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform GA: Diff ( 4 files changed, 163 insertions(+), 19 deletions(-))
Terraform Beta: Diff ( 4 files changed, 163 insertions(+), 19 deletions(-))

@geojaz geojaz changed the title [WIP] Dataflow flex-templates resource Dataflow flex-templates resource May 19, 2020
@geojaz
Copy link
Contributor Author

geojaz commented May 19, 2020

@rileykarson @emilymye I'd love to get some feedback on the PR now. I can see that I have a failure in the cloudbuild trigger, but I'm unable to see what it is.

Copy link
Member

@rileykarson rileykarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking strong- I left some nitpicks and comments about where we'll need version guards. I think we need to pass generation for the downstream to update and for me to run the tests.

if err != nil {
return err
}
d.SetId(job.Id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious- is the id of the job expected to change here?

third_party/terraform/utils/provider.go.erb Outdated Show resolved Hide resolved
@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform GA: Diff ( 4 files changed, 64 insertions(+))
Terraform Beta: Diff ( 5 files changed, 322 insertions(+))

@modular-magician
Copy link
Collaborator

Oops! It looks like you're using an unknown release-note type in your changelog entries:

  • REPLACEME

Please only use the types listed in https://github.com/GoogleCloudPlatform/magic-modules/blob/master/.ci/RELEASE_NOTES_GUIDE.md.

@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform GA: Diff ( 4 files changed, 65 insertions(+))
Terraform Beta: Diff ( 5 files changed, 323 insertions(+))

@modular-magician
Copy link
Collaborator

Oops! It looks like no changelog entry is attached to this PR. Please include a release note block in the PR body, as described in https://github.com/GoogleCloudPlatform/magic-modules/blob/master/.ci/RELEASE_NOTES_GUIDE.md:

```release-note:TYPE
Release note
```

@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform GA: Diff ( 4 files changed, 65 insertions(+))
Terraform Beta: Diff ( 5 files changed, 294 insertions(+))

@aaltay
Copy link

aaltay commented May 21, 2020

Thank you for adding me. Do you want me to review something specific?

Adding, people who directly work on templates: @azurezyq @arvindram03

Copy link
Member

@rileykarson rileykarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple schema issues right now. They're the reason for the failed trigger.

--- FAIL: TestProvider (0.04s)
    provider_test.go:377: err: 1 error occurred:
        	* resource google_dataflow_flex_template_job: No Update defined, must set ForceNew on: []string{"on_delete", "labels"}

ForceNew: true,
},

"on_delete": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we want a blank update method defined. That will make Terraform persist the value for on_delete into state. If there's no method at all, it requires this field to recreate the resource.

@emilymye emilymye removed their request for review May 29, 2020 17:33
Copy link
Member

@rileykarson rileykarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind rebasing on top of master? Something changed between the commit this was created at and now that causes our CI to fail, probably a setting that's encoded in a file inside the repo. I've had it happen on a few of my PRs as well.

@modular-magician
Copy link
Collaborator

Hello! I am a robot who works on Magic Modules PRs.

I have detected that you are a community contributor, so your PR will be assigned to someone with a commit-bit on this repo for initial review.

Thanks for your contribution! A human will be with you soon.

@danawillow, please review this PR or find an appropriate assignee.

@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform GA: Diff ( 4 files changed, 65 insertions(+))
Terraform Beta: Diff ( 5 files changed, 295 insertions(+))

@modular-magician
Copy link
Collaborator

I have triggered VCR tests based on this PR's diffs. See the results here: "https://ci-oss.hashicorp.engineering/viewQueued.html?itemId=122436"

@danawillow danawillow removed their request for review June 24, 2020 17:05
return `
<<EOF
{
"name": "Streaming Beam SQL",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a complete flex template? I get the following error after making the changes noted above: Error: googleapi: Error 400: (165707124cd2c54b): Docker image not specified in the template file., badRequest


The Dataflow resource is considered 'existing' while it is in a nonterminal state. If it reaches a terminal state (e.g. 'FAILED', 'COMPLETE', 'CANCELLED'), it will be recreated on the next 'apply'. This is as expected for jobs which run continuously, but may surprise users who use this resource for other kinds of Dataflow jobs.

A Dataflow job which is 'destroyed' may be "cancelled" or "drained". If "cancelled", the job terminates - any data written remains where it is, but no new data will be processed. If "drained", no new data will enter the pipeline, but any data currently in the pipeline will finish being processed. The default is "cancelled", but if a user sets `on_delete` to `"drain"` in the configuration, you may experience a long wait for your `terraform destroy` to complete.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could link to : https://cloud.google.com/dataflow/docs/guides/stopping-a-pipeline for effects of cancel vs drain.

I do not know if destroyed is TF terminology. Dataflow term for this is stopping.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.


[ To Come ...]
## Note on "destroy" / "apply"
There are many types of Dataflow jobs. Some Dataflow jobs run constantly, getting new data from (e.g.) a GCS bucket, and outputting data continuously. Some jobs process a set amount of data then terminate. All jobs can fail while running due to programming errors or other issues. In this way, Dataflow jobs are different from most other Terraform / Google resources.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dataflow terminology for these two types of jobs are streaming and batch jobs respectively.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to keep this message as-is for consistency with the existing resource at https://www.terraform.io/docs/providers/google/r/dataflow_job.html. If you're able to PR a fix (or just file an issue that would be a big help!

## Note on "destroy" / "apply"
There are many types of Dataflow jobs. Some Dataflow jobs run constantly, getting new data from (e.g.) a GCS bucket, and outputting data continuously. Some jobs process a set amount of data then terminate. All jobs can fail while running due to programming errors or other issues. In this way, Dataflow jobs are different from most other Terraform / Google resources.

The Dataflow resource is considered 'existing' while it is in a nonterminal state. If it reaches a terminal state (e.g. 'FAILED', 'COMPLETE', 'CANCELLED'), it will be recreated on the next 'apply'. This is as expected for jobs which run continuously, but may surprise users who use this resource for other kinds of Dataflow jobs.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Replace for other kinds of Dataflow jobs. with for Dataflow batch jobs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Type: schema.TypeString,
ValidateFunc: validation.StringInSlice([]string{"cancel", "drain"}, false),
Optional: true,
Default: "drain",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc below says the default is cancel. (The default is "cancelled",)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to cancel.

d.SetId("")
return nil
}
d.SetId(job.Id)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this line needed L127, read this info from this field already?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't hurt, but doesn't do anything- removed.

@rileykarson
Copy link
Member

I copied the changes from this PR into #3772, superseding this one. @geojaz do you mind responding to the googlebot comment when you have the chance there? I squashed your commits into a single one, for ease of reviewability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dataflow flex templates
7 participants