Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Pub/Sub Schema support for changing definitions via revisions #8079

Merged
merged 10 commits into from
Jun 8, 2023
Merged

feat: Add Pub/Sub Schema support for changing definitions via revisions #8079

merged 10 commits into from
Jun 8, 2023

Conversation

kamalaboulhosn
Copy link
Contributor

Allows schema definition to be changed by committing schema revisions. Fixes hashicorp/terraform-provider-google#13997. This replaces #7981. @SarahFrench and @melinath have the most context on this.

If this PR is for Terraform, I acknowledge that I have:

  • Searched through the issue tracker for an open issue that this either resolves or contributes to, commented on it to claim it, and written "fixes {url}" or "part of {url}" in this PR description. If there were no relevant open issues, I opened one and commented that I would like to work on it (not necessary for very small changes).
  • Ensured that all new fields I added that can be set by a user appear in at least one example (for generated resources) or third_party test (for handwritten resources or update tests).
  • Generated Terraform providers, and ran make test and make lint in the generated providers to ensure it passes unit and linter tests.
  • Ran relevant acceptance tests using my own Google Cloud project and credentials (If the acceptance tests do not yet pass or you are unable to run them, please let your reviewer know).
  • Read the Release Notes Guide before writing my release note below.

Release Note Template for Downstream PRs (will be copied)

pubsub: Allowed `definition` of `google_pubsub_schema` to change without deleting and recreating the resource by using schema revisions (https://cloud.google.com/pubsub/docs/schemas#commit-schema-revision)

@modular-magician modular-magician requested a review from shuyama1 June 2, 2023 20:11
@modular-magician
Copy link
Collaborator

Hello! I am a robot who works on Magic Modules PRs.

I've detected that you're a community contributor. @shuyama1, a repository maintainer, has been assigned to assist you and help review your changes.

❓ First time contributing? Click here for more details

Your assigned reviewer will help review your code by:

  • Ensuring it's backwards compatible, covers common error cases, etc.
  • Summarizing the change into a user-facing changelog note.
  • Passes tests, either our "VCR" suite, a set of presubmit tests, or with manual test runs.

You can help make sure that review is quick by running local tests and ensuring they're passing in between each push you make to your PR's branch. Also, try to leave a comment with each push you make, as pushes generally don't generate emails.

If your reviewer doesn't get back to you within a week after your most recent change, please feel free to leave a comment on the issue asking them to take a look! In the absence of a dedicated review dashboard most maintainers manage their pending reviews through email, and those will sometimes get lost in their inbox.


@melinath melinath requested review from melinath and SarahFrench and removed request for shuyama1 June 2, 2023 20:11
@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

Terraform GA: Diff ( 2 files changed, 92 insertions(+), 2 deletions(-))
Terraform Beta: Diff ( 2 files changed, 92 insertions(+), 2 deletions(-))
TF Conversion: Diff ( 3 files changed, 8 insertions(+), 3 deletions(-))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 2767
Passed tests 2473
Skipped tests: 283
Affected tests: 11

Action taken

Found 11 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
TestAccPubsubTopic_pubsubTopicSchemaSettingsExample|TestAccPubsubSchema_pubsubSchemaProtobufExample|TestAccPubsubSchema_pubsubSchemaBasicExample|TestAccBillingSubaccount_basic|TestAccBillingSubaccount_renameOnDestroy|TestAccComputeFirewallPolicyRule_multipleRules|TestAccApigeeKeystoresAliasesPkcs12_ApigeeKeystoresAliasesPkcs12Example|TestAccApigeeKeystoresAliasesKeyCertFile_apigeeKeystoresAliasesKeyCertFileTestExample|TestAccDataSourceGoogleFirebaseAndroidAppConfig|TestAccDataSourceGoogleServiceAccountJwt|TestAccDataSourceAlloydbLocations_basic

Get to know how VCR tests work

@modular-magician
Copy link
Collaborator

Tests passed during RECORDING mode:
TestAccPubsubTopic_pubsubTopicSchemaSettingsExample[Debug log]
TestAccPubsubSchema_pubsubSchemaProtobufExample[Debug log]
TestAccPubsubSchema_pubsubSchemaBasicExample[Debug log]
TestAccApigeeKeystoresAliasesPkcs12_ApigeeKeystoresAliasesPkcs12Example[Debug log]
TestAccApigeeKeystoresAliasesKeyCertFile_apigeeKeystoresAliasesKeyCertFileTestExample[Debug log]
TestAccDataSourceGoogleServiceAccountJwt[Debug log]
TestAccDataSourceAlloydbLocations_basic[Debug log]

Tests failed during RECORDING mode:
TestAccBillingSubaccount_basic[Error message] [Debug log]
TestAccBillingSubaccount_renameOnDestroy[Error message] [Debug log]
TestAccComputeFirewallPolicyRule_multipleRules[Error message] [Debug log]
TestAccDataSourceGoogleFirebaseAndroidAppConfig[Error message] [Debug log]

Please fix these to complete your PR
View the build log or the debug log for each test

Copy link
Contributor

@SarahFrench SarahFrench left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 This looks good to me, though it would benefit from an acceptance test that reflects that the resource is updatable now!

The google_pubsub_schema resource has generated tests but no handwritten tests, and handwritten tests are the way we test update behaviours. Could you please add a handwritten test to test updating the resource? There is guidance here and preexisting tests that can be referenced.

@kamalaboulhosn
Copy link
Contributor Author

👋 This looks good to me, though it would benefit from an acceptance test that reflects that the resource is updatable now!

@SarahFrench Done.

@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

Terraform GA: Diff ( 4 files changed, 166 insertions(+), 10 deletions(-))
Terraform Beta: Diff ( 4 files changed, 166 insertions(+), 10 deletions(-))
TF Conversion: Diff ( 3 files changed, 8 insertions(+), 3 deletions(-))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 2769
Passed tests 2477
Skipped tests: 283
Affected tests: 9

Action taken

Found 9 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
TestAccPubsubSchema_update|TestAccDataLossPreventionDeidentifyTemplate_dlpDeidentifyTemplate_infoTypeTransformationsUpdate|TestAccDataLossPreventionDeidentifyTemplate_dlpDeidentifyTemplateSkipCharactersExample|TestAccDataLossPreventionDeidentifyTemplate_dlpDeidentifyTemplateBasicExample|TestAccComputeFirewallPolicyRule_multipleRules|TestAccApigeeKeystoresAliasesKeyCertFile_apigeeKeystoresAliasesKeyCertFileTestExample|TestAccApigeeKeystoresAliasesPkcs12_ApigeeKeystoresAliasesPkcs12Example|TestAccDataSourceGoogleFirebaseAndroidAppConfig|TestAccDataSourceAlloydbLocations_basic

Get to know how VCR tests work

@modular-magician
Copy link
Collaborator

Tests passed during RECORDING mode:
TestAccDataLossPreventionDeidentifyTemplate_dlpDeidentifyTemplate_infoTypeTransformationsUpdate[Debug log]
TestAccDataLossPreventionDeidentifyTemplate_dlpDeidentifyTemplateSkipCharactersExample[Debug log]
TestAccDataLossPreventionDeidentifyTemplate_dlpDeidentifyTemplateBasicExample[Debug log]
TestAccApigeeKeystoresAliasesKeyCertFile_apigeeKeystoresAliasesKeyCertFileTestExample[Debug log]
TestAccApigeeKeystoresAliasesPkcs12_ApigeeKeystoresAliasesPkcs12Example[Debug log]
TestAccDataSourceGoogleFirebaseAndroidAppConfig[Debug log]
TestAccDataSourceAlloydbLocations_basic[Debug log]

Tests failed during RECORDING mode:
TestAccPubsubSchema_update[Error message] [Debug log]
TestAccComputeFirewallPolicyRule_multipleRules[Error message] [Debug log]

Please fix these to complete your PR
View the build log or the debug log for each test

@SarahFrench
Copy link
Contributor

SarahFrench commented Jun 5, 2023

Here's the error output for TestAccPubsubSchema_update:

    vcr_utils.go:147: Step 3/4 error: After applying this test step and performing a `terraform refresh`, the plan was not empty.
        stdout
        
        
        Terraform used the selected providers to generate the following execution
        plan. Resource actions are indicated with the following symbols:
          ~ update in-place
        
        Terraform will perform the following actions:
        
          # google_pubsub_schema.foo will be updated in-place
          ~ resource "google_pubsub_schema" "foo" {
              ~ definition = <<-EOT
                    syntax = "proto3";
                    message Results {
                    string message_request = 1;
                    string message_response = 2;
                  + string timestamp_request = 3;
                    }
                EOT
                id         = "projects/PROJECT/schemas/tf-test-schema-aqg7zhi9wz"
                name       = "tf-test-schema-aqg7zhi9wz"
                # (2 unchanged attributes hidden)
            }
        
        Plan: 0 to add, 1 to change, 0 to destroy.

Like I said in the other PR I'm at EOD in the UK, but thought that sharing this would help unblock you!

Copy link
Member

@melinath melinath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like TestAccPubsubSchema_update is failing. I'm not sure what's going on here exactly - I see in the debug logs that there are subsequent GET requests where one returns the new field and the following one does not, but there's no request in between that would change it. Possibly an eventual concurrency problem? (You can search for [ REQUEST ] to find requests.)

It looks like this is also causing failures related to TGC (which powers gcloud terraform vet) because of now using the no_send_name.go.erb expander. This seems potentially more "correct" - if that sounds right to you, you can update mmv1/third_party/validator/tests/data/example_pubsub_schema.json to remove the name field and that should resolve the error.

Note that this is technically a breaking change for TGC in that people previously relying on the name field being inside the schema would now need to get it from the object overall. But that's okay if the change is more correct. Basically the question is: would the CAI representation of the schema include the name / schema id in the data for the object?

@kamalaboulhosn
Copy link
Contributor Author

I'll look into what is going on. It is very strange, I could have sworn I ran the acceptance tests and got an "ok" before responding, but I guess something must not have been up to date.

@kamalaboulhosn
Copy link
Contributor Author

It looks like this is caused by caching effects in the Pub/Sub service itself. When a new revision is created, it is possible for a read to return the old schema if it was cached locally in the server reached. So there is nothing wrong with the Terraform code here, really. The service can be updated to fix this, but is there a way one generally deals with this kind of caching possibility in these Terraform modules?

@SarahFrench
Copy link
Contributor

SarahFrench commented Jun 6, 2023

I'm only familiar with how eventual consistency problems are handled on create and delete actions, but from looking at the code generator I think update can also be handled. For example here is an example where a generated resource is defined with an async block that defines how to retry the request and wait for an action to complete (here's the relevant PR). The check_response_func_existence and check_response_func_absence properties are used to supply the name of functions that are used to check that a resource has been created, updated or deleted fully and control retries during polling.

However, the above doesn't seem to be relevant to this error. In the case of the failing acceptance test I can see that the POST request that adds timestamp_request = 3 gets a 200 response that includes the updated definition, so no retries are needed on the update action. The problem is that the Update function for the google_pubsub_schema resource concludes by calling the Read function, and it's that GET that is failing to get up-to-date data from the API. I don't believe we have any logic in the provider that makes assertions and retries Read operations if the API returns data that doesn't match the config.

I think the best solution is for the service to be updated to fix this issue - though I defer to @melinath in case he has any insight

@melinath
Copy link
Member

melinath commented Jun 6, 2023

yeah, Terraform can handle eventual consistency on create (by polling until the API resource exists) or on delete (by polling until it doesn't) but I don't think we have a good pattern for handling it on update, since the end states are less binary.

The consistency issue would ultimately need to be fixed on the API side; however, that doesn't inherently block this PR being merged; the impact on customers from this would likely be minimal, since there will likely be a longer delay for users between this resource being applied and the next plan being run outside of tests (and it would essentially be idempotent from the user perspective if reapplied).

For the tests - if you happen to know how long it would take consistency to be achieved most of the time, I think you could add a time_sleep resource that has lifecycle.replace_triggered_by the pubsub schema - updates to the schema would trigger recreation of the sleep resource, which would then cause a delay in the test. (If that doesn't work, we would basically need to rerun the tests until we get lucky and get a good recording, then merge. That would lessen impact on other inflight PRs but could be annoying / long depending how many retries it takes, and wouldn't help with nightly builds, so a sleep-based solution would be preferred.)

(Note that the changes for the terraform-google-conversion tests also still need to happen.)

@modular-magician modular-magician requested a review from melinath June 8, 2023 12:08
@kamalaboulhosn
Copy link
Contributor Author

The time_sleep did help, though didn't fix it 100%. Still sometimes get a complaint that there is still a diff when a second plan is done. The server-side changes are in flight now and should land in a couple of weeks. Hopefully this is sufficient to get this checked in.

Can you please take a look @SarahFrench or @melinath? Note that the lint errors are for files not touched by this PR.

@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

Terraform GA: Diff ( 4 files changed, 181 insertions(+), 10 deletions(-))
Terraform Beta: Diff ( 4 files changed, 181 insertions(+), 10 deletions(-))
TF Conversion: Diff ( 2 files changed, 3 insertions(+), 3 deletions(-))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 2782
Passed tests 2480
Skipped tests: 292
Affected tests: 10

Action taken

Found 10 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
TestAccComputeFirewallPolicyRule_multipleRules|TestAccApigeeKeystoresAliasesPkcs12_ApigeeKeystoresAliasesPkcs12Example|TestAccApigeeKeystoresAliasesKeyCertFile_apigeeKeystoresAliasesKeyCertFileTestExample|TestAccDataSourceGoogleFirebaseAndroidAppConfig|TestAccPubsubTopic_pubsubTopicSchemaSettingsExample|TestAccDataSourceAlloydbLocations_basic|TestAccPubsubSchema_update|TestAccPubsubSchema_pubsubSchemaProtobufExample|TestAccLoggingBucketConfigProject_cmekSettings|TestAccPubsubSchema_pubsubSchemaBasicExample

Get to know how VCR tests work

@modular-magician
Copy link
Collaborator

Tests passed during RECORDING mode:
TestAccApigeeKeystoresAliasesPkcs12_ApigeeKeystoresAliasesPkcs12Example[Debug log]
TestAccApigeeKeystoresAliasesKeyCertFile_apigeeKeystoresAliasesKeyCertFileTestExample[Debug log]
TestAccDataSourceGoogleFirebaseAndroidAppConfig[Debug log]
TestAccPubsubTopic_pubsubTopicSchemaSettingsExample[Debug log]
TestAccDataSourceAlloydbLocations_basic[Debug log]
TestAccPubsubSchema_pubsubSchemaProtobufExample[Debug log]
TestAccLoggingBucketConfigProject_cmekSettings[Debug log]
TestAccPubsubSchema_pubsubSchemaBasicExample[Debug log]

Tests failed during RECORDING mode:
TestAccComputeFirewallPolicyRule_multipleRules[Error message] [Debug log]
TestAccPubsubSchema_update[Error message] [Debug log]

Please fix these to complete your PR
View the build log or the debug log for each test

@melinath
Copy link
Member

melinath commented Jun 8, 2023

Yes, some flakiness is tolerable (doesn't block merge); we just need to be reasonably certain that we would notice if something changes that breaks the resource. Rerunning the test to try to get a clean recording now that there's a sleep in place. /gcbrun

@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

Terraform GA: Diff ( 4 files changed, 181 insertions(+), 10 deletions(-))
Terraform Beta: Diff ( 4 files changed, 181 insertions(+), 10 deletions(-))
TF Conversion: Diff ( 2 files changed, 3 insertions(+), 3 deletions(-))

1 similar comment
@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

Terraform GA: Diff ( 4 files changed, 181 insertions(+), 10 deletions(-))
Terraform Beta: Diff ( 4 files changed, 181 insertions(+), 10 deletions(-))
TF Conversion: Diff ( 2 files changed, 3 insertions(+), 3 deletions(-))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 2782
Passed tests 2484
Skipped tests: 292
Affected tests: 6

Action taken

Found 6 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
TestAccApigeeKeystoresAliasesKeyCertFile_apigeeKeystoresAliasesKeyCertFileTestExample|TestAccApigeeKeystoresAliasesPkcs12_ApigeeKeystoresAliasesPkcs12Example|TestAccComputeFirewallPolicyRule_multipleRules|TestAccPubsubSchema_update|TestAccDataSourceGoogleFirebaseAndroidAppConfig|TestAccDataSourceAlloydbLocations_basic

Get to know how VCR tests work

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 2782
Passed tests 2484
Skipped tests: 292
Affected tests: 6

Action taken

Found 6 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
TestAccPubsubSchema_update|TestAccComputeFirewallPolicyRule_multipleRules|TestAccApigeeKeystoresAliasesKeyCertFile_apigeeKeystoresAliasesKeyCertFileTestExample|TestAccApigeeKeystoresAliasesPkcs12_ApigeeKeystoresAliasesPkcs12Example|TestAccDataSourceGoogleFirebaseAndroidAppConfig|TestAccDataSourceAlloydbLocations_basic

Get to know how VCR tests work

@modular-magician
Copy link
Collaborator

Tests passed during RECORDING mode:
TestAccApigeeKeystoresAliasesKeyCertFile_apigeeKeystoresAliasesKeyCertFileTestExample[Debug log]
TestAccApigeeKeystoresAliasesPkcs12_ApigeeKeystoresAliasesPkcs12Example[Debug log]
TestAccPubsubSchema_update[Debug log]
TestAccDataSourceGoogleFirebaseAndroidAppConfig[Debug log]
TestAccDataSourceAlloydbLocations_basic[Debug log]

Tests failed during RECORDING mode:
TestAccComputeFirewallPolicyRule_multipleRules[Error message] [Debug log]

Please fix these to complete your PR
View the build log or the debug log for each test

@modular-magician
Copy link
Collaborator

Tests passed during RECORDING mode:
TestAccPubsubSchema_update[Debug log]
TestAccApigeeKeystoresAliasesKeyCertFile_apigeeKeystoresAliasesKeyCertFileTestExample[Debug log]
TestAccApigeeKeystoresAliasesPkcs12_ApigeeKeystoresAliasesPkcs12Example[Debug log]
TestAccDataSourceGoogleFirebaseAndroidAppConfig[Debug log]
TestAccDataSourceAlloydbLocations_basic[Debug log]

Tests failed during RECORDING mode:
TestAccComputeFirewallPolicyRule_multipleRules[Error message] [Debug log]

Please fix these to complete your PR
View the build log or the debug log for each test

Copy link
Contributor

@SarahFrench SarahFrench left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving now that the new test's flakiness is minimised and a successful test recording has been made

@SarahFrench
Copy link
Contributor

SarahFrench commented Jun 8, 2023

The server-side changes are in flight now and should land in a couple of weeks. Hopefully this is sufficient to get this checked in.

I'll make an issue so that any necessary follow up (e.g. removing sleeps) is done. Thanks for this!

Edit 1: Looking into the exchange above about mmv1/third_party/validator/tests/data/example_pubsub_schema.json

Edit 2: The integration tests related to tfc are passing, but I don't know why

@kamalaboulhosn
Copy link
Contributor Author

Looking into the exchange above about mmv1/third_party/validator/tests/data/example_pubsub_schema.json

@SarahFrench I removed the change to use no_send_name.go.erb so hopefully that is now okay. I don't think the change was needed.

@SarahFrench
Copy link
Contributor

Thanks - I also just spotted that at one point more files in the TF Conversion downstream were affected versus now. I'm happy that I'm not missing anything - merging now

@SarahFrench SarahFrench dismissed melinath’s stale review June 8, 2023 19:43

I believe that the changed have been addressed

@SarahFrench SarahFrench merged commit 739b446 into GoogleCloudPlatform:main Jun 8, 2023
ron-gal pushed a commit to ron-gal/magic-modules that referenced this pull request Jun 12, 2023
…ns (GoogleCloudPlatform#8079)

* Add support for Pub/Sub schema evolution

* Add example for Pub/Sub schema evolution

* Remove changes not to be made in a single PR

* Remove changes not to be made in a single PR

* Remove changes not to be made in a single PR

* Fix spacing

* Test update, import

* Reduce flakiness
ericayyliu pushed a commit to ericayyliu/magic-modules that referenced this pull request Jul 26, 2023
…ns (GoogleCloudPlatform#8079)

* Add support for Pub/Sub schema evolution

* Add example for Pub/Sub schema evolution

* Remove changes not to be made in a single PR

* Remove changes not to be made in a single PR

* Remove changes not to be made in a single PR

* Fix spacing

* Test update, import

* Reduce flakiness
wj-chen pushed a commit to wj-chen/magic-modules that referenced this pull request Aug 1, 2023
…ns (GoogleCloudPlatform#8079)

* Add support for Pub/Sub schema evolution

* Add example for Pub/Sub schema evolution

* Remove changes not to be made in a single PR

* Remove changes not to be made in a single PR

* Remove changes not to be made in a single PR

* Fix spacing

* Test update, import

* Reduce flakiness
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support pubsub schema updates
4 participants