Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing test(s): CloudIdentityGroup still exists #10001

Closed
melinath opened this issue Sep 2, 2021 · 9 comments
Closed

Failing test(s): CloudIdentityGroup still exists #10001

melinath opened this issue Sep 2, 2021 · 9 comments

Comments

@melinath
Copy link
Collaborator

melinath commented Sep 2, 2021

Affected Resource(s)

  • google_cloud_identity_group

Failure rate:

  • 10% in August 2021
  • 13% in October 2021
  • 80% in Dec 2021
  • 64.5% in Jan 2022
  • 87% in Mar 2022
  • 76% in Apr 2022
  • 58% in May 2022
  • 66% in Jun 2022

Impacted tests:

  • TestAccCloudIdentityGroup_update
  • TestAccCloudIdentityGroup_cloudIdentityGroupsBasicExample

Nightly builds:

Message:

testing_new.go:63: Error running post-test destroy, there may be dangling resources: CloudIdentityGroup still exists at https://cloudidentity.googleapis.com/v1beta1/groups/02ce457m3h3k1ib
@rileykarson rileykarson added this to the Near-Term Goals milestone Sep 13, 2021
@melinath melinath changed the title Failing test(s): TestAccCloudIdentityGroup_update Failing test(s): CloudIdentityGroup still exists Jan 6, 2022
@mitj04
Copy link

mitj04 commented Dec 1, 2022

Google Cloud-TestAccCloudIdentityGroup_update - 43.2% of success rate (50 failures)
Google Cloud-TestAccCloudIdentityGroup_cloudIdentityGroupsBasicExample - 55.7% of success rate (39 failures)

Google Cloud Beta-TestAccCloudIdentityGroup_cloudIdentityGroupsBasicExample - 47.2% of success rate (47 failures)
Google Cloud Beta-TestAccCloudIdentityGroup_update - 42.9% of success rate (52 failures)

@mitj04
Copy link

mitj04 commented Dec 13, 2022

b/262375687

@roaks3
Copy link
Collaborator

roaks3 commented Feb 7, 2023

Current error looks like this (but appears to be the same underlying problem):

------- Stdout: -------
=== RUN   TestAccCloudIdentityGroup_cloudIdentityGroupsBasicExample
=== PAUSE TestAccCloudIdentityGroup_cloudIdentityGroupsBasicExample
=== CONT  TestAccCloudIdentityGroup_cloudIdentityGroupsBasicExample
    provider_test.go:312: Step 1/2 error: Error running apply: exit status 1
        
        Error: Error creating Group: googleapi: Error 409: Error(2018): Cannot create group 'tf-test-my-identity-groupcmynd6la37@terraform-graphite-test.joonix.net' because it already exists.
        Details:
        [
          {
            "@type": "type.googleapis.com/google.rpc.ResourceInfo",
            "description": "Error(2018): Cannot create group 'tf-test-my-identity-groupcmynd6la37@terraform-graphite-test.joonix.net' because it already exists.",
            "owner": "domain:cloudidentity.googleapis.com",
            "resourceType": "cloudidentity.googleapis.com/Group"
          }
        ]
        
          with google_cloud_identity_group.cloud_identity_group_basic,
          on terraform_plugin_test.tf line 3, in resource "google_cloud_identity_group" "cloud_identity_group_basic":
           3: resource "google_cloud_identity_group" "cloud_identity_group_basic" {
        
--- FAIL: TestAccCloudIdentityGroup_cloudIdentityGroupsBasicExample (3.26s)
FAIL

@roaks3
Copy link
Collaborator

roaks3 commented Feb 15, 2023

This also occurs for TestAccDataSourceCloudIdentityGroups_basic, and all three use the same testAccCloudIdentityGroup_cloudIdentityGroupsBasicExample config in Step 1. So it seems to be something related to that configuration specifically (possibly the initial_group_config config).

Also confirmed that it is possible for all 3 to fail in the same build: https://ci-oss.hashicorp.engineering/viewLog.html?buildId=374114&tab=buildResultsDiv&buildTypeId=GoogleCloud_ProviderGoogleCloudGoogleProject#testNameId-8784264308608877604 and for all 3 to pass in the same build: https://ci-oss.hashicorp.engineering/viewLog.html?buildId=373561&tab=buildResultsDiv&buildTypeId=GoogleCloud_ProviderGoogleCloudGoogleProject

@roaks3
Copy link
Collaborator

roaks3 commented Feb 15, 2023

I can't reproduce this with locally-running code, and can't readily see where the issue would be coming from, so I've created GoogleCloudPlatform/magic-modules#7308 to see if VCR can reproduce the errors.

Worst case if we can't determine exactly what is wrong: we could try using different configs for TestAccCloudIdentityGroup_update and TestAccDataSourceCloudIdentityGroups_basic , probably without initial_group_config, and see if the issue resolves for those tests.

@roaks3
Copy link
Collaborator

roaks3 commented Feb 16, 2023

Also related, there are some APIs that throw a 409 when there are multiple concurrent writes, not necessarily for the same exact resource. This API returns the 409 Cannot create group ... because it already exists any time there is an ALREADY_EXISTS (ie. 409) error internally, so it is possible a 409 is occurring due to a concurrent write, and that is being erroneously reported as the resource already existing.

The solution for other APIs seems to be a retry on 409, although if the error message is actually incorrect in this case, it could be hard to distinguish a retryable 409 from a non-retryable 409.

@roaks3
Copy link
Collaborator

roaks3 commented Feb 17, 2023

I'm going to try running the all of the tests that use a cloud_identity_group serially to see if we still observe the error, but testAccAccessContextManagerGcpUserAccessBinding_basicTest is one test that I don't think we can include because it is already run serially as part of the ACM tests.

If the issue is resolved, then that would likely confirm a concurrency issue, and we probably want to implement a mutex as well.

If the issue is not resolved, then we can likely rule out a concurrency problem, and will have to revisit this for a fix. In that case, we should also revert the change that runs these tests serially.

@roaks3
Copy link
Collaborator

roaks3 commented Mar 3, 2023

This has been passing in both providers since it was introduced, so considering it resolved. Will follow up with a separate ticket to handle a mutex.

@roaks3 roaks3 closed this as completed Mar 3, 2023
@github-actions
Copy link

github-actions bot commented Apr 4, 2023

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants