Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource/aws_cloudwatch_log_stream: Prevent early state removal #11617

Conversation

camlow325
Copy link
Contributor

Community Note

  • Please vote on this pull request by adding a 👍 reaction to the original pull request comment to help the community and maintainers prioritize this request
  • Please do not leave "+1" comments, they generate extra noise for pull request followers and do not help prioritize the request

Closes #11611

Release note for CHANGELOG:

* resource/aws_cloudwatch_log_stream: Prevent state removal of resource immediately after creation due to eventual consistency ([#11611](https://github.com/terraform-providers/terraform-provider-aws/issues/11611))

The AWS logs service has eventual consistency considerations. The aws_cloudwatch_log_stream resource immediately tries to read a stream after creation. If the stream is not found, the logs service returns a 200 OK with an empty list of streams. Since no streams are present, the
aws_cloudwatch_log_stream resource removes the created resource from state, leading to a "produced an unexpected new value for was present, but now absent" error.

With the changes in this commit, the empty list of streams in the response for the newly created resource will result in a NotFoundError being returned and a retry of the read request. A subsequent retry should hopefully be successful, leading to the state being preserved.

Output from acceptance testing:

$ make testacc TEST=./aws TESTARGS='-run=TestAccAWSCloudWatchLogStream_'
...
--- PASS: TestAccAWSCloudWatchLogStream_disappears_LogGroup (16.77s)
--- PASS: TestAccAWSCloudWatchLogStream_disappears (19.55s)
--- PASS: TestAccAWSCloudWatchLogStream_basic (19.91s)

References:
* hashicorp#11611

The AWS logs service has eventual consistency considerations. The
`aws_cloudwatch_log_stream` resource immediately tries to read a stream
after creation. If the stream is not found, the logs service returns a 200
OK with an empty list of streams. Since no streams are present, the
`aws_cloudwatch_log_stream` resource removes the created resource from
state, leading to a "produced an unexpected new value for was present,
but now absent" error.

With the changes in this commit, the empty list of streams in the response
for the newly created resource will result in a NotFoundError being returned
and a retry of the read request. A subsequent retry should hopefully be
successful, leading to the state being preserved.

Output from acceptance testing:

```
make testacc TEST=./aws TESTARGS='-run=TestAccAWSCloudWatchLogStream_'
...
--- PASS: TestAccAWSCloudWatchLogStream_disappears_LogGroup (16.77s)
--- PASS: TestAccAWSCloudWatchLogStream_disappears (19.55s)
--- PASS: TestAccAWSCloudWatchLogStream_basic (19.91s)
```
@camlow325 camlow325 requested a review from a team January 16, 2020 00:36
@ghost ghost added size/XS Managed by automation to categorize the size of a PR. needs-triage Waiting for first response or review from a maintainer. service/cloudwatchlogs labels Jan 16, 2020
var ls *cloudwatchlogs.LogStream
var exists bool

err := resource.Retry(2*time.Minute, func() *resource.RetryError {
Copy link
Contributor Author

@camlow325 camlow325 Jan 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure what a good upper bound for the retry timeout would be. In automation today, we do a second Terraform apply after the first one fails and that has always failed with a ResourceAlreadyExistsException: The specified log stream already exists error. The second run probably completes in well under 2 minutes in most cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minutes is pretty reasonable 👍 We can/will likely add a constant for this in the future.

var err error
ls, exists, err = lookupCloudWatchLogStream(conn, d.Id(), group, nil)
if err != nil {
return resource.NonRetryableError(err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be good to retry for other kinds of error, too - not sure. For the cases we've seen so far that we'd benefit from retrying on, the initial lookup returns a 200 OK so not retrying on failures would be backward compatible.

@bflad bflad added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Feb 6, 2020
@bflad bflad added this to the v2.48.0 milestone Feb 6, 2020
Copy link
Contributor

@bflad bflad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this fix, @camlow325 🚀

Output from acceptance testing:

--- PASS: TestAccAWSCloudWatchLogStream_disappears_LogGroup (6.57s)
--- PASS: TestAccAWSCloudWatchLogStream_disappears (7.24s)
--- PASS: TestAccAWSCloudWatchLogStream_basic (7.52s)

@bflad bflad merged commit 4cc7019 into hashicorp:master Feb 6, 2020
bflad added a commit that referenced this pull request Feb 6, 2020
@ghost
Copy link

ghost commented Feb 7, 2020

This has been released in version 2.48.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@ghost
Copy link

ghost commented Mar 27, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. size/XS Managed by automation to categorize the size of a PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

aws_cloudwatch_log_stream resource produced new value for was present but now absent
2 participants