Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error in creating aws copy snaphot request #2042

Merged
merged 6 commits into from
May 2, 2023

Conversation

leuyentran
Copy link
Contributor

@leuyentran leuyentran commented May 2, 2023

Change Overview

Upgrading aws-sdk-go from v1.44.244 to v1.44.2501 exposed failures in awsebs.SanpshotCopy()`. The upgrade was subsequently reverted in order to mitigate errors while waiting for root cause analysis.

This PR has the following changes:

  • And root cause has been found: A recent update in aws-sdk-go v1.44.247 has made it explicit that client's config region is used as DestinationRegion for AWS Copy Snapshot operation. This means that when CopySnapshotRequest() is made to create a presignedURL, the call needs to be made with the destination's client. This PR fix that for awsebs client .

  • As a result, aws-sdk-go can now be upgraded to v1.44.250

  • Lastly, the error mentioned above was not caught in unit testing because blockstorage_test.BlockStorageProviderSuite.TestSnapshotCopy() was skipped due to the following reason. This PR commented out the skip statement. The test took between 10-15s across 3 runs.

c.Skip("Sometimes, snapcopy takes over 10 minutes. go test declares failure if tests are that slow.")

Note that the entire BlockStorageProviderSuite test are currently skipped by make test due to lack of env config. Due to the test's nature of calling external provider's API, they should be placed in integration tests instead of unit tests. Issue is created to track: #2043.
As a result, blockstorage_test.BlockStorageProviderSuite.TestSnapshotCopy() is not run at the moment even with the change of this PR

Pull request type

Please check the type of change your PR introduces:

  • 🚧 Work in Progress
  • 🌈 Refactoring (no functional changes, no api changes)
  • 🐹 Trivial/Minor
  • 🐛 Bugfix
  • 🌻 Feature
  • 🗺️ Documentation
  • 🤖 Test

Test Plan

  • 💪 Manual
  • ⚡ Unit test
  • 💚 E2E

@leuyentran leuyentran requested a review from PrasadG193 May 2, 2023 01:11
@github-actions
Copy link
Contributor

github-actions bot commented May 2, 2023

Thanks for submitting this pull request 🎉. The team will review it soon and get back to you.

If you haven't already, please take a moment to review our project contributing guideline and Code of Conduct document.

si := ec2.CopySnapshotInput{
SourceSnapshotId: aws.String(from.ID),
SourceRegion: aws.String(from.Region),
DestinationRegion: ec2Cli.Config.Region,
}
rq, _ := fromCli.CopySnapshotRequest(&si)
rq, _ := ec2Cli.CopySnapshotRequest(&si)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the region used to create the client (in the constructor above) needs to match the region in the request as well, is that correct? (that's what I gather from the comment above (// Copy operation must be initiated from the destination region.), so double checking.

Copy link
Contributor

@kidgilson kidgilson May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, part of the aws sdk update fixed the issue of letting you have the destination-region be different from the client region (which they're supposed to be the same). We were doing it wrong before, but aws was allowing it in a way, whereas now it just needs to be done the correct way which is destination-region = client.region

@@ -144,7 +144,7 @@ func (s *BlockStorageProviderSuite) TestCreateSnapshot(c *C) {
}

func (s *BlockStorageProviderSuite) TestSnapshotCopy(c *C) {
c.Skip("Sometimes, snapcopy takes over 10 minutes. go test declares failure if tests are that slow.")
// c.Skip("Sometimes, snapcopy takes over 10 minutes. go test declares failure if tests are that slow.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intent to remove the // before merging to prevent checking in commented code?

Copy link
Contributor

@kidgilson kidgilson May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I agree, we should remove the c.Skip completely instead of commenting.

We ran some tests on it yesterday and it only added 5 seconds per run to have it running. This was skipped five years ago, when, I'm assuming there were performance differences, and in the past five years those have been improved to where it doesn't take as long now. This unit test does catch the original error when running so it's important to have it running.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention of leaving this as a comment is to leave some easy breadcrumbs behind on which test took a long time in the past, in case the problem re-surface. Alternatively, a log of test runs should also quickly provide the same information. I'll remove the Skip statement entirely

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@leuyentran
Copy link
Contributor Author

leuyentran commented May 2, 2023

Follow up issue on testing: #2043

Copy link
Contributor

@kidgilson kidgilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mergify mergify bot merged commit 28ccf53 into master May 2, 2023
@mergify mergify bot deleted the Fix-error-in-creating-aws-copy-snaphot-request branch May 2, 2023 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants