provider/scaleway: fix scaleway_volume_attachment with count > 1 #9493

nicolai86 · 2016-10-20T21:17:58Z

since scaleway requires servers to be powered off to attach volumes to we need
to make sure that we don't power down a server twice, or power up a server while
it's supposed to be modified. classic race condition.

sadly terraform doesn't seem to sport serialization primitives for use cases like
this, but putting the code in question behind a sync.Mutex does the trick, too.

this PR also migrates waitForServerState over to terraform's internal retry primitive.

fixes #9417

TF_ACC=1 go test ./builtin/providers/scaleway -v -run=TestAccScaleway -timeout 120m
=== RUN   TestAccScalewayDataSourceBootscript_Basic
--- PASS: TestAccScalewayDataSourceBootscript_Basic (1.46s)
=== RUN   TestAccScalewayDataSourceBootscript_Filtered
--- PASS: TestAccScalewayDataSourceBootscript_Filtered (1.16s)
=== RUN   TestAccScalewayDataSourceImage_Basic
--- PASS: TestAccScalewayDataSourceImage_Basic (7.22s)
=== RUN   TestAccScalewayDataSourceImage_Filtered
--- PASS: TestAccScalewayDataSourceImage_Filtered (7.17s)
=== RUN   TestAccScalewayIP_importBasic
--- PASS: TestAccScalewayIP_importBasic (1.95s)
=== RUN   TestAccScalewaySecurityGroup_importBasic
--- PASS: TestAccScalewaySecurityGroup_importBasic (2.35s)
=== RUN   TestAccScalewayServer_importBasic
--- PASS: TestAccScalewayServer_importBasic (138.48s)
=== RUN   TestAccScalewayVolume_importBasic
--- PASS: TestAccScalewayVolume_importBasic (3.49s)
=== RUN   TestAccScalewayIP_Basic
--- PASS: TestAccScalewayIP_Basic (10.79s)
=== RUN   TestAccScalewaySecurityGroupRule_Basic
--- PASS: TestAccScalewaySecurityGroupRule_Basic (6.57s)
=== RUN   TestAccScalewaySecurityGroup_Basic
--- PASS: TestAccScalewaySecurityGroup_Basic (2.52s)
=== RUN   TestAccScalewayServer_Basic
--- PASS: TestAccScalewayServer_Basic (129.39s)
=== RUN   TestAccScalewayServer_SecurityGroup
--- PASS: TestAccScalewayServer_SecurityGroup (166.02s)
=== RUN   TestAccScalewayVolumeAttachment_Basic
--- PASS: TestAccScalewayVolumeAttachment_Basic (373.03s)
=== RUN   TestAccScalewayVolume_Basic
--- PASS: TestAccScalewayVolume_Basic (3.27s)
PASS
ok      github.com/hashicorp/terraform/builtin/providers/scaleway   854.886s

tboerger · 2016-10-20T21:52:02Z

I will give it a try tomorrow

dcharbonnier · 2016-10-21T07:06:50Z

What do you think about the azure implementation that define a MutexKV in the provider and lock ressources. It take advantage of the terraform official helper mutexkv, that could be more "elegant" don't you think @nicolai86 ?

dcharbonnier · 2016-10-21T07:07:59Z

builtin/providers/scaleway/helpers.go


-	var currentState string
+func waitForServerState(scaleway *api.ScalewayAPI, serverID string, targetState string) error {


nicolai86 · 2016-10-21T09:11:49Z

great find, @dcharbonnier . I'll adjust the PR

nicolai86 · 2016-10-21T09:50:18Z

tests still pass:

TF_ACC=1 go test ./builtin/providers/scaleway -v -run=TestAccScaleway -timeout 120m
=== RUN   TestAccScalewayDataSourceBootscript_Basic
--- PASS: TestAccScalewayDataSourceBootscript_Basic (1.38s)
=== RUN   TestAccScalewayDataSourceBootscript_Filtered
--- PASS: TestAccScalewayDataSourceBootscript_Filtered (1.20s)
=== RUN   TestAccScalewayDataSourceImage_Basic
--- PASS: TestAccScalewayDataSourceImage_Basic (7.30s)
=== RUN   TestAccScalewayDataSourceImage_Filtered
--- PASS: TestAccScalewayDataSourceImage_Filtered (7.37s)
=== RUN   TestAccScalewayIP_importBasic
--- PASS: TestAccScalewayIP_importBasic (2.24s)
=== RUN   TestAccScalewaySecurityGroup_importBasic
--- PASS: TestAccScalewaySecurityGroup_importBasic (2.38s)
=== RUN   TestAccScalewayServer_importBasic
--- PASS: TestAccScalewayServer_importBasic (116.65s)
=== RUN   TestAccScalewayVolume_importBasic
--- PASS: TestAccScalewayVolume_importBasic (3.26s)
=== RUN   TestAccScalewayIP_Basic
--- PASS: TestAccScalewayIP_Basic (11.16s)
=== RUN   TestAccScalewaySecurityGroupRule_Basic
--- PASS: TestAccScalewaySecurityGroupRule_Basic (5.02s)
=== RUN   TestAccScalewaySecurityGroup_Basic
--- PASS: TestAccScalewaySecurityGroup_Basic (2.48s)
=== RUN   TestAccScalewayServer_Basic
--- PASS: TestAccScalewayServer_Basic (161.38s)
=== RUN   TestAccScalewayServer_SecurityGroup
--- PASS: TestAccScalewayServer_SecurityGroup (302.07s)
=== RUN   TestAccScalewayVolumeAttachment_Basic
--- PASS: TestAccScalewayVolumeAttachment_Basic (448.10s)
PASS
ok      github.com/hashicorp/terraform/builtin/providers/scaleway

tboerger · 2016-10-21T10:53:21Z

Is it possible that there is some kind of race while executing plan? It's hanging until I hit Ctrl+c, than it's printing a lot of output. But apply seem to work properly with that change, even if it takes some time to setup my cluster.

I'm just running into another issue from Scaleway where a node is hanging from time to time at starting server (kernel started).

tboerger · 2016-10-21T11:51:40Z

I can confirm that this patch correctly fixes the attachment of multiple volumes. It took nearly 8 minutes to launch one server of the type C2S with 3 additional volumes:

# time terraform apply
data.scaleway_image.store: Refreshing state...
scaleway_server.store: Creating...
  enable_ipv6:  "" => "false"
  image:        "" => "047f1372-3923-471f-82ca-5ff69dbaf0f7"
  name:         "" => "store"
  private_ip:   "" => "<computed>"
  public_ip:    "" => "<computed>"
  state:        "" => "<computed>"
  state_detail: "" => "<computed>"
  type:         "" => "C2S"
scaleway_volume.store.2: Creating...
  name:       "" => "vol-2"
  server:     "" => "<computed>"
  size_in_gb: "" => "150"
  type:       "" => "l_ssd"
scaleway_volume.store.0: Creating...
  name:       "" => "vol-0"
  server:     "" => "<computed>"
  size_in_gb: "" => "150"
  type:       "" => "l_ssd"
scaleway_volume.store.1: Creating...
  name:       "" => "vol-1"
  server:     "" => "<computed>"
  size_in_gb: "" => "150"
  type:       "" => "l_ssd"
scaleway_volume.store.0: Creation complete
scaleway_volume.store.1: Creation complete
scaleway_volume.store.2: Creation complete
scaleway_server.store: Still creating... (10s elapsed)
scaleway_server.store: Still creating... (20s elapsed)
scaleway_server.store: Still creating... (30s elapsed)
scaleway_server.store: Still creating... (40s elapsed)
scaleway_server.store: Still creating... (50s elapsed)
scaleway_server.store: Still creating... (1m0s elapsed)
scaleway_server.store: Still creating... (1m10s elapsed)
scaleway_server.store: Still creating... (1m20s elapsed)
scaleway_server.store: Creation complete
scaleway_volume_attachment.store.0: Creating...
  server: "" => "c0186588-ecd2-44ad-a4de-2926d3654461"
  volume: "" => "0903698a-a952-4e54-bd83-797fe02a410d"
scaleway_volume_attachment.store.2: Creating...
  server: "" => "c0186588-ecd2-44ad-a4de-2926d3654461"
  volume: "" => "e67fc95c-cee8-4dba-9d61-453e4640151a"
scaleway_volume_attachment.store.1: Creating...
  server: "" => "c0186588-ecd2-44ad-a4de-2926d3654461"
  volume: "" => "9df7d61c-41b8-45be-b376-ce927ec10709"
scaleway_volume_attachment.store.0: Still creating... (10s elapsed)
scaleway_volume_attachment.store.2: Still creating... (10s elapsed)
scaleway_volume_attachment.store.1: Still creating... (10s elapsed)
scaleway_volume_attachment.store.0: Still creating... (20s elapsed)
scaleway_volume_attachment.store.2: Still creating... (20s elapsed)
scaleway_volume_attachment.store.1: Still creating... (20s elapsed)
scaleway_volume_attachment.store.0: Still creating... (30s elapsed)
scaleway_volume_attachment.store.2: Still creating... (30s elapsed)
scaleway_volume_attachment.store.1: Still creating... (30s elapsed)
scaleway_volume_attachment.store.2: Still creating... (40s elapsed)
scaleway_volume_attachment.store.0: Still creating... (40s elapsed)
scaleway_volume_attachment.store.1: Still creating... (40s elapsed)
scaleway_volume_attachment.store.2: Still creating... (50s elapsed)
scaleway_volume_attachment.store.1: Still creating... (50s elapsed)
scaleway_volume_attachment.store.0: Still creating... (50s elapsed)
scaleway_volume_attachment.store.0: Still creating... (1m0s elapsed)
scaleway_volume_attachment.store.2: Still creating... (1m0s elapsed)
scaleway_volume_attachment.store.1: Still creating... (1m0s elapsed)
scaleway_volume_attachment.store.2: Still creating... (1m10s elapsed)
scaleway_volume_attachment.store.0: Still creating... (1m10s elapsed)
scaleway_volume_attachment.store.1: Still creating... (1m10s elapsed)
scaleway_volume_attachment.store.1: Still creating... (1m20s elapsed)
scaleway_volume_attachment.store.2: Still creating... (1m20s elapsed)
scaleway_volume_attachment.store.0: Still creating... (1m20s elapsed)
scaleway_volume_attachment.store.1: Still creating... (1m30s elapsed)
scaleway_volume_attachment.store.2: Still creating... (1m30s elapsed)
scaleway_volume_attachment.store.0: Still creating... (1m30s elapsed)
scaleway_volume_attachment.store.2: Still creating... (1m40s elapsed)
scaleway_volume_attachment.store.1: Still creating... (1m40s elapsed)
scaleway_volume_attachment.store.0: Still creating... (1m40s elapsed)
scaleway_volume_attachment.store.0: Still creating... (1m50s elapsed)
scaleway_volume_attachment.store.2: Still creating... (1m50s elapsed)
scaleway_volume_attachment.store.1: Still creating... (1m50s elapsed)
scaleway_volume_attachment.store.0: Creation complete
scaleway_volume_attachment.store.1: Still creating... (2m0s elapsed)
scaleway_volume_attachment.store.2: Still creating... (2m0s elapsed)
scaleway_volume_attachment.store.1: Still creating... (2m10s elapsed)
scaleway_volume_attachment.store.2: Still creating... (2m10s elapsed)
scaleway_volume_attachment.store.2: Still creating... (2m20s elapsed)
scaleway_volume_attachment.store.1: Still creating... (2m20s elapsed)
scaleway_volume_attachment.store.1: Still creating... (2m30s elapsed)
scaleway_volume_attachment.store.2: Still creating... (2m30s elapsed)
scaleway_volume_attachment.store.1: Still creating... (2m40s elapsed)
scaleway_volume_attachment.store.2: Still creating... (2m40s elapsed)
scaleway_volume_attachment.store.1: Still creating... (2m50s elapsed)
scaleway_volume_attachment.store.2: Still creating... (2m50s elapsed)
scaleway_volume_attachment.store.1: Still creating... (3m0s elapsed)
scaleway_volume_attachment.store.2: Still creating... (3m0s elapsed)
scaleway_volume_attachment.store.2: Still creating... (3m10s elapsed)
scaleway_volume_attachment.store.1: Still creating... (3m10s elapsed)
scaleway_volume_attachment.store.1: Still creating... (3m20s elapsed)
scaleway_volume_attachment.store.2: Still creating... (3m20s elapsed)
scaleway_volume_attachment.store.1: Still creating... (3m30s elapsed)
scaleway_volume_attachment.store.2: Still creating... (3m30s elapsed)
scaleway_volume_attachment.store.2: Still creating... (3m40s elapsed)
scaleway_volume_attachment.store.1: Still creating... (3m40s elapsed)
scaleway_volume_attachment.store.1: Still creating... (3m50s elapsed)
scaleway_volume_attachment.store.2: Still creating... (3m50s elapsed)
scaleway_volume_attachment.store.2: Creation complete
scaleway_volume_attachment.store.1: Still creating... (4m0s elapsed)
scaleway_volume_attachment.store.1: Still creating... (4m10s elapsed)
scaleway_volume_attachment.store.1: Still creating... (4m20s elapsed)
scaleway_volume_attachment.store.1: Still creating... (4m30s elapsed)
scaleway_volume_attachment.store.1: Still creating... (4m40s elapsed)
scaleway_volume_attachment.store.1: Still creating... (4m50s elapsed)
scaleway_volume_attachment.store.1: Still creating... (5m0s elapsed)
scaleway_volume_attachment.store.1: Still creating... (5m10s elapsed)
scaleway_volume_attachment.store.1: Still creating... (5m20s elapsed)
scaleway_volume_attachment.store.1: Still creating... (5m30s elapsed)
scaleway_volume_attachment.store.1: Still creating... (5m40s elapsed)
scaleway_volume_attachment.store.1: Still creating... (5m50s elapsed)
scaleway_volume_attachment.store.1: Still creating... (6m0s elapsed)
scaleway_volume_attachment.store.1: Still creating... (6m10s elapsed)
scaleway_volume_attachment.store.1: Creation complete

Apply complete! Resources: 7 added, 0 changed, 0 destroyed.

The state of your infrastructure has been saved to the path
below. This state is required to modify and destroy your
infrastructure, so keep it safe. To inspect the complete state
use the `terraform show` command.

State path: terraform.tfstate

real    7m39.445s
user    0m0.604s
sys 0m0.606s

tboerger · 2016-10-21T12:19:35Z

And it took 11 minutes to destroy :D

nicolai86 · 2016-10-21T12:58:09Z

@tboerger at the moment all scaleway_volume_attachments are executed in serial. This makes the behaviour slow for multiple volumes, but it's correct.

A future optimization would be to add a barrier on stop & start actions, then we could batch the attachments which would speed things up considerably.

But we should leave this off to future PRs :)

tboerger · 2016-10-21T13:08:00Z

True, at least it should work now properly.

dcharbonnier · 2016-10-26T05:43:54Z

what are we waiting to merge this ?

stack72 · 2016-10-27T14:02:49Z

Hi @nicolai86

Sorry for the time taken to get back to you on this one - can you rebase it for me and I can get it merged?

You may have to change the timeout in the retry introduced by the other PR i merged today by you due to the 11 minutes above. Maybe make it a 20minute retry?

Paul

since scaleway requires servers to be powered off to attach volumes to, we need to make sure that we don't power down a server twice, or power up a server while it's supposed to be modified. sadly terraform doesn't seem to sport serialization primitives for usecases like this, but putting the code in question behind a `sync.Mutex` does the trick, too fixes #9417

@dcharbonnier

following @dcharbonnier suggestion. thanks!

nicolai86 · 2016-10-27T14:36:09Z

I've rebased the PR and also increased the timeouts 👍

thanks for looking after this :)

stack72 · 2016-10-27T15:51:31Z

This LGTM now @nicolai86 :) Tests are green (as always!) - will squash and merge - thanks for the awesome work as usual

% make testacc TEST=./builtin/providers/scaleway                                                                                           1 ↵
==> Checking that code complies with gofmt requirements...
go generate $(go list ./... | grep -v /terraform/vendor/)
2016/10/27 16:38:12 Generated command/internal_plugin_list.go
TF_ACC=1 go test ./builtin/providers/scaleway -v  -timeout 120m
=== RUN   TestAccScalewayDataSourceBootscript_Basic
--- PASS: TestAccScalewayDataSourceBootscript_Basic (2.99s)
=== RUN   TestAccScalewayDataSourceBootscript_Filtered
--- PASS: TestAccScalewayDataSourceBootscript_Filtered (2.42s)
=== RUN   TestAccScalewayDataSourceImage_Basic
--- PASS: TestAccScalewayDataSourceImage_Basic (14.49s)
=== RUN   TestAccScalewayDataSourceImage_Filtered
--- PASS: TestAccScalewayDataSourceImage_Filtered (11.31s)
=== RUN   TestAccScalewayIP_importBasic
--- PASS: TestAccScalewayIP_importBasic (2.92s)
=== RUN   TestAccScalewaySecurityGroup_importBasic
--- PASS: TestAccScalewaySecurityGroup_importBasic (3.95s)
=== RUN   TestAccScalewayServer_importBasic
--- PASS: TestAccScalewayServer_importBasic (113.34s)
=== RUN   TestAccScalewayVolume_importBasic
--- PASS: TestAccScalewayVolume_importBasic (5.91s)
=== RUN   TestProvider
--- PASS: TestProvider (0.00s)
=== RUN   TestProvider_impl
--- PASS: TestProvider_impl (0.00s)
=== RUN   TestAccScalewayIP_Basic
--- PASS: TestAccScalewayIP_Basic (15.24s)
=== RUN   TestAccScalewaySecurityGroupRule_Basic
--- PASS: TestAccScalewaySecurityGroupRule_Basic (7.97s)
=== RUN   TestAccScalewaySecurityGroup_Basic
--- PASS: TestAccScalewaySecurityGroup_Basic (3.83s)
=== RUN   TestAccScalewayServer_Basic
--- PASS: TestAccScalewayServer_Basic (126.03s)
=== RUN   TestAccScalewayServer_SecurityGroup
--- PASS: TestAccScalewayServer_SecurityGroup (113.99s)
=== RUN   TestAccScalewayVolumeAttachment_Basic
--- PASS: TestAccScalewayVolumeAttachment_Basic (343.90s)
=== RUN   TestAccScalewayVolume_Basic
--- PASS: TestAccScalewayVolume_Basic (6.41s)
PASS
ok      github.com/hashicorp/terraform/builtin/providers/scaleway   774.719s

nicolai86 · 2016-10-27T15:55:37Z

<3 @stack72

@dcharbonnier

…hicorp#9493) * provider/scaleway: fix scaleway_volume_attachment with count > 1 since scaleway requires servers to be powered off to attach volumes to, we need to make sure that we don't power down a server twice, or power up a server while it's supposed to be modified. sadly terraform doesn't seem to sport serialization primitives for usecases like this, but putting the code in question behind a `sync.Mutex` does the trick, too fixes hashicorp#9417 * provider/scaleway: use mutexkv to lock per-resource following @dcharbonnier suggestion. thanks! * provider/scaleway: cleanup waitForServerState signature * provider/scaleway: store serverID in var * provider/scaleway: correct imports * provider/scaleway: increase timeouts

@dcharbonnier

…hicorp#9493) * provider/scaleway: fix scaleway_volume_attachment with count > 1 since scaleway requires servers to be powered off to attach volumes to, we need to make sure that we don't power down a server twice, or power up a server while it's supposed to be modified. sadly terraform doesn't seem to sport serialization primitives for usecases like this, but putting the code in question behind a `sync.Mutex` does the trick, too fixes hashicorp#9417 * provider/scaleway: use mutexkv to lock per-resource following @dcharbonnier suggestion. thanks! * provider/scaleway: cleanup waitForServerState signature * provider/scaleway: store serverID in var * provider/scaleway: correct imports * provider/scaleway: increase timeouts

@dcharbonnier

…hicorp#9493) * provider/scaleway: fix scaleway_volume_attachment with count > 1 since scaleway requires servers to be powered off to attach volumes to, we need to make sure that we don't power down a server twice, or power up a server while it's supposed to be modified. sadly terraform doesn't seem to sport serialization primitives for usecases like this, but putting the code in question behind a `sync.Mutex` does the trick, too fixes hashicorp#9417 * provider/scaleway: use mutexkv to lock per-resource following @dcharbonnier suggestion. thanks! * provider/scaleway: cleanup waitForServerState signature * provider/scaleway: store serverID in var * provider/scaleway: correct imports * provider/scaleway: increase timeouts

ghost · 2020-04-21T01:58:59Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

catsby added provider/aws bug labels Oct 20, 2016

nicolai86 mentioned this pull request Oct 20, 2016

Scaleway: Adding multiple volumes doesn't work #9417

Closed

catsby added provider/scaleway and removed provider/aws labels Oct 20, 2016

dcharbonnier reviewed Oct 21, 2016

View reviewed changes

dcharbonnier mentioned this pull request Oct 21, 2016

Add support for volumes on creation for scaleway #9499

Closed

tboerger approved these changes Oct 21, 2016

View reviewed changes

nicolai86 added 6 commits October 27, 2016 16:33

provider/scaleway: use mutexkv to lock per-resource

73f4a74

following @dcharbonnier suggestion. thanks!

provider/scaleway: cleanup waitForServerState signature

aac604d

provider/scaleway: store serverID in var

f7d36f5

provider/scaleway: correct imports

e320488

provider/scaleway: increase timeouts

d73ec5f

stack72 merged commit d9a2e0d into hashicorp:master Oct 27, 2016

ghost locked and limited conversation to collaborators Apr 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provider/scaleway: fix scaleway_volume_attachment with count > 1 #9493

provider/scaleway: fix scaleway_volume_attachment with count > 1 #9493

nicolai86 commented Oct 20, 2016 •

edited

Loading

tboerger commented Oct 20, 2016

dcharbonnier commented Oct 21, 2016

dcharbonnier Oct 21, 2016

nicolai86 commented Oct 21, 2016

nicolai86 commented Oct 21, 2016

tboerger commented Oct 21, 2016

tboerger commented Oct 21, 2016

tboerger commented Oct 21, 2016

nicolai86 commented Oct 21, 2016

tboerger commented Oct 21, 2016

dcharbonnier commented Oct 26, 2016

stack72 commented Oct 27, 2016 •

edited

Loading

nicolai86 commented Oct 27, 2016

stack72 commented Oct 27, 2016

nicolai86 commented Oct 27, 2016

ghost commented Apr 21, 2020


		var currentState string
		func waitForServerState(scaleway *api.ScalewayAPI, serverID string, targetState string) error {

provider/scaleway: fix scaleway_volume_attachment with count > 1 #9493

provider/scaleway: fix scaleway_volume_attachment with count > 1 #9493

Conversation

nicolai86 commented Oct 20, 2016 • edited Loading

tboerger commented Oct 20, 2016

dcharbonnier commented Oct 21, 2016

dcharbonnier Oct 21, 2016

Choose a reason for hiding this comment

nicolai86 commented Oct 21, 2016

nicolai86 commented Oct 21, 2016

tboerger commented Oct 21, 2016

tboerger commented Oct 21, 2016

tboerger commented Oct 21, 2016

nicolai86 commented Oct 21, 2016

tboerger commented Oct 21, 2016

dcharbonnier commented Oct 26, 2016

stack72 commented Oct 27, 2016 • edited Loading

nicolai86 commented Oct 27, 2016

stack72 commented Oct 27, 2016

nicolai86 commented Oct 27, 2016

ghost commented Apr 21, 2020

nicolai86 commented Oct 20, 2016 •

edited

Loading

stack72 commented Oct 27, 2016 •

edited

Loading