Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: do deep copy for dummyVol struct #2669

Merged
merged 2 commits into from
Nov 23, 2021
Merged

Conversation

Madhu-1
Copy link
Collaborator

@Madhu-1 Madhu-1 commented Nov 22, 2021

with a shallow copy of rbdVol to dummyVol, the image name update of the dummyVol is getting reflected on the rbdVol which we don't want.
do a deep copy to avoid this problem.

fixes ##2656 (comment)
Signed-off-by: Madhu Rajanna madhupr007@gmail.com

@mergify mergify bot added the component/rbd Issues related to RBD label Nov 22, 2021
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 22, 2021

@BenamarMk PTAL.

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 22, 2021

/retest ci/centos/mini-e2e-helm/k8s-1.21

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 22, 2021

/retest ci/centos/mini-e2e-helm/k8s-1.21

Warning: Permanently added '192.168.39.159' (ECDSA) to the list of known hosts. client_loop: send disconnect: Broken pipe script returned exit code 255?

Copy link

@BenamarMk BenamarMk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this patch and it works as intended.

@Madhu-1 Madhu-1 added the ci/retry/e2e Label to retry e2e retesting on approved PR's label Nov 22, 2021
yati1998
yati1998 previously approved these changes Nov 22, 2021
Yuggupta27
Yuggupta27 previously approved these changes Nov 22, 2021
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 22, 2021

Before the fix

I1122 09:05:59.458723       1 utils.go:177] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC call: /replication.Controller/EnableVolumeReplication
I1122 09:05:59.459657       1 utils.go:179] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC request: {"parameters":{"mirroringMode":"snapshot","schedulingInterval":"5m"},"secrets":"***stripped***","volume_id":"0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004"}
I1122 09:05:59.502402       1 omap.go:87] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 got omap values: (pool="replicapool", namespace="", name="csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004"): map[csi.imageid:10b183a48a97 csi.imagename:csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 csi.volname:pvc-26893f08-ff2b-4a0f-a5c3-884b720ffb2c csi.volume.owner:default]
I1122 09:05:59.581764       1 rbd_util.go:345] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 rbd: create replicapool/csi-vol-dummy-818d066f-39ab-45de-8943-a117f9d82c8e size 1024M (features: [layering]) using mon 192.168.121.235:6789
I1122 09:06:00.662432       1 utils.go:188] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC response: {}
I1122 09:06:00.668961       1 utils.go:177] ID: 20 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC call: /replication.Controller/PromoteVolume
I1122 09:06:00.669184       1 utils.go:179] ID: 20 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC request: {"parameters":{"mirroringMode":"snapshot","schedulingInterval":"5m"},"secrets":"***stripped***","volume_id":"0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004"}
I1122 09:06:00.683420       1 omap.go:87] ID: 20 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 got omap values: (pool="replicapool", namespace="", name="csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004"): map[csi.imageid:10b183a48a97 csi.imagename:csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 csi.volname:pvc-26893f08-ff2b-4a0f-a5c3-884b720ffb2c csi.volume.owner:default]
E1122 09:06:00.720896       1 utils.go:186] ID: 20 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC error: rpc error: code = InvalidArgument desc = mirroring is not enabled on 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004, image is in 2 Mode
I1122 09:06:00.751022       1 utils.go:177] ID: 21 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC call: /replication.Controller/EnableVolumeReplication
I1122 09:06:00.756642       1 utils.go:179] ID: 21 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC request: {"parameters":{"mirroringMode":"snapshot","schedulingInterval":"5m"},"secrets":"***stripped***","volume_id":"0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004"}
I1122 09:06:00.763371       1 omap.go:87] ID: 21 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 got omap values: (pool="replicapool", namespace="", name="csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004"): map[csi.imageid:10b183a48a97 csi.imagename:csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 csi.volname:pvc-26893f08-ff2b-4a0f-a5c3-884b720ffb2c csi.volume.owner:default]
I1122 09:06:01.672882       1 utils.go:188] ID: 21 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC response: {}
I1122 09:06:01.692753       1 utils.go:177] ID: 22 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC call: /replication.Controller/PromoteVolume
I1122 09:06:01.693003       1 utils.go:179] ID: 22 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC request: {"parameters":{"mirroringMode":"snapshot","schedulingInterval":"5m"},"secrets":"***stripped***","volume_id":"0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004"}
I1122 09:06:01.698206       1 omap.go:87] ID: 22 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 got omap values: (pool="replicapool", namespace="", name="csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004"): map[csi.imageid:10b183a48a97 csi.imagename:csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 csi.volname:pvc-26893f08-ff2b-4a0f-a5c3-884b720ffb2c csi.volume.owner:default]
I1122 09:06:01.736907       1 replicationcontrollerserver.go:526] ID: 22 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 Attempting to tickle dummy image for restarting RBD schedules
I1122 09:06:03.784935       1 replicationcontrollerserver.go:538] ID: 22 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 Added scheduling at interval 5m, start time  for volume replicapool/csi-vol-dummy-818d066f-39ab-45de-8943-a117f9d82c8e
I1122 09:06:03.785582       1 utils.go:188] ID: 22 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC response: {}

After the fix

30-0242ac110004 GRPC call: /replication.Controller/EnableVolumeReplication
I1122 09:15:59.930725       1 utils.go:179] ID: 18 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC request: {"parameters":{"mirroringMode":"snapshot","schedulingInterval":"5m"},"secrets":"***stripped***","volume_id":"0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004"}
I1122 09:15:59.960524       1 omap.go:87] ID: 18 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 got omap values: (pool="replicapool", namespace="", name="csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004"): map[csi.imageid:10b183a48a97 csi.imagename:csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 csi.volname:pvc-26893f08-ff2b-4a0f-a5c3-884b720ffb2c csi.volume.owner:default]
I1122 09:16:00.033938       1 rbd_util.go:345] ID: 18 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 rbd: create replicapool/csi-vol-dummy-818d066f-39ab-45de-8943-a117f9d82c8e size 1024M (features: [layering]) using mon 192.168.121.235:6789
I1122 09:16:00.036236       1 utils.go:188] ID: 18 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC response: {}
I1122 09:16:00.037846       1 utils.go:177] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC call: /replication.Controller/PromoteVolume
I1122 09:16:00.038253       1 utils.go:179] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC request: {"parameters":{"mirroringMode":"snapshot","schedulingInterval":"5m"},"secrets":"***stripped***","volume_id":"0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004"}
I1122 09:16:00.039916       1 omap.go:87] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 got omap values: (pool="replicapool", namespace="", name="csi.volume.0c25bdd3-485f-11ec-bd30-0242ac110004"): map[csi.imageid:10b183a48a97 csi.imagename:csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004 csi.volname:pvc-26893f08-ff2b-4a0f-a5c3-884b720ffb2c csi.volume.owner:default]
I1122 09:16:00.093362       1 replicationcontrollerserver.go:526] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 Attempting to tickle dummy image for restarting RBD schedules
I1122 09:16:01.622962       1 replicationcontrollerserver.go:538] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 Added scheduling at interval 5m, start time  for volume replicapool/csi-vol-0c25bdd3-485f-11ec-bd30-0242ac110004
I1122 09:16:01.623516       1 utils.go:188] ID: 19 Req-ID: 0001-0009-rook-ceph-0000000000000002-0c25bdd3-485f-11ec-bd30-0242ac110004 GRPC response: {}

@github-actions
Copy link

/retest ci/centos/mini-e2e-helm/k8s-1.21

@github-actions
Copy link

@Madhu-1 "ci/centos/mini-e2e-helm/k8s-1.21" test failed. Logs are available at location for debugging

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 22, 2021

@humblec @nixpanic can I get a review on this one on priority?

@Madhu-1 Madhu-1 added Priority-0 highest priority issue DNM DO NOT MERGE labels Nov 22, 2021
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 22, 2021

added DNM. QE is still testing this one looks like we need some more fix

@github-actions
Copy link

/retest ci/centos/mini-e2e-helm/k8s-1.21

@github-actions
Copy link

@Madhu-1 "ci/centos/mini-e2e-helm/k8s-1.21" test failed. Logs are available at location for debugging

@mergify mergify bot dismissed stale reviews from yati1998 and Yuggupta27 November 22, 2021 13:04

Pull request has been modified.

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 22, 2021

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Nov 22, 2021

rebase

✅ Branch has been successfully rebased

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 22, 2021

Cannot contact cico-workspace-ptd83: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@6d19570c:JNLP4-connect connection from 10.131.0.145/10.131.0.145:41744": Remote call on JNLP4-connect connection from 10.131.0.145/10.131.0.145:41744 failed. The channel is closing down or has closed down

looks like jenkins issue restarting the tests

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 22, 2021

/retest ci/centos/mini-e2e-helm/k8s-1.22

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 22, 2021

/retest ci/centos/mini-e2e/k8s-1.21

@github-actions
Copy link

/retest ci/centos/mini-e2e-helm/k8s-1.21

@github-actions
Copy link

/retest ci/centos/mini-e2e-helm/k8s-1.22

@github-actions
Copy link

@Madhu-1 "ci/centos/mini-e2e-helm/k8s-1.22" test failed. Logs are available at location for debugging

@github-actions
Copy link

/retest ci/centos/mini-e2e-helm/k8s-1.22

@github-actions
Copy link

@Madhu-1 "ci/centos/mini-e2e-helm/k8s-1.22" test failed. Logs are available at location for debugging

@github-actions
Copy link

/retest ci/centos/mini-e2e-helm/k8s-1.22

@github-actions
Copy link

@Madhu-1 "ci/centos/mini-e2e-helm/k8s-1.22" test failed. Logs are available at location for debugging

@github-actions
Copy link

/retest ci/centos/mini-e2e-helm/k8s-1.22

@github-actions
Copy link

@Madhu-1 "ci/centos/mini-e2e-helm/k8s-1.22" test failed. Logs are available at location for debugging

@github-actions
Copy link

/retest ci/centos/mini-e2e-helm/k8s-1.22

@github-actions
Copy link

@Madhu-1 "ci/centos/mini-e2e-helm/k8s-1.22" test failed. Logs are available at location for debugging

@humblec
Copy link
Collaborator

humblec commented Nov 23, 2021

@Madhu-1 if I heard right we are waiting for QE testing, isnt it ?

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 23, 2021

@Madhu-1 if I heard right we are waiting for QE testing, isnt it ?

yes, added DNM for the same reason.

@agarwal-mudit agarwal-mudit removed the DNM DO NOT MERGE label Nov 23, 2021
@nixpanic
Copy link
Member

@agarwal-mudit could you share some of the test results please? Re-adding DNM for now.

@nixpanic nixpanic added the DNM DO NOT MERGE label Nov 23, 2021
@@ -290,9 +290,9 @@ func createDummyImage(ctx context.Context, rbdVol *rbdVolume) error {
if err != nil {
return err
}
dummyVol := rbdVol
dummyVol := *rbdVol
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably helps to have a comment about this, as the implicit deep-copy is something that is really required here. Or create a DeepCopy() function to make it very explicit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a comment. if I make any code changes this should go through the testing again to cover all the cases to avoid regression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have regression tests for this at all? If the e2e does not have a test-case, you can add the ci/skip/e2e label?

The current change is very fragile, and breaking seems easy. If a follow-up is planned, can you share a link to the issue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nixpanic we are running the e2e to make sure nothing breaks at a cephcsi level. Yes, we can add the label here. nothing is planned an as a follow-up as code fix. #2675 a tracker to remove this workaround

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression is a manual testing for now

@agarwal-mudit
Copy link
Contributor

@agarwal-mudit could you share some of the test results please? Re-adding DNM for now.

It was decided not to wait for the test results. We can merge the fix and if it doesn't help QE can put it back to ON_QA.
Adding you in the discussion. Removing the DNM label

@agarwal-mudit agarwal-mudit removed the DNM DO NOT MERGE label Nov 23, 2021
@Madhu-1 Madhu-1 added the ci/skip/e2e skip running e2e CI jobs label Nov 23, 2021
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Nov 23, 2021

@nixpanic @humblec This is tested from @BenamarMk and Annette and its working fine as expected

with shallow copy of rbdVol to dummyVol
the image name update of the dummyVol is getting
reflected on the rbdVol which we dont want.

do deep copy to avoid this problem.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
currently we are fist operating on the  dummy
image to refresh the pool and then we are adding
the scheduling. we think the scheduling should
be added first and than we should refresh the
pool. If we do this all the existing schedules
will be considered from the scheduler.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/retry/e2e Label to retry e2e retesting on approved PR's ci/skip/e2e skip running e2e CI jobs component/rbd Issues related to RBD Priority-0 highest priority issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants