Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(migration): adding support to migrate the PV to a new node #304

Merged
merged 7 commits into from
May 1, 2021

Conversation

pawanpraka1
Copy link
Contributor

@pawanpraka1 pawanpraka1 commented Apr 6, 2021

Signed-off-by: Pawan pawan@mayadata.io

Why is this PR required? What issue does it fix?:

Here we are setting the node affinity on PV using nodename, so we can not migrate the pods to the new node as k8s scheduler will try to schedule on the old node only.

What this PR does?:

Here, we can now label the nodes with the unique values using the key openebs.io/nodeid. The ZFS LocalPV driver will pick the value from the nodes and set the affinity accordingly.

Now to migrate the PV to the other node, we have to move the disks to the other node and remove the old node from the cluster and set the same label on the new node using the same key and value, which will let k8s scheduler to schedule the pods to the that node.

Refactoring?:

We are using nodename as NodeID in code, refactored the code and made it NodeName to have better code readability. Also changed the ZFSVolume display string to NodeID from the strting Node to tell that it is the NodeID.

Limitation?:

we can only move the PVs to the new node, we can not move the PVs to the node which was already used in the cluster as there is only one allowed value for the custom key for setting the node label.

@@ -88,7 +88,7 @@ func run(config *config.Config) {

klog.Infof("ZFS Driver Version :- %s - commit :- %s", version.Current(), version.GetGitCommit())
klog.Infof(
"DriverName: %s Plugin: %s EndPoint: %s NodeID: %s",
"DriverName: %s Plugin: %s EndPoint: %s Node Name: %s",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NodeID is used here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is config.NodeID, which is node name only.

@@ -805,6 +805,8 @@ spec:
value: "zfs-operator"
- name: OPENEBS_IO_ENABLE_ANALYTICS
value: "true"
- name: NODE_AFFINITY_KEY
value: kubernetes.io/hostname
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the label key that we are using? It is added to each node by kubernetes right? So how does the labelling a new node work when another node goes down?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is the default key used by the driver. User has to label and provide that key here. See the docs added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the default key is used, PV migration cannot be done right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

@@ -2204,11 +2206,11 @@ spec:
image: openebs/zfs-driver:ci
imagePullPolicy: IfNotPresent
args:
- "--nodeid=$(OPENEBS_NODE_ID)"
- "--nodeid=$(OPENEBS_NODE_NAME)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it node name or node ID as the arg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, better to rename the arg here, will do that. Thanks

Copy link
Member

@akhilerm akhilerm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given some comments

Here we are setting the node affinity on PV using nodename, so we can not
migrate the pods to the new node as k8s scheduler will try to schedule
on the old node only.

Here, we can now label the nodes with the unique values and provide that
key in the operator yaml to use for setting the affinity via NODE_AFFINITY_KEY env.
The ZFS LocalPV driver will pick the value from the nodes and set the affinity
using the key provided.

Now to migrate the PV to the other node, we have to move the disks to the other node
and remove the old node from the cluster and set the same label on the new node using
the same key, which will let k8s scheduler to schedule the pods to the that node.

Signed-off-by: Pawan <pawan@mayadata.io>
Signed-off-by: Pawan <pawan@mayadata.io>
Signed-off-by: Pawan <pawan@mayadata.io>
@pawanpraka1 pawanpraka1 force-pushed the migration branch 3 times, most recently from f59bd97 to a3ec2b3 Compare April 26, 2021 09:59
Signed-off-by: Pawan <pawan@mayadata.io>
Copy link
Member

@shubham14bajpai shubham14bajpai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

docs/faq.md Outdated Show resolved Hide resolved
docs/faq.md Show resolved Hide resolved
Copy link
Member

@akhilerm akhilerm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pawanpraka1 Given some comments

pawanpraka1 and others added 2 commits April 28, 2021 22:51
Co-authored-by: Akhil Mohan <akhilerm@gmail.com>
Co-authored-by: Akhil Mohan <akhilerm@gmail.com>
docs/faq.md Outdated Show resolved Hide resolved
Copy link
Member

@akhilerm akhilerm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kmova
Copy link
Member

kmova commented Apr 30, 2021

Few high level questions.

=> Is this feature tied to a fixed label called "openebs.io/nodeid" ? Or can the user decide a custom label set by their infra tools "coolplatform.io/node-id"?

=> The PR has some changes related to replacing NODEID with NODENAME. Can you update a note in the PR description on why this refactoring was done.

@pawanpraka1
Copy link
Contributor Author

pawanpraka1 commented Apr 30, 2021

@kmova

=> Is this feature tied to a fixed label called "openebs.io/nodeid" ? Or can the user decide a custom label set by their infra tools "coolplatform.io/node-id"?

The earlier implementation was doing that only where users have to label the nodes with the custom key and then they have to mention the key the operator yaml. It seems like a lot of work for the users and does not look very user friendly. Later I made the change to fix the key (dce4b0c) so that user does not need to do anything else. We just need to label the nodes with the fixed key.

=> The PR has some changes related to replacing NODEID with NODENAME. Can you update a note in the PR description on why this refactoring was done.

done

@pawanpraka1 pawanpraka1 changed the title feat(migration): adding support to migrate the PV to other nodes feat(migration): adding support to migrate the PV to the new node Apr 30, 2021
@pawanpraka1 pawanpraka1 changed the title feat(migration): adding support to migrate the PV to the new node feat(migration): adding support to migrate the PV to a new node Apr 30, 2021
@kmova
Copy link
Member

kmova commented May 1, 2021

Had a chat with @pawanpraka1 regarding the customization of the label.

In the current implementation, the driver will automatically set the node labels using node name as a unique identifier. This approach is backward compatible and users don't need to take any action. Going ahead with merging this solution.

This will be followed up by further discussions and design around - making the unique label configurable. We have to decide whether it is a good idea to make this configurable at install or storage class (if this can be done). Based on the feedback on the current implementation we will have to see if we can design a solution that can also be used for migrating volumes to an existing node.

@kmova kmova merged commit 1b30116 into openebs:master May 1, 2021
@pawanpraka1 pawanpraka1 deleted the migration branch May 12, 2021 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants