Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(plugin): get owner node id from created resource #549

Merged
merged 9 commits into from
Aug 7, 2024

Conversation

jnels124
Copy link
Contributor

@jnels124 jnels124 commented Jun 25, 2024

Pull Request template

Please, go through these steps before you submit a PR.

Why is this PR required? What issue does it fix?:

  • Allows restoring from snapshot when nodeid differs from the node name

What this PR does?:

  • gets the owner node id from the created resource in all cases

Does this PR require any upgrade changes?:
No

If the changes in this PR are manually verified, list down the scenarios covered::

  1. Deployed with zfs nodes and custom nodeid
  2. Ingested data
  3. Created a snapshot from disks
  4. Restored from snapshots
  5. verified data present

Any additional information for your reviewer? :
Mention if this PR is part of any design or a continuation of previous PRs

Checklist:

PLEASE REMOVE BELOW INFORMATION BEFORE SUBMITTING

The PR title message must follow convention:
<type>(<scope>): <subject>.

Where:

  • type is defining if release will be triggering after merging submitted changes, details in CONTRIBUTING.md.
    Most common types are:

    • feat - for new features, not a new feature for build script
    • fix - for bug fixes or improvements, not a fix for build script
    • chore - changes not related to production code
    • docs - changes related to documentation
    • style - formatting, missing semi colons, linting fix etc; no significant production code changes
    • test - adding missing tests, refactoring tests; no production code change
    • refactor - refactoring production code, eg. renaming a variable or function name, there should not be any significant production code changes
  • scope is a single word that best describes where the changes fit.
    Most common scopes are like:

    • data engine (localpv, jiva, cstor)
    • feature (provisioning, backup, restore, exporter)
    • code component (api, webhook, cast, upgrade)
    • test (tests, bdd)
    • chores (version, build, log, travis)
  • subject is a single line brief description of the changes made in the pull request.

@codecov-commenter
Copy link

codecov-commenter commented Jun 25, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.37%. Comparing base (c389127) to head (d397a29).
Report is 2 commits behind head on develop.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop     #549   +/-   ##
========================================
  Coverage    96.37%   96.37%           
========================================
  Files            1        1           
  Lines          496      496           
========================================
  Hits           478      478           
  Misses          14       14           
  Partials         4        4           
Flag Coverage Δ
bddtests 96.37% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jnels124 jnels124 changed the title fix(plugin): get owner node id consistently fix(plugin): get owner node id from created resource Jun 25, 2024
@niladrih
Copy link
Member

@jnels124 -- do you think you could help me with a test case for this nodeId changes that we've added recently?

@jnels124
Copy link
Contributor Author

@jnels124 -- do you think you could help me with a test case for this nodeId changes that we've added recently?

Yes absolutely.

@niladrih
Copy link
Member

niladrih commented Jun 26, 2024

@jnels124 -- Thank you! We use the ginkgo test framework. The tests sit inside the 'tests' directory. They run on a single-node minikube ('none' driver) kubernetes cluster in our CI. Let me know if you're blocked with any of this. Again, thank you!

@jnels124
Copy link
Contributor Author

jnels124 commented Jul 9, 2024

@niladrih Please review this approach. This of course doubles the test execution time but the test
cases present make sense to run with this configuration. Note, the existing tests don't verify much in many cases. Will create a follow up ticket to address that separately.

@Abhinandan-Purkait
Copy link
Member

@niladrih Please review this approach. This of course doubles the test execution time but the test cases present make sense to run with this configuration. Note, the existing tests don't verify much in many cases. Will create a follow up ticket to address that separately.

cc @niladrih

Copy link
Member

@Abhinandan-Purkait Abhinandan-Purkait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look good to me. However running the ci tests two times looks a bit of an overkill to me, maybe this should be an e2e use case rather than a ci test. Doubling ci affects the PR, PR merge... all workflows.

Copy link
Member

@niladrih niladrih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the work that you've put in here @jnels124! I'm inclined to say that we should both of the sets of tests in parallel. However, the tests are taking 30 min, which isn't too bad and it is testing the core features. I'm okay with this tbh. LGTM

@tiagolobocastro
Copy link
Contributor

The code changes look good to me. However running the ci tests two times looks a bit of an overkill to me, maybe this should be an e2e use case rather than a ci test. Doubling ci affects the PR, PR merge... all workflows.

Yeah I'd also suggest running a smaller number of tests with custom node id, at least on PR being raised. Maybe we can have extended tests running at a separate cadence.

Copy link
Member

@niladrih niladrih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again... this approach might scale badly as newer tests are added, as they'd be run twice.

It makes sense to run a sanity test set with volume provision/de-provision and snapshot creation/deletion as a compromise.

It might be a good idea to add a copy of the test set at https://github.com/openebs/zfs-localpv/blob/develop/tests/provision_test.go in a new file. The prepareCustomNodeIdEnv would be in code, and it would be run as the very first By (before By("####### Creating the storage class : " + fstype + " #######") or By("Creating default storage class", createStorageClass)) in fsVolCreationTest() and blockVolCreationTest() functions. This way what changes is, if there are new tests added they are not run twice, and nodeId is tested for the big workflows. WDYT about this approach @jnels124?

@jnels124
Copy link
Contributor Author

jnels124 commented Jul 22, 2024

Looking at this again... this approach might scale badly as newer tests are added, as they'd be run twice.

It makes sense to run a sanity test set with volume provision/de-provision and snapshot creation/deletion as a compromise.

It might be a good idea to add a copy of the test set at https://github.com/openebs/zfs-localpv/blob/develop/tests/provision_test.go in a new file. The prepareCustomNodeIdEnv would be in code, and it would be run as the very first By (before By("####### Creating the storage class : " + fstype + " #######") or By("Creating default storage class", createStorageClass)) in fsVolCreationTest() and blockVolCreationTest() functions. This way what changes is, if there are new tests added they are not run twice, and nodeId is tested for the big workflows. WDYT about this approach @jnels124?

That makes sense to me I will get that in. Will let you know if I encounter difficulties around the environment config. I should be able to update node labels and delete pods to configure the environment appropriately. Will also need to ensure the test resets the environment back to the original state. This also means, this test can not be ran in parallel with other tests.

@jnels124
Copy link
Contributor Author

jnels124 commented Jul 23, 2024

@niladrih I took a slightly different approach but accomplishes the goal of running only specific tests after the environment has been configured with a custom node id. I went with this approach because there is not an existing client (and other things) in the test set up to allow modification of k8s nodes. Also, this approach allows more flexibility for future testing

@Abhinandan-Purkait
Copy link
Member

Abhinandan-Purkait commented Jul 30, 2024

@jnels124 Can you please rebase the PR branch, a lot of changes went in on the tests and ci. I see we are bumping go version as well, just wanted to double check with @niladrih @tiagolobocastro @avishnu about this.

Copy link
Member

@niladrih niladrih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed the go version bump. The go bump needs to go into this Dockerfile as well https://github.com/openebs/zfs-localpv/blob/develop/buildscripts/zfs-driver/zfs-driver.Dockerfile.

go.mod Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
Copy link
Member

@niladrih niladrih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@niladrih
Copy link
Member

niladrih commented Aug 1, 2024

@jnels124 thank you for making the change! You'd have to rebase your PR branch. I see commits from @sinhaashish and @Abhinandan-Purkait on your branch. This is likely because of a merge, instead of a rebase. I think there might be a git rebase option --rebase-merges.

What we're trying to do is to get your commits to sit on top. Let me know if you need help with this.

Signed-off-by: Jesse Nelson <jesse@swirldslabs.com>
Signed-off-by: Jesse Nelson <jesse@swirldslabs.com>
Signed-off-by: Jesse Nelson <jesse@swirldslabs.com>
Signed-off-by: Jesse Nelson <jesse@swirldslabs.com>
jnels124 and others added 4 commits August 3, 2024 16:43
Signed-off-by: Jesse Nelson <jesse@swirldslabs.com>
Signed-off-by: Jesse Nelson <jesse@swirldslabs.com>
Signed-off-by: Jesse Nelson <jesse@hashgraph.com>
Signed-off-by: Jesse Nelson <jesse@hashgraph.com>
@jnels124
Copy link
Contributor Author

jnels124 commented Aug 3, 2024

@niladrih I have rebased the commits and pr should only contain my feature commits. thanks

Signed-off-by: Jesse Nelson <jesse@hashgraph.com>
@Abhinandan-Purkait
Copy link
Member

@jnels124 Thanks, are we good to merge now?

@jnels124
Copy link
Contributor Author

jnels124 commented Aug 6, 2024

@Abhinandan-Purkait yes everything is complete and good to merge

@Abhinandan-Purkait Abhinandan-Purkait merged commit fc826e1 into openebs:develop Aug 7, 2024
8 checks passed
@Abhinandan-Purkait
Copy link
Member

@jnels124 Thanks. You can test the changes using the 2.7.0-develop helm chart, till the next release is cut.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Volume From Snapshot Fails With Custom Node Ids
5 participants