Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8SSAND-1855 ⁃ Cluster level tasks #739

Closed
jsanda opened this issue Oct 25, 2022 · 10 comments · Fixed by #776
Closed

K8SSAND-1855 ⁃ Cluster level tasks #739

jsanda opened this issue Oct 25, 2022 · 10 comments · Fixed by #776
Assignees
Labels
done Issues in the state 'done' enhancement New feature or request

Comments

@jsanda
Copy link
Contributor

jsanda commented Oct 25, 2022

What is missing?
cass-operator provides the CassandraTask CRD for performing various operations like rolling restart, cleanup, and rebuild. Here is an example manifest:

apiVersion: control.k8ssandra.io/v1alpha1
kind: CassandraTask
metadata:
  name: upgradesstables
spec:
  datacenter:
    name: dc1
    namespace: k8ssandra-operator
  jobs:
    - name: upgradesstables-dc1
      command: upgradesstables

Creating this CassandraTask will result in cass-operator running upgradesstables against all nodes in dc1.

There needs to be an analogous cluster-level task CRD to run tasks against all DCs. It might look something like this:

apiVersion: control.k8ssandra.io/v1alpha1
kind: K8ssandraClusterTask
metadata:
  name: upgradesstables
spec:
  datacenters:
    - name: dc1
      namespace: k8ssandra-operator
      k8sContext: east
    - name: dc2 
      namespace: k8ssandra-operator
      k8sContext: west
  jobs:
    - name: upgradesstables-dc1
      command: upgradesstables

The datacenters list specifies the DCs on which the task should be run as well as the order in which it should be run. The k8sContext field tells the operator in which Kubernetes cluster the DC exists.

Why do we need it?
Currently if I want to run a task, such as upgradesstables, against a multi-DC cluster I have to some amount of orchestration. k8ssandra-operator should handle this since it already provides cluster-level orchestration and management.

Environment

  • K8ssandra Operator version:

    v1.3.0

    **Anything else we need to know?**:

┆Issue is synchronized with this Jira Task by Unito
┆friendlyId: K8SSAND-1855
┆priority: Medium

@jsanda jsanda added the enhancement New feature or request label Oct 25, 2022
@sync-by-unito sync-by-unito bot changed the title Cluster level tasks K8SSAND-1855 ⁃ Cluster level tasks Oct 25, 2022
@adutra
Copy link
Contributor

adutra commented Nov 15, 2022

FYI this feature has become a prerequisite for a cluster-level rebalance task and should be prioritized. cc @adejanovski

@adutra
Copy link
Contributor

adutra commented Nov 15, 2022

@jsanda why do you suggest a list of datacenters instead of a reference to the K8ssandraCluster object itself?

e.g.

apiVersion: control.k8ssandra.io/v1alpha1
kind: K8ssandraClusterTask
metadata:
  name: upgradesstables
spec:
  cluster:
    name: cluster1
    namespace: k8ssandra-operator
  jobs:
    - name: upgradesstables-dc1
      command: upgradesstables

My fear is that users would end up listing DCs that do not belong to the same clusters, in which case a task like rebalance would produce undetermined results.

@jsanda
Copy link
Contributor Author

jsanda commented Nov 15, 2022

I suggested a list in the event the user only wants to run the task across a subset of DCs.

@adutra
Copy link
Contributor

adutra commented Nov 15, 2022

Hmm I see. What about:

apiVersion: control.k8ssandra.io/v1alpha1
kind: K8ssandraClusterTask
metadata:
  name: upgradesstables
spec:
  cluster:
    name: cluster1
    namespace: k8ssandra-operator
  datacenters: # a list of target DC names, assume all DCs in spec order if not provided
    - dc3
    - dc1
  jobs:
    - name: upgradesstables-dc1
      command: upgradesstables

@adejanovski adejanovski added the ready Issues in the state 'ready' label Nov 16, 2022
@olim7t
Copy link
Contributor

olim7t commented Nov 17, 2022

Will jobs reuse the CassandraJob / CassandraCommand resources from cass-operator, or do we plan to have a different set of commands?

@burmanm
Copy link
Contributor

burmanm commented Nov 18, 2022

Hmm I see. What about:

Would listing datacenters restrict the output to only those, and if left without datacenters array - run to all datacenters?

.. and reading the comment, it seems so ;)

@jsanda
Copy link
Contributor Author

jsanda commented Nov 18, 2022

Will jobs reuse the CassandraJob / CassandraCommand resources from cass-operator, or do we plan to have a different set of commands?

To help answer this question I think it would be good to look at examples for a few different types of CassandraTask to see how they look at the cluster level. I can provide some examples if you would like.

It would also be good to discuss what the status will look like.

@adutra
Copy link
Contributor

adutra commented Nov 18, 2022

Will jobs reuse the CassandraJob / CassandraCommand resources from cass-operator, or do we plan to have a different set of commands?

Also, I was a bit concerned about the shape that JobArguments is taking: it lists all arguments to all known commands, but as the number of supported commands grow, I feat that this struct will become a bag of unrelated stuff and could eventually become easy to be used in wrong ways. @burmanm do you have the same feeling? What would you suggest to improve this?

@burmanm
Copy link
Contributor

burmanm commented Nov 21, 2022

@adutra So I guess the alternative would look like:

jobs:
  rebuild:
    - name: "run-my-upgradesstable"
      source_datacenter: "dc1"

Or did you have some other struct in mind?

@adejanovski adejanovski added in-progress Issues in the state 'in-progress' ready-for-review Issues in the state 'ready-for-review' and removed ready Issues in the state 'ready' in-progress Issues in the state 'in-progress' ready-for-review Issues in the state 'ready-for-review' labels Nov 25, 2022
@adejanovski adejanovski added ready-for-review Issues in the state 'ready-for-review' and removed in-progress Issues in the state 'in-progress' labels Dec 7, 2022
@adejanovski adejanovski added ready-for-review Issues in the state 'ready-for-review' review Issues in the state 'review' and removed ready-for-review Issues in the state 'ready-for-review' labels Dec 7, 2022
@jsanda
Copy link
Contributor Author

jsanda commented Dec 9, 2022

@adejanovski Can we please create an issue for documenting cluster this feature and make sure it gets picked up after #739 is merged?

@adejanovski adejanovski added done Issues in the state 'done' and removed review Issues in the state 'review' labels Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
done Issues in the state 'done' enhancement New feature or request
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants