Replies: 2 comments 1 reply
-
Continuing this idea, this could perhaps be extended to deploy NiFi in another operating mode: |
Beta Was this translation helpful? Give feedback.
-
I explored this idea and compared it to just extending the I found that for standalone clusters, if the operator acts like the cluster coordinator (in the clustered NiFi case), then we can achieve nifikop driving a set of standalone NiFis. For this reason, I'm going to close this discussion and raise a new one to discuss extending NiFiKOp to drive a standalone cluster. |
Beta Was this translation helpful? Give feedback.
-
Purpose
This proposal captures the high level design of a new NiFiKop resource called
NifiReplicaCluster
, which complementsNifiCluster
in the sense that it deploys NiFi in a different operating mode: unclustered.One feature lacking in NiFiKop is the ability to treat a NiFi cluster like a set of unrelated replicas (i.e. cattle) instead of as a clustered set of nodes which require special attention (i.e. pets) during upgrades or calculated NiFi cluster operations. For the remainder of this document, NiFi clusters will be referred to as either clustered or unclustered, where clustered is the default behavior and is the same as
NifiCluster
. An unclustered NiFi is one where none of the nodes are aware of each other, but they are collectively running the same flow.Benefits of clustering
Drawbacks of clustering
Benefits of unclustered NiFis
Drawbacks of unclustered NiFis
There are benefits and drawbacks to each of these deployment methods, but NiFiKop only provides the ability to deployed clustered NiFis. This is why we are introducing the
NifiReplicaCluster
concept.In depth documentation can be found here and in neighboring pages: https://konpyutaika.github.io/nifikop/docs/1_concepts/3_features
NiFiKop API
NiFiKop exposes its API through several Kubernetes custom resource definitions (CRDs):
NifiCluster
. NiFiKop configures the user in the NiFi application.NifiUser
may belong to zero or more groups.NifiDataflow
may have zero or more ParameterContexts configured.Together, these resources allow you to declaratively define a clustered NiFi configuration. NiFiKop processes these resources and translates them into a deployment in Kubernetes. NiFiKop elects to oversee the Pod lifecycle instead of leveraging a Kubernetes Deployment or ReplicaSet. This is because NiFiKop allows granular and varying node configurations within a NiFi cluster, which isn't possible with a Deployment/ReplicaSet. With the latter resources, the same spec is applied to every Pod in the deployment.
NifiReplicaCluster Concept
Where NiFiKop's NifiCluster deploys NiFi in a clustered configuration and implements various logic to ensure that each node in the cluster remains that way,
NifiReplicaCluster
deploys unclustered NiFi pods that are entirely isolated from each other but each share the following resources:NifiDataflow
NifiUser
NifiUserGroup
NifiRegistryClient
NifiParameterContext
This enables us to deploy NiFi in an unclustered configuration, but configure each NiFi pod with the same users, user groups, dataflows, parameter contexts, and NiFi Registry client. In this way, we treat each node as a logical cluster, but they are unclustered in the NiFi sense. This deployment method enables us to perform the following operations more simply than in the clustered NiFi case:
These benefits comes with some sacrifices:
As previously mentioned, there are benefits and drawbacks to each approach. For users who do not need cluster coordination features or users who wish to perform zero downtime upgrades, then a
NifiReplicaCluster
might be a suitable alternative.NifiReplicaCluster Design
A new NiFiKop API resource
NifiReplicaCluster
is introduced, which is quite similar toNifiCluster
:The benefit of this resource being so similar to
NifiCluster
and re-using theNifiDataflow
,NifiUser
,NifiUserGroup
,NifiRegistryClient
, andNifiParameterContext
resources is that it can re-use the logic written in controllers to configure the NiFi pods in the deployment with all of this information. The lone difference is that the controllers won't be able to use the NiFi cluster coordinator to replicate requests on their behalf. They will need to issue requests to each node in the cluster and ensure that the request to each node was successful to ensure state.We need not concern ourselves with embedded/remote zookeeper since we're not using any cluster coordination features.
A controller will be created for this API resource that performs the following actions as a part of its reconciliation loop:
NifiReplicaCluster
Deployment or StatefulSet depending on whether thespec.stateful
flag is set with aPodSpec
composed of:Existing Controller Updates
Each of the below controllers will be updated to handle two separate cases. One where the resources have a
clusterRef
and one where they have areplicaClusterRef
. For each controller, nothing will change in the existingclusterRef
scenario. For areplicaClusterRef
, the implementation will be the same except for any interaction with NiFi via a REST client must now interact with every pod in the deployment. The controller should ensure that the request with each pod is successful:NifiDataflow
NifiUser
NifiUserGroup
NifiRegistryClient
NifiParameterContext
Deployment/StatefulSet Update Strategies
Because we're using k8s native Deployment and/or StatefulSets, we can take advantage of native Kubernetes update strategies.
There is also interest in implementing various upgrade strategies for unclustered NiFis. These are the scenarios we wish to support, perhaps through the use of finalizers:
Beta Was this translation helpful? Give feedback.
All reactions