-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-555 NETOBSERV-552 CRD updates - many breaking changes #169
Conversation
You can check just the first commit (9599fd4) to review separately the FLP reconciler refactoring. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, great job! I have few comments
@@ -4,12 +4,19 @@ metadata: | |||
name: cluster | |||
spec: | |||
namespace: netobserv | |||
deploymentType: DIRECT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name deploymentType
sounds like a bit ambiguous to me. Maybe communication
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not fan of deploymentType
, but maybe not communication
either .. let's get more ideas :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about optionnalComponents
where we can list Kafka
but also Prometheus
and even Loki
in the future if it become optional ?
It will be very flexible and would allow any new options.
If we really want to be explicit, we can simply use components
with values like Agent-Processor
and Agent-Kafka-Processor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some ideas: wiring
, connect
, messaging
, linking
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transfer
...
@@ -4,12 +4,19 @@ metadata: | |||
name: cluster | |||
spec: | |||
namespace: netobserv | |||
deploymentType: DIRECT | |||
agent: | |||
type: EBPF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% related to this task, but I was thinking in the case where a customer would like to override some properties of the eBPF agent, which is the only supported agent for downstream.
They should provide a yaml fragment like:
agent:
ebpf:
sampling: 1
Being ebpf
the only section they should use, wouldn't it be cleaner just to:
- remove agent.type
- move all the ebpf stuff as properties of the
agent
section - try to find a "hidden" place for
ipfix
section that would allow overriding the eBPF agent without making our users to "pay" the price of a discriminated union.
Another option could be to move as properties of agent
these properties that are common to all the agents (sampling, cacheMaxLenght, cacheMaxTimeout...) and keep some implementation details as subsections of ebpf
(like the image, the image pull policy, etc...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Commenting here for the third time because I am reading this too fast...)
My understanding is we are finally keeping both agent downstream but we are going to document clearly that only the ebpf agent is officially supported.
But may be we could move to the agent section the fields that are relevent to both agent:
agent:
sampling: 1
ebpf:
...
ipfix:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are not shared fields, and I don't think we want make them shared, for instance because we want to have different default values (eBPF supports higher sampling rate than IPFIX for instance)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The feedback we had during the CRD API review was also to not use boolean in the API, are we going to change this part?
I only reviewed the CRD change not the impacted code. I will continue the review tomorrow.
api/v1alpha1/flowcollector_types.go
Outdated
//+kubebuilder:default:=1 | ||
// consumerReplicas defines the number of replicas (pods) to start for flowlogs-pipeline-transformer, which consumes Kafka messages. | ||
// This setting is ignored when Kafka is disabled. | ||
ConsumerReplicas int32 `json:"consumerReplicas,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the logic of moving the field here, but I think it would be better to let it in the FLP section:
- while this fields only apply when kafka is enabled, this field apply to FLP. May be we could name it better in FLP like "KafkaConsumerReplicas"
- we would be more consitant with the console plugin section which also has this kind of field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
api/v1alpha1/flowcollector_types.go
Outdated
// consumerAutoscaler spec of a horizontal pod autoscaler to set up for flowlogs-pipeline-transformer, which consumes Kafka messages. | ||
// This setting is ignored when Kafka is disabled. | ||
// +optional | ||
ConsumerAutoscaler *FlowCollectorHPA `json:"consumerAutoscaler,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here
api/v1alpha1/flowcollector_types.go
Outdated
AgentIPFIX = "IPFIX" | ||
AgentEBPF = "EBPF" | ||
DeploymentTypeDirect = "DIRECT" | ||
DeploymentTypeKafka = "KAFKA" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call it DeploymentModelSimple and DeploymentModelKafka?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need to organize a vote, because everyone has different ideas :)
api/v1alpha1/flowcollector_types.go
Outdated
//+kubebuilder:default:=1 | ||
// consumerReplicas defines the number of replicas (pods) to start for flowlogs-pipeline-transformer, which consumes Kafka messages. | ||
// This setting is ignored when Kafka is disabled. | ||
ConsumerReplicas int32 `json:"consumerReplicas,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest FLPConsumerReplicas
api/v1alpha1/flowcollector_types.go
Outdated
// consumerAutoscaler spec of a horizontal pod autoscaler to set up for flowlogs-pipeline-transformer, which consumes Kafka messages. | ||
// This setting is ignored when Kafka is disabled. | ||
// +optional | ||
ConsumerAutoscaler *FlowCollectorHPA `json:"consumerAutoscaler,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure you want to change HPA to Autoscaler everywhere? There are VPA (Vertical Pod Autoscaler) so HPA makes it more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can rollback it, but it was to avoid acronyms that maybe not everybody knows. and the full size "horizontalPodAutoscaler" is.. very long :/
cases as it offers better performances and should work regardless | ||
of the CNI installed on the cluster. "IPFIX" works with OVN-Kubernetes | ||
CNI (other CNIs could work if they support exporting IPFIX, | ||
but they would require manual configuration). | ||
enum: | ||
- IPFIX | ||
- EBPF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the order in the UI dropdown, then put EBPF first.
to start for flowlogs-pipeline-transformer, which consumes Kafka | ||
messages. This setting is ignored when Kafka is disabled. | ||
format: int32 | ||
minimum: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be 1?
enable: | ||
default: false | ||
description: enable TLS | ||
type: boolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like TLS has a similar situation with "enable" as Kafka. Might be okay though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll check the remaining booleans in a follow-up
I'll review the boolean fields as @OlivierCazade and @stleerh pointed out |
Decoupling the inner "single reconcilers" into different files and structs: - Main entry point is still flp_reconciler.go, which now only initialises and calls inner reconcilers - ingest/transfo/monolith inner reconcilers do their own stuff in a decoupled way, with their own "*_objects" files - "flp_common_objects" is like a common lib for the specialized objects files
Main items: - NETOBSERV-555: "spec.flowlogsPipeline.kind" is removed, assuming DaemonSet - processor.replicas/hpa are now kafka.consumerReplicas/consumerAutoscaler (hpa renamed autoscaler, also for console) - NETOBSERV-552: "ovn-kubernetes" and "cno" options moved into IPFIX section - IPFIX documented as a legacy option - NETOBSERV-544: remove kafka.enable option. Instead, use a top-level enum for deployment mode: DIRECT or KAFKA Smaller changes: - "flp.prometheus" option renamed to "flp.metricsServer", trying to be more agnositc (it's actually openmetrics/opentelemetry standard) and not make user thinks it actually communicates with prometheus - Refactored the main controller tests to adapt to new deployment options - Make "FetchAll" logs less verbose
- move consumer replicas/hpa to FLP section rather than kafka - s/DeploymentType/DeploymentModel
71d42f7
to
b106ca9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A minor change suggestion
Co-authored-by: Mario Macias <mmaciasl@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
/ok-to-test |
New image: ["quay.io/netobserv/network-observability-operator:eeb83b9"]. It will expire after two weeks. |
New changes are detected. LGTM label has been removed. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jotak The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
update cli title
Main items:
DaemonSet
kafka.consumerReplicas/consumerAutoscaler (hpa renamed autoscaler,
also for console)
enum for deployment mode: DIRECT or KAFKA
Smaller changes:
more agnositc (it's actually openmetrics/opentelemetry standard) and
not make user thinks it actually communicates with prometheus
options