-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collectors self-telemetry pipelines. #1431
Conversation
…for source metrics
…head. Adding otelcol_processor_acceptes_spans to our processor
Maybe worth to add an option to disable the metrics via flag/odigos-config field? |
k8sutils/pkg/consts/consts.go
Outdated
// Label used to identify the Odigos pod which is acting as a node collector. | ||
OdigosNodeCollectorLabel = "odigos.io/data-collection" | ||
// Label used to identify the Odigos pod which is acting as a cluster collector. | ||
OdigosClusterCollectorLabel = "odigos.io/gateway" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structure before this change was:
- chunk of things related to cluster collector
- chunk of things related to node collector
Do we want to change it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted to separate to node and cluster collector
|
||
func (p *dataSizesMetricsProcessor) processTraces(ctx context.Context, td ptrace.Traces) (ptrace.Traces, error) { | ||
if p.samplingFraction != 0 && rand.Float64() < p.samplingFraction { | ||
p.traceSize.Add(ctx, int64(p.tracesSizer.TracesSize(td)) * p.inverseSamplingFraction, metric.WithAttributes(p.traceAttributes(td)...)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.tracesSizer.TracesSize(td)
needs to encode the traces to proto.
Can it have performance hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the reason for the SamplingRation
config of the processor
// ResourceAttributesKeys is a list of resource attributes keys that will be used to add labels for the metrics. | ||
ResourceAttributesKeys []string `mapstructure:"res_attributes_keys"` | ||
|
||
// SamplingRatio is the ratio of payloads that are measured. Values between 0.0 and 1.0 are valid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the ratio of payloads that are measured"? perhaps add the motivation for why to use it and how to choose the value for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the comment here PTAL
@@ -138,6 +140,13 @@ func startHTTPServer(flags *Flags) (*gin.Engine, error) { | |||
apis.PUT("/actions/types/RenameAttribute/:id", func(c *gin.Context) { actions.UpdateRenameAttribute(c, flags.Namespace, c.Param("id")) }) | |||
apis.DELETE("/actions/types/RenameAttribute/:id", func(c *gin.Context) { actions.DeleteRenameAttribute(c, flags.Namespace, c.Param("id")) }) | |||
|
|||
// Metrics | |||
apis.GET("/metrics/namespace/:namespace/kind/:kind/name/:name", func(c *gin.Context) { endpoints.GetSingleSourceMetrics(c, odigosMetrics) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be under /metrics/source
? to make it explicit
tracesDataSent int64 | ||
logsDataSent int64 | ||
metricsDataSent int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are the units? bytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, added comments
return err | ||
} | ||
|
||
if strings.HasPrefix(senderPod, k8sconsts.OdigosNodeCollectorDaemonSetName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think relying on the pod name to classify it as node / cluster is risky.
A more robust way would be to populate another resource attribute that record the role of the collector which produced the data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, however, we also use the pod name as an identifier for the collector - so it has 2 goals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Few last nits :)
k8sutils/pkg/consts/consts.go
Outdated
const ( | ||
// Cluster collector is responsible for exporting observability data from the cluster. | ||
ClusterCollector CollectorType = "cluster" | ||
// Node collector is receiving data from different instrumentation SDKs in the same node. | ||
NodeCollector CollectorType = "node" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, because we already have this here:
// +kubebuilder:validation:Enum=CLUSTER_GATEWAY;NODE_COLLECTOR
type CollectorsGroupRole string
const (
CollectorsGroupRoleClusterGateway CollectorsGroupRole = "CLUSTER_GATEWAY"
CollectorsGroupRoleNodeCollector CollectorsGroupRole = "NODE_COLLECTOR"
)
and this here:
if (cg.Spec.Role == odigosv1.CollectorsGroupRoleNodeCollector || cg.Spec.Role == "DATA_COLLECTION") && cg.Status.Ready {
Should we use the same names? and should we share the enum everywhere? now or later
Send collector metrics from node and cluster collectors.
The goal is to have metrics about the data sent and throughput from sources and to destinations.
The main parts of this PR are:
odigostrafficmetrics
which is a processor that additional collects metrics to those that are already provided by the collector. This processor can be configured to attach different attributes to the metrics. In addition, it is possible to configure the processor to perform its measurements on a fraction of the spans/metrics/logs.collectormetrics
package in the UI server which acts as an OTLP receiver, and exposes endpoints for the UI.