Skip to content

Latest commit

 

History

History
163 lines (108 loc) · 6.47 KB

reducer.md

File metadata and controls

163 lines (108 loc) · 6.47 KB

Reducer

The role of the reducer is to combine information received from eBPF collectors into metrics. Metrics are aggregated across time and across dimensions. The output is in the form of timestamped data points, which can be stored in a time-series database.

Running

To list all the available command-line options:

$ reducer --help

By default, reducer will write its logging output to /var/log/ebpf_net.log. If that file is not writable, reducer will fail with an error message.

The EBPF_NET_LOG_FILE_PATH environment variable can be used to control the log file location:

$ EBPF_NET_LOG_FILE_PATH=/tmp/reducer.log reducer

To disable writing a log file and instead output logging messages to the console:

$ reducer --no-log-file --log-console

Reducer will normally listen for incoming connections from collectors on port 8000. To select a different port, use the --port command-line parameter:

$ reducer --port=9000

Make sure to use the same port number when running the collectors (the EBPF_NET_INTAKE_PORT environment variable).

Prometheus output

By default, reducer will make metrics available for scraping in the Prometheus format. Unless specified with the --prom command-line parameter, reducer will run a HTTP server on 127.0.0.1:7010. To use a different port, or to make scraping externally accessible:

$ reducer --prom=0.0.0.0:7010

Now an external Prometheus instance will be able to access it. Example Prometheus scraping configuration:

scrape_configs:
- job_name: 'opentelemetry-ebpf-reducer'
  static_configs:
  - targets:
    - '192.168.0.101:7010'

OpenTelemetry Protocol (OTLP) over gRPC

To send metrics using OTLP over gRPC, e.g. to an OpenTelemetry collector:

$ reducer --disable-prometheus-metrics --enable-otlp-grpc-metrics --otlp-grpc-metrics-host=192.168.0.212

The default OTLP-over-gRPC port is 4317. To send to a different port the --otlp-grpc-metrics-port command-line parameter is used.

To conserve bandwidth, metric descriptions are not sent in messages. To enable sending metric descriptions use the --enable-otlp-grpc-metric-descriptions command-line parameter.

Choosing metrics

It is possible to select which metrics are generated by using the --disable-metrics and --enable-metrics command-line parameters. Both parameters accept a comma-separated list of metrics to enable or disable. The --enable-metrics parameter has precedence.

Metrics are grouped into four groups: tcp, udp, dns and http. To refer to all metrics within a group, <group>.all can be used.

For example, to disable all TCP metrics except the number of bytes transferred:

$ reducer --disable-metrics=tcp.all --enable-metrics=tcp.bytes

Feature flags

Certain features are disabled by default, but can be enabled using a command-line flag:

  • --enable-aws-enrichment: Enables enrichment using AWS metadata received from the Cloud Collector. A Cloud Collector instance needs to be running and connected to the reducer for this feature to work.
  • --enable-autonomous-system-ip: Enables using IP addresses for autonomous systems (AS). This feature requires loading a GeoIP database by specifying its path using the GEOIP_PATH environment variable.
  • --enable-id-id: Enables id-id time-series generation. The id-id time-series carry the lowest-level information but are of the greatest volume and cardinality, so are disabled by default.
  • --enable-az-id: Enables az-id time-series generation.

If id-id time-series generation is enabled, the --disable-node-ip-field command-line parameter can be used to disable the IP address dimension. In some cases this can greatly reduce the cardinality of the id-id time-series.

Scaling

At present, scaling the reducer is a manual try-and-see task. The reducer runs a data processing pipeline separated into three stages – ingest, matching and aggregation. Each stage can be scaled individually using the --num-ingest-shards, --num-matching-shards and --num-aggregation-shards command-line parameters.

Usually, the best approach is to scale all the stages by the same factor. Keep in mind that each shard consumes a certain amount of memory, whether it is heavily loaded or not.

Internal metrics

Internal metrics (also known as stats) are time-series that show information on reducer and collectors performance.

If OTLP over gRPC is enabled (with the --enable-otlp-grpc-metrics flag), then internal metrics are sent to the same OTLP receiver as are normal metrics, using ebpf_net as the group name.

If OTLP over gRPC is not enabled, internal metrics will be published in Prometheus format. By default, reducer will run a HTTP server on port 0.0.0.0:7010 where internal metrics can be scraped. The --internal-prom command-line parameter can be used to change the bind address and port number.

Selecting which internal metrics are generated is also done using the --disable-metrics and --enable-metrics command-line parameters. Internal metrics are contained in the ebpf_net group. For example, to enable just the "up" internal metric:

$ reducer --disable-metrics=ebpf_net.all --enable-metrics=ebpf_net.up

Logging and debugging

There are six logging levels: trace, debug, info, warning, error and critical. Minimum logging level can be set using the --<level> command-line parameter, e.g. --debug. The default logging level is info.

Most of the debug- and trace-level logging is not generated if not explicitly enabled. Enabling it is done on a subsystem and component basis:

  • --log-whitelist-client-type: Enables logging based on collector type, e.g. kernel, cloud, k8s.
  • --log-whitelist-node-resolution-type: Enables logging based on resolution type.
  • --log-whitelist-channel: Enables logging for various communication channel types, e.g. tcp, upstream, reconnecting_channel.
  • --log-whitelist-ingest: Enables logging for the ingest stage of the data-processing pipeline.
  • --log-whitelist-matching: Enables logging for the matching stage of the data-processing pipeline.

The --log-whitelist-all command-line flag will enable logging for all components and subsystems.

Environment variables

  • EBPF_NET_DATA_DIR: Directory in which the program will read and potentially write data files. If not specified the current working directory will be used.
  • EBPF_NET_LOG_FILE_PATH: Location of the file in which logging messages are written. Default value is /var/log/ebpf_net.log.
  • GEOIP_PATH: Location of the geolocation database file.