Skip to content

Commit

Permalink
feat(apps): add otel exporter for graphql, service and web-api (#1528)
Browse files Browse the repository at this point in the history
<!--- Provide a general summary of your changes in the Title above -->

## Description

<!--- Describe your changes in detail -->

- Sets up local OTEL setup that match the OTEL configuration in Azure
Container Apps
- Added fusioncache telemetry
- Added Entity Framework telemetry
- Missing liveMetrics if we want that. Needs to be considered. Other
than that, the most relevant traces are pulled out from the AzureMonitor
package.
- Metrics are only visible locally for now. Turns out that the Azure
Monitor Workspace has a Prometheus instance, but it does not allow for
us sending metrics to it, as it does not have an OTEL endpoint 🙃
Solution here was adding the MetricsMonitor to send metrics directly to
app insights for now..!
- Will add logging in the next PR

To see your metrics, spin up the OTEL services by running
`docker-compose-otel.yml`. The service should start sending to the OTLP
collector automatically.

Example of a trace in the local Jaeger:

![CleanShot 2024-12-09 at 17 52
11@2x](https://github.com/user-attachments/assets/295eba27-84e8-4735-9a0e-be4f2fcfed9c)

## Related Issue(s)

- #1465 

## Verification

- [ ] **Your** code builds clean without any errors or warnings
- [ ] Manual testing done (required)
- [ ] Relevant automated test added (if you find this hard, leave it and
we'll help out)

## Documentation

- [ ] Documentation is updated (either in `docs`-directory, Altinnpedia
or a separate linked PR in
[altinn-studio-docs.](https://github.com/Altinn/altinn-studio-docs), if
applicable)

---------

Co-authored-by: Ole Jørgen Skogstad <skogstad@softis.net>
  • Loading branch information
arealmaas and oskogstad authored Dec 16, 2024
1 parent 198f735 commit cb9238e
Show file tree
Hide file tree
Showing 22 changed files with 1,217 additions and 43 deletions.
5 changes: 5 additions & 0 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,8 @@ POSTGRES_DB=dialogporten
DB_CONNECTION_STRING=Server=dialogporten-postgres;Port=5432;Database=${POSTGRES_DB};User ID=${POSTGRES_USER};Password=${POSTGRES_PASSWORD};

COMPOSE_PROJECT_NAME=digdir

# OTEL
OTEL_NAMESPACE=dialogporten-local
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
62 changes: 62 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,68 @@ Besides ordinary unit and integration tests, there are test suites for both func

See `tests/k6/README.md` for more information.

## Health Checks

The project includes integrated health checks that are exposed through standard endpoints:
- `/health/startup` - Dependency checks
- `/health/liveness` - Self checks
- `/health/readiness` - Critical service checks
- `/health` - General health status
- `/health/deep` - Comprehensive health check including external services

These health checks are integrated with Azure Container Apps' health probe system and are used to monitor the application's health status.

## Observability with OpenTelemetry

This project uses OpenTelemetry for distributed tracing and metrics collection. The setup includes:

### Core Features
- Distributed tracing across services
- Runtime and application metrics
- Integration with Azure Monitor/Application Insights
- Support for both OTLP and Azure Monitor exporters
- Automatic instrumentation for:
- ASP.NET Core
- HTTP clients
- Entity Framework Core
- PostgreSQL
- FusionCache

### Configuration

OpenTelemetry is configured through environment variables that are automatically provided by Azure Container Apps in production environments:

```json
{
"OTEL_SERVICE_NAME": "your-service-name",
"OTEL_EXPORTER_OTLP_ENDPOINT": "http://your-collector:4317",
"OTEL_EXPORTER_OTLP_PROTOCOL": "grpc",
"OTEL_RESOURCE_ATTRIBUTES": "key1=value1,key2=value2",
"APPLICATIONINSIGHTS_CONNECTION_STRING": "your-connection-string"
}
```

### Local Development

For local development, the project includes a docker-compose setup with:
- OpenTelemetry Collector
- Grafana
- Other supporting services

To run the local observability stack:
```bash
podman compose -f docker-compose-otel.yml up
```

### Request Filtering

The telemetry setup includes smart filtering to:
- Exclude health check endpoints from tracing
- Filter out duplicate traces from Azure SDK clients
- Only record relevant HTTP client calls

For more details about the OpenTelemetry setup, see the `ConfigureTelemetry` method in `AspNetUtilitiesExtensions.cs`.

## Updating the SDK in global.json
When RenovateBot updates `global.json` or base image versions in Dockerfiles, make sure they match.
The `global.json` file should always have the same SDK version as the base image in the Dockerfiles.
Expand Down
45 changes: 45 additions & 0 deletions docker-compose-otel.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
services:
# OpenTelemetry Collector
otel-collector:
image: otel/opentelemetry-collector-contrib:0.114.0
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./local-otel-configuration/otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP http receiver
- "8888:8888" # Prometheus metrics exposed by the collector
- "8889:8889" # Prometheus exporter metrics
depends_on:
jaeger:
condition: service_healthy

# Jaeger for trace visualization
jaeger:
image: jaegertracing/all-in-one:1.64.0
ports:
- "16686:16686" # Jaeger UI
- "14250:14250" # Model used by collector
environment:
- COLLECTOR_OTLP_ENABLED=true

# Prometheus for metrics
prometheus:
image: prom/prometheus:v3.0.1
volumes:
- ./local-otel-configuration/prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"

# Grafana for metrics visualization
grafana:
image: grafana/grafana:11.4.0
ports:
- "3000:3000"
environment:
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
volumes:
- ./local-otel-configuration/grafana-datasources.yml:/etc/grafana/provisioning/datasources/datasources.yml
- ./local-otel-configuration/grafana-dashboards.yml:/etc/grafana/provisioning/dashboards/dashboards.yml
- ./local-otel-configuration/dashboards:/etc/grafana/provisioning/dashboards
11 changes: 10 additions & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
include:
- docker-compose-db-redis.yml
- docker-compose-cdc.yml
- docker-compose-otel.yml

services:
dialogporten-webapi-ingress:
Expand All @@ -14,7 +15,7 @@ services:
restart: always

dialogporten-webapi:
scale: 2
scale: 1
build:
context: .
dockerfile: src/Digdir.Domain.Dialogporten.WebApi/Dockerfile
Expand All @@ -34,6 +35,10 @@ services:
- Serilog__MinimumLevel__Default=Debug
- ASPNETCORE_URLS=http://+:8080
- ASPNETCORE_ENVIRONMENT=Development
- OTEL_EXPORTER_OTLP_ENDPOINT=${OTEL_EXPORTER_OTLP_ENDPOINT}
- OTEL_EXPORTER_OTLP_PROTOCOL=${OTEL_EXPORTER_OTLP_PROTOCOL}
- OTEL_SERVICE_NAME=dialogporten-webapi
- OTEL_RESOURCE_ATTRIBUTES=service.instance.id=dialogporten-webapi,service.namespace=${OTEL_NAMESPACE}
volumes:
- ./.aspnet/https:/https

Expand Down Expand Up @@ -70,5 +75,9 @@ services:
- Serilog__WriteTo__0__Name=Console
- Serilog__MinimumLevel__Default=Debug
- ASPNETCORE_ENVIRONMENT=Development
- OTEL_EXPORTER_OTLP_ENDPOINT=${OTEL_EXPORTER_OTLP_ENDPOINT}
- OTEL_EXPORTER_OTLP_PROTOCOL=${OTEL_EXPORTER_OTLP_PROTOCOL}
- OTEL_SERVICE_NAME=dialogporten-graphql
- OTEL_RESOURCE_ATTRIBUTES=service.instance.id=dialogporten-graphql,service.namespace=${OTEL_NAMESPACE}
volumes:
- ./.aspnet/https:/https
176 changes: 176 additions & 0 deletions local-otel-configuration/dashboards/aspnet-core-metrics.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
{
"annotations": {
"list": []
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"links": [],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "prometheus",
"uid": "Prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 20,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "smooth",
"lineWidth": 2,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"title": "HTTP Request Duration",
"type": "timeseries",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "Prometheus"
},
"expr": "sum(rate(dialogporten_http_server_request_duration_seconds_bucket[$__rate_interval])) by (le)",
"legendFormat": "{{le}}",
"refId": "A"
}
]
},
{
"datasource": {
"type": "prometheus",
"uid": "Prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"id": 2,
"options": {
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"showThresholdLabels": false,
"showThresholdMarkers": true
},
"pluginVersion": "10.2.0",
"title": "Active Requests",
"type": "gauge",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "Prometheus"
},
"expr": "dialogporten_http_server_active_requests",
"refId": "A"
}
]
}
],
"refresh": "5s",
"schemaVersion": 38,
"style": "dark",
"tags": [],
"templating": {
"list": []
},
"time": {
"from": "now-15m",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "ASP.NET Core Metrics",
"uid": "aspnet-core-metrics",
"version": 1,
"weekStart": ""
}
Empty file.
Loading

0 comments on commit cb9238e

Please sign in to comment.