Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(apps): add otel exporter for graphql, service and web-api #1528

Merged
merged 37 commits into from
Dec 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
4e66bdc
feat(web-api): add OTEL exporter
arealmaas Nov 25, 2024
4a032b3
cleanup
arealmaas Nov 25, 2024
a1cd82f
cleanup
arealmaas Nov 27, 2024
b92ca9f
Merge branch 'main' into feat/add-otel-exporter
arealmaas Nov 28, 2024
2b7530d
clean up
arealmaas Nov 28, 2024
f224c95
clean up
arealmaas Nov 28, 2024
3780a6b
add dashboard
arealmaas Nov 28, 2024
2e5c7a4
clean up le dashboards and improve dashboards
arealmaas Nov 28, 2024
cf4a658
clean up
arealmaas Nov 28, 2024
0f6d127
cleanup
arealmaas Nov 28, 2024
c2d6d74
cleanup
arealmaas Nov 28, 2024
b718cbe
add azure traces
arealmaas Dec 2, 2024
380c922
Merge branch 'main' into feat/add-otel-exporter
arealmaas Dec 9, 2024
1fded53
finish up
arealmaas Dec 9, 2024
6d51b05
cleanup
arealmaas Dec 9, 2024
da54e00
cleanup
arealmaas Dec 9, 2024
a08580f
cleanup
arealmaas Dec 9, 2024
d0cd2d7
cleanup
arealmaas Dec 9, 2024
539cf2d
ensure tests run successfully
arealmaas Dec 10, 2024
7b5a3b0
cleanup
arealmaas Dec 10, 2024
f30cc58
cleanup
arealmaas Dec 10, 2024
5200455
add fallback to service name
arealmaas Dec 10, 2024
59b8c77
cleanup
arealmaas Dec 10, 2024
d101149
refactor resource attributes
arealmaas Dec 10, 2024
ad6852e
yayyay
arealmaas Dec 10, 2024
df8825f
yayyay
arealmaas Dec 10, 2024
ab2bedb
yayyay
arealmaas Dec 10, 2024
f11e024
chore: otel suggestions (#1586)
oskogstad Dec 11, 2024
e6130a0
Merge branch 'main' into feat/add-otel-exporter
oskogstad Dec 13, 2024
07fa1f5
lele
arealmaas Dec 13, 2024
864f074
banan
arealmaas Dec 16, 2024
998fc9d
env variables
arealmaas Dec 16, 2024
9024680
remove log
arealmaas Dec 16, 2024
63eb75a
use correct protocol
arealmaas Dec 16, 2024
73f2c0b
muurge
arealmaas Dec 16, 2024
db3d5fa
add otel docker image versions
arealmaas Dec 16, 2024
c80d833
cleanup
arealmaas Dec 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,8 @@ POSTGRES_DB=dialogporten
DB_CONNECTION_STRING=Server=dialogporten-postgres;Port=5432;Database=${POSTGRES_DB};User ID=${POSTGRES_USER};Password=${POSTGRES_PASSWORD};

COMPOSE_PROJECT_NAME=digdir

# OTEL
OTEL_NAMESPACE=dialogporten-local
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
62 changes: 62 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,68 @@ Besides ordinary unit and integration tests, there are test suites for both func

See `tests/k6/README.md` for more information.

## Health Checks

The project includes integrated health checks that are exposed through standard endpoints:
- `/health/startup` - Dependency checks
- `/health/liveness` - Self checks
- `/health/readiness` - Critical service checks
- `/health` - General health status
- `/health/deep` - Comprehensive health check including external services

These health checks are integrated with Azure Container Apps' health probe system and are used to monitor the application's health status.

## Observability with OpenTelemetry

This project uses OpenTelemetry for distributed tracing and metrics collection. The setup includes:

### Core Features
- Distributed tracing across services
- Runtime and application metrics
- Integration with Azure Monitor/Application Insights
- Support for both OTLP and Azure Monitor exporters
- Automatic instrumentation for:
- ASP.NET Core
- HTTP clients
- Entity Framework Core
- PostgreSQL
- FusionCache

### Configuration

OpenTelemetry is configured through environment variables that are automatically provided by Azure Container Apps in production environments:

```json
{
"OTEL_SERVICE_NAME": "your-service-name",
"OTEL_EXPORTER_OTLP_ENDPOINT": "http://your-collector:4317",
"OTEL_EXPORTER_OTLP_PROTOCOL": "grpc",
"OTEL_RESOURCE_ATTRIBUTES": "key1=value1,key2=value2",
"APPLICATIONINSIGHTS_CONNECTION_STRING": "your-connection-string"
}
```

### Local Development

For local development, the project includes a docker-compose setup with:
- OpenTelemetry Collector
- Grafana
- Other supporting services

To run the local observability stack:
```bash
podman compose -f docker-compose-otel.yml up
```

### Request Filtering

The telemetry setup includes smart filtering to:
- Exclude health check endpoints from tracing
- Filter out duplicate traces from Azure SDK clients
- Only record relevant HTTP client calls

For more details about the OpenTelemetry setup, see the `ConfigureTelemetry` method in `AspNetUtilitiesExtensions.cs`.

## Updating the SDK in global.json
When RenovateBot updates `global.json` or base image versions in Dockerfiles, make sure they match.
The `global.json` file should always have the same SDK version as the base image in the Dockerfiles.
Expand Down
45 changes: 45 additions & 0 deletions docker-compose-otel.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
services:
# OpenTelemetry Collector
otel-collector:
image: otel/opentelemetry-collector-contrib:0.114.0
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./local-otel-configuration/otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP http receiver
- "8888:8888" # Prometheus metrics exposed by the collector
- "8889:8889" # Prometheus exporter metrics
depends_on:
jaeger:
condition: service_healthy

# Jaeger for trace visualization
jaeger:
image: jaegertracing/all-in-one:1.64.0
ports:
- "16686:16686" # Jaeger UI
- "14250:14250" # Model used by collector
environment:
- COLLECTOR_OTLP_ENABLED=true

# Prometheus for metrics
prometheus:
image: prom/prometheus:v3.0.1
volumes:
- ./local-otel-configuration/prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"

# Grafana for metrics visualization
grafana:
image: grafana/grafana:11.4.0
ports:
- "3000:3000"
environment:
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
arealmaas marked this conversation as resolved.
Show resolved Hide resolved
volumes:
- ./local-otel-configuration/grafana-datasources.yml:/etc/grafana/provisioning/datasources/datasources.yml
- ./local-otel-configuration/grafana-dashboards.yml:/etc/grafana/provisioning/dashboards/dashboards.yml
- ./local-otel-configuration/dashboards:/etc/grafana/provisioning/dashboards
11 changes: 10 additions & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
include:
- docker-compose-db-redis.yml
- docker-compose-cdc.yml
- docker-compose-otel.yml

services:
dialogporten-webapi-ingress:
Expand All @@ -14,7 +15,7 @@ services:
restart: always

dialogporten-webapi:
scale: 2
scale: 1
build:
context: .
dockerfile: src/Digdir.Domain.Dialogporten.WebApi/Dockerfile
Expand All @@ -34,6 +35,10 @@ services:
- Serilog__MinimumLevel__Default=Debug
- ASPNETCORE_URLS=http://+:8080
- ASPNETCORE_ENVIRONMENT=Development
- OTEL_EXPORTER_OTLP_ENDPOINT=${OTEL_EXPORTER_OTLP_ENDPOINT}
- OTEL_EXPORTER_OTLP_PROTOCOL=${OTEL_EXPORTER_OTLP_PROTOCOL}
- OTEL_SERVICE_NAME=dialogporten-webapi
- OTEL_RESOURCE_ATTRIBUTES=service.instance.id=dialogporten-webapi,service.namespace=${OTEL_NAMESPACE}
volumes:
- ./.aspnet/https:/https

Expand Down Expand Up @@ -70,5 +75,9 @@ services:
- Serilog__WriteTo__0__Name=Console
- Serilog__MinimumLevel__Default=Debug
- ASPNETCORE_ENVIRONMENT=Development
- OTEL_EXPORTER_OTLP_ENDPOINT=${OTEL_EXPORTER_OTLP_ENDPOINT}
- OTEL_EXPORTER_OTLP_PROTOCOL=${OTEL_EXPORTER_OTLP_PROTOCOL}
- OTEL_SERVICE_NAME=dialogporten-graphql
- OTEL_RESOURCE_ATTRIBUTES=service.instance.id=dialogporten-graphql,service.namespace=${OTEL_NAMESPACE}
volumes:
- ./.aspnet/https:/https
176 changes: 176 additions & 0 deletions local-otel-configuration/dashboards/aspnet-core-metrics.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
{
"annotations": {
"list": []
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"links": [],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "prometheus",
"uid": "Prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 20,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "smooth",
"lineWidth": 2,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"title": "HTTP Request Duration",
"type": "timeseries",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "Prometheus"
},
"expr": "sum(rate(dialogporten_http_server_request_duration_seconds_bucket[$__rate_interval])) by (le)",
"legendFormat": "{{le}}",
"refId": "A"
}
]
},
{
"datasource": {
"type": "prometheus",
"uid": "Prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"id": 2,
"options": {
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"showThresholdLabels": false,
"showThresholdMarkers": true
},
"pluginVersion": "10.2.0",
"title": "Active Requests",
"type": "gauge",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "Prometheus"
},
"expr": "dialogporten_http_server_active_requests",
"refId": "A"
}
]
}
],
"refresh": "5s",
"schemaVersion": 38,
"style": "dark",
"tags": [],
"templating": {
"list": []
},
"time": {
"from": "now-15m",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "ASP.NET Core Metrics",
"uid": "aspnet-core-metrics",
"version": 1,
"weekStart": ""
}
Empty file.
Loading
Loading