Skip to content

jjti/go-spancheck

Repository files navigation

go-spancheck

Latest release ci Go Report Card MIT License

Checks usage of:

Example

spancheck -checks 'end,set-status,record-error' ./...
func _() error {
    // span.End is not called on all paths, possible memory leak
    // span.SetStatus is not called on all paths
    // span.RecordError is not called on all paths
    _, span := otel.Tracer("foo").Start(context.Background(), "bar")

    if true {
        // return can be reached without calling span.End
        // return can be reached without calling span.SetStatus
        // return can be reached without calling span.RecordError
        return errors.New("err")
    }

    return nil // return can be reached without calling span.End
}

Configuration

golangci-lint

Docs on configuring the linter are also available at https://golangci-lint.run/usage/linters/#spancheck:

linters:
  enable:
    - spancheck

linters-settings:
  spancheck:
    # Checks to enable.
    # Options include:
    # - `end`: check that `span.End()` is called
    # - `record-error`: check that `span.RecordError(err)` is called when an error is returned
    # - `set-status`: check that `span.SetStatus(codes.Error, msg)` is called when an error is returned
    # Default: ["end"]
    checks:
      - end
      - record-error
      - set-status
    # A list of regexes for function signatures that silence `record-error` and `set-status` reports
    # if found in the call path to a returned error.
    # https://github.com/jjti/go-spancheck#ignore-check-signatures
    # Default: []
    ignore-check-signatures:
      - "telemetry.RecordError"
    # A list of regexes for additional function signatures that create spans. This is useful if you have a utility
    # method to create spans. Each entry should be of the form <regex>:<telemetry-type>, where `telemetry-type`
    # can be `opentelemetry` or `opencensus`.
    # https://github.com/jjti/go-spancheck#extra-start-span-signatures
    # Default: []
    extra-start-span-signatures:
      - "github.com/user/repo/telemetry/trace.Start:opentelemetry"

CLI

To install the linter as a CLI:

go install github.com/jjti/go-spancheck/cmd/spancheck@latest
spancheck ./...

Only the span.End() check is enabled by default. The others can be enabled with -checks 'end,set-status,record-error'.

$ spancheck -h
...
Flags:
  -checks string
        comma-separated list of checks to enable (options: end, set-status, record-error) (default "end")
  -extra-start-span-signatures string
        comma-separated list of regex:telemetry-type for function signatures that indicate the start of a span
  -ignore-check-signatures string
        comma-separated list of regex for function signatures that disable checks on errors

Ignore Check Signatures

The span.SetStatus() and span.RecordError() checks warn when there is:

  1. a path to return statement
  2. that returns an error
  3. without a call (to SetStatus or RecordError, respectively)

But it's convenient to call SetStatus and RecordError from utility methods [1]. To support that, the ignore-*-check-signatures settings will suppress warnings if the configured function is present in the path.

For example, by default, the code below would have warnings as shown:

func task(ctx context.Context) error {
    ctx, span := otel.Tracer("foo").Start(ctx, "bar") // span.SetStatus is not called on all paths
    defer span.End()

    if err := subTask(ctx); err != nil {
        return recordErr(span, err) // return can be reached without calling span.SetStatus
    }

    return nil
}

func recordErr(span trace.Span, err error) error {
    span.SetStatus(codes.Error, err.Error())
    span.RecordError(err)
    return err
}

The warnings are can be ignored by setting -ignore-check-signatures flag to recordErr:

spancheck -checks 'end,set-status,record-error' -ignore-check-signatures 'recordErr' ./...

Extra Start Span Signatures

By default, Span creation will be tracked from calls to (go.opentelemetry.io/otel/trace.Tracer).Start, go.opencensus.io/trace.StartSpan, or go.opencensus.io/trace.StartSpanWithRemoteParent.

You can use the -extra-start-span-signatures flag to list additional Span creation functions. For all such functions:

  1. their Spans will be linted (for all enable checks)
  2. checks will be disabled (i.e. there is no linting of Spans within the creation functions)

You must pass a comma-separated list of regex patterns and the telemetry library corresponding to the returned Span. Each entry should be of the form <regex>:<telemetry-type>, where telemetry-type can be opentelemetry or opencensus. For example, if you have created a function named StartTrace in a telemetry package, using the go.opentelemetry.io/otel library, you can include this function for analysis like so:

spancheck -extra-start-span-signatures 'github.com/user/repo/telemetry/StartTrace:opentelemetry' ./...

Problem Statement

Tracing is a celebrated [1,2] and well marketed [3,4] pillar of observability. But self-instrumented tracing requires a lot of easy-to-forget boilerplate:

import (
	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/codes"
)

func task(ctx context.Context) error {
    ctx, span := otel.Tracer("foo").Start(ctx, "bar")
    defer span.End() // call `.End()`

    if err := subTask(ctx); err != nil {
        span.SetStatus(codes.Error, err.Error()) // call SetStatus(codes.Error, msg) to set status:error
        span.RecordError(err) // call RecordError(err) to record an error event
        return err
    }

    return nil
}

For spans to be really useful, developers need to:

  1. call span.End() always
  2. call span.SetStatus(codes.Error, msg) on error
  3. call span.RecordError(err) on error
  4. call span.SetAttributes() liberally

This linter helps developers with steps 1-3.

Checks

This linter supports three checks, each documented below. Only the check for span.End() is enabled by default. See Configuration for instructions on enabling the others.

span.End()

Enabled by default.

Not calling End can cause memory leaks and prevents spans from being closed.

Any Span that is created MUST also be ended. This is the responsibility of the user. Implementations of this API may leak memory or other resources if Spans are not ended.

source: trace.go

func task(ctx context.Context) error {
    otel.Tracer("app").Start(ctx, "foo") // span is unassigned, probable memory leak
    _, span := otel.Tracer().Start(ctx, "foo") // span.End is not called on all paths, possible memory leak
    return nil // return can be reached without calling span.End
}

span.SetStatus(codes.Error, "msg")

Disabled by default. Enable with -checks 'set-status'.

Developers should call SetStatus on spans. The status attribute is an important, first-class attribute:

  1. observability platforms and APMs differentiate "success" vs "failure" using span's status codes.
  2. telemetry collector agents, like the Open Telemetry Collector's Tail Sampling Processor, are configurable to sample Error spans at a higher rate than OK spans.
  3. observability platforms, like DataDog, have trace retention filters that use spans' status. In other words, status:error spans often receive special treatment with the assumption they are more useful for debugging. And forgetting to set the status can lead to spans, with useful debugging information, being dropped.
func _() error {
    _, span := otel.Tracer("foo").Start(context.Background(), "bar") // span.SetStatus is not called on all paths
    defer span.End()

    if err := subTask(); err != nil {
        span.RecordError(err)
        return errors.New(err) // return can be reached without calling span.SetStatus
    }

    return nil
}

OpenTelemetry docs: Set span status.

span.RecordError(err)

Disabled by default. Enable with -checks 'record-error'.

Calling RecordError creates a new exception-type event (structured log message) on the span. This is recommended to capture the error's stack trace.

func _() error {
    _, span := otel.Tracer("foo").Start(context.Background(), "bar") // span.RecordError is not called on all paths
    defer span.End()

    if err := subTask(); err != nil {
        span.SetStatus(codes.Error, err.Error())
        return errors.New(err) // return can be reached without calling span.RecordError
    }

    return nil
}

OpenTelemetry docs: Record errors.

Note: this check is not applied to OpenCensus spans because they have no RecordError method.

Attribution

This linter is the product of liberal copying of:

And the contributions of: