Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(instrumentation-undici): Add responseHook #2356

Merged
merged 10 commits into from
Aug 9, 2024

Conversation

timfish
Copy link
Contributor

@timfish timfish commented Jul 29, 2024

Which problem is this PR solving?

Short description of the changes

This PR adds a responseHook to the instrumentation config

@timfish timfish requested a review from a team July 29, 2024 15:34
Copy link
Contributor

@mydea mydea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great from my POV! Would be super helpful.

Copy link
Member

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

added few minors

plugins/node/instrumentation-undici/src/types.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@david-luna david-luna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @timfish

Thanks for contributing to the undici instrumentation. The PR looks very good :)

The new hook is executed upon undici:request:headers diagnostics channel publishes a message when headers are received but there is no guarantee the body is available yet. I wonder if responseHook name will let devs think they can inspect the whole response (body included).

plugins/node/instrumentation-undici/src/types.ts Outdated Show resolved Hide resolved
plugins/node/instrumentation-undici/test/fetch.test.ts Outdated Show resolved Hide resolved
@timfish
Copy link
Contributor Author

timfish commented Jul 31, 2024

I wonder if responseHook name will let devs think they can inspect the whole response (body included).

The http instrumentation uses a hook with the same name which is called before the body is received:
https://github.com/open-telemetry/opentelemetry-js/blob/30a46ae5471f9268c2cd241a805e8e68e3ab4f28/experimental/packages/opentelemetry-instrumentation-http/src/http.ts#L332

The readme docs for http say:

Function for adding custom attributes before response is handled

My description says:

Function for adding custom attributes after the response headers are received.

@blumamir
Copy link
Member

I wonder if responseHook name will let devs think they can inspect the whole response (body included).

The http instrumentation uses a hook with the same name which is called before the body is received: https://github.com/open-telemetry/opentelemetry-js/blob/30a46ae5471f9268c2cd241a805e8e68e3ab4f28/experimental/packages/opentelemetry-instrumentation-http/src/http.ts#L332

Although the readme docs for http say:

Function for adding custom attributes before response is handled

So maybe that is a better description than mine?

Function for adding custom attributes after the response headers are received.

I think hooks are very popular to record payloads. with http instrumentation, it is possible to collect the payload in the responseHook, but is it possible with undici? or will we need a different hook for accessing the response payload?

Copy link

codecov bot commented Jul 31, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.52%. Comparing base (dfb2dff) to head (47fa91f).
Report is 208 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2356      +/-   ##
==========================================
- Coverage   90.97%   90.52%   -0.46%     
==========================================
  Files         146      155       +9     
  Lines        7492     7459      -33     
  Branches     1502     1537      +35     
==========================================
- Hits         6816     6752      -64     
- Misses        676      707      +31     
Files Coverage Δ
plugins/node/instrumentation-undici/src/undici.ts 96.35% <100.00%> (ø)

... and 72 files with indirect coverage changes

@timfish
Copy link
Contributor Author

timfish commented Jul 31, 2024

I think hooks are very popular to record payloads. with http instrumentation, it is possible to collect the payload in the responseHook, but is it possible with undici?

How are they doing this? As far as I can see, the http instrumentation does not offer an easy way to access the response payload from outgoing requests.

It's possible to get the response buffers from undici via the undici:request:trailers diagnostic channel event which in the undici plugin currently just calls onDone to end the span. The downside to this event is that it doesn't send pay you the request object. It's nice to have a single event with both request and response at the same time so you can make more granular choices.

@trentm
Copy link
Contributor

trentm commented Jul 31, 2024

[Amir]

with http instrumentation, it is possible to collect the payload in the responseHook

Are you sure? I don't think it is. The responseHook is being called synchronously in the 'response' event handler (https://nodejs.org/api/http.html#event-response). The response body is not yet available at this point.

@timfish
Copy link
Contributor Author

timfish commented Jul 31, 2024

The body isn't available at that point, but you have access to the response which is a stream, so you could response.on('data', (d) => {}).

Is anyone actually doing this just access the resulting buffer in the responseHook? What are the use cases?

Copy link
Contributor

@trentm trentm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Waiting for a bit before merging to give @blumamir a chance to follow-up on earlier comments and @david-luna to review.

| [`startSpanHook`](https://github.com/open-telemetry/opentelemetry-js-contrib/blob/main/plugins/node/instrumentation-undici/src/types.ts#L67) | `StartSpanHookFunction` | Function for adding custom attributes before a span is started. |
| [`requireParentforSpans`](https://github.com/open-telemetry/opentelemetry-js-contrib/blob/main/plugins/node/instrumentation-undici/src/types.ts#L69) | `Boolean` | Require a parent span is present to create new span for outgoing requests. |
| [`headersToSpanAttributes`](https://github.com/open-telemetry/opentelemetry-js-contrib/blob/main/plugins/node/instrumentation-undici/src/types.ts#L71) | `Object` | List of case insensitive HTTP headers to convert to span attributes. Headers will be converted to span attributes in the form of `http.{request\|response}.header.header-name` where the name is only lowercased, e.g. `http.response.header.content-length` |
| [`ignoreRequestHook`](https://github.com/open-telemetry/opentelemetry-js-contrib/blob/main/plugins/node/instrumentation-undici/src/types.ts#L73) | `IgnoreRequestFunction` | Undici instrumentation will not trace all incoming requests that matched with custom function. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, these updates are to all the line numbers. Thanks for doing that.
In a separate change I think we should drop those links. It is too much of a maintenance burden.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

Copy link
Contributor

@david-luna david-luna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if we get questions/issues about body not being available in the hook we can always update the description and highlight it.

@trentm trentm merged commit 60a99c9 into open-telemetry:main Aug 9, 2024
21 checks passed
@dyladan dyladan mentioned this pull request Aug 9, 2024
@timfish timfish deleted the feat/undici-response-hook branch August 14, 2024 08:05
@trentm trentm mentioned this pull request Aug 20, 2024
@foklepoint
Copy link

foklepoint commented Oct 20, 2024

heyo!
Wanted to share that Rye needed undici response bodies in our traces. Throwing our implementation of using these newly added hooks (Thnks @timfish). It's not a great implementation by any standard and lots of jankiness trying to access body. But just in case someone wants to improve upon this or use it. Tried the path with the undici::trailers mentioned in this thread and just wanted to call out had a really hard time trying to get the body off that.

Drawbacks of the approach below - couldn't quite get the request body to be added to the trace spans and janky async added on hooks to make type system happy.

import { Span } from '@opentelemetry/api';
import { UndiciRequest, UndiciResponse } from '@opentelemetry/instrumentation-undici';
import stringify from 'fast-safe-stringify';
import * as zlib from 'zlib';

// Tees the body then restores it back to the original request object
async function readAndRestoreRequestBody(request: UndiciRequest) {
  const originalBody = request.body;
  const chunks: Uint8Array[] = [];

  for await (const chunk of originalBody) {
    chunks.push(chunk);
  }

  // Restore the body
  request.body = {
    async *[Symbol.asyncIterator]() {
      yield* chunks;
    },
  };

  return Buffer.concat(chunks).toString();
}

// Constants
const MAX_BODY_SIZE = 1024 * 1024; // 1MB

/**
 * Represents the options for the Undici instrumentation configuration.
 */
interface UndiciInstrumentationOptions {
  requestHeadersSpanAttrName?: string;
  responseHeaderSpanAttrName?: string;
  requestBodySpanAttrName?: string;
  responseBodySpanAttrName?: string;
}

/**
 * Represents the capture context for response data.
 */
interface CaptureContext {
  rawResponse: Buffer;
  span?: Span;
}

/**
 * Represents the handler for Undici instrumentation.
 */
class UndiciInstrumentationResponseBodyHandler {
  private context: CaptureContext;
  private responseHeaderSpanAttrName: string;
  private responseBodySpanAttrName: string;

  constructor(span: Span, responseHeaderSpanAttrName: string, responseBodySpanAttrName: string) {
    this.responseHeaderSpanAttrName = responseHeaderSpanAttrName;
    this.responseBodySpanAttrName = responseBodySpanAttrName;
    this.context = {
      rawResponse: Buffer.alloc(0),
      span,
    };
  }

  /**
   * Handles each data chunks from the response as they come in.
   */
  private handleChunk(chunk: Buffer): void {
    if (chunk.length > MAX_BODY_SIZE) {
      throw new Error(`Response body exceeded ${MAX_BODY_SIZE} bytes. Skipping further capturing.`);
    }
    this.context.rawResponse = Buffer.concat([this.context.rawResponse, chunk]);
  }

  /**
   * The overridden onData method to capture response chunks.
   */
  public onData(chunk: Buffer): void {
    try {
      this.handleChunk(chunk);
    } catch (error) {
      console.error(`Error in undici telemetry onData:`, error);
    }
  }

  /**
   * The overridden onFinally method to finalize capturing.
   */
  public onFinally(_error: Error | null): void {
    try {
      if (this.context.rawResponse.length === 0) {
        return;
      }
      const stringifiedBody = this.stringifiedResponseBody();
      this.context.span?.setAttribute(this.responseBodySpanAttrName, stringifiedBody);
    } catch (error) {
      console.error(`Error in undici telemetry onFinally:`, error);
    }
  }

  private stringifiedResponseBody(): string {
    // If not content-encoded, return utf8 of rawResponse buffer
    // Read response headers from the span, very hacky but it works. Unfortunatly the response object
    // is not available here because this class hooks into the requestHook, not the responseHook and it
    // has to do that so we can listen to onData and onFinally BEFORE the response is returned.
    if (!this.context.span) {
      throw new Error('Span is not set in the context.');
    }

    // eslint-disable-next-line @typescript-eslint/no-explicit-any
    const responseHeadersJson = (this.context.span as any).attributes[
      this.responseHeaderSpanAttrName
    ];
    if (!responseHeadersJson) {
      throw new Error('Response headers not found in span attributes.');
    }

    const responseHeaders: Record<string, string> = JSON.parse(responseHeadersJson);
    const contentEncoding = responseHeaders['content-encoding'];
    if (!contentEncoding) {
      return this.context.rawResponse.toString('utf8');
    }

    // Decompress the rawResponse buffer
    return this.decompressBody(this.context.rawResponse, contentEncoding);
  }

  private decompressBody(rawResponse: Buffer, contentEncoding: string): string {
    switch (contentEncoding) {
      case 'gzip':
      case 'x-gzip':
        return zlib.gunzipSync(rawResponse).toString('utf8');
      case 'br':
        return zlib.brotliDecompressSync(rawResponse).toString('utf8');
      case 'deflate':
        return zlib.inflateSync(rawResponse).toString('utf8');
      default:
        throw new Error(`Unsupported content encoding: ${contentEncoding}`);
    }
  }
}

/**
 * Converts an array of headers to a map/object.
 */
function headerArrayToMap(headers: Buffer[] | string[]): Record<string, string> {
  const headersMap: Record<string, string> = {};
  for (let i = 0; i < headers.length; i += 2) {
    const key = headers[i].toString().toLowerCase();
    const value = headers[i + 1].toString();
    headersMap[key] = value;
  }
  return headersMap;
}

/**
 * Factory function to get UndiciInstrumentation with custom hooks.
 */
export function getUndiciInstrumentationConfig(options: UndiciInstrumentationOptions = {}) {
  const {
    requestHeadersSpanAttrName = 'http.request.headers',
    responseHeaderSpanAttrName = 'http.response.headers',
    requestBodySpanAttrName = 'http.request.body',
    responseBodySpanAttrName = 'http.response.body',
  } = options;

  return {
    /**
     * Hook to handle requests.
     */
    requestHook: async (span: Span, request: UndiciRequest) => {
      try {
        const handler = new UndiciInstrumentationResponseBodyHandler(
          span,
          responseHeaderSpanAttrName,
          responseBodySpanAttrName,
        );
        const requestHeaders = headerArrayToMap(request.headers as string[]);
        span.setAttribute(requestHeadersSpanAttrName, stringify(requestHeaders));

        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        const originalOnData = (request as any).onData.bind(request);
        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        (request as any).onData = (chunk: Buffer) => {
          handler.onData(chunk);
          return originalOnData(chunk);
        };

        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        const originalOnFinally = (request as any).onFinally.bind(request);
        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        (request as any).onFinally = (error: Error | null) => {
          handler.onFinally(error);
          return originalOnFinally(error);
        };
      } catch (error) {
        console.error(`Error in undici telemetry requestHook:`, error);
      }
    },

    /**
     * Adds custom attributes to the span on response.
     */
    responseHook: async (
      span: Span,
      { request, response }: { request: UndiciRequest; response: UndiciResponse },
    ) => {
      try {
        const responseHeaders = headerArrayToMap(response.headers as Buffer[]);
        span.setAttribute(responseHeaderSpanAttrName, stringify(responseHeaders));

        // Since the response hook is not async we need to read the body AFTER we are done sending the request and have
        // received the response so we can be sure we aren't going to be consuming the body mid request. Doing so
        // leads to mismatched content-length errors.
        const requestBody = await readAndRestoreRequestBody(request);
        span.setAttribute(requestBodySpanAttrName, requestBody);
      } catch (error) {
        console.error(
          `Error in undici telemetry responseHook: Failed to set headers as span attributes:`,
          error,
        );
      }
    },
  };
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants