Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add id in logs in order to identify all the logs related to an error, to a request #1981

Closed
2 tasks done
bnjjj opened this issue Oct 21, 2022 · 5 comments · Fixed by #1982
Closed
2 tasks done

Add id in logs in order to identify all the logs related to an error, to a request #1981

bnjjj opened this issue Oct 21, 2022 · 5 comments · Fixed by #1982
Assignees

Comments

@bnjjj
Copy link
Contributor

bnjjj commented Oct 21, 2022

Here is a sub task from #1840

The original idea was to provide a way to identify logs related to an error received in a GraphQL response with for example an error_id. I think we can do better by introduce a request_id in logs instead of just an error_id. In that way people who wants to debug a behavior without an error would be able to easily debug just by grepping the request_id in the logs.

My proposal is to introduce request_id in fields for JSON formatter and directly in the text with of the TextFormatter (including nice formatting).

An addition to this experience we could introduce a way to directly ask to the router to return the request_id in the response thanks to GraphQL extensions to not limit it only for error cases. The way I imagine it is a configuration to indicate if we want to enable this debug mode and what kind of header should we pass to enable the ability to return the request_id.

An addition to this experience we could introduce a way to directly ask to the router to return the request_id in the response headers.

Something like

expose_request_id:
    enabled: true # Disabled by default
    header_name: "ROUTER_DEBUG_MODE" # Send ROUTER_DEBUG_MODE = true in headers to return the request_id in extensions

and have an answer like this:

{
  "data": {
    "topProducts": [
      {
        "name": "Table"
      },
      {
        "name": "Couch"
      },
      {
        "name": "Chair"
      }
    ]
  }
}

headers:

apollo_request_id: b060011c6c016070fc21a6c543dda554

And the formatted text log for example:

2022-10-21T14:49:11.557950Z ERROR [request_id=b060011c6c016070fc21a6c543dda554] apollo_router::query_planner::execution: Fetch error: HTTP fetch failed from 'accounts': HTTP fetch failed from 'accounts': error trying to connect: tcp connect error: Connection refused (os error 111)

TODO:

  • Rename to trace_id
  • Send it in apollo studio in response headers
@bnjjj bnjjj self-assigned this Oct 21, 2022
@garypen
Copy link
Contributor

garypen commented Oct 21, 2022

I like it. I wonder why have header control though? If it's good to have a request_id, why not just always return it as an extension? Security? Performance?

I still think we want error codes as well so we can easily classify errors and simplify downstream processing.

@bnjjj
Copy link
Contributor Author

bnjjj commented Oct 21, 2022

I don't know why not enabling it by default, it could just be a noise I think. But I don't have strong opinions.

For error codes we will still be able to classify errors thanks to the type field in extension which specifies the kind of error it is.

For example:

{
  "data": {},
  "errors": [
    {
      "message": "HTTP fetch failed from 'accounts': HTTP fetch failed from 'accounts': error trying to connect: tcp connect error: Connection refused (os error 111)",
      "path": [],
      "extensions": {
        "type": "SubrequestHttpError",
        "service": "accounts",
        "reason": "HTTP fetch failed from 'accounts': error trying to connect: tcp connect error: Connection refused (os error 111)",
        "request_id": "5e6a6bda8d0dca26e5aec14dafa6d96f"
      }
    }
  ]
}

@bnjjj
Copy link
Contributor Author

bnjjj commented Oct 21, 2022

However it could be also way more easy to always enable the return of request_id because I won't have to manually add it on every errors in the errors array...

@garypen
Copy link
Contributor

garypen commented Oct 21, 2022

Regarding error classification. I'm thinking about the use case where an admin is looking at server logs, not the use case where a client is looking at a response.

@bnjjj bnjjj mentioned this issue Oct 24, 2022
4 tasks
@bnjjj
Copy link
Contributor Author

bnjjj commented Oct 24, 2022

@garypen To make sure there isn't any misunderstandings could you provide a real world example of what you would like to have as an admin. Maybe it's already done, but I want to be sure I understand correctly.

BTW I made a changes in the way we will expose request_id. At the end I think it's definitely better to expose it in response headers instead of graphql extensions.

bnjjj added a commit that referenced this issue Oct 26, 2022
…1982)

close #1981 

It automatically adds a `trace_id` on logs to identify which log is
related to a specific request. Also adds `trace_id` to an example error
to have an example.

This PR also refactor the way we handled formatters before, this
refactor will let us be more flexible on which specific fields we want
to display in logs. For now it keeps the current behavior which do not
display any fields coming from current span and parent spans, only
fields written directly in the log macro.

Example of an error response:
```json
{
  "data": {},
  "errors": [
    {
      "message": "HTTP fetch failed from 'accounts': HTTP fetch failed from 'accounts': error trying to connect: tcp connect error: Connection refused (os error 111)",
      "path": [],
      "extensions": {
        "type": "SubrequestHttpError",
        "service": "accounts",
        "reason": "HTTP fetch failed from 'accounts': error trying to connect: tcp connect error: Connection refused (os error 111)"
      }
    }
  ]
}
```

with response header:
```
apollo_trace_id: 5e6a6bda8d0dca26e5aec14dafa6d96f
```

with related logs:

```logs
2022-10-21T15:17:45.562553Z ERROR [trace_id=5e6a6bda8d0dca26e5aec14dafa6d96f] apollo_router::services::subgraph_service: fetch_error="hyper::Error(Connect, ConnectError(\"tcp connect error\", Os { code: 111, kind: ConnectionRefused, message: \"Connection refused\" }))"
2022-10-21T15:17:45.565768Z ERROR [trace_id=5e6a6bda8d0dca26e5aec14dafa6d96f] apollo_router::query_planner::execution: Fetch error: HTTP fetch failed from 'accounts': HTTP fetch failed from 'accounts': error trying to connect: tcp connect error: Connection refused (os error 111)
```

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants