Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I debug a RST_STREAM with code PROTOCOL_ERROR response on an http2 stream? #3422

Closed
murgatroid99 opened this issue Jun 22, 2021 · 24 comments

Comments

@murgatroid99
Copy link

  • Node.js Version: 16.3.0
  • OS: Linux
  • Scope (install, code, runtime, meta, other?): runtime
  • Module (and version) (if relevant): http2

I am trying to debug a scenario in which a Node server is responding to an http2 request with an RST_STREAM with the code PROTOCOL_ERROR. This is happening without calling any application code on the server side, so the response seems to be generated by the http2 module itself. Setting NODE_DEBUG=http2 environment variable on the server just shows the following, which does not seem to show anything relevant:

NET 778668: onconnection
STREAM 778668: read 0
STREAM 778668: need readable false
STREAM 778668: length less than watermark true
STREAM 778668: do read
NET 778668: _read
NET 778668: Socket._handle.readStart
HTTP2 778668: Http2Session server: received a connection
HTTP2 778668: Http2Session server: setting up session handle
HTTP2 778668: Http2Session server: sending settings
HTTP2 778668: Http2Session server: submitting settings
HTTP2 778668: Http2Session server: created
(node:778668) Warning: Setting the NODE_DEBUG environment variable to 'http2' can expose sensitive data (such as passwords, tokens and authentication headers) in the resulting log.
HTTP2 778668: Http2Session server: settings received

What else can I do to determine what protocol error the server is seeing to cause it to respond that way?

@bl-ue
Copy link

bl-ue commented Jun 22, 2021

cc @jasnell

@travisghansen
Copy link

Now is a great time to get a response to this before I tear my whole testing environment down.

@murgatroid99
Copy link
Author

I want to add that I suspect that the root cause of the error in question is a bug in Node: I think Node may be rejecting some valid HTTP/2 traffic. But we cannot determine if that is the case or provide any details about that bug until we can determine exactly what it is that Node is rejecting.

@travisghansen
Copy link

Anyone? Is the correct forum for submitting bug reports?

@travisghansen
Copy link

Where is the appropriate place to get this filed as a bug so it will actually get looked at?

@RafaelGSS
Copy link
Member

Hey @murgatroid99, I'm looking into it, but, I can't reproduce it locally. Could you create a reproductible code? Also, what NODE_DEBUG_NATIVE=http2 tells you?

@murgatroid99
Copy link
Author

The question here is: if a Node server is closing an http2 session with a protocol error, how can I determine why the server sent it, or in other words, what protocol constraint was violated, so that whatever behavior is incorrect can be corrected? Or, alternatively, to determine that it is returning a protocol error incorrectly? Is NODE_DEBUG_NATIVE=http2 supposed to provide that information?

@RafaelGSS
Copy link
Member

It may provide that information. Those flags should not be enabled in production environments though.

In fact, PROTOCOL_ERROR is used when a more specific code is not available, according to the RFC 7540 - RST_STREAM:

RST_STREAM frames MUST be associated with a stream. If a RST_STREAM
frame is received with a stream identifier of 0x0, the recipient MUST
treat this as a connection error (Section 5.4.1) of type
PROTOCOL_ERROR.

RST_STREAM frames MUST NOT be sent for a stream in the "idle" state.
If a RST_STREAM frame identifying an idle stream is received, the
recipient MUST treat this as a connection error (Section 5.4.1) of
type PROTOCOL_ERROR.

@murgatroid99
Copy link
Author

That is not what the quoted parts of the spec say. They describe specific situations in which a PROTOCOL_ERROR should be sent. They are also both referring to connection errors, which result in a GOAWAY with that code, not a RST_STREAM.

The user who originally filed the issue said

NODE_DEBUG_NATIVE=http2 provided no additional output in my setup.

@travisghansen
Copy link

travisghansen commented Mar 5, 2022

I'm now attempting to migrate another project to the purejs grpc code (a csi driver for k8s/nomad/etc) and am getting a similar issue when deployed in k8s:

I0305 17:19:06.153301       1 main.go:166] Version: v2.4.0
I0305 17:19:06.153468       1 main.go:167] Running node-driver-registrar in mode=registration
I0305 17:19:06.155552       1 main.go:191] Attempting to open a gRPC connection with: "/csi-data/csi.sock"
I0305 17:19:06.155651       1 connection.go:154] Connecting to unix:///csi-data/csi.sock
I0305 17:19:06.157526       1 main.go:198] Calling CSI driver to discover driver name
I0305 17:19:06.157601       1 connection.go:183] GRPC call: /csi.v1.Identity/GetPluginInfo
I0305 17:19:06.157620       1 connection.go:184] GRPC request: {}
I0305 17:19:06.169008       1 connection.go:186] GRPC response: {"name":"org.democratic-csi.nfs","vendor_version":"1.5.5"}
I0305 17:19:06.169245       1 connection.go:187] GRPC error: <nil>
I0305 17:19:06.169277       1 main.go:208] CSI driver name: "org.democratic-csi.nfs"
I0305 17:19:06.169548       1 node_register.go:53] Starting Registration Server at: /registration/org.democratic-csi.nfs-reg.sock
I0305 17:19:06.170355       1 node_register.go:62] Registration Server started at: /registration/org.democratic-csi.nfs-reg.sock
I0305 17:19:06.171047       1 node_register.go:92] Skipping HTTP server because endpoint is set to: ""
I0305 17:19:08.175405       1 main.go:102] Received GetInfo call: &InfoRequest{}
I0305 17:19:08.177074       1 main.go:109] "Kubelet registration probe created" path="/var/lib/kubelet/plugins/org.democratic-csi.nfs/registration"
I0305 17:19:08.211273       1 main.go:120] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:false,Error:RegisterPlugin error -- plugin registration failed with err: rpc error: code = Internal desc = stream terminated by RST_STREAM with error code: PROTOCOL_ERROR,}
E0305 17:19:08.211373       1 main.go:122] Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: rpc error: code = Internal desc = stream terminated by RST_STREAM with error code: PROTOCOL_ERROR, restarting registration container.

Of particular note, the 'clients' to the grpc service are all written in golang (both for the k8s bits, and the envoy package). There are several binaries in k8s/csi that interact with the server and some of them seem to work fine others not.

Also of note, with the csi project as invoked by k8s resulted in an issue I don't have when using the c-based node grpc server and also I didn't experience the issue with several other clients. The issue was that the call object had some cyclic references requiring me to cleanse the object before logging it with JSON.stringify(). I'm not sure why some clients result in a cyclic object somewhere in the call object and others not, but I thought it may be relevant here.

@RafaelGSS
Copy link
Member

Well, I have tested the PROTOCOL_ERROR in a very basic example from Node core: https://github.com/nodejs/node/blob/master/test/parallel/test-http2-multi-content-length.js#L62
and using the NODE_DEBUG_NATIVE=http2 I got a "reason":

Http2Session server (5) receiving 70 bytes [wants data? 1]
Http2Session server (5) complete frame received: type: 4
Http2Session server (5) beginning headers for stream 1
Http2Session server (5) invalid frame received (0/1000), code: -532
Http2Session server (5) complete frame received: type: 4

Indeed, without a reproducible code, I can't help with this issue. At least, it sounds like a feature request to add a "cause" inside the error event.

@travisghansen
Copy link

To reproduce the issue requires setting up k8s/csi or envoy in an appropriate fashion to invoke a grpc service backed by nodejs. It's possible but has a fair amount of setup to get to that state. I'll see if I can put a repo together that will simplify that a bit.

Does node need to be built with debug symbols or anything of that nature to get the extra output? I tried that env var and didn't seem to see anything addition being spewed out, but I'll try again.

@travisghansen
Copy link

travisghansen commented Mar 6, 2022

OK, after some debugging I feel quite stupid (or I lost my mind somewhere along the way). I currently have 2 scenarios resulting in the RST_STREAM response:

envoy invoking grpc (this was the original issue)

I'm not entirely sure what transpired here, but I have this working currently after realizing that the c-based grpc package binds 0.0.0.0 as:

tcp6       0      0 :::9002                 :::*                    LISTEN      2239465/node

and @grpc/grpc-js binds as:

tcp        0      0 0.0.0.0:9002            0.0.0.0:*               LISTEN      2358602/node

When I fired up envoy in debug mode now I realized it was translating localhost to :: and so I forced envoy to connect to 127.0.0.1 and traffic started to flow. I ran into some other minor issues but ultimately got envoy working with the @grpc/grpc-js implementation.

This is very strange as reviewing the notes from last year it appears requests were actually making it to node in some degree, but I can no longer get any activity when using @grpc/grpc-js + localhost :(

In short, the RST may have been simply because it was trying to connect to an address that wasn't bound.

csi invoking over a uds

This is a new issue I've just bumped into while trying to use the @grpc/grpc-js code in conjunction with csi. localhost / tcp6 issues do not apply here as csi runs entirely over unix socket files. Using your NODE_DEBUG_NATIVE=http2 suggestion I have been able to gather the following:

Http2Session server (39) session created
Http2Session server (39) i/o stream consumed
Http2Session server (39) scheduling write
Http2Session server (39) sending pending data
Http2Session server (39) nghttp2 has 9 bytes to send
Http2Session server (39) wants read? 1
Http2Session server (39) receiving 394 bytes, offset 0
Http2Session server (39) receiving 394 bytes [wants data? 1]
Http2Session server (39) complete frame received: type: 4
Http2Session server (39) complete frame received: type: 8
Http2Session server (39) complete frame received: type: 6
Http2Session server (39) beginning headers for stream 1
/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node[2350548]: ../src/debug_utils-inl.h:76:std::string node::SPrintFImpl(const char*, Arg&&, Args&& ...) [with Arg = long unsigned int&; Args = {const char*&}; std::string = std::basic_string<char>]: Assertion `(p) != nullptr' failed.
 1: 0xa1a640 node::Abort() [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
 2: 0xa1a6be  [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
 3: 0x96c8df std::string node::SPrintFImpl<unsigned long&, char const*&>(char const*, unsigned long&, char const*&) [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
 4: 0x96c996 std::string node::SPrintFImpl<unsigned long&, char const*&>(char const*, unsigned long&, char const*&) [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
 5: 0x96cbb2 void node::UnconditionalAsyncWrapDebug<unsigned long&, char const*&>(node::AsyncWrap*, char const*, unsigned long&, char const*&) [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
 6: 0xa52555 node::http2::Http2Session::OnNghttpError(nghttp2_session*, char const*, unsigned long, void*) [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
 7: 0x18ae876  [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
 8: 0x18b9478 nghttp2_session_mem_recv [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
 9: 0xa54877 node::http2::Http2Session::ConsumeHTTP2Data() [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
10: 0xa54c9e node::http2::Http2Session::OnStreamRead(long, uv_buf_t const&) [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
11: 0xaf54e1  [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
12: 0x1380009  [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
13: 0x1380630  [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
14: 0x1386f55  [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
15: 0x13746df uv_run [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
16: 0xa5d726 node::NodeMainInstance::Run() [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
17: 0x9eab6c node::Start(int, char**) [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
18: 0x7fd02ff0bb25 __libc_start_main [/usr/lib/libc.so.6]
19: 0x982005  [/home/thansen/.asdf/installs/nodejs/lts-erbium/bin/node]
./dev/start.sh: line 41: 2350547 Aborted                 sudo -E ./bin/democratic-csi --log-level "${LOG_LEVEL}" --driver-config-file "./dev/${CONFIG}" --csi-version "${CSI_VERSION}" --csi-name "${CSI_NAME}" --server-address "0.0.0.0" --server-port "50051" --server-socket /tmp/csi.sock

When using the grpc package (for the server) I do not receive this error no matter what client I use (go, grpc or @grpc/grpc-js.

So:

grpc server + grpc (grpc-uds package to be exact, for both server and client) client I can connect to the endpoint with the following endpoints:

  • unix:///tmp/csi.sock
  • /tmp/csi.sock

grpc server + @grpc/grpc-js client I can connect with the following endpoints:

  • unix:///tmp/csi.sock
  • /tmp/csi.sock fails with Name resolution failed for target dns:/tmp/csi.sock

@grpc/grpc-js server + grpc client I can connect with the following endpoints:

  • /tmp/csi.sock fails with the trace above
  • unix:///tmp/csi.sock works as expected

@grpc/grpc-js server + @grpc/grpc-js client I can connect with the following endpoints:

  • unix:///tmp/csi.sock
  • /tmp/csi.sock fails with Name resolution failed for target dns:/tmp/csi.sock

So, it seems there's something with processing the beginning headers for stream 1 or whatever comes immediately after it the blows up when the endpoint/host does not include unix://. This explains why some of the k8s bits fail because the go client may be behaving the same as the grpc package when given an endpoint in the form of /foo/bar without the unix:// prefix. I'm not sure if adding the prefix to the go client connection works but doing so would be highly irregular syntax in that world.

takeaways

  • the difference in bind behavior between the grpc package and @grpc/grpc-js should probably be documented (perhaps aligned?)
  • @grpc/grpc-js should probably default to adding unix:// to the server/client when the path looks like a uds (or does not look like a host:port). maybe this should be left to higher-level code?
  • the trace above should probably not crash node (I guess it only does this when the debug var is set), and maybe the logic should be slightly relaxed
    • I have produced the above trace/crash manually with the grpc client
    • I have produced the above trace/crash as triggered by some k8s bits (kubelet) against both node 12 and 16
  • it would be great if errors of this nature could be caught in some shape or form
  • it would be great if errors of this nature included more descriptive reasoning

@RafaelGSS
Copy link
Member

As I said previously, without a minimal reproducible code, I'm not able to help.

I know that some issues are very hard to reproduce easily, but, regardless of the http2 client, you should be able to reproduce it In any client.

@travisghansen
Copy link

Sure. I’m going to attempt to address that but there are several other items that can be acted/commented upon in the mean time (catchable error, better descriptions, etc).

The stack trace has actionable info as well no?

@travisghansen
Copy link

OK, it's a bit messy because node v12 (I use asdf to switch around versions) is required for the grpc-uds package, but this should be relatively easy:

# get code
git clone https://github.com/democratic-csi/democratic-csi.git
cd democratic-csi
git checkout next

# install deps, ensure using node v12 here (grpc-uds does not work with newer node versions)
npm i

# start the server (at this point, you may use whatever node version you want here as `grpc-uds` will not be used)
NODE_DEBUG_NATIVE=http2 GRPC_IMPLEMENTATION="@grpc/grpc-js" ./bin/democratic-csi --csi-version=1.5.0 --csi-name=org.democratic-csi.nfs --driver-config-file=./examples/local-hostpath.yaml --log-level=debug --csi-mode=node --server-socket=/tmp/csi.sock

# in another terminal (must be done with node 12)
comment ~ line 60 `csiAddress = "unix://" + csiAddress;` in `bin/liveness-probe`
GRPC_IMPLEMENTATION="grpc-uds" ./bin/liveness-probe --csi-version 1.5.0 --csi-address=/tmp/csi.sock

@RafaelGSS
Copy link
Member

The stack trace has actionable info as well no?

Not really, because I can't figure out what was the cause of the above error.


The above code throws in v12, but, looks fixed in v16:

info: starting csi server - name: org.democratic-csi.nfs, version: 1.5.5, driver: local-hostpath, mode: node, csi version: 1.5.0, address: , socket: unix:///tmp/csi.sock
Http2Session server (38) session created
Http2Session server (38) i/o stream consumed
Http2Session server (38) scheduling write
Http2Session server (38) sending pending data
Http2Session server (38) nghttp2 has 9 bytes to send
Http2Session server (38) wants read? 1
Http2Session server (38) receiving 394 bytes, offset 0
Http2Session server (38) receiving 394 bytes [wants data? 1]
Http2Session server (38) complete frame received: type: 4
Http2Session server (38) complete frame received: type: 8
Http2Session server (38) complete frame received: type: 6
Http2Session server (38) beginning headers for stream 1
Http2Session server (38) Error 'Invalid HTTP header field was received: frame type: 1, stream: 1, name: [:authority], value: [/tmp/csi.sock]'
Http2Session server (38) invalid frame received (0/1000), code: -531
Http2Session server (38) complete frame received: type: 8
Http2Session server (38) complete frame received: type: 8
Http2Session server (38) sending pending data
Http2Session server (38) nghttp2 has 9 bytes to send
Http2Session server (38) nghttp2 has 17 bytes to send
Http2Session server (38) stream 1 closed with code: 1
HttpStream 1 (41) [Http2Session server (38)] closed with code 1
HttpStream 1 (41) [Http2Session server (38)] destroying stream
Http2Session server (38) nghttp2 has 13 bytes to send
Http2Session server (38) wants read? 1
Http2Session server (38) wants read? 1
HttpStream 1 (41) [Http2Session server (38)] tearing down stream
Http2Session server (38) receiving -4095 bytes, offset 0
Http2Session server (38) submitting goaway
Http2Session server (38) scheduling write
Http2Session server (38) destroying session
Http2Session server (38) closing session
Http2Session server (38) make done session callback
Http2Session server (38) sending pending data
Http2Session server (38) freeing nghttp2 session
Http2Session server (38) Error 'Invalid HTTP header field was received: frame type: 1, stream: 1, name: [:authority], value: [/tmp/csi.sock]'

@travisghansen
Copy link

The trace above was running with v12? What am I doing wrong that makes it abort/exit before getting the rest of the output after the beginning headers line?

I’ll do further debugging but with v16 I’ve been getting the same behavior as v12 where it hits the error and exits (when the debug env var is set). If I can figure out how to make it not exit while spewing that debugging data I can possibly figure out what the error is from the k8s client.

@bobaaaaa
Copy link

bobaaaaa commented Mar 6, 2022

@travisghansen maybe some docker images + docker-compose would help here?

@travisghansen
Copy link

There are docker images for the project, but I think we're pretty well at the bottom of it. It would be nice to have some mechanism to catch these. Looking around it appears to be a client issue and it's unlikely any changes will be made to node (beyond hopefully emitting some sort of event/error as mentioned).

I have been able to confirm this is the exact issue hitting my k8s install now:

Http2Session server (58) session created
Http2Session server (58) i/o stream consumed
Http2Session server (58) scheduling write
Http2Session server (58) sending pending data
Http2Session server (58) nghttp2 has 9 bytes to send
Http2Session server (58) wants read? 1
Http2Session server (58) receiving 33 bytes, offset 0
Http2Session server (58) receiving 33 bytes [wants data? 1]
Http2Session server (58) complete frame received: type: 4
Http2Session server (58) sending pending data
Http2Session server (58) nghttp2 has 9 bytes to send
Http2Session server (58) wants read? 1
Http2Session server (58) wants read? 1
Http2Session server (58) receiving 9 bytes, offset 0
Http2Session server (58) receiving 9 bytes [wants data? 1]
Http2Session server (58) complete frame received: type: 4
Http2Session server (58) settings refreshed for session
Http2Session server (58) sending pending data
Http2Session server (58) wants read? 1
Http2Session server (58) receiving 140 bytes, offset 0
Http2Session server (58) receiving 140 bytes [wants data? 1]
Http2Session server (58) beginning headers for stream 1
Http2Session server (58) Error 'Invalid HTTP header field was received: frame type: 1, stream: 1, name: [:authority], value: [/var/lib/kubelet/plugins/org.democratic-csi.nfs/csi.sock]'
Http2Session server (58) invalid frame received (0/1000), code: -531
Http2Session server (58) sending pending data
Http2Session server (58) stream 1 closed with code: 1
HttpStream 1 (61) [Http2Session server (58)] closed with code 1
HttpStream 1 (61) [Http2Session server (58)] destroying stream
Http2Session server (58) nghttp2 has 13 bytes to send
Http2Session server (58) wants read? 1
Http2Session server (58) wants read? 1
HttpStream 1 (61) [Http2Session server (58)] tearing down stream
Http2Session server (58) receiving -4095 bytes, offset 0
Http2Session server (58) submitting goaway
Http2Session server (58) scheduling write
Http2Session server (58) destroying session
Http2Session server (58) closing session
Http2Session server (58) make done session callback
Http2Session server (58) sending pending data
Http2Session server (34) receiving -4095 bytes, offset 0
Http2Session server (34) submitting goaway
Http2Session server (34) scheduling write
Http2Session server (34) destroying session
Http2Session server (34) closing session
Http2Session server (34) make done session callback
Http2Session server (34) sending pending data
Http2Session server (34) freeing nghttp2 session
Http2Session server (58) freeing nghttp2 session

It's just a giant pain because there are so many clients accessing the same socket so we have to (im)patiently wait for the whole ecosystem to move along (which clearly isn't happening very fast) :(

@mcollina mcollina closed this as completed Mar 7, 2022
@travisghansen
Copy link

I take it we have no intention of surfacing these errors?

@RafaelGSS
Copy link
Member

We certainly have. Nonetheless, without a minimal reproducible code, we are not able to help. If you still are having issues, create a new issue as a feature request and ping me if you want. I'd be happy to help!

@travisghansen
Copy link

Just to make sure we’re clear, I’m referring to the invalid header error being bubbled up (and any others that would sever the connection before hitting higher-level code). Are we referring to the same thing?

@RafaelGSS
Copy link
Member

Yes, It looks like a feature request, right? I'm assuming that you would like to get this information in the HTTP2 server somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants