Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP2 module crashes and sometimes segfaults when running tests against it #21416

Closed
aaronjwood opened this issue Jun 19, 2018 · 11 comments
Closed
Labels
http2 Issues or PRs related to the http2 subsystem.

Comments

@aaronjwood
Copy link

aaronjwood commented Jun 19, 2018

  • Version: 10.3.0, 10.4.0, 10.4.1, 10.5.0, 10.6.0
  • Platform: Linux + OSX
  • Subsystem: HTTP2

We're using the http2 module to implement a GRPC proxy that just passes things through. When running tests against our proxy module we hit this crash:

node[466]: ../src/node_http2.cc:781:static int node::http2::Http2Session::OnHeaderCallback(nghttp2_session*, const nghttp2_frame*, nghttp2_rcbuf*, nghttp2_rcbuf*, uint8_t, void*): Assertion `(stream) != (nullptr)' failed.
 1: node::Abort() [node]
 2: 0x876c55 [node]
 3: node::http2::Http2Session::OnHeaderCallback(nghttp2_session*, nghttp2_frame const*, nghttp2_rcbuf*, nghttp2_rcbuf*, unsigned char, void*) [node]
 4: nghttp2_session_mem_recv [node]
 5: node::http2::Http2Session::OnStreamRead(long, uv_buf_t const&) [node]
 6: 0x90a15e [node]
 7: 0x97b349 [node]
 8: 0x97b968 [node]
 9: 0x981848 [node]
10: uv_run [node]
11: node::Start(uv_loop_s*, int, char const* const*, int, char const* const*) [node]
12: node::Start(int, char**) [node]
13: __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
14: 0x845c25 [node]
Aborted (core dumped)

and occasionally we hit a bare segfault:

Segmentation fault (core dumped)

The test is being run inside a ubuntu 16.04 container. Some of our tests throw a lot of data at it such as these:

    test("It medium ping 400k msg",  () => {
        // loop polling for service to be ready
        return new Promise((resolve, reject)=> {
            cli.Echo({sequence: 1210, msg: "test".repeat(100000)}, function (err, resp) {
                expect(err).toBe(null);
                expect(resp.status).toBe(0);
                expect(resp.sequence).toBe("1210");
                expect(resp.msg).toBe("test".repeat(100000));
                resolve();
            });
        });
    });
    test("It concurrent ping 10k threads",  () => {
        // loop polling for service to be ready
        let PLIST = []
        // 100 parallel pings
        for (let i = 0; i < 1000; i=i+1) {
            PLIST.push(new Promise((resolve, reject)=> {
                cli.Echo({sequence: i, msg: "test"+(10000 - i)}, function (err, resp) {
                    expect(err).toBe(null);
                    expect(resp.status).toBe(0);
                    expect(resp.sequence).toBe("" + i);
                    expect(resp.msg).toBe("test"+(10000 - i));
                    resolve();
                });
            }))
        }
        return Promise.all(PLIST);
    });
    test("It concurrent ping 100 thread 4k msg",  () => {
        // loop polling for service to be ready
        let PLIST = []
        // 100 parallel pings
        for (let i = 0; i < 100; i=i+1) {
            PLIST.push(new Promise((resolve, reject)=> {
                cli.Echo({sequence: i, msg: "test".repeat(1000)}, function (err, resp) {
                    expect(err).toBe(null);
                    expect(resp.status).toBe(0);
                    expect(resp.sequence).toBe("" + i);
                    expect(resp.msg).toBe("test".repeat(1000));
                    resolve();
                });
            }))
        }
        return Promise.all(PLIST);
    });
@richardlau
Copy link
Member

@apapirovski Is this related to the hypothetical situation fixed by #21194?

@richardlau richardlau added the http2 Issues or PRs related to the http2 subsystem. label Jun 20, 2018
@apapirovski
Copy link
Member

@richardlau Not sure. There's a few http2 fixes going into the next release, I think.

@aaronjwood Could you test against 10.5 when it comes out?

@aaronjwood
Copy link
Author

Absolutely.

@addaleax
Copy link
Member

10.5.0 is out, and the security release in 10.4.1 also had some fixes relevant to HTTP2 – might be good to try again now?

@aaronjwood
Copy link
Author

Still seems to be a sporadic issue with 10.5.0:

node[37254]: ../src/node_http2.cc:795:static int node::http2::Http2Session::OnHeaderCallback(nghttp2_session *, const nghttp2_frame *, nghttp2_rcbuf *, nghttp2_rcbuf *, uint8_t, void *): Assertion `(stream) != nullptr' failed.
 1: 0x1000334aa node::Abort() [/usr/local/bin/node]
 2: 0x100032587 node::MakeCallback(v8::Isolate*, v8::Local<v8::Object>, char const*, int, v8::Local<v8::Value>*, node::async_context) [/usr/local/bin/node]
 3: 0x10006b458 node::http2::Http2Stream::AddHeader(nghttp2_rcbuf*, nghttp2_rcbuf*, unsigned char) [/usr/local/bin/node]
 4: 0x10090504a nghttp2_session_mem_recv [/usr/local/bin/node]
 5: 0x10006b1a6 node::http2::Http2Session::Write(uv_buf_t const*, unsigned long) [/usr/local/bin/node]
 6: 0x10006dd51 node::http2::Http2Session::OnStreamRead(long, uv_buf_t const&) [/usr/local/bin/node]
 7: 0x1000b38b7 node::LibuvStreamWrap::OnUvRead(long, uv_buf_t const*) [/usr/local/bin/node]
 8: 0x100767d5b uv__stream_io [/usr/local/bin/node]
 9: 0x10076f7e0 uv__io_poll [/usr/local/bin/node]
10: 0x100760576 uv_run [/usr/local/bin/node]
11: 0x10003c2d9 node::Start(v8::Isolate*, node::IsolateData*, int, char const* const*, int, char const* const*) [/usr/local/bin/node]
12: 0x10003ba38 node::Start(uv_loop_s*, int, char const* const*, int, char const* const*) [/usr/local/bin/node]
13: 0x10003b6c0 node::Start(int, char**) [/usr/local/bin/node]
14: 0x100003034 start [/usr/local/bin/node]
15: 0x6 
Abort trap: 6

@aaronjwood
Copy link
Author

This seems to be fixed with 10.6.0, at least on OSX. Let me confirm it works in our Linux CI environment...

@aaronjwood
Copy link
Author

Yup, seems to be fixed in general. Related to 5a71e7941d 3ba9a445de in the 10.6.0 release?

@jasnell
Copy link
Member

jasnell commented Jul 18, 2018

I believe this particular issue is fixed, but there's still at least one bug lurking in there. I spotted a segfault running tests on 10.6.0 yesterday but haven't had a chance to chase it down yet.

@aaronjwood
Copy link
Author

Guess I spoke too soon :(

/usr/bin/node[509]: ../src/node_http2.cc:884:static int node::http2::Http2Session::OnHeaderCallback(nghttp2_session*, const nghttp2_frame*, nghttp2_rcbuf*, nghttp2_rcbuf*, uint8_t, void*): Assertion `(stream) != nullptr' failed.
 1: 0x89c2f0 node::Abort() [/usr/bin/node]
 2: 0x89c3d5  [/usr/bin/node]
 3: 0x8f144e node::http2::Http2Session::OnHeaderCallback(nghttp2_session*, nghttp2_frame const*, nghttp2_rcbuf*, nghttp2_rcbuf*, unsigned char, void*) [/usr/bin/node]
 4: 0x15e5b48 nghttp2_session_mem_recv [/usr/bin/node]
 5: 0x8f0ef4 node::http2::Http2Session::OnStreamRead(long, uv_buf_t const&) [/usr/bin/node]
 6: 0x94427e  [/usr/bin/node]
 7: 0x9ca4a9  [/usr/bin/node]
 8: 0x9caac8  [/usr/bin/node]
 9: 0x9d09c8  [/usr/bin/node]
10: 0x9bfcab uv_run [/usr/bin/node]
11: 0x8a6b9d node::Start(v8::Isolate*, node::IsolateData*, int, char const* const*, int, char const* const*) [/usr/bin/node]
12: 0x8a5e36 node::Start(int, char**) [/usr/bin/node]
13: 0x7f40c0335830 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
14: 0x863a95  [/usr/bin/node]

@jasnell
Copy link
Member

jasnell commented Aug 10, 2018

I believe this is being caused by a race condition. When the stream is closed by the receiving end, it is removed from the local map. However, the sender may still be transmitting header data. I've got a potential fix coming, but this has been a bit difficult to reliably verify in a test case.

jasnell added a commit to jasnell/node that referenced this issue Aug 10, 2018
@gdams gdams closed this as completed in ac92a42 Aug 13, 2018
rvagg pushed a commit that referenced this issue Aug 15, 2018
Fixes: #21416

PR-URL: #22256
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
Reviewed-By: George Adams <george.adams@uk.ibm.com>
@aaronjwood
Copy link
Author

Awesome, thanks!

kjin pushed a commit to kjin/node that referenced this issue Sep 25, 2018
Fixes: nodejs#21416

PR-URL: nodejs#22256
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
Reviewed-By: George Adams <george.adams@uk.ibm.com>
kjin pushed a commit to kjin/node that referenced this issue Oct 16, 2018
Fixes: nodejs#21416

PR-URL: nodejs#22256
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
Reviewed-By: George Adams <george.adams@uk.ibm.com>
BethGriggs pushed a commit that referenced this issue Oct 17, 2018
Fixes: #21416

Backport-PR-URL: #22850
PR-URL: #22256
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
Reviewed-By: George Adams <george.adams@uk.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http2 Issues or PRs related to the http2 subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants