Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling stream.end() in callback of stream.write() can leave ClientHttp2Session in an invalid state #49147

Open
wwilfinger opened this issue Aug 13, 2023 · 9 comments
Labels
http2 Issues or PRs related to the http2 subsystem.

Comments

@wwilfinger
Copy link

Version

v18.16.1

Platform

Linux laptop 6.4.7-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jul 27 20:01:18 UTC 2023 x86_64 GNU/Linux

Subsystem

http2

What steps will reproduce the bug?

const http2 = require('node:http2')
const fs = require('node:fs')

async function connect(authority, options) {
  return new Promise((resolve, reject) => {
    const session = http2.connect(authority, options)

    session.once('connect', (session) => {
      console.log('session connect')
      resolve(session)
    })

    session.once('close', () => {
      console.log('session close')
    })

    session.once('error', (err) => {
      console.error('session error')
      reject(err)
    })

  })
}

async function request(session) {
  return new Promise((resolve, reject) => {
    let stream

    const headers = {
      ":path": '/',
      ":method": 'POST'
    }

    try {
      stream = session.request(headers)
    } catch (err) {
      reject(err)
    }

    console.log(`request() session state: ${JSON.stringify(session.state)}`)

    stream.on('close', () => {
      console.info(`stream close with code ${stream.rstCode}`)
      if (stream.rstCode) {
        reject(`stream closed with code ${stream.rstCode}`)
      } else {
        resolve()
      }
    })

    stream.on('end', () => {
      console.info('stream end')
      resolve()
    })

    stream.on('error', (err) => {
      console.error(`stream error: ${err.message}`)
      reject(err)
    })

    stream.setTimeout(500, () => {
      console.log('stream timeout')
      reject(new Error('stream timeout'))
    })

    console.log('stream.write() with callback')
    stream.write(Buffer.from('writing some data', 'utf8'), (maybeErr) => {
      console.log(`stream write callback error: ${maybeErr}`)
      console.log('stream write callback: calling stream.end()')
      stream.end()
    })
  })
}

async function main() {
  // localhost:50051 is a http/2 server with a 1 millisecond timeout
  const authority = 'https://localhost:50051'

  const options = {
    ca: fs.readFileSync('./../cert/ca-cert.pem'),
    servername: 'example.cloudapis.test',
  }

  let session

  while (session === undefined) {
    try {
      session = await connect(authority, options)
    } catch (err) {
      console.error(`error creating session ${err.message}`)
      await new Promise(resolve => setTimeout(resolve, 100))
    }
  }

  while (true) {
    try {
      await request(session)
    } catch (err) {
      console.error(`request() error ${err.message}`)
    }
    await new Promise(resolve => setTimeout(resolve, 1000))
  }
}

function doOtherWork() {
  // To keep the node process alive
  setTimeout(doOtherWork, 1000)
}

main()
doOtherWork()

How often does it reproduce? Is there a required condition?

A server-side timeout of 1ms is never (for me) long enough for the TLS handshake to completely finish.
Once 'connect' is emitted by a session, reproduces every time.
Reproduces with a ':method' header of POST, but not with GET.

What is the expected behavior? Why is that the expected behavior?

Session should be marked as destroyed, which would make subsequent calls to session.request() return an error.

If you change the POST to a GET, the session is destroyed and this example logs

request() session state: {}
request() error The session has been destroyed

What do you see instead?

The session is not marked as destroyed. Calls to session.request() do not error. Streams can continue to be added to the session. These pile up forever (session.state.outboundQueueSize increases forever) until the process is killed (oom or otherwise).

Additional information

This is a bit involved to set up with the server end and the certificates. I have a docker compose setup available here:

https://github.com/wwilfinger/nodejs-invalid-http2session

This is a stripped-down reproduction of this issue grpc/grpc-node#2502 which also has a docker compose setup available https://github.com/wwilfinger/grpc-pubsub-fake-public/

Several things seem to be required to reproduce:

  • TLS handshake is partially complete. Client has ACK'd ServerHello but has not sent a TLS response yet.
  • The server-side will reject any futher communication because of a timeout configured on the server.
  • stream configured for :method=POST (at least, :method=GET won't reproduce)
  • stream.write(msg, () => stream.end()). The write causes SIGPIPE.

Some thoughts/feelings:

I don't think the TLS handshake being incomplete is strictly required, but it is a reliable way for the write() to fail with a system error? The handshake is what we were seeing in the wild.

It doesn't really matter what the http/2 server is configured to respond to at the http/2 layer, because from the server's point of view the TLS handshake never finishes.

stream.write(); stream.end() won't reproduce. The end() has to be called in the write() callback.

GET doesn't reproduce. POST does. I haven't tried out other methods.

node v18.16.1 is what I did most of my testing with. It's easy enough to change the base image in the Dockerfile. node:16-slim does not reproduce, node:20-slim does.

@sanket-1305
Copy link

I want to work on this issue, I am a beginner so will need some guidance from the community . Can you please assign me this issue?

@debadree25 debadree25 added the http2 Issues or PRs related to the http2 subsystem. label Aug 14, 2023
@debadree25
Copy link
Member

hey @sanket-1305 feel free to work on the issue and submit a PR, there's normally no need to assign issues you can probably find the code relevant to the issue in lib/internal/http2

@sanket-1305
Copy link

@debadree25 thankyou for the reply, will start working on this but i will need some time, because my exams are going on still i will try my best to solve this asap!

@murgatroid99
Copy link
Contributor

You say "stream.write(); stream.end() won't reproduce. The end() has to be called in the write() callback." Have you tried enclosing the stream.end() call in process.nextTick or setImmediate? In other words, is this triggered specifically by calling stream.end() in the write() callback, or more generally by calling stream.end() any time after the write() callback is called?

@sanket-1305
Copy link

can anyone tell me if i am on right path or completely wrong here.
node-#49147

@murgatroid99
Copy link
Contributor

You're looking at Http2ServerResponse. The example reproduction code provided in the issue uses a server written in Go, so no Node server code is going to be relevant to the problem. In addition, you are looking at compat.js, which provides the compatibility API that is similar to the HTTP/1 API, but it uses HTTP/2. The example code does not use the compatibility API, so that is not the right place to look for the problem.

Finally, there is a good chance that this problem is not in the http2 code at all. There are indications in the logs in the linked gRPC bug that the underlying TLS socket is reporting incorrect state information to the http2 session. So this is at least in part a tls module bug. It could even be a bug in the C++ code that interfaces with openssl.

@sanket-1305
Copy link

As I beginner i am not able to understand this and will not be able to work on this. I will start working on some basic issues first.

@wwilfinger
Copy link
Author

Have you tried enclosing the stream.end() call in process.nextTick or setImmediate? In other words, is this triggered specifically by calling stream.end() in the write() callback, or more generally by calling stream.end() any time after the write() callback is called?

Both process.nextTick and setImmediate avoid the situation.

stream.write(Buffer.from('writing some data', 'utf8'), (maybeErr) => {
  console.log(`stream write callback error: ${maybeErr}`)
  console.log('stream write callback: calling stream.end()')
  // reproduces
  stream.end()
  // will NOT reproduce
  // process.nextTick(() => stream.end())
  // will NOT reproduce
  // setImmediate(() => stream.end)
})

@wwilfinger
Copy link
Author

Some observations with a debugger.

console.log('stream.write()')
debugger // ss shows CLOSE-WAIT, fd=27. Node debugger shows the same fd number
stream.write(Buffer.from('writing some data', 'utf8'), (maybeErr) => {
  console.log(`stream write callback error: ${maybeErr}`)
  console.log('stream write callback: calling stream.end()')
  debugger // strace printed SIGPIPE. ss no longer shows a TCP connection
  stream.end()
  debugger // node debugger still shows fd=27, session.destroyed=false, session.Symbol(socket).destroyed=false
})

During all three of those breakpoints, the debugger shows the same info for the TCP object. This is at (pseudo-code I guess, because of the Symbols) session.Symbol(socket).Symbol(kHandle)

VSCode debugger screenshot

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http2 Issues or PRs related to the http2 subsystem.
Projects
None yet
Development

No branches or pull requests

4 participants