Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

Error handling for streaming API handlers in the browser, has it ever worked? #2519

Closed
achingbrain opened this issue Oct 7, 2019 · 5 comments
Labels
exp/expert Having worked on the specific codebase is important exploration kind/bug A bug in existing code (including security flaws) kind/support A question or request for support P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked

Comments

@achingbrain
Copy link
Member

A few of our APIs are streaming, in that they produce long-lived HTTP responses containing newline-delimited JSON of arbitrary length. When these APIs encounter an error mid-stream, they end the response and write HTTP trailers containing error information.

Thing is, no browsers support HTTP trailers so the browser experience is to end the stream (pull stream, readable stream, etc) with no error message.

It seem there's no plan to add HTTP trailers to the fetch API either, so we need a better way of handling errors. In fact the whole API needs overhauling to be leaner, more consistent and less tied to golang conventions. I did a bunch of work on a RESTful HTTP server for IPFS a while ago, though TBF it could use bringing in line with ipfsx.

Errors though, we could:

  1. Deprecate the HTTP API and implement RPC over websockets - this would also allow us to do realtime events
  • ✅ Fast (to use)
  • ✅ Simple mapping from existing API
  • ❌ Big job, incredibly disruptive
  1. Chunk streaming responses into multipart messages (e.g. instead of each stream entity being delimited by \n it would be a full-blown message part with appropriate mime-type headers, etc which would let us know if we are about to parse an error or not)
  • ✅ Uses existing web conventions
  • ❌ Adds overhead for small payloads
  1. Return a unique identifier with each response and allow the client to use that id to look up an error for a limited time via an error endpoint
  • ❌ Adds complexity
  1. Just write the error into the response body and hope the client notices
  • ❌ Unsuspecting clients likely to encounter significant surprise

My preference would be for 2, though preferably as part of a wider rethink of the API, a la ipfsx.

@achingbrain
Copy link
Member Author

cc: @ipfs/repos-javascript

@mikeal
Copy link
Contributor

mikeal commented Oct 7, 2019

An alternative I would recommend is to simply wrap the existing line separated JSON messages in the metadata we’re looking to replace in the HTTP layer.

So instead of

{ cid: baseEncodedString }

It would be:

{ payload: { cid: baseEncodedString } }

When you need to stream errors or other metadata you can just do it as a top level property:

{ error: { status: 404, message: ‘Block not found’ }}

Chunk streaming responses into multipart messages (e.g. instead of each stream entity being delimited by \n it would be a full-blown message part with appropriate mime-type headers, etc which would let us know if we are about to parse an error or not)

I haven’t found a compelling reason to use multipart messages when the input is not also a multipart message. The format is difficult to parse so implementations tend to be buggy and it’s not any better than just doing your own messages w/ JSON.

Deprecate the HTTP API and implement RPC over websockets - this would also allow us to do realtime events

The one thing I can think of that would make this the most compelling option is if we’re pushing a lot of binary down these streams. With JSON we’re going to suffer base encoding so a new WebSocket based thing would make a lot of sense if we’re serving a lot of block data. Otherwise this seems like a rather large endeavor to shave a few bytes off the response messages.

HTTP Trailers

So, HTTP Trailers are one of several features in HTTP 1.1 that ended up being highly problematic and poorly supported. My general rule is to simply not use them, and I would especially caution against using them for errors.

The naive approach to HTTP, which is what most interfaces take, is that there are two phases: status/headers and body. Once you’ve returned status/headers it’s best to consider HTTP’s role in error messaging complete. Any errors you need to do mid-stream are going to have to be handled in the body, either through messages like I described above or by simply closing the stream prematurely causing a parser error in the client (this is certainly the most common, as anyone who’s dealt with HTTP in the wild has encountered it).

While future versions of HTTP have tried to do a better job at these use cases they still end up being accessed through API’s that assume “HTTP 1.1 semantics” which usually means the naive approach I mentioned above.

@daviddias
Copy link
Member

daviddias commented Oct 8, 2019

The crux of this issue is not designing a better API, is upgrading the HTTP API itself. We need a migration path for go-ipfs to upgrade its HTTP API in a way that it is non breaking for the users.

One of the attempts was to make it spec first -- https://github.com/ipfs/http-api-spec -- but got superseded by autogen docs.

For more background, read:

If only time for one more issue, then my rec is ipfs/specs#208

@achingbrain
Copy link
Member Author

An alternative I would recommend is to simply wrap the existing line separated JSON messages in the metadata we’re looking to replace in the HTTP layer.

To me this looks a lot like what go-IPFS has done with the existing HTTP API. It would be better to use the primitives that HTTP gives us - that is, instead of wrapping things in envelope objects with keys that hint at the content type, just use actual content-types and be done with it.

We need a migration path for go-ipfs to upgrade its HTTP API in a way that it is non breaking for the users.

Agreed, any improvements would have to be done in both codebases. We could do it one command at a time as part of ipfsx as outlined in #2509, the idea it's opt-in initially so people can continue to use the existing API in the interim without breaking anything.

@mikeal
Copy link
Contributor

mikeal commented Oct 8, 2019

To me this looks a lot like what go-IPFS has done with the existing HTTP API. It would be better to use the primitives that HTTP gives us - that is, instead of wrapping things in envelope objects with keys that hint at the content type, just use actual content-types and be done with it.

This sounds good, but HTTP doesn’t actually have support for streaming errors. You get one error state in HTTP and that’s the status code, after that you’re on your own. Once you’re parsing the body HTTP’s role in errors is effectively over.

We need a migration path for go-ipfs to upgrade its HTTP API in a way that it is non breaking for the users.

The easiest thing to do would be to put any changed/new API at a new endpoint and leave the old one alone until nobody is using it anymore.

@alanshaw alanshaw added kind/bug A bug in existing code (including security flaws) exp/expert Having worked on the specific codebase is important exploration P2 Medium: Good to have, but can wait until someone steps up kind/support A question or request for support status/ready Ready to be worked labels Oct 21, 2019
SgtPooki referenced this issue in ipfs/js-kubo-rpc-client Aug 18, 2022
Adds a server running a gRPC endpoint over websockets running on port 5003, a `ipfs-grpc-client` module to access the server and a `ipfs-client` module that uses the gRPC client with HTTP fallback.

This is to solve shortcomings and limitations of the existing HTTP API and addresses the concerns raised in the 'Streaming HTTP APIs and errors, y u no work?' session we had at IPFS team week in NYC.

## Key points

1. Enables full duplex communication with a remote node

When making an HTTP request in the browser, a [FormData][] object must be created. In order to add all the values to the FormData object, an incoming stream must be consumed in its entirety before the first byte is sent to the server.

This means you cannot start processing a response before the request has been sent, so you cannot have full-duplex communication between client and server over HTTP. This seems unlikely to change in the near future.

With a websocket transport for gRPC-web, individual messages can be sent backwards and forwards by the client or the server enabling full-duplex communication.  This is essential for things like progress events from `ipfs.add` in the short term, and exposing the full stream capabilities of libp2p via remote client in the long term.

2. Enables streaming errors

The existing HTTP API sends errors as HTTP trailers.  No browser supports HTTP trailers so when a stream encounters an error, from the client's point of view the stream just stops with no possibility of finding out what happened.

This can also mask intended behaviour cause users to incorrectly interpret the API. For example if you specify a timeout to a DHT query and that timeout is reached, in the browser the stream ends without an error and you take away the results you've received thinking all is well but on the CLI the same operation results in a non-zero exit code.

A websocket transport has no restrictions here, since full-duplex communication is possible, errors can be received at any time.

3. Listens on websockets with no HTTP fallback

gRPC-web exists and is a way of exposing a gRPC service over HTTP.  Whereas gRPC supports four modes (unary, e.g. one request object and one response object, client streaming, server streaming and bidirectional streaming), gRPC-web only supports [unary and server streaming](https://github.com/grpc/grpc-web#wire-format-mode).  This is due to limitations of the web platform mentioned above and doesn't give us anything over our existing HTTP API.

The gRPC-web team are evaluating several options for client and bidirectional streaming, all of which require new capabilities to be added to browsers and none of which will be available in a reasonable time frame.

Notably they have [no plans to use websockets](https://github.com/grpc/grpc-web/blob/master/doc/streaming-roadmap.md#issues-with-websockets) as a transport, even though it solves the problems we have today.

The team from [improbable](https://improbable.io/) maintain a [gRPC-web-websockets bridge](https://github.com/improbable-eng/grpc-web) which the client added by this PR is compatible with.  Their bridge also has a go implementation of a [reverse proxy](https://github.com/improbable-eng/grpc-web/tree/master/go/grpcwebproxy) for use with gRPC servers to turn them into gRPC-web servers with an optional websocket transport.

My proposal is to embrace the use of websockets to solve our problems right now, then move to whatever streaming primitive the gRPC-web team settle on in the years to come.

As implemented there's only websockets here and no HTTP fallback as the existing HTTP API works fine for unary operations so there's little to be gained by blocking this work on reimplementing the whole of the HTTP API in gRPC-web, and the client can pick and choose which API it'll use per-call.

By running the websocket server on a different port to the existing HTTP API it gives us room to add gRPC-web fallback for the API if we find that useful.

4. Has protobuf definitions for all requests/responses

See the [ipfs-grpc-protocol](https://github.com/ipfs/js-ipfs/tree/feat/add-grpc-server-and-client/packages/ipfs-grpc-protocol) module, which contains definitions for API requests/reponses.

They've been ported from the existing API and will need some checking.

The [ipfs-grpc-server/README.md](https://github.com/ipfs/js-ipfs/blob/feat/add-grpc-server-and-client/packages/ipfs-grpc-server/README.md) has a rundown of the websocket communication protocol that was ported from [improbable-eng/grpc-web](https://github.com/improbable-eng/grpc-web).

5. Options as metadata

When making a request, metadata is sent during the preamble - these take the form of a string identical to HTTP headers as the initial websocket message - I've used this mechanism to send the options for a given invocation.

Notably these are not defined as a protocol buffer, just an unspecified list of simple key/value pairs - maybe they should be to ensure compatibility between implementations?

This will be trivial in the implementation in the PR as it contains a server implementation too but to do it in go will require patching or forking the improbable gRPC proxy.

6. Errors as metadata

Similar to the existing HTTP API, message trailers are used to send errors.  Four fields are used to re-construct the error on the client side:

| Field | Notes | 
| ----- | ----- |
| grpc-status  | 0 for success, 1+ for error |
| grpc-message | An error message |
| grpc-stack   | A stack trace with `\n` delimited lines |
| grpc-code    | A string code such as `'ERROR_BAD_INPUT'` that may be used for i18n translations to show a message to the user in their own language |

Similar to options these fields are unspecified, if a convention is not enough, perhaps they should be specified as a protobuf and the trailer sent as binary?

7. Streams

When sending data as part of an `ipfs.add`, we send repeated messages that contain a path, a content buffer and an index.  The index is used to differentiate between streams - path cannot be used as it could be empty.  Only the first supplied `path` is respected for a given index. On the server separate input streams are created for each file being added.  A file stream is considered closed when an unset or empty content buffer is received.  Ultimately this will allow us to apply backpressure on a per-file basis and read from different file streams in parallel and asymmetrically based on the available server capacity.

8. Performance

Observed performance pegs gRPC-web over websockets as similar to the HTTP Client with pretty much zero optimisation work performed

9. Security

Browsers require TLS for all use of websocket connections to localhost. They do not require it for the loopback address, however, which this PR uses, though loopback means the traffic will not leave the local machine.

The incoming requests start as HTTP requests so have a referer header and user agent so would follow the same restrictions as the existing HTTP API.

Fixes #2519
Fixes #2838
Fixes #2943
Fixes #2854
Fixes #2864

[FormData]: https://developer.mozilla.org/en-US/docs/Web/API/FormData
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
exp/expert Having worked on the specific codebase is important exploration kind/bug A bug in existing code (including security flaws) kind/support A question or request for support P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked
Projects
None yet
Development

No branches or pull requests

4 participants