-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP/2 #309
Comments
HTTP/2 layers overview
Stream Priority/DependenciesIt seems that full featured scheduler for
HTTP/2 layers implementationSince
Starting/switching HTTP/2Considering #1176 (comment), the possible variant for HTTP/2 starting and subsequent switching between HTTP/1.1 <-> HTTP/2 processing inside Tempesta FW - can be the special flag for HTTP/2 and QUIC with HTTP/3In context of HTTP/2 features, which must be implemented, the main parts of future QUIC very rough architecture may be looked like:
However, with described scheme there are several questions, which can lead to different QUIC implementation, and should be discussed:
Besides, for HTTP/2 over QUIC implementation (which in fact is HTTP/3), it is worth to mention the most significant HTTP/3 specifics (compared with HTTP/2):
Therefore, three implementation variants for HTTP/2 and QUIC<->HTTP/3 could be specified:
In my opinion, the second variant among described above is most suitable for implementation; the drawback of this approach is the necessity to implement in fact two different models with rather large part of functionality overlapped; but very likely that quite a lot of code can be reused from HTTP/2 implementation for future
Flow ControlIt seems that we should implement at least WINDOW_UPDATE frame in #309 - in order to announce the largest possible window (second paragraph in https://tools.ietf.org/html/rfc7540#section-5.2.2). |
Several comments:
In HTTP/2 and HTTP/3 there are always 0th control stream, so probably it makes sense to embed
How it's different from current assembling HTTP messages from ingress skb chunks? I'd expect that after decoding we can fail to our current messages processing logic.
Not necessary. In #498 we exactly want to proxy message chunks as is. So probably in HTTP/2, when we assembled HTTP headers and can execute scheduling logic, we can proxy data frames w/o assembling. However, if it's simpler for now just to assemble whole messages - I'm fine with it. I'm only against introducing additional architectural layer which will be reworked in #498 - just don't want to do the same work twice. Maybe the layer is required - I didn't actually get this
why do we need the assembling layer for full HTTP/2 proxying?
I think technically we can just insert two skb frames with appropriate HTTP/2 frame headers and, of course, update
Yes. GFSM is a generalization of hooks mechanism which allows several FSM to subscribe to a particular events and do context switches. FreeBSD netgraph uses similar approach for network protocols handling, but I don't think that it's an efficient way. Moreover,
Agree.
The question is how much of the code we'll be able to reuse for HTTP/3. Now we only can keep HTTP/3 in mind during HTTP/2 development.
I think the framing layer will provide downcalls plus to upcalls for HTTP/2, just the same way as current
Yes, that's true. The only thing that we can do now is to separate framing layer as much as possible to be able to call it from the new UDP
The 3rd variant is absolutely not an option. We'll definitely die if try to implement TCP/TLS/HTTP/2 concurrently with UDP/HTTP/3, especially given that the QUIC standard is constantly changing. Now we need to implement HTTP/2 ASAP and should only keep HTTP/3 in mind to choose more HTTP/3-friendly architecture if we have a choice. It'd be good if we can reuse the same logic, at least as a copy & paste foundation (e.g. we can copy some HTTP/2 framing logic and adjust it for HTTP/3). So I propose back to choosing between 1st and 2nd options when we have HTTP/2 and start with #724 . |
1. Correction of comment about HTTP/2 design; 2. Some corrections in the Huffman decoding functionality.
…o HTTP/2-parser(decoder) layer (#309).
HTTP/2 HPACK layer implementation (#309).
1. Changes in HPACK decoder to copy only Huffman-decoded and dynamically indexed headers; 2. Appropriate changes in HPACK-decoder/parser unit-tests.
Corrections as a result of HPACK decoder/encoder/parser unit-tests debugging.
Parse name, colon, LWS, value and RWS of HTTP/1.1-response headers into separate chunks to facilitate the name/value splitting and colon/OWS eviction during HTTP/1.1=>HTTP/2 response transformation.
…ion-h2-cache HTTP/2 implementation: HTTP/2-cache (#309).
HTTP/2 Parser implementation (#309).
Done in #1368 |
The initial implementation of HTTP/2 was done in #1176 . Need a review and further development of the code. I suggest to create a new pull request instead of developing #1176.
Consider Cache Digests as important for Ideal HTTP Performance.
There is HTTP/2 test suite which should be used to test the implementation.
Context of the issue is just a robust and performance architecture of HTTP/2 (with #687 in mind - we should not let similar issues to pass to the master and probably there will be cheap (quick) opportunities to fix some of #687 problems). All the small extensions, protocol features and bugs should be reported in small separate tasks.
An minimal requirement for the issue to be done is that a HTTP/2 capable browser must be able to load tempesta-tech.com via HTTP/2.
QUIC foundations
Some of HTTP/2 mechanisms are moved to QUIC (#724): streams, frames, compression. So please make required TODO comments and necessary code adjustments for further extensions to QUIC. In particular at least these docs affect current HTTP/2 design:
QUIC transport defines main primitives of QUIC.
QPACK for HPACK. Probably Avoiding Head-of-Line Blocking is the most influential feature of QPACK which affect synchronization design of HTTP/2 streams and maybe more connection-wide attributes.
HTTP/3 for the whole new code. Chapters 4 and A.2 discuss similarities and differences in the framing layers.
Notes
We should not support Huffman encoding, see #1176 (comment) . Huffman decoding is only required now. However, in #1125 we'll generate HTTP/2 requests and Huffman encoding does make difference for HTTP headers. So I'd leave current Huffman encoding in the source code, but leave it unused until #1125.
Unfortunately we can not pass Huffman-encoded strings to HTTP parser because upper and lower case characters with different umber of bits and have not clear transition (like 0x20 for base ASCII). So we have to decode HTTP headers before passing them to HTTP parser.
HTTP/2 amplification threat
HTTP/2 HPACK introduces HTTP/2 amplification threat. Protection against the attack is left for #488.
Framing
HTTP/2 (QUIC) uses very close, but still different in many details, frames (see QUIC transport chapter 12 and HTTP/3 chapters 4 and A.2 ), so it seems we can not just reuse the code, but, probably, some logic us reusable.
Anyway the logic is relatively complex and defines a logical layer, so I propose to develop it in a new source code file (logic module)
http_frame.c
which later can be split for HTTP/2- and HTTP/3- specific code.Since max frame size is 16MB, should call the decoder layer on each frame chunk except if there is not enough payload for processing.
The framing layer determines type of frames: service, headers, body. Header frame payload should go through the decoder layer while body (data) frame payload should skip the decoder and go to HTTP parser directly. The parser code must be split to handle body and headers separately. Service frames must manage current stream state and don't imply HTTP parser calling.
Streams
An HTTP/2 or HTTP/3 stream is essentially FSM with the defines state transitions. A stream and frame type must be passed to
tfw_http_req_process()
. If we define stream withTfwStream
data structure, thenTfwStream->parser
should replace currentTfwConn->parser
.I propose to use 128 for
SETTINGS_MAX_CONCURRENT_STREAMS
and handle the streams in a binary heap storing identifiers and pointers to dynamically allocated stream descriptors. Maybe there is a better data structure.Flow control should be left until #498 and now we should announce the largest possible window and ignore the client window - just send the whole response to a client. Keeping the response in Tempesta memory and send it in small chunks may lead to significant security flaw. It's bad to ignore the client window, so #498 was marked as crucial and moved to the next milestone.
Streams prioritization and dependency is the subject for #1196.
All in all the stream logic implies relatively complex logic about flow control, prioritization and so on, so I propose also to move it to a new
http_stream.c
. Control stream(s) logic, in sense of frames mentioned below, must be handled also here by calling appropriate HTTP, framing, and connection routines. Note stream operation in HTTP/3 described in https://tools.ietf.org/html/draft-ietf-quic-transport-18#section-2 and https://tools.ietf.org/html/draft-ietf-quic-http-18#section-3 as well.Only
HEADERS
,DATA
,CONTINUATION
,RST_STREAM
,PING
,GOAWAY
frames must be implemented in this issue, probablySETTINGS
also should be supported for some cases. A client nevere sendsPUSH_PROMISE
frame, so it's implementation is left for #1194.WINDOW_UPDATE
frame is left for #498.HTTP/2 <-> HTTP/1.1 message transformation
I propose to extend
TfwHttpHdrTbl
that it can contain plain HTTP/1.1 header strings as well as indexes to to encoded HTTP/2 headers. The interfaces to the current table should be generalized in such way that all the logic can get necessary information about a header regardless whether it's HPACK encoded or in plain TfwStr format.However, if
TfwHttpHdrTbl
we can handle Huffman-decoded only headers.Note that HTTP/2 defines message length in other ways than HTTP/1.1, so current chunked,
Content-Length
and connection close logic must be adusted.Decoder layer
The HTTP parser function must call from a new decoder layer. The decoder layer must be aware about HTTP parser states and feed current decoded chunk to the parser. The decoder should be responsible for:
We should take advantage from the bit string translation to a characters string. In particular, the should be multiple Huffman decoding tables implementing characters filtration (just return error for particular bits mapping to a prohibited character). Next the Huffman decoding state machine can be generated in such way that for example
(the bit string means
010101 00000 100001
will be immediately blocked or translated to a space%0A
,\n
). There is no need to implement any functionality of #2, but the decoder must be designed in appropriate. (In #2 we can add support for run-time updated Huffman decoder table to support reconfiguration of the translation tables). However, current alphabets checking must be implemented for HTTP/2 using Huffman decoder. Also, please removehttp_norm.h
and all the related stuff.The decoder must receive a chunk of data, execute the decoding logic, for an encoded string if any and call the parser for the string. In this case the parser will have many entries so #1131 is crucial now.
The decoder must write data to a new HTTP request because decoded data is usually larger than original.
In general case Humman-encoded symbol can cross byte bounds, so the decoder FSM must store some context data to be able to process frame chunks passed from the framing layer.
HTTP parser
The parser must eat HTTP/1.1 and HTTP/2 messages depending on information from the framing/streaming layer, HTTP/2 path must be in preference (in sense of conditions and
likely
paths). This is required since HTTP/2 headers use binary separators instead of CRLF.It's also possible to process the binary delimiters on decoder side (after all the delimeters determine type of current header - indexed, encoded, or plain) and left current HTTP parser to process ASCII parts of the headers only.
tfw_http_parse_req()
must be split into method, URI, HTTP version, headers, and body parsing parts. Only some of the parts must be called for HTTP/2.Characters filtration, the SIMD alphabets checking, must not be executed for HTTP/2 strings since we have Huffman decoders.
Headers conversion
Headers adjustments,
tfw_http_adjust_{req,resp}()
are the right place to convert headers for different format (from/to HTTP/2 to/from HTTP/1.1). They're also good for this because of #1103 and whe need/have extra space for the format conversion and changing the headers.Responses should be encoded in-place (they're always smaller than HTTP/1.1) in
tfw_http_adjust_resp()
leaving more optimization opportunities for #1103. The new added headers must be immediately compressed in-place, so HTTP XFRM logic must be adjusted for HTTP/2 (create HPACK'ed headers instead of compress plain strings added by the current logic).Caching
It'd be good to keep HTTP/2 and HTTP/1.1 headers in cached entries, but now it's not necessary and we can translare HTTP/1.1 headers to HTTP/2 in
tfw_http_adjust_resp()
.TODO
The code must be done in separate branches producing many sequential small PRs which of them must go through a review and be merged into the master. It's easier to review a smaller code and we can catch issues earlier. Also this way we can avoid heavy rebases. I suggest to split PRs in the way that there are not more than 1 of the tasks below in a PR.
Framing decoding. In general, framing layer should be hooked the same way as TLS and current HTTP layers. It's not necessary to use GFSM, but it might be useful to handle data offsets. Current
tfw_http_req_process()
handles the message parsing and it's processing logic (scheduling, caching etc.) and this can remain - different HTTP/2 frames can be treated by the function just as separate chunks. However, from framing layer we know exact type of current frame and we should call the right (headers or body) parser instead of analyzing current parsing state.Starting HTTP/2 (RFC 7540 chapter 3): ALPN handling, initial transmission of 101 (Switching protocols) response and parsing the client preface.
Stream FSMs handling. At this stage functional test for
PING
frame must work.HTTP/2 <-> HTTP/1.1 message transformation to transfer messages between HTTP/2 clients and HTTP/1.1 backends.
HPACK and Huffman logic.
HTTP/1.1 response streaming for pipelined requests - with HTTP/2 we don't have blocking requests any more, so each received response can be forwarded to a client immediately. So it seems
seq_queue
becomes HTTP/1.1 specific and in context of Improve the architecture that supports the correct order of HTTP responses #687 it'd be good to not to use it for HTTP/2 at all.Update the Wiki list of known implementations of HTTP/2.
The text was updated successfully, but these errors were encountered: