splithttp transport #3412

mmmray · 2024-06-01T22:29:26Z

how it works:

each transported stream is identified by a random UUID <uuid>
download using long-living chunked-transfer from GET /?session=<uuid>
upload using POST /?session=<uuid>

personal design goals:

see Implement Meek Protocol HTTP CDN #2162 [Potential new transport] Streamed xSSE Meek (HTTP/1.1) #3334 for extended motivation
transport that can pass more CDNs, CDN does not have to support websocket
support of not only CDNs, but anything that can proxy HTTP
someday, in the future: ability to split up/download to different CDN (without dialerProxy/loopback tricks)
simpler than meek

implementation notes, concerns:

putting session ID in URL achieves two goals: cache-busting, and avoiding requirement that the HTTP proxy forwards HTTP headers correctly. this can be debated.
no decoupling of connection lifetime from inner stream lifetime. particularly, CDNs may enforce a total response timeout, which would affect connection quality.

on the other hand, this makes it easier to prevent DoS attacks: the in-memory map for storing sessions is always bounded by the number of physical TCP connections
connection upload can be hijacked without active MITM, by reading the UUID passively and using it from another machine. I don't think this is a practical issue, the answer is always to use TLS in any case.
i noticed a lot of HTTP-ish transports write to the socket directly. However, I need HTTP connection reuse for performance, so I want to use http.Client.
I have a lot of trouble getting http.Client to work correctly with uTLS. as a result, ALPN protocol negotiation is not correctly implemented, and HTTP2 prior knowledge is assumed for https://. For http://, HTTP/1.1 is used anyway.

usage notes:

On server and client:

      "streamSettings": {
        "network": "splithttp",
        "splithttpSettings": {
          "host": "example.com",
          "path": "/"
        }
      }

if the connection hangs, most likely the CDN is buffering the response body of GET. i found that nginx and apache respond to X-Accel-Buffering: no response header and disable buffering.

**how it works:** * each transported stream is identified by a random UUID `<uuid>` * download using long-living chunked-transfer from `GET /?session=<uuid>` * upload using `POST /?session=<uuid>` **personal design goals:** * transport that can pass more CDNs, CDN does not have to support websocket * support of not only CDNs, but anything that can proxy HTTP * someday, in the future: ability to split up/download to different CDN (without dialerProxy/loopback tricks) * simpler than meek **implementation notes, concerns:** * putting session ID in URL achieves two goals: cache-busting, and avoiding requirement that the HTTP proxy forwards HTTP headers correctly. this can be debated. * no decoupling of connection lifetime from inner stream lifetime. particularly, CDNs may enforce a total response timeout, which would affect connection quality. on the other hand, this makes it easier to prevent DoS attacks: the in-memory map for storing sessions is always bounded by the number of physical TCP connections * connection upload can be hijacked **without active MITM**, by reading the UUID passively and using it from another machine. I don't think this is a practical issue, the answer is always to use TLS in any case. * i noticed a lot of HTTP-ish transports write to the socket directly. However, I need HTTP connection reuse for performance, so I want to use http.Client. * I have a lot of trouble getting `http.Client` to work correctly with uTLS. as a result, ALPN protocol negotiation is not correctly implemented, and HTTP2 prior knowledge is assumed for https://. For http://, HTTP/1.1 is used anyway. **usage notes:** On server and client: ``` "streamSettings": { "network": "splithttp", "splithttpSettings": { "host": "example.com", "path": "/" } } ``` * if the connection hangs, most likely the CDN is buffering the response body of `GET`. i found that nginx and apache respond to `X-Accel-Buffering: no` response header and disable buffering.

Add test file

Fangliding · 2024-06-02T20:06:18Z

How many tested CDNs?

irgfw · 2024-06-02T20:08:47Z

In theory, this could solve many problems in Iran.

mmmray · 2024-06-02T20:12:06Z

How many tested CDNs?

I have tested two of the major ones + [redacted]

I will check the tests you added

Fangliding · 2024-06-02T20:15:16Z

@mmmray Don't mind about the test fail. It didn't happen in new test file, it's sometimes like this, not your problem((((

mmmray · 2024-06-02T20:56:01Z

docs: XTLS/Xray-docs-next#514

mmmray · 2024-06-03T00:25:34Z

it seems that the tests actually fail because they attempt to use the same ports as the websocket tests, while running concurrently with them

testing.servers.tcp.port.PickPort appears to be the correct solution (used in http2 transport), but the implementation is broken for me on Ubuntu. It is missing SO_REUSEPORT, but the implementation of that is platform-specific.

I will hardcode different port numbers, but I wonder if pulling in another dependency go-reuseport is ok for testing?

yuhan6665 · 2024-06-03T00:26:17Z

Thanks for your great work @mmmray !
@Fangliding I've been improving xray's test for a long time. I'm 99.9% sure the test failure are caused by your test. I notice sometimes Test_listenSHAndDial fail and sometimes Test_listenWSAndDial fail. Maybe they share something in common and it is due to run them in parallel? Please help to fix it
@RPRX please take a look before we can merge

yuhan6665 · 2024-06-03T00:31:33Z

lol that is fast @mmmray

yuhan6665 · 2024-06-03T00:45:07Z

it seems that the tests actually fail because they attempt to use the same ports as the websocket tests, while running concurrently with them

testing.servers.tcp.port.PickPort appears to be the correct solution (used in http2 transport), but the implementation is broken for me on Ubuntu. It is missing SO_REUSEPORT, but the implementation of that is platform-specific.

I will hardcode different port numbers, but I wonder if pulling in another dependency go-reuseport is ok for testing?

The idea of PickPort() is that it tries to listen a port, and then quickly release self and pass this port number to the user. It is a simple yet proven solution. I highly recommend it over magic like SO_REUSEPORT ;)

mmmray · 2024-06-03T00:50:54Z

My bad. i already tried it before and saw some "address already in use" errors in logs. I can't reproduce them anymore, my code must have been faulty.

RPRX · 2024-06-03T02:25:41Z

非常感谢 PR！我觉得实现这样的传输层，最大的挑战在于 客户端的请求不一定会按顺序到达服务端，有针对这种情况做处理吗？

RPRX · 2024-06-03T02:29:12Z

另外目前对于 h2 的情况是一个连接占一条 h2 还是多个连接复用同一条 h2

mmmray · 2024-06-03T02:44:14Z

may not necessarily arrive at the server in order . Is there any handling for this situation?

the start of one HTTP request is hard-blocked on the successful 200 OK response of the previous data chunk. basically, there is a single for { .. } goroutine in dialer.go that sends http requests synchronously.

is one connection occupying one h2 or multiple connections reuse the same h2?

ah, currently there is one entire http.Client per connection. this could be improved, but I am not sure if there is another way than global sync.Map? I think the http.Client would have to be created per-outbound somehow.

RPRX · 2024-06-03T03:02:14Z

may not necessarily arrive at the server in order . Is there any handling for this situation?

the start of one HTTP request is hard-blocked on the successful 200 OK response of the previous data chunk. basically, there is a single for { .. } goroutine in dialer.go that sends http requests synchronously.

这样处理比较简单，但效率非常低，且无法应对实时性要求比较高的应用，实用性会打折扣，试试实现一个序号与服务端缓存机制

（不过对于 h1 好像只能是现在这样）

is one connection occupying one h2 or multiple connections reuse the same h2?

ah, currently there is one entire http.Client per connection. this could be improved, but I am not sure if there is another way than global sync.Map? I think the http.Client would have to be created per-outbound somehow.

目前是你说的那种处理方式的话确实无法多个连接复用同一条 h2，换成我说的那种方式的话可以，代码参考 h2 transport

RPRX · 2024-06-03T03:05:43Z

比如说你可以让同一个连接用一个共同的 UUID 再加上序号，0 代表下行，1 2 3 4 5... 代表上行，一直发，不用等服务端返回 200

mmmray · 2024-06-03T03:28:13Z

Reducing the bandwidth overhead is already attempted by batching fast sequential conn.Write calls into fewer, larger HTTP requests. However, there is a (hardcoded) limit to how large those HTTP requests can be, and it forces a tradeoff between bandwidth and latency. If I understood you right, the concurrent uploads could help with this, so I will try it.

I find the upload speedtest to be okay, but it is definitely lower than websocket.

For the code, please refer to h2 transport.

Ah, I see. I'm new to Go, so I didn't really understand how things like this can be used as map keys. I will just copy it.

RPRX · 2024-06-03T04:05:39Z

Reducing the bandwidth overhead

不是为了这件事，而是客户端要等服务端回复后再发第二个包的话，会造成高延迟和低带宽，所以 TCP 不会等 ACK 再发第二个包

RPRX · 2024-06-03T04:09:22Z

For the code, please refer to h2 transport.

Ah, I see. I'm new to Go, so I didn't really understand how things like this can be used as map keys. I will just copy it.

好像找错地方了，相关代码在这里，查看 globalDialerMap

mmmray · 2024-06-03T04:20:28Z

I have seen globalDialerMap, I am just confused because it can use dialerConf as key, and dialerConf contains *MemoryStreamConfig, which indirectly contains a bunch of interfaces. How can a hashmap use this as a key at all? But it is more of a Golang beginner question, as said I will just copy it.

it seems the answer is, golang does a lot of dynamic type checking to determine equality: https://stackoverflow.com/a/48289238/1544347

RPRX · 2024-06-03T04:31:05Z

type dialerConf struct {
	net.Destination
	*internet.MemoryStreamConfig
}

其实 *internet.MemoryStreamConfig 就是个指针，用于区分不同的出站，这里应该是只对比了指针本身，没有深层对比

比如说两个出站的配置一样，但指针肯定不一样，还是会被认定为两个出站，~~深层对比的话认定为同一个出站就不好了~~

Fangliding · 2024-06-03T04:34:39Z

Thanks for your great work @mmmray ! @Fangliding I've been improving xray's test for a long time. I'm 99.9% sure the test failure are caused by your test. I notice sometimes Test_listenSHAndDial fail and sometimes Test_listenWSAndDial fail. Maybe they share something in common and it is due to run them in parallel? Please help to fix it @RPRX please take a look before we can merge

I have run a single test file several times on my own device without any issues. I forgot that using the same port may cause them to fight 🙈

RPRX · 2024-06-12T04:50:14Z

it uses http.Client which always reads response headers after writing

可以读取但不能阻塞下一个请求，否则需要被改变

i think it is (theoretically) possible that it is perceived as attack by some servers

是有这种可能，还有可能服务端偶尔会漏收一两个数据，所以客户端需要新建协程来等待 response 并重发数据，比如最多 3 次

对于 H2 和 HTTP/1.1 都可以有这个机制，但是会有点占内存，服务端那个基于序号的缓存机制也是如此

RPRX · 2024-06-12T04:55:44Z

不过客户端这个重发机制是否有必要，需要实测，我觉得没被视作攻击时服务端基本上不会漏收数据，而被视作攻击时重发也没用

mmmray · 2024-06-12T04:56:56Z

Can be read but cannot block the next request, otherwise it needs to be changed

it can be tried.. should it be done in this PR? I think in order to implement this, http.Client needs to be removed entirely from HTTP/1.1 mode, it is a bigger change. client.Do does not allow to write and read separately, it does it all at once.

http retransmission is automatically done by http.Client (without extra code), if it is removed, it may need to be reimplemented.

RPRX · 2024-06-12T05:02:15Z

~~我不熟悉 Golang 的这些库，但我觉得对于 HTTP/1.1 应该有方便的可并发且不阻塞的 request 方法吧，不然的确很 inelegant~~

mmmray · 2024-06-12T05:07:10Z

I will give it a try, but am not confident I will succeed quickly. I believe the code complexity will increase by a lot, and I think the code is already too complex.

RPRX · 2024-06-15T01:26:19Z

@yuhan6665 将那些信息转发给我了。我希望减少 HTTP/1.1 的连接数是因为 it's the right way to go，减少连接数不仅更优雅，还可以减少对资源的需求从而可以在更苛刻的条件下工作，当然还可能可以少吸引些来自 GFW 的注意。但如果这件事实在难以在短期内实现且伊朗人急需这个新 transport，我们可以先合并了这个 PR 并发一个新版本，@mmmray 你来决定。

mmmray · 2024-06-16T12:30:23Z

@RPRX the upload optimization is now implemented for HTTP/1.1 specifically. for HTTP/2, the code is unchanged, because it never had lots of connections on upload anyway, and I am a bit afraid of regressions. I think the transport can now be merged.

I think in the future this code needs some cleanup, but I want to defer more work until there is more feedback.

yuhan6665

Looks good to me!

RPRX · 2024-06-18T05:33:28Z

LGTM 虽然我没有看代码但是代码能跑就是好样的这就发版+更新分享链接标准

不过建议你在文档中写一下实现细节方便其它软件实现这个 transport

RPRX · 2024-06-18T05:46:36Z

https://github.com/XTLS/Xray-core/releases/tag/v1.8.15 #716

mmmray · 2024-06-18T07:02:07Z

thanks rprx for releasing!

thanks rprx and yuhan for reviewing and contributing!

thanks fangliding for hotfixing docs!!

EDIT:
https://xtls.github.io/en/config/transports/splithttp.html#protocol-details

for implementors (but maybe also for people who need to debug their CDN)

can be moved to developer documentation, up to you

RPRX · 2024-06-20T01:17:45Z

看了 @mmmray 的文档和 @yuhan6665 的 release note 之后，我把新版标注为了 pre-release，有几个地方：

据文档的“协议细节”所述，客户端会等到 GET 返回 200 后才发送上行数据，这会徒增 1-RTT，此处需要修改为直接开始发
“CDN必须禁用缓存, 或者缓存不忽略查询字符串”，这里需要一个 breaking change，把所有 ? 都改为路径，即 POST prefix/UUID/0，这样的话 CDN 天然不会认为是同一个路径从而不会缓存，就是说全用路径而不要用任何传参
文档中关于 ALPN 的“已知问题”是过期描述，现在就是这么设计的，应该移到“协议细节”
文档中关于效率的问题，下行效率肯定拉满，需要标注，只是上行效率可能不如 WebSocket，但也不会差太多。@yuhan6665 提到了 meek 和 v2fly 的实现，就是说 SplitHTTP 和 meek 除了都能穿透不支持 WebSocket 的 CDN 外有什么很大的共同点吗？SplitHTTP 是基于我提出的实现方式，实现细节也是我在这里说的，我没有仔细研究过 meek 且我相信 @mmmray 的实现没有参考过半点 v2fly 的 meek 代码，就是说可以给 credit 但不要乱给 credit，此外文档需要标注至少下行效率比 meek 更高
文档中有两处 WebSocket 需要改为 SplitHTTP

以上。

mmmray · 2024-06-20T01:49:09Z

that's unfortunate. I think those issues should be caught in code review. Or maybe I should have updated the documentation more actively while the PR was still open, I am not sure.

According to the "Protocol Details" of the document, the client will wait until GET returns 200 before sending upstream data. This will increase 1-RTT. This needs to be modified to start sending directly.

The RTT issue is true. This was done intentionally to prevent somebody from opening arbitrary amount of session IDs and never closing them (similar to http2 rapid reset attack). This issue can be solved two ways:

put TTL and a total limit on open sessions, then POST requests can be sent before GET without DoS concerns
initial GET request contains early data, like websocket, but GET must still come before POST. It's not clear how this would work with the "no use of headers" constraint though.

which approach is preferred?

"The CDN must disable caching, or the cache does not ignore the query string", a breaking change is needed here to change all ?All are changed to paths, that is, POST prefix/UUID/0, in this case the CDN will naturally not think it is the same path and will not cache it. That is to say, use the entire path without any parameters.

The initial prototype was this way, but moved it to querystring for consistency with other transports (because no other transport uses subpaths for anything). I was not sure if subpaths would be more difficult for some users to route correctly. I only encountered one CDN that needed explicit configuration. ~~Perhaps the documentation is too intimidating, pessimistic and preemptively mentions too many potential problems. Should it be moved to a Troubleshooting section?~~

Regarding efficiency issues in the document, the downlink efficiency is definitely full and needs to be marked.

I hear upload is much worse than websocket in Iran. Given throttling behavior of Iran GFW based on detected features, I did not want to make definitive statements about performance just yet.

The implementation does not refer to the meek code of v2fly

This is true, but I tried v2fly meek as a user to make my own opinion of what protocol is viable. I think it's ok to give credit to research even if this design and code is different.

In general, what do you think about fixing these issues in later releases with more settings? For example, a setting to change from querystring to paths, and a setting to opt into 0-RTT could be added in 1.8.16 if desired. It is currently not clear to me whether those issues are important, and more feedback from Iran is really needed. They are currently just waiting for mobile clients to update. Currently it seems the biggest issue is upload bandwidth.

RPRX · 2024-06-20T02:15:25Z

我不认为现有的实现能够“prevent somebody from opening arbitrary amount of session IDs and never closing them”，毕竟 GET 也没有任何的鉴权，所以先把这个不必要的 1-RTT 减掉，防止恶意请求的逻辑我们可以放到以后的版本中再实现

此外把传参改成路径即可，还有文档照我说的修改即可（上行效率已经尽可能高了），希望我们能于明天发布下一个版本

mmmray · 2024-06-20T02:28:35Z

What i mean is, in order to have N concurrent sessions, currently the attacker needs to open N concurrent TCP connections (in case of HTTP/1.1) or N h2 streams. Because when the GET is closed, the session ID is also closed.

But if upload can also open sessions, only N POSTs are needed to create N sessions, they do not need to be concurrent and can potentially be performed on a single TCP connection.

The cost is different. Let's say you want 6 million open sessions, currently you have to maintain 6 million TCP connections, it's a lot of resources for the attacker. But 6 million POST requests with different UUID each is feasible.

Anyway, you are right we can fix it later. I will open sessions on either POST or GET, and close when GET is closed.

EDIT: I will still have to implement session TTL to prevent memory leaks. The client does not have to be malicious, it could simply lose internet connection after early POST.

I can prepare the PR and doc fix tomorrow at earliest, I think the deadline may not be met unfortunately.

RPRX · 2024-06-20T02:45:11Z

I believe HTTP/2 is our prior concern, and therefore the cost of attack is very low, considering our current code.

But, of course it's not bad to have "session TTL", and you have enough time to prepare the PR and doc fix. Just ASAP, thanks!

mmmray · 2024-06-20T21:50:13Z

I think everything is addressed in XTLS/Xray-docs-next#516 XTLS/Xray-docs-next#516

except this:

There are two places in the document where WebSocket needs to be changed to SplitHTTP

this is in ZH docs, need somebody else to do it

RPRX · 2024-06-20T23:32:44Z

this is in ZH docs, need somebody else to do it

@Fangliding

Fangliding · 2024-06-21T00:09:25Z

是我瞎了我看英文版本的时候意识到这是ws翻译过来的描述为了行文统一就去复制过来了没想到改漏了地方我自己读了两三遍还没发现

liushengqi000 · 2024-06-30T04:01:16Z

干得好，终于看到tcp拆成多个http的实现了。我只会把流量转成http3，然后把udp包在不同连接上乱发，反正一切都有http3来维护

mmmray and others added 4 commits June 2, 2024 00:28

bypass more buffering

6d34cfc

better naming

dcd2847

Create splithttp_test.go

9a120f9

Add test file

sync map

3af7b2a

fix concurrency issue with splithttp vs websocket tests

3a628c7

try PickPort

27a29b1

remove dead code

344d4f4

Fangliding mentioned this pull request Jun 3, 2024

splithttp XTLS/Xray-docs-next#514

Merged

1 task

mmmray added 2 commits June 15, 2024 15:30

split up download and upload client, to work around keepalive bug

92e30d3

optimized upload for http/1.1

c5281d4

yuhan6665 approved these changes Jun 16, 2024

View reviewed changes

RPRX merged commit c10bd28 into XTLS:main Jun 18, 2024
34 checks passed

mmmray deleted the splithttp-pr branch June 18, 2024 06:59

This was referenced Jun 20, 2024

next iteration on splithttp #3462

Merged

update splithttp docs XTLS/Xray-docs-next#516

Merged

doggie1989 mentioned this pull request Jun 30, 2024

Xray最新Splithttp支持。 mack-a/v2ray-agent#1061

Closed

tqangxl mentioned this pull request Jul 12, 2024

SplitHTTP 支持 MatsuriDayo/NekoBoxForAndroid#713

Closed

RPRX mentioned this pull request Jul 21, 2024

SplitHTTP h3 h2 multiplex controller #3560

Closed

leninalive pushed a commit to amnezia-vpn/amnezia-xray-core that referenced this pull request Oct 29, 2024

Add SplitHTTP Transport (XTLS#3412)

65b7a1a

splithttp transport #3412

splithttp transport #3412

Conversation

mmmray commented Jun 1, 2024

Fangliding commented Jun 2, 2024

irgfw commented Jun 2, 2024

mmmray commented Jun 2, 2024 • edited Loading

Fangliding commented Jun 2, 2024 • edited Loading

mmmray commented Jun 2, 2024

mmmray commented Jun 3, 2024 • edited Loading

yuhan6665 commented Jun 3, 2024

yuhan6665 commented Jun 3, 2024

yuhan6665 commented Jun 3, 2024

mmmray commented Jun 3, 2024

RPRX commented Jun 3, 2024

RPRX commented Jun 3, 2024

mmmray commented Jun 3, 2024 • edited Loading

RPRX commented Jun 3, 2024

RPRX commented Jun 3, 2024

mmmray commented Jun 3, 2024

RPRX commented Jun 3, 2024

RPRX commented Jun 3, 2024

mmmray commented Jun 3, 2024 • edited Loading

RPRX commented Jun 3, 2024

Fangliding commented Jun 3, 2024

RPRX commented Jun 12, 2024

RPRX commented Jun 12, 2024

mmmray commented Jun 12, 2024 • edited Loading

RPRX commented Jun 12, 2024

mmmray commented Jun 12, 2024

RPRX commented Jun 15, 2024

mmmray commented Jun 16, 2024

yuhan6665 left a comment

Choose a reason for hiding this comment

RPRX commented Jun 18, 2024

RPRX commented Jun 18, 2024

mmmray commented Jun 18, 2024 • edited Loading

RPRX commented Jun 20, 2024

mmmray commented Jun 20, 2024 • edited Loading

RPRX commented Jun 20, 2024

mmmray commented Jun 20, 2024 • edited Loading

RPRX commented Jun 20, 2024

mmmray commented Jun 20, 2024

RPRX commented Jun 20, 2024

Fangliding commented Jun 21, 2024

liushengqi000 commented Jun 30, 2024

mmmray commented Jun 2, 2024 •

edited

Loading

Fangliding commented Jun 2, 2024 •

edited

Loading

mmmray commented Jun 3, 2024 •

edited

Loading

mmmray commented Jun 3, 2024 •

edited

Loading

mmmray commented Jun 3, 2024 •

edited

Loading

mmmray commented Jun 12, 2024 •

edited

Loading

mmmray commented Jun 18, 2024 •

edited

Loading

mmmray commented Jun 20, 2024 •

edited

Loading

mmmray commented Jun 20, 2024 •

edited

Loading