-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address change and consent to send #161
Comments
This would be the equivalent of an Optimistic ACK attack, after completing the crypto handshake. If the server randomly skips packet numbers, it will notice the attack eventually and close the connection. |
Packet skipping would seem to be a possible answer, though the client only needs to ack a few packets to keep things moving along. If it is eventually unlucky enough to hit a skipped packet the attack ends. I'd need to do the math, but 800k sounds right for an expected value, which is pretty high when you compare it to the cost of a handshake (around 1400 octets and some crypto for clients). |
You could maybe send a ping frame with a random number if you see a new IP address...? |
In the case of port or IP address change, skipping a larger number of packets makes it much harder to guess. |
FWIW, PING doesn't work here. The response to a PING is an ACK, which contains no information from the PING. |
Yes, I meant to actually extend the ping mechanism here to make it possible to reflect a random number chosen by the ping sender. Having this as an explicit check rather than doing some implicit guess-work might make sense given it's simple and the cost is low. |
The idea of a PING packet with an random number and a PONG in response seems reasonable, though if this is the only use case, I think skipping packet numbers is sufficient, given the 5-tuple is changing anyway. |
Having thought about this some more, skipping packet numbers does allow for some fairly high confidence that there is return routeability. Since you are starting a new congestion state for the new path anyway, ACKs will be critical in driving the congestion window open. I do think that we need to be a lot more aggressive in dropping/skipping packet numbers with high probability (> 0.5) for a short while. |
There is a number of other protocols that have the same issue, including MPTCP, SCTP, MobileIP and the usual solution is what Mirja proposes, a return routability check before sending packets to the new address. Some references where this has been documented include section 5 of RFC6181 for MPTCP, RFC4225 for MIPv6, RFC4218 for Shim6 and RFC5062 for SCTP. |
There are a few issues here
Alternatively, we can add an optional random number to the PING frame, which when present means that the receiver MUST respond immediately with a PING frame carrying the same random number.
|
@janaiyengar, on point three, could that just be spontaneous multipath? :p |
I really support to put a random number in the ping: that's easier and fast than you step 2 above. |
For the multipath case I would assume that you tell the other beforehand which IP you are going to use... |
I wouldn't make ping include an optional random number, I would add a new explicit return routeability frame (or two). I also wouldn't necessarily make the value random, just hard to guess (source address token lite). |
About the multipath case, it is not always the case that you know the new address. |
In this case the old address goes away which I wouldn't call multipath anymore |
:-) |
Note on "doesn't have to be random". I think the term you want here is "unpredictable" or "unguessable" |
You need two things here:
|
@janaiyengar, you are assuming that a jump in packet number is what will carry the unguessable property. I don't think that's sound. In order for the jump to be valuable, it has to be large, which leads to the packet number encoding not being big enough. The suggestion here, which I think is reasonable, is that you have an unguessable value that is sent to the endpoint that moved. Overloading packet number for this is nice in theory, and it might be the right answer in the end, but I wouldn't presume that it is the best answer. Also, you can't just "restore" the cwnd on a new path. That path might look nothing like the previous path. Unless you are also suggesting that your initial cwnd can safely be anything provided that you react properly. Otherwise, you have to allow for the possibility that the new path could be orders of magnitudes less capable than the previous path. Starting congestion control all over is probably the only sensible thing you can do. That is, you have a different congestion state for every path used. |
Is not the following enough to defend against this?
|
As stated earlier (here and here for instance), I don't think that a single packet number gap contains enough entropy and it could in fact be a mistake to rely on it for other reasons. ACK efficiency or resistance to delays and reordering being some concrete reasons against leaving intentional gaps. I would rather we build something more concrete than rely on these implicit mechanisms. As stated on the mailing list, Public Reset and ICMP Unreachable can't or won't necessarily be sent by the entity under attack. |
On cwnd: When I said "restore the cwnd", I meant to the initial cwnd on the new path. Whatever cwnd you use on a new IP, the server cuts it down cwnd to min_cwnd (which is 2 x MSS), and then restores the cwnd back to the earlier cwnd on validation. On packet number jumps: Are you saying that a 4-byte packet number is inadequate entropy? If so then we'll need to revisit Version Negotiation, since a VN packet includes a new server-selected connection ID and the only correlator is the packet number. An attacker could just as easily inject a VN packet with the correct packet number causing the connection establishment to break. (Actually, the problem in VN is worse than here, since the attacker cannot get a single packet wrong when trying to guess a jump. We haven't nailed down rules for when an endpoint receives a VN with a packet number that does not match any sent packets.) |
I heard of the idea to serve a single connection from multiple servers, and assigning a range of packet numbers to each server. I haven't thought all the details through, but if this is possible, there will be larger gaps (although you wouldn't need 31-bit packet number space for this). |
Oh yeah. Though As for triggering an aggressive ACK, I agree that an explicit signal removes the uncertainty, but if the gap needs to be in the order of 2^31, then you can simply step up the ACK schedule if you see a large gap. We could even recommend a minimum gap. But that all assumes that packet number gaps are the right way to validate consent to receive. I don't think that they are. |
If an entity does not send Public Reset or ICMP Unreachable, and our CWND after a "migration" event was low, the CWND would not open up w/o ACKs, so the server (reflector) would not send anything more to the victim. This is the same effect as not getting a PONG back.
You need a lot of entropy only when it is cheap for an attacker to keep guessing. In this case, if your MIN_CWND is, say, 2, and you jump the packet number by rand(min=0, max=1023) (uniformly distributed), then with 99.8% probability the attacker's ACK will ack a packet you never sent, and he will kill this connection. |
So, to confirm my understanding of this method, we expect a client to change connection-id when it knows it is changing IP address, for the linkability reasons we have rehearsed elsewhere. This method would then get used whenever that new connection-id is seen for the first time and when a new IP is presented on the old connection ID, because we presume that this is an indication of NAT rebinding. Would a change in source port, re-using the same connection ID and IP also trigger this? It seems like that presents a pretty small risk of being an attack. Which causes me to wonder whether we also ought to exempt other changes that are most likely to be due to either NAT rebinding or IPv6 address selection mechanics. The same v4 /28 or v6 /64 seem very likely to be the same host. Is it worth PINGing there? Or is it not worth the special logic to test the change? |
Are you assuming that you would create a 31-bit jump? The draft currently
suggests 8-bits, which is not sufficient for this attack.
Yes, I was. That's an easy enough thing to change in the draft.
The actual entropy from a 31-bit jump is actually less than you think. The
attacker can always pretend that packets were lost and guess high. That has
an effect on cwnd which might reduce the effectiveness of the attack if the
guess is wrong, so the wiggle room is probably not that big, but the actual
entropy is in fact smaller.
Not following. If an attacker guesses high or not, when the sender receives
an ACK for a packet it never sent, it MUST terminate the connection. I
don't think the cwnd or pretending that packet loss happened matters here.
|
That said, I'm definitely leaning towards the PING-PONG thing myself. It's simpler to describe, implement, and not get wrong. |
Ted -- your summary is correct. I think we probably should require (MUST) it on an IP address change, but port number changes are probably ok. If we decide to go with a PING-PONG frame (which is where I'm leaning), then the cost of sending this frame isn't that high, so it's fine to be liberal in deciding when to send it. Reducing the cwnd however has costs, and I would be more cautious there. Doing so on IP address changes seems like the safest thing to do. If we wanted to be slightly more performance-sensitive, we could do something like allow /24s without cwnd reduction, but still send a PING-PONG or something like it. |
IP might not be OK. I think that a CGN will randomize in an unpredictable fashion, but if they don't it would be possible to do a reflection attack on hosts that share a CGN. I agree that's narrow, and probably not very compelling, but nonetheless it suggests that in general we can't consider a change in port alone to be perfectly safe. Probably safe with caveats might be OK, and we might allow a higher initial value for that, but I don't see any real harm in starting the congestion control over again. |
On Sat, May 20, 2017 at 4:08 AM, Martin Thomson ***@***.***> wrote:
IP might not be OK. I think that a CGN will randomize in an unpredictable
fashion, but if they don't it would be possible to do a reflection attack
on hosts that share a CGN. I agree that's narrow, and probably not very
compelling, but nonetheless it suggests that in general we can't consider a
change in port alone to be perfectly safe.
I'm not really following the difference between CGN behavior and NAT
behavior here that is causing you to draw a different conclusion for NAT
rebindings and CGN rebindings. Are you seeing a case where the NAT has
re-bound and the CGN then rebinds as increasing the risk that the return
traffic is not seen?
Probably safe with caveats might be OK, and we might allow a higher
initial value for that, but I don't see any real harm in starting the
congestion control over again.
I think the data we have so far indicates that a port-only change doesn't
require this, and I think it would substantially increase the number of
times you had to restart the controller.
Ted
… —
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#161 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABVb5GFZ2yHbc35OJRr-pppYJEGXzz2hks5r7smRgaJpZM4LlHwL>
.
|
Yes, this is generically applicable to NAT. I'd be interested in that data. You might be seeing less enterprise traffic, so I'd keep that as a caveat. Because when you have a whitelist for port-only changes you get some potential issues there. Unless you think that there is no reason why one enterprise user might be motivated to attack another host on the same network. |
On Mon, May 22, 2017 at 5:11 PM, Martin Thomson ***@***.***> wrote:
Unless you think that there is no reason why one enterprise user might be
motivated to attack another host on the same network.
So, the threat model is user X sets up a connection to server Z, receiving
port N of IP xxx.xxx.xxx.xxx.
user X detects that if User Y were to set up a connection, Port K would be
used with the same xxx.xxx.xxx.xxx. Alternatively, port K has been
allocated for some other connection of user Y's.
User X then begins to emit packets with xxx.xxx.xxx.xxx:K as source.
(The firewall and NAT pass these for reasons we cannot ken)
User Y receives the traffic. In case one, it clogs the (physical) port.
In case two, it messes with the established connection.
I'd hazard a guess that firewall and NAT prevention of this is high enough
that keeping the connection window open while doing a PING/PONG still makes
sense, but I agree that I did not think of this particular threat model.
Ted
… You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#161 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABVb5DE1Xs3VMNmdwXN3zcemME2gbi4Gks5r8iQzgaJpZM4LlHwL>
.
|
Yes, if we assume that the attacker's resources are limited to those that exist behind the same NAT/firewall, it's pretty tricky to mount that attack. In the end, I would prefer to have some simple blanket rules, followed by an explanation of some cases in which certain rules might be lifted. Same-port migrations might be one of those cases. |
On Mon, May 22, 2017 at 5:29 PM, Martin Thomson ***@***.***> wrote:
Yes, if we assume that the attacker's resources are limited to those that
exist behind the same NAT/firewall, it's pretty tricky to mount that attack.
In the end, I would prefer to have some simple blanket rules, followed by
an explanation of some cases in which certain rules might be lifted.
Same-port migrations might be one of those cases.
I think the simplest blanket rules are:
1) If there has been an IP migration, reset the congestion window until a
PING returns a PONG.
2) If there has been a port migration, send a PING and reset the congestion
window if the PONG does not return within 2x previous RTT.
While I use PING/Pong here, any of the methods discussed (packet skipping,
new return routability frame) have the same semantics; I'm more concerned
that we get simple blanket rules that don't have us reset the state when we
don't need to.
regards,
Ted
|
If the PONG doesn't come back, I would say that the path is unusable. I can see the case for restoration in the case of a port-only change, on the assumption that the path between the remote endpoint and the NAT is unlikely to change significantly in terms of characteristics. But an IP change could be onto a path with substantially different characteristics. Here's my proposal: If the IP address or port of a peer changes, reset the congestion window and initiate <insert path validation method here>. Once the validation completes, if the IP address of the peer did not change, the congestion window MAY be increased to the value it had prior to migration. If the path validation method fails completely after <some time>, mark the path as unusable, which might cause the connection to fail. Question: do we have to consider the possibility that the prior congestion window might decay due to inactivity and account for that here, or is it better to continue to run the old controller, feed all the new signals into it and switch it in? |
On Tue, May 23, 2017 at 6:05 PM, Martin Thomson ***@***.***> wrote:
If the PONG doesn't come back, I would say that the path is unusable.
Agreed.
I can see the case for restoration in the case of a port-only change, on
the assumption that the path between the remote endpoint and the NAT is
unlikely to change significantly in terms of characteristics. But an IP
change could be onto a path with substantially different characteristics.
Here's my proposal:
If the IP address or port of a peer changes, reset the congestion window
and initiate <insert path validation method here>. Once the validation
completes, if the IP address of the peer did not change, the congestion
window MAY be increased to the value it had prior to migration. If the path
validation method fails completely after <some time>, mark the path as
unusable, which might cause the connection to fail.
So, there are two types of path changes which could happen here. One is a
change in origin of the path, and an IP change would typically signal that
by being in a different address range. So you'd have rough heuristics that
said if the new IP is in the same /N that it should be treated as likely
using the same path but that if it is in a substantially different CIDR
block it is likely using a different path. The other change is one in
which some path elements are using functions like ECMP to share load across
links. In those cases, a shift in port may also result in a shift in path
because the source port is used in the load sharing function. While those
functions are generally aimed at keeping the load (and thus congestion) on
the relevant links in balance, there is no way to guarantee that they are.
As a result, what we do here depends a lot on what we're trying to
optimize. If we want to minimize the lost of congestion control state, I
think we end up with something like:
If the source IP address or port of a peer changes, reset the congestion
window and initiate path validation. If the path validation method fails
after <some time>, mark the path as unusable. If the path validation
completes, the congestion window MAY be increased to the value it had prior
to migration; this increase should occur only when it is likely that the
change in IP address or port did result in a path change. A change in both
port and IP should always be taken as evidence of a path change.
If we are trying to minimize the chance that a congestion window gets
re-opened inappropriately, I think we get something like this:
If the source IP address or port of a peer changes, reset the congestion
window and initiate path validation. If the path validation method fails
after <some time>, mark the path as unusable. If the path validation
completes, the congestion window MAY be increased to the value it had prior
to migration if only the port has changed and there is no other evidence in
a change in path.
Question: do we have to consider the possibility that the prior congestion
window might decay due to inactivity and account for that here, or is it
better to continue to run the old controller, feed all the new signals into
it and switch it in?
In the latter case, you mean that you'd feed the signals into both the new
controller created on IP/port change and the controller that existed before
the change, swapping the latter in when the path validation succeeded?
Ted
|
I guess that I'm in the "minimize the change that a congestion window gets re-opened inappropriately", absent more evidence that doing so wouldn't be too bad.
Yes. There might need to be some tweaks to that if we consider the possibility that the previous path could be reused, but that's essentially it. |
Would there be a concern related to address change forced by overcrowded mobile networks at conferences, festivals, and disasters?
|
I think "the congestion window MAY be increased to the value it had prior Another question: do we need to decrease to the minimal congestion window or the initial congestion window? |
Based on the discussion at the interim, I think that we have a different model in mind now: If there is a path change, reset the congestion controller. Absent other information, assume that any change in remote address is indicative of a path change. You might consider a port-only change to be indicative of hitting the same path. I don't think that this is a universally safe assumption, because it assumes a great deal about what exists on the other side of the NAT, but we might simply rely on the congestion controller being properly responsive to deal with cases where it genuinely isn't a new path. If there is a change in the remote address, make sure to obtain proof that the other endpoint can receive your packets at their new address. Until this is successful, reduce the number of packets you are willing to send on that path to <some low rate that might be incidentally related to the initial congestion window>. If this validation fails, terminate the connection. For the purposes of validation, we will define a new frame that has a randomized payload. An endpoint would be required to send a copy of that payload back in another frame. An endpoint can initiate this validation process at any time and for any reason (though it might have to deal with the resulting ENHANCE_YOUR_CALM if it sends it too much). Note that I said to terminate the connection on this last bit because if validation of the path fails, either you have such a terrible loss issue that you don't want to be using the path, or the other side is spoofing this new source address. |
This has been much-discussed, and it's a relatively isolated change, so I did it. This modifies PING to have an optional payload and adds a PONG frame to echo the PING. An empty PING generates an ACK; a PING with a payload demands a PONG. Generating an unguessable PING is the basis of mid-connection address validation. If the PING is sent on the new path, and the PONG comes back, then the remote address is probably OK to use. I've taken the discussion in the issue into consideration here. There's a lot of potential nuance to capture in terms of how an endpoint might reduce and restore send rates, but I've done what I can to thread the gap between allowing unbounded sending along new and untested paths and allowing connections to get back to doing business. It's annoying that this makes PING and PONG so disparate. I think that we have a re-ordering of frames in our near future to correct minor infidelities like this. I didn't want to do that here and pollute this PR though. Closes #161.
This has been much-discussed, and it's a relatively isolated change, so I did it. This modifies PING to have an optional payload and adds a PONG frame to echo the PING. An empty PING generates an ACK; a PING with a payload demands a PONG. Generating an unguessable PING is the basis of mid-connection address validation. If the PING is sent on the new path, and the PONG comes back, then the remote address is probably OK to use. I've taken the discussion in the issue into consideration here. There's a lot of potential nuance to capture in terms of how an endpoint might reduce and restore send rates, but I've done what I can to thread the gap between allowing unbounded sending along new and untested paths and allowing connections to get back to doing business. It's annoying that this makes PING and PONG so disparate. I think that we have a re-ordering of frames in our near future to correct minor infidelities like this. I didn't want to do that here and pollute this PR though. Closes #161.
This has been much-discussed, and it's a relatively isolated change, so I did it. This modifies PING to have an optional payload and adds a PONG frame to echo the PING. An empty PING generates an ACK; a PING with a payload demands a PONG. Generating an unguessable PING is the basis of mid-connection address validation. If the PING is sent on the new path, and the PONG comes back, then the remote address is probably OK to use. I've taken the discussion in the issue into consideration here. There's a lot of potential nuance to capture in terms of how an endpoint might reduce and restore send rates, but I've done what I can to thread the gap between allowing unbounded sending along new and untested paths and allowing connections to get back to doing business. It's annoying that this makes PING and PONG so disparate. I think that we have a re-ordering of frames in our near future to correct minor infidelities like this. I didn't want to do that here and pollute this PR though. Closes #161.
When a client changes IP addresses (for any reason, including NAT rebinding), the server has no way of gaining an assurance that the client has actually seen its packets. That opens an attack where the attacker establishes a connection, then starts spoofing source address. If the server starts to send to that source address without verifying that the client can receive packets, that's potentially a powerful packet amplification attack.
It's not sufficient to look for ACKs or other such things, because no client message includes proof that the client has actually received a packet from the server. A client can just generate ACK and WINDOW_UPDATE frames as necessary, even new requests. The only risk a client runs is that it overestimates the number of packets that the server has sent and sends too many spurious ACK frames, which the server will then recognize and react to.
The text was updated successfully, but these errors were encountered: