-
Notifications
You must be signed in to change notification settings - Fork 52
Temporary ttl=0
returned on edge case by Route 53
#51
Comments
Root cause of the issue is that the Amazon Route 53 nameserver appears to round/truncate the remaining ttl for a query to 0. This is an Amazon bug since the nameserver should never report Since the amazon nameserver reports the "remaining" ttl it has in its own cache, the DNS client will automatically "zoom in" on the edge where the record expires. This is especially true when a system is under load, in a way that in every possible fraction of a second a dns request is done. So when the local cache (our own) expires, we immediately fire a query, and automatically are hitting the edge on the nameserver. Since the timing has to be precise we're not always hitting the edge, but the user reported it happening every 2 minutes. So it seems that when the remaining time calculated by the nameserver is close to 0 (but not exactly since it would invalidate its own cache in that case). It rounds, or truncates (integer cast maybe) the value to 0. This causes it to report ttl=0 for a non-0 ttl record. This in turn causes the loadbalancer to switch behaviour to do dns queries on every request, which shouldn't happen. |
See Kong/kong#3641 (comment)
for a possible fix. The problem causes the Balancer to temporarily switch from the proper record (A) to a vritual SRV one, because
ttl=0
is detected.The text was updated successfully, but these errors were encountered: