-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core) embedded dns resolver #1225
Conversation
…the balancer_by_lua implementation.
return {host = result.host, port = result.port} | ||
end | ||
|
||
-- Query for A record |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an SRV record is supposed to start with _
, which is an illegal character for A records iirc. So instead of querying them both successively, check for the underscore and query either A or SRV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an SRV record is supposed to start with _
I am not aware of this convention. In all the Consul examples an SRV record looks like an A record. Also in my tests SRV records were just returning port information, not IP information, so querying both is required.
Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See RFC2782. But apparently consul has multiple ways of looking up the same service; 'Standard lookup' and 'RFC 2782 lookup', see https://www.consul.io/docs/agent/dns.html
The way I understand it if you define a service in Consul it constructs multiple ways to retrieve the same service info.
It will be known by this name; [tag.]<service>.service[.datacenter].<domain>
as well as; _<service>._<protocol>.service[.datacenter][.domain]
They return both a port as well as a hostname/ip. So in case of a hostname, a second lookup may be required.
try; dig srvtest.thijsschreijer.nl SRV
(not Consul, but regular DNS)
As a general comment; can we cache the dns resolver objects? this creates a new dns resolver object for every request (assuming ttl = 0). Can't we reuse them. Less setting up and tearing down, and less GC cycles. |
Yes, we should - but when trying to cache them though the system was returning errors after a few queries. In the resty-dns examples the resolver is initialized every time too: https://github.com/openresty/lua-resty-dns#synopsis The TTL is handled differently, since the resolutions are stored in the shared dict and not in the object itself. |
Did you provide a locking mechanism? you must make sure (due to the co-sockets nature) that every DNS resolver object only handles a single complete cycle at a time. Before reusing the object for the next query. |
That didn't really concern me a lot since the base example of resty-dns creates a new resolver anyways, which may be the intended use, or not have any particular performance impact. It's an open question. Locking would slow down resolution across the workers on high throughputs, maybe more than initializing a new resolver. I will benchmark this branch to figure it out. |
Locking is rather simple. Something like this; Create two lists at module level; The problem is that when a request is sent, the coroutine/light-thread is yielded by openresty. Now if a second request comes in and uses the same sockets before the response is returned, it goes awry. Cosocket code is written as sync, but really is async. Yielding is implicit whenever you do a socket operation. |
My point is that we may not need locking in the first place, because creating the resolver every time doesn't really have any performance slowdowns (which seems to be the intended use in the resty-dns examples I linked above). At a closer look when the library is initialized it creates a new socket for every request (at https://github.com/openresty/lua-resty-dns/blob/master/lib/resty/dns/resolver.lua#L112), which is a non-blocking operation, and then sets a timeout on the socket (https://github.com/openresty/lua-resty-dns/blob/master/lib/resty/dns/resolver.lua#L133) in order to enforce the DNS resolution timeout. Your system seems more like a pool of DNS resolvers which while may be doing the work, adds additional complexity. I would first benchmark the performance of the current system, and then take an action depending if any performance loss is obtained. In other words, let's make it work, let's run it, and see :) |
we should test this with other Service discovery tools other than consul. Like zookeeper. |
I am aiming at supporting Consul and etcd - we could always keep the |
Just updated my PR, hope that it got 🈴 soon~, then you guys can finishes this feature~~ |
Hey, i built this branch to try it, but i have some problem with a service configuration. How do you set the query to do to the dns discovery service ? |
|
||
-- Query for SRV record | ||
local final_port = port | ||
if enable_srv then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that on the enable_srv he didn't load corrrectly the configuration, I put the srv: true on the yaml configuration file but it is always nil
So when im querying consul, i never got the port of the service. I made it to work by adding a realy bad fix.
On resolver.lua
if dns_config.srv == nil then
dns_config.srv = true
end
before the call of the
dns:resolve(parsed_url.host, parsed_url.port, dns_config.srv)
If we stick to the standards, they all should work. Specific optimization can be done later.
It's not so much in resolving where the slowdown is, but in the GC. Everything created also needs to be cleaned up. Anyway, currently I don't think they can be reused, see openresty/lua-resty-dns#13 |
So some more issues to deal with;
|
https://github.com/docker/libkv kv abstraction library with support for consul, etcd and zookeeper. Might not fit the use case / technology stack but thought it worth mentioning |
Whats the eta on this? We publish a microservice container endpoint info into consul via registrator. Will this PR cover the use-case where kong could leverage that named service info in consul to dynamically route all requests to the correct backends as containers come up/down? |
it will not make it in the upcoming 0.9. But is scheduled for the next one. Presumably 0.10 if all goes well 4 weeks after 0.9 release (as 2 weeks for an RC, 2 weeks later final). |
@bitsofinfo just so you know, you can currently use Consul if you are not leveraging |
This branch is now superseded by #1541. If anyone wants to test the current status, please see #1541 (comment) |
This PR drastically changes how DNS resolutions work.
/etc/hosts
and/etc/resolv.d
files. During resolutions, the entries in/etc/hosts
always have the highest priority, with fallback to the configured DNS server.dns_resolvers_available
has been removed, anddns_resolver
has been introduced. By default the system uses the local settings configured at/etc/resolv.d
(using the firstnameserver
and being compliant with anyoptions
specified, if any). The DNS resolver can be manually set by uncommentingdns_resolver
.A
andSRV
resolutions are supported.SRV
resolutions are enabled by default and can be disabled to gain some performance in scenarios when it's not required, setting the propertysrv: false
intodns_resolver
.balancer_by_lua
in the configuration file.We can wait
refactor/cli
to be merged intonext
before merging this, as you like.The PR is almost complete: