-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dedupe probe POSTs #485
Dedupe probe POSTs #485
Conversation
71460bd
to
b57965c
Compare
@tomwilkie PTAL |
} | ||
r.set(peer.hostname, endpoints) |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
If I read this right, you're doing the get for /api on a single goroutine (the resolver goroutine), and you're doing them every 1 min. These gets will block for ~30s for endpoints that don't exist; I'm worried that when scope is 'joining' a large cluster with multiple unreachable endpoints, it will take a long time (many minutes) for the data to show up in all the apps. I suggest we have a cache of ip->id (which expires its entire after some longer timeout, say 5mins) and we do these lookups in parallel, on background goroutines. |
Will add a pool of goroutines for lookups, but I'm not sure adding a cache makes sense, I'm not sure it will make a significant difference to the behavior and it's an extra layer of complexity. I'll sketch it out and see how it feels. |
@tomwilkie OK, that's a bit of a refactor, but it's nice and straightforward (and tested) :) |
result := map[target][]string{} | ||
for _, t := range targets { | ||
c := make(chan []string) | ||
go func(t target) { s.p(); defer s.v(); c <- resolveOne(t) }(t) |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
@peterbourgon do you plan on doing the /api lookups on separate goroutines? You seem to have moved the dns lookups on separate goroutines (but they max out at 5s), but the /api lookups could take 30s and are all on one goroutine? |
(sigh) When you said "lookup" I interpreted it as "DNS lookup" and not "GET /api". I'll change it. Why do you think GET /api will block for 30 seconds? |
The http calls could block for 30s if, say, the host is unreachable. Which is why we did all the weave integration, so you could deploy scope on a cluster which spans two datacenters - where ips from one aren't reachable from the other. |
- Set instead of Add, to allow replacement of endpoints - Break out individual Publishers to their own files and tests
1560adf
to
eccc74a
Compare
- Process DNS resolution serially - Process up to 10 HTTP GET (for app ID) concurrently More than 10 concurrent GET requests will block on the semaphore. This will cause the staticResolver.resolve method to block, which is probably fine: it will just delay the next resolve loop, currently at 1m intervals. To make this a little bit more robust, I've also added a fastClient for app ID resolution, with a timeout (total, including connect, request, and response) of 5s.
@tomwilkie 8602132 should do it. I hope. |
type Publisher interface { | ||
Publish(*bytes.Buffer) error | ||
Publish(io.Reader) error |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Have tested and seems to work well. |
Agh, should have squashed. Sorry. Not enough coffee yet. |
Addresses #463