Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider doing async reverse dns lookups in probe, breaking out the internet node into something more interesting. #364

Closed
tomwilkie opened this issue Aug 17, 2015 · 14 comments
Assignees

Comments

@tomwilkie
Copy link
Contributor

Esp when running apps in the cloud, would be nice to show which components are calling out to which services (s3, ses, etc)

@peterbourgon
Copy link
Contributor

Ooh, I like it: we could just tag remote nodes with a variety of whois-y lookup data...

"123.45.6.78;80": {
    "country": "UK",
    "city":    "Birmingham",
    "owner":   "HP Foods LLC"
}

@awh
Copy link
Contributor

awh commented Aug 25, 2015

Other metadata types that could be interesting - geoIP, autonomous system

@tomwilkie
Copy link
Contributor Author

Design considerations:

  • we don't want to hammer any backend (dns) service, so we'll want some caching
  • we don't want the lookups in dns to block the probes main thread

@peterbourgon
Copy link
Contributor

For geolocation stuff, it may be better to ship a GeoIP database with every Scope release, rather than pinging some remote service.

@inercia
Copy link
Contributor

inercia commented Aug 25, 2015

@peterbourgon there are some GeoIP databases publicly available (with some Golang libs). Maybe Scope could download it on the first run and keep it until a new version is available...

@peterbourgon
Copy link
Contributor

The probes would need to download it, and you may have lots of probes in your infrastructure. If I were an ops guy, I would be very unhappy with a monitoring agent reaching out to the internet and downloading some megabytes from a server when it first boots. I would be very unhappy with it reaching out to the internet at all, actually, unless I explicitly turned some feature on, and understood the implications of doing so.

@inercia
Copy link
Contributor

inercia commented Aug 25, 2015

@peterbourgon you are right: we should include the GeoIP database in the Scope image. Maybe we could add an option for overriding the embeded database in the future...

@inercia
Copy link
Contributor

inercia commented Aug 25, 2015

And what about adding this kind of information not in the probe but in the server? The result of geolocating an IP would be the same, and it would be easier to keep an up-to-date database in the server...

@tomwilkie
Copy link
Contributor Author

The app server might not be running in the same context as the probes, and might not have access to the same dns (for instance).

Lets worry less about geoip for now, and more about reverse dns lookups.

@awh
Copy link
Contributor

awh commented Aug 25, 2015

Lets worry less about geoip for now, and more about reverse dns lookups.

Yes sorry about that - I didn't mean to pollute this PR, just capture the conversation we had around it the other day...

@inercia inercia self-assigned this Aug 25, 2015
@inercia
Copy link
Contributor

inercia commented Aug 26, 2015

Some implementation questions:

  • Where should we add this reverse resolution metadata? In the Address/EdgeMetadatas ? In Address/NodeMetadatas?
  • For avoiding contention, maybe the first time we see an IP we could trigger a async, reverse resolution and then add the metadata the next time we generate a report... could this be feasible?

@tomwilkie
Copy link
Contributor Author

  1. Address/Endpoint NodeMetadatas is the obvious choice, but these don't exist for remote ends of connections. We'll need to have the probes generate the pseudo nodes, which is part of Consider gently converging NodeMetadata struct and RenderableNode #357
  2. Yes that sounds like a good idea. Don't block the probe goroutine, have it pick it up next time.

@inercia
Copy link
Contributor

inercia commented Aug 27, 2015

Ok, then I'll start by adding new nodes in Address/NodeMetadatas for remote ends of connections iff we have a cached reverse DNS resolution, with the reverse resolution in the "addr" (I'll add these in Addresses as I've seen in the comments in the code that Endpoints need pids, but correct me if I'm wrong)

@peterbourgon
Copy link
Contributor

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants