Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

show well known services #1082

Closed
pidster opened this issue Mar 1, 2016 · 6 comments
Closed

show well known services #1082

pidster opened this issue Mar 1, 2016 · 6 comments
Assignees
Labels
feature Indicates that issue is related to new end user functionality

Comments

@pidster
Copy link
Contributor

pidster commented Mar 1, 2016

Well-known hosts and/or services, such as Amazon RDS, SQS + others TBC should be displayed as a Host node, separate to 'The Internet' node(s)

@errordeveloper
Copy link
Contributor

I think we should try an approach which would allows us to discover concrete entities, i.e. use AWS APIs, rather then current traffic observation mechanisms inside of machines user owns. One additional benefit of such approach is that we could tie-in things like CloudWatch metrics, and generally extend the horizon. This way we could probably tell more information, e.g. when ELB gave up on routing to an instance due to failing health-checks.

@rade rade changed the title Show non-containerised well known hosts separately to the Internet nodes show well known services Jul 4, 2016
@rade rade added this to the July2016 milestone Jul 4, 2016
@rade rade added the feature Indicates that issue is related to new end user functionality label Jul 4, 2016
@rade rade modified the milestones: July2016, August2016 Jul 4, 2016
@rade
Copy link
Member

rade commented Aug 2, 2016

What are our options here?

Reverse-resolving the IP does not give us any info beyond "it's an aws address":

$ dig +short dynamodb.us-east-1.amazonaws.com
54.239.21.107
$ dig -x 54.239.21.107

; <<>> DiG 9.10.3-P4-Ubuntu <<>> -x 54.239.21.107
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 22840
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;107.21.239.54.in-addr.arpa.    IN  PTR

;; AUTHORITY SECTION:
239.54.in-addr.arpa.    600 IN  SOA dns-external-master.amazon.com. root.amazon.com. 37 3600 900 604800 900

;; Query time: 21 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Aug 02 21:44:39 BST 2016
;; MSG SIZE  rcvd: 126

We could have some probe that repeatedly queries for all the names in all the regions. The trouble is that there are loads, and there is some round-robin or other load-balancing going on:

$ dig +short dynamodb.us-east-1.amazonaws.com
54.239.20.120
$ dig +short dynamodb.us-east-1.amazonaws.com
52.94.2.50
$ dig +short dynamodb.us-east-1.amazonaws.com
54.239.20.72
$ dig +short dynamodb.us-east-1.amazonaws.com
54.239.20.128
$ dig +short dynamodb.us-east-1.amazonaws.com
54.239.21.107

It's hard to know many attempts it would take to extract all the IPs. I suppose it wouldn't take much effort to conduct an experiment.

We also do not know how often the info changes.

Another alternative is for probes to spy on DNS answers coming into the host, e.g. capture packets on the network interfaces which have a src port of 53 and perhaps some extra filters. Provided the probe is running when applications make their DNS queries, that should allow us to build up a map from looked-up-names to IPs, which we can then use, in reverse, to associate the destination IPs of connections with a service name.

The map may need some expiration for entries, though we can probably get away without that for quite a while. Note that DNS TTLs don't help us here since established connections may last well beyond the TTL. I suppose we could simply expire entries for IPs we haven't seen for a while.

What happens when we spy resolutions for different names mapping to the same IP? Prime example would be region-less names ("Some services, such as Amazon EC2, let you specify an endpoint that does not include a specific region, for example, https://ec2.amazonaws.com. In that case, AWS routes the endpoint to us-east-1."). I guess we would create nodes that represent the transitive closure of names and IPs reachable by traversing the name/ip relation. This effectively partitions the name/ip relation.

@tomwilkie
Copy link
Contributor

I was under the impression this contained all the info we need, but it doesn't seem to:

https://aws.amazon.com/blogs/aws/aws-ip-ranges-json/

@rade
Copy link
Member

rade commented Aug 10, 2016

Interestingly a few AWS IPs do reverse resolve; I can see s3-1-w.amazonaws.com and dynamodb.us-east-1.amazonaws.com in some of our reports. So a stepping stone here would be to simply show all distinct DNS names as separate nodes. We'd probably have to treat "ec2-A-B-C-D.compute-*.amazonaws.com" addresses specially though, i.e. collapse them all into a single "EC2" node.

@2opremio 2opremio self-assigned this Sep 13, 2016
@2opremio
Copy link
Contributor

I can't think about a better alternative than what @rade has proposed so I am going to go ahead and implement it.

@2opremio
Copy link
Contributor

I suppose we could simply expire entries for IPs we haven't seen for a while

I think that this effectively means a classic LRU eviction scheme. I don't think we need to be strict about when exactly to expire the IPs if they are not being used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates that issue is related to new end user functionality
Projects
None yet
Development

No branches or pull requests

5 participants