Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unavailable Replicas When Running in AWS Using DNS Replication #1134

Closed
JosephWitthuhnTR opened this issue Oct 15, 2018 · 4 comments
Closed

Comments

@JosephWitthuhnTR
Copy link
Contributor

We are running Eureka and having issues where all replicas are showing up in Eureka as unavailable replicas. Each server shows all three servers, so it successfully finds the others, but the replicas are showing as unavailable. Also, each server also shows itself as a replica (I'm not sure if that is normal).

Configuration:

  • Three EC2 instances. Each with an extra network interface attached.
  • The extra network interface is then referenced from our DNS records.
  • This allows us to take down an instance and replace it, and just attach the network interface to the new instance to allow our Eureka instances to maintain the same IP address.
  • For the sake of my examples below, let's say the EC2 instances are:
AZ EC2 Instance Extra Network Adapter
eu-west-1a 10.1.1.1 10.1.1.2
eu-west-1b 10.1.2.1 10.1.2.2
eu-west-1c 10.1.3.1 10.1.3.2

I believe the DNS record is working fine, since it is able to read the IP addresses of the other Eureka instances without issue.

The replics show up in "unavailable-replicas" on the Eureka page in this format:
http://10.1.1.2:8080/eureka/v2/,http://10.1.2.2:8080/eureka/v2/,http://10.1.3.2:8080/eureka/v2/,
(where those IP addresses are the correct IP addresses for the extra network interfaces that are referred do by our DNS records)

If I sign into one of the boxes, I am able to curl to these URLs, so I do not believe there is a network issue here. Also, application instances are successfully registering with Eureka,including all three instances of Eureka (as I'd expect, it shows them with the EC2 instance IPs of 10.1.1.1, 10.1.2.1, and 10.1.3.1).

We are running on 1.9.5, but also experienced the issue with a much older version, 1.3.1.

Looking in the logs, we are seeing this output every sixty seconds:

2018-10-15 16:02:20,812 ERROR com.netflix.eureka.aws.EIPManager$EIPBindingTask:444 [Eureka-EIPBinder] [run] Could not bind to EIP
java.lang.StringIndexOutOfBoundsException: String index out of range: -4
        at java.lang.String.substring(String.java:1967)
        at com.netflix.eureka.aws.EIPManager.getEIPsFromServiceUrls(EIPManager.java:360)
        at com.netflix.eureka.aws.EIPManager.getEIPsForZoneFromDNS(EIPManager.java:394)
        at com.netflix.eureka.aws.EIPManager.getCandidateEIPs(EIPManager.java:316)
        at com.netflix.eureka.aws.EIPManager.isEIPBound(EIPManager.java:165)
        at com.netflix.eureka.aws.EIPManager$EIPBindingTask.run(EIPManager.java:431)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)

(this is interesting because I don't believe we are even using EIP based configuration, right?)

We also see this every five minutes (with the correct DNS name):
2018-10-15 16:01:35,083 INFO com.netflix.discovery.shared.resolver.aws.ConfigClusterResolver:39 [AsyncResolver-bootstrap-executor-0] [getClusterEndpoints] Resolving eureka endpoints via DNS: txt.eu-west-1.eureka-qa.cloud.ourcompany.com

When we start our Eureka server, we are passing in system properties like this:
-Deureka.environment=qa -Deureka.datacenter=cloud -Deureka.shouldUseDns=true -Deureka.eurekaServer.domainName=eureka-qa.cloud.ourcompany.com -Deureka.eurekaServer.context=eureka/v2 -Deureka.enableSelfPreservation=false -Deureka.datacenter=cloud -Deureka.region=eu-west-1

Are we doing something wrong with our configuration? The StringIndexOutOfBoundsException issue almost looks like a bug with Eureka. We've spent a few days trying to solve this ourselves, but haven't made much progress.

I'd appreciate any help anyone can give!

Thanks,
Joseph Witthuhn

@JosephWitthuhnTR
Copy link
Contributor Author

Is the issue with lines 353-358 of EIPManager (https://github.com/Netflix/eureka/blob/master/eureka-core/src/main/java/com/netflix/eureka/aws/EIPManager.java#L353)?

The comment indicates that the if check on line 358 is designed to prevent us from hitting that code if "ec2-" isn't in the hostname. But that isn't the effect of the code. If "ec2-" isn't in the hostname, then beginIndex will be set to "3" and we'll still go into the if check. As written, I think that if check is impossible not to hit.

@mgtriffid
Copy link
Contributor

mgtriffid commented Oct 16, 2018

Well, I see replica is displayed as "unavailable" if Eureka cannot find InstanceInfo for Eureka which has hostname same as configured. See https://github.com/Netflix/eureka/blob/master/eureka-core/src/main/java/com/netflix/eureka/util/StatusUtil.java#L68 . So in your situation each Eureka instance has "EC2" IP in registry, but DNS resolves into different IP, I think this explains such behavior.

@JosephWitthuhnTR
Copy link
Contributor Author

Is there a property that I could set that would fix that, and tell the Eureka client to register under the IP address of the network interface that I attached? Is that the behavior something that eureka.vipAddress might set?

Also, I submit a pull request for the exception I was seeing. I'm not 100% sure if that resolves my particular issue, but I think it is a bug fix that is needed anyhow, right? This was #1135

@troshko111
Copy link
Contributor

Can you supply the desired IP through eureka-client/src/main/java/com/netflix/appinfo/providers/EurekaConfigBasedInstanceInfoProvider.java?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants