-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Periodic failures in kitchen-ec2 #10
Comments
Good question, my money would be on a race condition as well. It's possible that the wait logic is returning just a little too quickly when it sees an open TCP socket. Do you see this using any other drivers? Or possibly even certain AMI images? |
I'm seeing the same issue using an ubuntu 10.04 image (ami-1ab3ce73).
After that I can kitchen login 10 and kitchen converge 10 with no issues. |
Ditto here with ami-1ebb2077 (12.04 LTS). |
This may help to deal with instances that show an open TCP socket on port 22 but are not yet ready for an SSH client connection. References test-kitchen/kitchen-ec2#10
I'm hopeful that the above commit in Test Kitchen core will help us here. Will be in the next release of both gems. |
So far so good with the updated Test Kitchen. :D Thanks! |
We ran into the issue in the initial report of this issue (where the exception message is a username) because of a race condition with the population of ssh keys on our AMI. It's important to verify that cloud-init is configured and working properly because, if it is, key population will take place before the ssh daemon is launched. (Check In our case, on a CentOS AMI, the default cloud-init user was misconfigured and our root ssh key was only being populated by the |
Boom this just fixed my tests as well. Thanks @fnichol! |
@lancefrench ah, that's pretty insightful and makes a ton of sense if you're baking your own AMIs. @jejohns and @rayrod2030, thanks for confirming! /me hopes we're all good here now. 🍰 |
@fnichol Which version of test kitchen is this in? I am running |
Periodically I get failures starting up EC2 machines, like this:
However, the machine is actually created; if I immediately do "kitchen converge oracle-7-fedora-18", then kitchen successfully logs into the machine and starts converging.
Perhaps there's a race condition in here somewhere? Or kitchen is trying to connect to the SSH port even though it's really not quite ready?
The text was updated successfully, but these errors were encountered: