Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues when tests are run inside a container #134

Closed
ostacey opened this issue Apr 21, 2016 · 4 comments
Closed

Issues when tests are run inside a container #134

ostacey opened this issue Apr 21, 2016 · 4 comments
Assignees
Milestone

Comments

@ostacey
Copy link

ostacey commented Apr 21, 2016

My organization is running Jenkins build jobs in Docker. That makes configuring Jenkins a lot simpler; there's only one typeof build node regardless of what you're building.

However, I've run into some problems when triggering testcontainers from inside Docker.

  1. Issue with getContainerIpAddress(). If you are talking to Docker over a socket, DockerClientConfigUtils assumes the address of the Docker host is "localhost". That's not accurate if the test is running inside a container. (This scenario shows up when you give your container access to Docker by exposing the Docker socket as a volume.)

This issue is fixable. First, you need to detect if the test is running inside Docker. For that, you can look to see if the file "/.dockerenv" exists - inside a container, it is always present. Second, you need to know what IP to provide in that case. Luckily, you don't need the IP of the docker host; the default gateway (usually 172.17.0.1) is normally sufficient. You can get this by running "netstat -nr" and parsing the output. (That may vary with more complex networking scenarios.)

If there's interest, and the approach seems palatable, I can assemble a pull request for this.

  1. Issue with the way Docker does port mapping. Take a look at this. I've verified this behavior in Docker 1.10 on OSX docker-machine and on a pure Ubuntu host with Docker 1.9.
# get your docker IP (change for your docker setup)
DOCKER_IP=`docker-machine ip docker-vm`

# run a container that maps port 9999, but doesn't listen on it.
docker run -d -name dummy -p 9999:80 alpine:3.3 /bin/sh sleep 1000

# try to connect to it; fails and returns right away
nc -v $DOCKER_IP 9999

# try to connect to it from inside another container
docker run --rm -it alpine:3.3 nc -v $DOCKER_IP 9999

The final nc command will connect to the socket and just sit there. That's right, for some reason Docker is accepting connections on the mapped port, even though there's nothing listening. If you try to do anything with the port, you'll get "connection reset by peer" or other error. (I'm not sure if this is intended behavior on the Docker side or an accident; I haven't dug in that deeply.)

This behavior significantly breaks the container startup flow in GenericContainer. There, we consider the container to have started as soon as we can get a socket connection to the port. Sadly, the "new Socket().close()" code there is just as fooled as "nc", so it can proceed to testing with a not-yet-functional container.

Obviously, a Thread.sleep() call at the top of a test is a (bad) workaround for this. The changes in pull request 113 would provide a solution for most users out-of-the-box, and will allow developers who are testing non-http services to write their own wait strategies.

@rnorth
Copy link
Member

rnorth commented Apr 22, 2016

Hi Oliver
Thanks for this analysis - it's another permutation to consider, but definitely sounds valid :)

Re 1, I'm definitely supportive in principle! My only slight worry is that you might have a fair amount of work to do to cover all the possibilities. I think if we could have a way to 'do the right thing automatically' 80% of the time but allow some kind of manual override for teams who manage to find a more exotic network configuration, that would be fine.

I did a small portion of the work on this for #90, which is actually sitting in a branch and could be picked up if you want (container-ip-override). That approach was purely manual, but if you're willing to look into automating the IP address discovery that'd be a big improvement.

Re 2, the problem sounds interesting - like the docker TCP proxy is leaving its listening socket open after the first attempt at connecting through fails. I think you're quite right that the solution to this is some smarter liveness checks, and #133 seems like the right way to do this. I'll work with @outofcoffee to get this PR in as soon as possible.

Thanks

Richard

@outofcoffee
Copy link
Contributor

@ostacey #133 is merged :)

@ostacey
Copy link
Author

ostacey commented May 2, 2016

I share your concern about exotic network configurations. I guess there are a few variables:

    1. the network configuration of the containers run by the tests
    1. the network configuration of the container the test is running in
    1. the environment the test is running in

If I make the change somewhere in DockerClientFactory.dockerHostIpAddress() or DockerClientConfigUtils, then the affected scope (as far as I can see) is just GenericContainer and its subclasses. Currently, there's only support in GenericContainer for running containers in "bridge" mode (unless a user subclasses applyConfiguration), which simplifies point 1.

Point 2 is more of an issue. As I see the cases:

  • If the container is running in "bridge" mode, then the approach with the default gateway is correct
  • If it's running in "host" mode, then "localhost" is what you want to use.
  • If it's running in "container" mode, you're sharing the network stack of another container, so there isn't a separate case - it's dependent on whatever you are running.
  • If it's running using the new "network" mode, I think it might depend on your driver, but using the "bridge" driver I see results very similar to the default "bridge" mode.

Point 3 is also nontrivial. Fortunately, since we're only concerned with cases where the tests are running inside a Docker container, we can assume that the test environment is Linux-based - no BSD or Windows to worry about. I suggested using "netstat -nr" to look at routing information, but that command is not installed by default in debian:jessie or various ubuntus. The more "modern" replacement is "ip route", which is present everywhere... except the freshly released Ubuntu 16.04. I'm still looking for an alternative there.

@bsideup
Copy link
Member

bsideup commented Jan 15, 2017

Hi @ostacey,

FYI

  1. getContainerIpAddress should be fixed once this PR is merged: Make it possible to run TestContainers inside a container #267
  2. open port issue was fixed in Command-based port check in HostPortWaitStrategy in case of Docker for Mac #236

We would be happy if you test it!

@bsideup bsideup added this to the 1.1.8 milestone Jan 15, 2017
@bsideup bsideup self-assigned this Jan 15, 2017
@rnorth rnorth closed this as completed Jan 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants