Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent failure in CI environment #479

Closed
stuarthendren opened this issue Oct 27, 2017 · 32 comments
Closed

Intermittent failure in CI environment #479

stuarthendren opened this issue Oct 27, 2017 · 32 comments

Comments

@stuarthendren
Copy link

I have a simple Redis integration test:

public class RedisResourceTest {

 private static final String DOCKER_IMAGE = "redis:3.2.9";
   private static final int TIMEOUT = 120;

  @ClassRule
  public static GenericContainer redis = new GenericContainer(DOCKER_IMAGE).withExposedPorts(6379)
      .withStartupTimeout(Duration.ofSeconds(TIMEOUT));

  @Test
  public void testResourceCanSendAndRecieve() throws ResourceInitializationException {
    String ipAddredd = redis.getContainerIpAddress()),
    int redisPort = redis.getMappedPort(6379)))};
   
    //TEST code  
  
  }
}

This passes every time running locally, however, when running in Jenkins (BlueOcean) it intermittently fails with the exception:

java.lang.AssertionError: A port exposed by a docker container should be accessible: 'Connection refused (Connection refused)' does not equal expected 'hello'
	at org.rnorth.visibleassertions.VisibleAssertions.fail(VisibleAssertions.java:440)
	at org.rnorth.visibleassertions.VisibleAssertions.assertEquals(VisibleAssertions.java:174)
	at org.testcontainers.DockerClientFactory.checkExposedPort(DockerClientFactory.java:190)
	at org.testcontainers.DockerClientFactory.lambda$client$2(DockerClientFactory.java:125)
	at org.testcontainers.DockerClientFactory.runInsideDocker(DockerClientFactory.java:226)
	at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:114)
	at org.testcontainers.containers.GenericContainer.<init>(GenericContainer.java:116)
	at io.committed.RedisResourceTest.<clinit>(RedisResourceTest.java:27)
	at sun.misc.Unsafe.ensureClassInitialized(Native Method)
	at sun.reflect.UnsafeFieldAccessorFactory.newFieldAccessor(UnsafeFieldAccessorFactory.java:43)
	at sun.reflect.ReflectionFactory.newFieldAccessor(ReflectionFactory.java:156)
	at java.lang.reflect.Field.acquireFieldAccessor(Field.java:1088)
	at java.lang.reflect.Field.getFieldAccessor(Field.java:1069)
	at java.lang.reflect.Field.get(Field.java:393)
	at org.junit.runners.model.FrameworkField.get(FrameworkField.java:73)
	at org.junit.runners.model.TestClass.getAnnotatedFieldValues(TestClass.java:230)
	at org.junit.runners.ParentRunner.classRules(ParentRunner.java:255)
	at org.junit.runners.ParentRunner.withClassRules(ParentRunner.java:244)
	at org.junit.runners.ParentRunner.classBlock(ParentRunner.java:194)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:362)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
	at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
	at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

I've tried larger timeout values, but that doesn't seem to help. The docker sock port is exposed to the machine, the logs report lots of activity and the fact it sometimes passes must imply it's not simply that it can't interact with Docker.

I realize that this must be something to do with the environment, maybe as the build is already run in a Docker Container, so I don't expect you to have a fix for this, however, my question is, how do I start to debug such an issue? Maybe there is a way to see the logs from the container?

@bsideup
Copy link
Member

bsideup commented Oct 27, 2017

Hi @stuarthendren,

Please check this page: https://www.testcontainers.org/usage/inside_docker.html
It explains how to configure your build if you want to run the tests inside a container

The error is a pre-flight check and verifies that your environment is correct, nothing special to debug :)

@stuarthendren
Copy link
Author

Thank you for the suggestion but I had already done that.

@kiview
Copy link
Member

kiview commented Nov 14, 2017

@stuarthendren Could you provide your Jenkins pipeline configuration? So you are mounting the Docker socket?

@stuarthendren
Copy link
Author

pipeline {
    agent { 
        docker {
            image 'maven:3-jdk-8'
            args '--volumes-from ci3_cache_1 -v /var/run/docker.sock:/var/run/docker.sock'
        }
    }
    environment {
        NEXUS_PASSWORD = credentials('NEXUS_PASSWORD')
        NEXUS_AUTH = credentials('NEXUS_AUTH')
        SONAR_PASSWORD = credentials('SONAR_PASSWORD')
    }

    stages {
        stage('env') {
            steps {
                notifyBuild()
                sh 'printenv'
            }
        }
        stage('build') {
            steps {    
                sh 'mvn -s $MAVEN_SETTINGS  -B clean install'
            }
        }
        stage('quality') {
            steps {
                sh 'mvn -s $MAVEN_SETTINGS -B sonar:sonar'
            }
        }
    }
    post {
        always {
            junit '**/target/surefire-reports/*.xml'
        }
        success {
             notifyBuild('SUCCESS')
        }
        unstable {
             notifyBuild('UNSTABLE')
        }
        failure {
            notifyBuild('FAILED')
        }
    }
}

leaving out the notify function - just a slackSend call.

@jochenhebbrecht
Copy link
Contributor

Hi,

I'm running into exactly same problem, but unfortunately, also still looking for a solution ...

Jochen

@kiview
Copy link
Member

kiview commented Jan 9, 2018

@stuarthendren
Can you provide the log output for the EnvironmentAndSystemPropertyClientProviderStrategy and
DockerClientProviderStrategy log lines?

Also we use a build image with installed Docker client in our CI.

@stuarthendren
Copy link
Author

I can only access some old logs from jenkins at the moment, neither of those classes logged anything, I can look at the log level settings and rerun over the weekend.

@kiview
Copy link
Member

kiview commented Jan 9, 2018

Should be loglevel info, see this logback config for sensible defaults:
https://www.testcontainers.org/usage.html#logging

@jochenhebbrecht
Copy link
Contributor

jochenhebbrecht commented Jan 9, 2018

Hi @kiview ,

I'm facing the same problem, so I wanted to report my logs. I added the Logback library on my classpath:

<dependency>
    <groupId>ch.qos.logback</groupId>
    <artifactId>logback-classic</artifactId>
    <version>1.2.3</version>
    <scope>test</scope>
</dependency>

I added the following xml configuration:

<configuration>
    <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
            <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger - %msg%n</pattern>
        </encoder>
    </appender>

    <root level="info">
        <appender-ref ref="STDOUT"/>
    </root>

    <logger name="org.testcontainers" level="DEBUG"/>
</configuration>

But when I run my tests, I get this logging:

log4j:WARN Continuable parsing error 1 and column 16
log4j:WARN Document root element "configuration", must match DOCTYPE root "null".
log4j:WARN Continuable parsing error 1 and column 16
log4j:WARN Document is invalid: no grammar found.
log4j:WARN The <configuration> element has been deprecated.
log4j:WARN Use the <log4j:configuration> element instead.
log4j:ERROR Could not create an Appender. Reported error follows.
java.lang.ClassNotFoundException: ch.qos.logback.core.ConsoleAppender
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.log4j.helpers.Loader.loadClass(Loader.java:198)
	at org.apache.log4j.xml.DOMConfigurator.parseAppender(DOMConfigurator.java:247)
	at org.apache.log4j.xml.DOMConfigurator.findAppenderByName(DOMConfigurator.java:176)
	at org.apache.log4j.xml.DOMConfigurator.findAppenderByReference(DOMConfigurator.java:191)
	at org.apache.log4j.xml.DOMConfigurator.parseChildrenOfLoggerElement(DOMConfigurator.java:523)
	at org.apache.log4j.xml.DOMConfigurator.parseRoot(DOMConfigurator.java:492)
	at org.apache.log4j.xml.DOMConfigurator.parse(DOMConfigurator.java:1001)
	at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:867)
	at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:773)
	at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483)
	at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
	at org.apache.log4j.Logger.getLogger(Logger.java:104)
	at org.testcontainers.shaded.io.netty.util.internal.logging.Log4JLoggerFactory.newInstance(Log4JLoggerFactory.java:38)
	at org.testcontainers.shaded.io.netty.util.internal.logging.InternalLoggerFactory.newDefaultFactory(InternalLoggerFactory.java:47)
	at org.testcontainers.shaded.io.netty.util.internal.logging.InternalLoggerFactory.getDefaultFactory(InternalLoggerFactory.java:62)
	at org.testcontainers.shaded.io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:88)
	at org.testcontainers.shaded.io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:81)
	at org.testcontainers.shaded.io.netty.bootstrap.Bootstrap.<clinit>(Bootstrap.java:51)
	at com.github.dockerjava.netty.NettyDockerCmdExecFactory.init(NettyDockerCmdExecFactory.java:196)
	at com.github.dockerjava.core.DockerClientImpl.withDockerCmdExecFactory(DockerClientImpl.java:161)
	at com.github.dockerjava.core.DockerClientBuilder.build(DockerClientBuilder.java:45)
	at org.testcontainers.dockerclient.DockerClientProviderStrategy.getClientForConfig(DockerClientProviderStrategy.java:161)
	at org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy.test(EnvironmentAndSystemPropertyClientProviderStrategy.java:39)
	at org.testcontainers.dockerclient.DockerClientProviderStrategy.lambda$getFirstValidStrategy$1(DockerClientProviderStrategy.java:104)
	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267)
	at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:302)
	at java.util.stream.Streams$ConcatSpliterator.tryAdvance(Streams.java:728)
	at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
	at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:498)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:485)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.findAny(ReferencePipeline.java:469)
	at org.testcontainers.dockerclient.DockerClientProviderStrategy.getFirstValidStrategy(DockerClientProviderStrategy.java:137)
	at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:90)
	at org.testcontainers.containers.GenericContainer.<init>(GenericContainer.java:116)
	at org.testcontainers.containers.JdbcDatabaseContainer.<init>(JdbcDatabaseContainer.java:35)
	at org.testcontainers.containers.OracleContainer.<init>(OracleContainer.java:17)
	at com.MYCOMPANY.mpd.pcontrol.client.impl.PControlReportDbClientImplTest.<clinit>(PControlReportDbClientImplTest.java:33)
	at sun.misc.Unsafe.ensureClassInitialized(Native Method)
	at sun.reflect.UnsafeFieldAccessorFactory.newFieldAccessor(UnsafeFieldAccessorFactory.java:43)
	at sun.reflect.ReflectionFactory.newFieldAccessor(ReflectionFactory.java:156)
	at java.lang.reflect.Field.acquireFieldAccessor(Field.java:1088)
	at java.lang.reflect.Field.getFieldAccessor(Field.java:1069)
	at java.lang.reflect.Field.get(Field.java:393)
	at org.junit.runners.model.FrameworkField.get(FrameworkField.java:73)
	at org.junit.runners.model.TestClass.getAnnotatedFieldValues(TestClass.java:230)
	at org.junit.runners.ParentRunner.classRules(ParentRunner.java:255)
	at org.junit.runners.ParentRunner.withClassRules(ParentRunner.java:244)
	at org.junit.runners.ParentRunner.classBlock(ParentRunner.java:194)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:362)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
	at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
log4j:WARN No appenders could be found for logger (org.testcontainers.shaded.io.netty.util.internal.logging.InternalLoggerFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
        ℹ︎ Checking the system...
        ✔ Docker version should be at least 1.6.0
        ✔ Docker environment should have more than 2GB free disk space
        ✔ File should be mountable
...

@jochenhebbrecht
Copy link
Contributor

Ok, I managed to fix the logging, this is the outcome:

[main] INFO org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy - Found docker client settings from environment
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Found Docker environment with Environment variables, system properties and defaults. Resolved: 
    dockerHost=unix:///var/run/docker.sock
    apiVersion='{UNKNOWN_VERSION}'
    registryUrl='https://index.docker.io/v1/'
    registryUsername='jenkins'
    registryPassword='null'
    registryEmail='null'
    dockerConfig='DefaultDockerClientConfig[dockerHost=unix:///var/run/docker.sock,registryUsername=jenkins,registryPassword=<null>,registryEmail=<null>,registryUrl=https://index.docker.io/v1/,dockerConfig=/home/jenkins/.docker,sslConfig=<null>,apiVersion={UNKNOWN_VERSION}]'

[main] INFO org.testcontainers.DockerClientFactory - Docker host IP address is localhost
[main] INFO org.testcontainers.DockerClientFactory - Connected to docker: 
  Server Version: 17.06.2-ce
  API Version: 1.30
  Operating System: Amazon Linux AMI 2017.09
  Total Memory: 32177 MB
        ?? Checking the system...
        ? Docker version should be at least 1.6.0
        ? Docker environment should have more than 2GB free disk space
        ? File should be mountable
        ? A port exposed by a docker container should be accessible
            'Connection refused (Connection refused)' does not equal expected 'hello'

So although our Jenkins slave is a Docker container (it's running in an ECS cluster on AWS), the testcontainers code doesn't recognize he's currently working from within an existing Docker container and it's trying to reach out to 'localhost' (which is not valid!)

Is it possible there's a problem with the recognition of 'RunInsideDocker'?

@bsideup
Copy link
Member

bsideup commented Jan 10, 2018

@jochenhebbrecht it seems so. Could you please share your ECS task def?

@jochenhebbrecht
Copy link
Contributor

Sure, here you go!

screencapture-eu-west-1-console-aws-amazon-ecs-home-1515577988117

@bsideup
Copy link
Member

bsideup commented Jan 10, 2018

@jochenhebbrecht may I kindly ask you to share a JSON definition as text instead? :) You can get it by clicking on "JSON" tab

@kiview
Copy link
Member

kiview commented Jan 10, 2018

When running inside Docker, the Docker host IP address should ideally be something like 172.17.0.1.

@jochenhebbrecht
Copy link
Contributor

jochenhebbrecht commented Jan 10, 2018

Ha, sorry :) ... Didn't notice the "JSON" tab. Yes, of course :-)

{
  "executionRoleArn": null,
  "containerDefinitions": [
    {
      "dnsSearchDomains": null,
      "logConfiguration": null,
      "entryPoint": null,
      "portMappings": [],
      "command": null,
      "linuxParameters": null,
      "cpu": 1,
      "environment": [
        {
          "name": "DOCKER_API_VERSION",
          "value": "1.27"
        },
        {
          "name": "TRUSTED_SSH_HOSTS",
          "value": "bitbucket.MYCOMPANY.com:7999 svn.MYCOMPANY.com subversion.MYCOMPANY.com"
        }
      ],
      "ulimits": null,
      "dnsServers": null,
      "mountPoints": [
        {
          "readOnly": false,
          "containerPath": "/run/docker.sock",
          "sourceVolume": "DockerSocket"
        },
        {
          "readOnly": false,
          "containerPath": "/home/jenkins/.m2",
          "sourceVolume": "MavenRepository"
        }
      ],
      "workingDirectory": null,
      "dockerSecurityOptions": null,
      "memory": 2850,
      "memoryReservation": null,
      "volumesFrom": [],
      "image": "MYREPOID.dkr.ecr.eu-west-1.amazonaws.com/jenkins-cloud-slave:4f4538e",
      "disableNetworking": null,
      "essential": true,
      "links": null,
      "hostname": null,
      "extraHosts": null,
      "user": null,
      "readonlyRootFilesystem": null,
      "dockerLabels": null,
      "privileged": false,
      "name": "ecs-cloud-jenkins-cloud-slave-3GB",
      "expanded": true
    }
  ],
  "placementConstraints": [],
  "memory": null,
  "taskRoleArn": null,
  "compatibilities": [
    "EC2"
  ],
  "taskDefinitionArn": "arn:aws:ecs:eu-west-1:MYREPOID:task-definition/ecs-cloud-jenkins-cloud-slave-3GB:12",
  "family": "ecs-cloud-jenkins-cloud-slave-3GB",
  "requiresAttributes": [
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.ecr-auth"
    }
  ],
  "requiresCompatibilities": null,
  "networkMode": null,
  "cpu": null,
  "revision": 12,
  "status": "ACTIVE",
  "volumes": [
    {
      "name": "DockerSocket",
      "host": {
        "sourcePath": "/var/run/docker.sock"
      }
    },
    {
      "name": "MavenRepository",
      "host": {
        "sourcePath": "/data/nfs/maven"
      }
    }
  ]
}

@jochenhebbrecht
Copy link
Contributor

@kiview yes, I know, but apparently, the testcontainers code doesn't do that (and returns 'localhost'), although it's pretty unclear to me why ...

@jochenhebbrecht
Copy link
Contributor

jochenhebbrecht commented Jan 10, 2018

Some extra information: I've added more logging in the DockerClientConfigUtils class:

Index: core/src/main/java/org/testcontainers/dockerclient/DockerClientConfigUtils.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- core/src/main/java/org/testcontainers/dockerclient/DockerClientConfigUtils.java	(date 1507982746000)
+++ core/src/main/java/org/testcontainers/dockerclient/DockerClientConfigUtils.java	(date 1515589691000)
@@ -23,10 +23,12 @@
                     cmd -> cmd.withCmd("sh", "-c", "ip route|awk '/default/ { print $3 }'"),
                     (client, id) -> {
                         try {
-                            return client.logContainerCmd(id)
-                                    .withStdOut(true)
-                                    .exec(new LogToStringContainerCallback())
-                                    .toString();
+                            String logContainer = client.logContainerCmd(id)
+                                             .withStdOut(true)
+                                             .exec(new LogToStringContainerCallback())
+                                             .toString();
+                            log.info(logContainer);
+                            return logContainer;
                         } catch (Exception e) {
                             log.warn("Can't parse the default gateway IP", e);
                             return null;
@@ -37,6 +39,10 @@
             .filter(StringUtils::isNotBlank);
 
     public static String getDockerHostIpAddress(DockerClientConfig config) {
+        log.info("IN_A_CONTAINER: " + IN_A_CONTAINER);
+        log.info("Get detected docker host IP: " + getDetectedDockerHostIp());
+        log.info("Config docker host schema: " + config.getDockerHost().getScheme());
+        log.info("Config docker host host: " + config.getDockerHost().getHost());
         return getDetectedDockerHostIp().orElseGet(() -> {
             switch (config.getDockerHost().getScheme()) {
                 case "http":

... and this results in the following logging ...

main] INFO org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy - Found docker client settings from environment
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Found Docker environment with Environment variables, system properties and defaults. Resolved: 
    dockerHost=unix:///var/run/docker.sock
    apiVersion='{UNKNOWN_VERSION}'
    registryUrl='https://index.docker.io/v1/'
    registryUsername='jenkins'
    registryPassword='null'
    registryEmail='null'
    dockerConfig='DefaultDockerClientConfig[dockerHost=unix:///var/run/docker.sock,registryUsername=jenkins,registryPassword=<null>,registryEmail=<null>,registryUrl=https://index.docker.io/v1/,dockerConfig=/home/jenkins/.docker,sslConfig=<null>,apiVersion={UNKNOWN_VERSION}]'

[main] INFO org.testcontainers.dockerclient.DockerClientConfigUtils - IN_A_CONTAINER: true
[main] INFO org.testcontainers.dockerclient.DockerClientConfigUtils - 
[main] INFO org.testcontainers.dockerclient.DockerClientConfigUtils - Get detected docker host IP: Optional.empty
[main] INFO org.testcontainers.dockerclient.DockerClientConfigUtils - Config docker host schema: unix
[main] INFO org.testcontainers.dockerclient.DockerClientConfigUtils - Config docker host host: null
[main] INFO org.testcontainers.DockerClientFactory - Docker host IP address is localhost
[main] INFO org.testcontainers.DockerClientFactory - Connected to docker: 
  Server Version: 17.06.2-ce
  API Version: 1.30
  Operating System: Amazon Linux AMI 2017.09
  Total Memory: 32177 MB
        ?? Checking the system...
        ? Docker version should be at least 1.6.0
        ? Docker environment should have more than 2GB free disk space
        ? File should be mountable
        ? A port exposed by a docker container should be accessible
            'Connection refused (Connection refused)' does not equal expected 'hello'

So the shell command ip route|awk '/default/ { print $3 } cannot be trusted to determine the host IP address, I think?

I executed the command ip route just before the mvn clean install and it does give me some output

+ ip route
default via 172.17.0.1 dev eth0 
172.17.0.0/16 dev eth0  proto kernel  scope link  src 172.17.0.4 

@bsideup
Copy link
Member

bsideup commented Jan 10, 2018

@jochenhebbrecht oh, nice, thanks!
Yes, it seems that it doesn't work with ECS (probably because they introduced ECS-native networking a few months ago)

Thanks for debugging it! Also, if you found how to fix that - that would be highly appreciated (not even talking about the PR :D )

@jochenhebbrecht
Copy link
Contributor

jochenhebbrecht commented Jan 11, 2018

I created a shell script on the Jenkins Cloud Slave which executed, in for loop for like 1.000.000 times, the 'ip route' command and it always returned a valid value. So I don't think it's related to ECS-native networking.

I found out the LogToStringContainerCallback class doesn't receive the output of the Docker container. I'm afraid there's something wrong with the capturing of the stdout messages (probably in combination with ECS)

@bsideup
Copy link
Member

bsideup commented Jan 11, 2018

@jochenhebbrecht wow, thanks! Yes, we observed some flakiness with LogToStringContainerCallback before, thanks for pointing at, I think I know how to fix it (or at least to workaround)

@bsideup
Copy link
Member

bsideup commented Jan 11, 2018

@jochenhebbrecht
Copy link
Contributor

Thanks, but meanwhile, I could solve the issue myself. This is what I've changed:

Index: core/src/main/java/org/testcontainers/dockerclient/DockerClientConfigUtils.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- core/src/main/java/org/testcontainers/dockerclient/DockerClientConfigUtils.java	(date 1507982746000)
+++ core/src/main/java/org/testcontainers/dockerclient/DockerClientConfigUtils.java	(date 1515667022000)
@@ -9,6 +9,8 @@
 import java.io.File;
 import java.util.Optional;
 
+import static java.util.concurrent.TimeUnit.SECONDS;
+
 @Slf4j
 public class DockerClientConfigUtils {
 
@@ -23,10 +25,15 @@
                     cmd -> cmd.withCmd("sh", "-c", "ip route|awk '/default/ { print $3 }'"),
                     (client, id) -> {
                         try {
-                            return client.logContainerCmd(id)
-                                    .withStdOut(true)
-                                    .exec(new LogToStringContainerCallback())
-                                    .toString();
+                            LogToStringContainerCallback loggingCallback = new LogToStringContainerCallback();
+                            client.logContainerCmd(id).withStdOut(true)
+                                                      .withStdErr(true)
+                                                      .withFollowStream(true)
+                                                      .withTailAll()
+                                                      .exec(loggingCallback)
+                                                      .awaitStarted();
+                            loggingCallback.awaitCompletion(3, SECONDS);
+                            return loggingCallback.toString();
                         } catch (Exception e) {
                             log.warn("Can't parse the default gateway IP", e);
                             return null;

I'm now always retrieving the output of the 'ip route' command. I think it's related to the waiting time now being added.

@bsideup
Copy link
Member

bsideup commented Jan 11, 2018

@jochenhebbrecht that is also a good solution 👍 probably even better than mine :)

The key here is withFollowStream and awaitCompletion. And it's a good idea because here we can await the completion because it is an one-off command 👍

Do you want to submit a PR? :)

@jochenhebbrecht
Copy link
Contributor

Yes, I'll make a PR. I'll keep you updated.

@jochenhebbrecht
Copy link
Contributor

PR created. All tests are green.

bsideup pushed a commit that referenced this issue Jan 12, 2018
…ing inside docker (#541)

* fix: introduce 'withFollowStream' and 'awaitCompletion' methods to await the completion of one-off command

* Fixed retrieval of Docker host IP when running inside Docker (#479)

* fix: withStdErr() is not needed - withTailAll() is not needed (default: tail=all)

* Update CHANGELOG.md
@stuarthendren
Copy link
Author

I updated to 1.5.1 and was not able to reproduce the error. Happy for you to close it.

@jochenhebbrecht
Copy link
Contributor

... which is weird, because they still need to release version 1.5.2 which will contain the fix? :-) ...
It might be you were suffering from another problem? I had this problem with 1.4.3 and with 1.5.1.

@stuarthendren
Copy link
Author

Just as an update - the error did occur again, but a lot less often than before.
I've had a look through the logs and pulled out any mention of the classes you asked about above:

16:36:31.125 [main] DEBUG org.testcontainers.shaded.io.netty.util.internal.PlatformDependent0 - jdk.internal.misc.Unsafe.allocateUninitializedArray(int): unavailable
java.lang.ClassNotFoundException: jdk.internal.misc.Unsafe
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at org.testcontainers.shaded.io.netty.util.internal.PlatformDependent0$6.run(PlatformDependent0.java:295)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.testcontainers.shaded.io.netty.util.internal.PlatformDependent0.<clinit>(PlatformDependent0.java:288)
	at org.testcontainers.shaded.io.netty.util.internal.PlatformDependent.getSystemClassLoader(PlatformDependent.java:895)
	at org.testcontainers.shaded.io.netty.util.internal.PlatformDependent.isAndroid0(PlatformDependent.java:919)
	at org.testcontainers.shaded.io.netty.util.internal.PlatformDependent.<clinit>(PlatformDependent.java:70)
	at org.testcontainers.shaded.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:192)
	at org.testcontainers.shaded.io.netty.channel.epoll.Native.<clinit>(Native.java:61)
	at org.testcontainers.shaded.io.netty.channel.epoll.Epoll.<clinit>(Epoll.java:33)
	at org.testcontainers.shaded.io.netty.channel.epoll.EpollEventLoopGroup.<clinit>(EpollEventLoopGroup.java:38)
	at com.github.dockerjava.netty.NettyDockerCmdExecFactory$UnixDomainSocketInitializer.epollGroup(NettyDockerCmdExecFactory.java:241)
	at com.github.dockerjava.netty.NettyDockerCmdExecFactory$UnixDomainSocketInitializer.init(NettyDockerCmdExecFactory.java:233)
	at com.github.dockerjava.netty.NettyDockerCmdExecFactory.init(NettyDockerCmdExecFactory.java:206)
	at com.github.dockerjava.core.DockerClientImpl.withDockerCmdExecFactory(DockerClientImpl.java:161)
	at com.github.dockerjava.core.DockerClientBuilder.build(DockerClientBuilder.java:45)
	at org.testcontainers.dockerclient.DockerClientProviderStrategy.getClientForConfig(DockerClientProviderStrategy.java:161)
	at org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy.test(EnvironmentAndSystemPropertyClientProviderStrategy.java:39)
16:36:32.563 [main] INFO org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy - Found docker client settings from environment
16:36:32.036 [pool-1-thread-1] DEBUG org.testcontainers.dockerclient.DockerClientProviderStrategy - Pinging docker daemon...
16:36:32.610 [main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Found Docker environment with Environment variables, system properties and defaults. Resolved: 
    dockerHost=unix:///var/run/docker.sock
    apiVersion='{UNKNOWN_VERSION}'
    registryUrl='https://index.docker.io/v1/'
    registryUsername='?'
    registryPassword='null'
    registryEmail='null'
    dockerConfig='DefaultDockerClientConfig[dockerHost=unix:///var/run/docker.sock,registryUsername=?,registryPassword=<null>,registryEmail=<null>,registryUrl=https://index.docker.io/v1/,dockerConfig=?/.docker,sslConfig=<null>,apiVersion={UNKNOWN_VERSION}]'

This was just to update, but hoefully the changes in 1.5.2 will fix it anyway.

@agibalov
Copy link

We have similar issues on 1.4.3. I've noticed that it happens when there are 2 or more parallel builds on our build server. I've never had this issue when running the builds locally.

@bsideup
Copy link
Member

bsideup commented Mar 27, 2018

@agibalov 1.4.3 is old, the latest is 1.6.0, please update

@agibalov
Copy link

@bsideup upgrading to 1.6.0 fixed the issue for us. Thanks!

@bsideup
Copy link
Member

bsideup commented Apr 11, 2018

@agibalov errr... FYI the latest now is 1.7.0 😂
Anyway, good to know that it is fixed, I'll close the issue :)

@bsideup bsideup closed this as completed Apr 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants