Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Artifact caching proxy is unreliable #3481

Closed
basil opened this issue Mar 30, 2023 · 28 comments
Closed

Artifact caching proxy is unreliable #3481

basil opened this issue Mar 30, 2023 · 28 comments

Comments

@basil
Copy link
Collaborator

basil commented Mar 30, 2023

Service(s)

Artifact-caching-proxy

Summary

Despite #2752 (comment), https://ci.jenkins.io/job/Tools/job/bom/job/master/1568/ failed after approximately 90 minutes with

[2023-03-30T21:32:37.394Z] [ERROR] Failed to execute goal on project script-security: Could not resolve dependencies for project org.jenkins-ci.plugins:script-security:hpi:999999-SNAPSHOT: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202) -> [Help 1]
[2023-03-30T21:32:37.394Z] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal on project script-security: Could not resolve dependencies for project org.jenkins-ci.plugins:script-security:hpi:999999-SNAPSHOT: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies (LifecycleDependencyResolver.java:245)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies (LifecycleDependencyResolver.java:138)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved (MojoExecutor.java:369)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:327)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:260)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:172)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:100)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.execute (MavenCli.java:821)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:270)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-03-30T21:32:37.394Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-03-30T21:32:37.394Z] Caused by: org.apache.maven.project.DependencyResolutionException: Could not resolve dependencies for project org.jenkins-ci.plugins:script-security:hpi:999999-SNAPSHOT: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.project.DefaultProjectDependenciesResolver.resolve (DefaultProjectDependenciesResolver.java:187)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies (LifecycleDependencyResolver.java:226)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies (LifecycleDependencyResolver.java:138)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved (MojoExecutor.java:369)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:327)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:260)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:172)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:100)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.execute (MavenCli.java:821)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:270)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-03-30T21:32:37.394Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-03-30T21:32:37.394Z] Caused by: org.eclipse.aether.resolution.DependencyResolutionException: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.394Z]     at org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies (DefaultRepositorySystem.java:402)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.project.DefaultProjectDependenciesResolver.resolve (DefaultProjectDependenciesResolver.java:182)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies (LifecycleDependencyResolver.java:226)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies (LifecycleDependencyResolver.java:138)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved (MojoExecutor.java:369)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:327)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:260)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:172)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:100)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.execute (MavenCli.java:821)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:270)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-03-30T21:32:37.395Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-03-30T21:32:37.395Z] Caused by: org.eclipse.aether.resolution.ArtifactResolutionException: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve (DefaultArtifactResolver.java:506)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts (DefaultArtifactResolver.java:278)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies (DefaultRepositorySystem.java:385)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.project.DefaultProjectDependenciesResolver.resolve (DefaultProjectDependenciesResolver.java:182)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies (LifecycleDependencyResolver.java:226)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies (LifecycleDependencyResolver.java:138)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved (MojoExecutor.java:369)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:327)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:260)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:172)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:100)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.execute (MavenCli.java:821)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:270)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-03-30T21:32:37.395Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-03-30T21:32:37.395Z] Caused by: org.eclipse.aether.transfer.ArtifactTransferException: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.connector.basic.ArtifactTransportListener.transferFailed (ArtifactTransportListener.java:52)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:388)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:73)
[2023-03-30T21:32:37.395Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-03-30T21:32:37.395Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-03-30T21:32:37.395Z]     at java.lang.Thread.run (Thread.java:829)
[2023-03-30T21:32:37.395Z] Caused by: org.apache.http.ConnectionClosedException: Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.395Z]     at org.apache.http.impl.io.ContentLengthInputStream.read (ContentLengthInputStream.java:178)
[2023-03-30T21:32:37.395Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:135)
[2023-03-30T21:32:37.395Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:148)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.copy (AbstractTransporter.java:193)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.utilGet (AbstractTransporter.java:106)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.transport.http.HttpTransporter.access$200 (HttpTransporter.java:95)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.transport.http.HttpTransporter$EntityGetter.handle (HttpTransporter.java:626)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.transport.http.HttpTransporter.execute (HttpTransporter.java:371)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.transport.http.HttpTransporter.implGet (HttpTransporter.java:298)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.get (AbstractTransporter.java:72)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$GetTaskRunner.runTask (BasicRepositoryConnector.java:466)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:383)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:73)
[2023-03-30T21:32:37.395Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-03-30T21:32:37.395Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-03-30T21:32:37.395Z]     at java.lang.Thread.run (Thread.java:829)
[2023-03-30T21:32:37.395Z] [ERROR] 
[2023-03-30T21:32:37.395Z] [ERROR] Re-run Maven using the -X switch to enable full debug logging.
[2023-03-30T21:32:37.395Z] [ERROR] 
[2023-03-30T21:32:37.395Z] [ERROR] For more information about the errors and possible solutions, please read the following articles:
[2023-03-30T21:32:37.395Z] [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[2023-03-30T21:32:37.395Z] org.jenkins.tools.test.exception.PomExecutionException: mvn -B -V -e -ntp -s /home/jenkins/agent/workspace/Tools_bom_master@tmp/config13736734992442172432tmp -Dmaven.javadoc.skip=true clean process-test-classes in /home/jenkins/agent/workspace/Tools_bom_master/pct-work/script-security failed with exit status 1
[2023-03-30T21:32:37.395Z] 	at org.jenkins.tools.test.maven.ExternalMavenRunner.run(ExternalMavenRunner.java:106)
[2023-03-30T21:32:37.395Z] 	at org.jenkins.tools.test.PluginCompatTester.testPluginAgainst(PluginCompatTester.java:381)
[2023-03-30T21:32:37.395Z] 	at org.jenkins.tools.test.PluginCompatTester.testPlugins(PluginCompatTester.java:182)
[2023-03-30T21:32:37.395Z] 	at org.jenkins.tools.test.PluginCompatTesterCli.call(PluginCompatTesterCli.java:198)
[2023-03-30T21:32:37.395Z] 	at org.jenkins.tools.test.PluginCompatTesterCli.call(PluginCompatTesterCli.java:43)
[2023-03-30T21:32:37.395Z] 	at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
[2023-03-30T21:32:37.395Z] 	at picocli.CommandLine.access$1500(CommandLine.java:148)
[2023-03-30T21:32:37.396Z] 	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
[2023-03-30T21:32:37.396Z] 	at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
[2023-03-30T21:32:37.396Z] 	at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
[2023-03-30T21:32:37.396Z] 	at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
[2023-03-30T21:32:37.396Z] 	at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
[2023-03-30T21:32:37.396Z] 	at picocli.CommandLine.execute(CommandLine.java:2170)
[2023-03-30T21:32:37.396Z] 	at org.jenkins.tools.test.PluginCompatTesterCli.main(PluginCompatTesterCli.java:203)
script returned exit code 1

Reproduction steps

Run a lot of jenkinsci/bom builds without the skip-artifact-caching-proxy label and you are bound to see it within a few attempts.

@basil basil added the triage Incoming issues that need review label Mar 30, 2023
@dduportal dduportal added this to the infra-team-sync-2023-04-04 milestone Apr 1, 2023
@dduportal dduportal removed the triage Incoming issues that need review label Apr 1, 2023
@dduportal
Copy link
Contributor

Quick report of the actions performed Friday 30 of March by the Jenkins Infra team:

  • Checked the service metrics in datadog (ACP service, 2 replicas in the DigitalOcean cluster, during the 30 minutes before and after the time indicated in the error message):
    • %Used CPU, memory usage (%RSS) and I/O read and write were quite constant in their behaviors
    • Network outbound shows tiny "teeth" corresponding to the sent files
    • No OOM or restarted pods/container.process
  • Same for the collected logs: no error message at all in this time windows

=> there is no "easy and obvious" cause

I heard @lemeurherve mentioning the builds were back to normal despite jenkinsci/bom#1921 but I'm not sure: I'll let him report here monday (as the current main build of the BOM seems to succeed: we can wait for the week-end to pass).

Next steps:

@basil
Copy link
Collaborator Author

basil commented Apr 4, 2023

there is no "easy and obvious" cause

Did we consider the possibility of file descriptor exhaustion? Not sure what RLIMIT_NOFILE is on the nginx(8) process, but worker_connections (which is bounded by RLIMIT_NOFILE) is set to 1024. Maven's aether.connector.basic.threads defaults to 5, meaning that a proxy with 1024 workers could handle ~200 simultaneous Maven builds. That seems to be in the same order of magnitude as the number of concurrent PCT jobs we run. See https://nginx.org/en/docs/http/ngx_http_stub_status_module.html for how to get this information from nginx(8), or simply run ls -1 /proc/${PID}/fd | wc -l against the nginx(8) worker (not primary!) process during a busy BOM build for a crude approximation. If file descriptors are exhausted you could increase RLIMIT_NOFILE and/or worker_connections, or apply backpressure by passing e.g. -Daether.connector.basic.threads=2 to Maven to halve the number of concurrent requests.

I heard @lemeurherve mentioning the builds were back to normal despite jenkinsci/bom#1921 but I'm not sure: I'll let him report here monday (as the current main build of the BOM seems to succeed: we can wait for the week-end to pass).

I did not see any report today, but I am assuming the reasoning is that the Maven upgrade restored retries and therefore stability. That is possible; it could be confirmed by reverting jenkinsci/bom#1921 and seeing if the problem returns. I have never seen a failure with the old version of Maven and repo.jenkins-ci.org (Artifactory) though, only with the artifact caching proxy. In any case, the Maven retry behavior is simply a workaround on top of the real problem: it should not be necessary to retry downloads in the first place. They should suceed on the first try. To phrase this another way, the artifact caching proxy is a server built and hosted by us entirely for our own needs (and not accessible to anyone else), so we ought to be able to engineer it to meet our needs perfectly on the first try. This can be achieved using suitable hardware & networking resources combined with OS and userland configuration, either scaling horizontally (e.g., multiple proxies with a load balancer) or vertically (e.g., a single proxy with sufficient resources to meet demand).

Closing as the "unreliable" behavior (which is BOM-only) is tracked in #3481

A theory of file descriptor exhaustion is consistent with the pathology being only observable in BOM, as BOM is a pretty good stress test for the web server.

@dduportal
Copy link
Contributor

I'm not sure about the RLIMIT_NOFILE : the worker_connections is set to 1024, that is true, but each ACP instance has 4 workers so 4096, and we have 2 ACP replicas per "cloud" with RR load balanced requests.
Also, Nginx logs this error when it happens (faced it before many times) so I'm surprised that we did not see any error about this.

However, I realize we did NOT check for the ingress (which happen to be Nginx) configuration that might be the culprit (we do not hit ACP directly).

Anyway that could be worth it to:

@dduportal
Copy link
Contributor

Also, there are fundamentals questions at stake:

  • Why does it only happen on DigitalOcean and not on AWS? Searching the differences might help (different Kubernetes clusters, different block storage implementation, different kernels)
  • Why does it only happen with Maven ?
  • Why does it only happen after ~ 90 min ?

@dduportal
Copy link
Contributor

Please note that with #3487, investing too much time in DigitalOcean might be a waste as we are not sure if we'll get more credits.

@dduportal
Copy link
Contributor

Metrics on the ACP side confirms that the 30th of March, the cumulated amount of active connections on the whole ACP for DigitalOcean never crossed the threshold of 400 simultaneously.

However, it seems there is a lot of connections "waiting" which goes into the direction of jenkins-infra/helm-charts#471

Capture d’écran 2023-04-04 à 09 17 39

@timja
Copy link
Member

timja commented Apr 4, 2023

Is it worth trying an artifactory instance running on each cloud provider as a proxy instead of nginx?

I assume nginx was used as that's what was used previously?

@dduportal
Copy link
Contributor

Is it worth trying an artifactory instance running on each cloud provider as a proxy instead of nginx?

I assume nginx was used as that's what was used previously?

Yes, but not only:

  • Nginx (as a caching proxy) can easily scale horizontally: it is currently in HA mode to allow infrastructure to operate without breaking builds. Artifactory can do HA (https://jfrog.com/help/r/jfrog-installation-setup-documentation/high-availability) but it is clearly rough to deploy
  • The resource usage of artifactory is through the roof (memory + CPU)
  • The maintenance of Artifactory is a lot of time to spend on. We can't compare to repo.jenkins-ci.org which has engineering team behind and a decade worth of tuning (and a cloud bill through the roof...).

I'm really bothered by this issue: there is nothing pointing finger at ACP specifically (otherwise the issue would have been seen in AWS as well since quite some time) so it really looks a weird issue.

@timja
Copy link
Member

timja commented Apr 4, 2023

The maintenance of Artifactory is a lot of time to spend on. We can't compare to repo.jenkins-ci.org which has engineering team behind and a decade worth of tuning (and a cloud bill through the roof...).

We've ran it in a pod on Kubernetes for about 4 years and can count on one hand the number of times we've had to touch it.
What maintenance are you thinking of?

@dduportal
Copy link
Contributor

dduportal commented Apr 4, 2023

The maintenance of Artifactory is a lot of time to spend on. We can't compare to repo.jenkins-ci.org which has engineering team behind and a decade worth of tuning (and a cloud bill through the roof...).

We've ran it in a pod on Kubernetes for about 4 years and can count on one hand the number of times we've had to touch it. What maintenance are you thinking of?

  • Security updates of the container image and the app (at leats once a month)
  • Node pool image migration
  • Cloud billing rebalancing (when we run in/out of credits)

these are the maintenance I'm thinking of.

(edit)

  • Artifactory could be a solution to discuss and eventually try though, after we're fixed about Digital Ocean billing for the upcoming months.
  • Let's try a bit more of the Nginx optimization with what @basil mentionned and what I found.
  • Also, let's try (if it does not work):
    • Remove the ingress component and use a direct SVC LB to remove layers adding latency and un-neede requests
    • Switching from PVC to emptyDir could be experimented (more frequent cache cleanup but might have different I/O behavior

@basil
Copy link
Collaborator Author

basil commented Apr 5, 2023

I tried enabling the artifact caching proxy in ATH and got the same error, even with Maven 3.9.1. So Maven 3.9.1 is not the solution.

[2023-04-05T01:40:59.059Z] + mvn -B -V -e -ntp -s /home/jenkins/workspace/_acceptance-test-harness_PR-1097@tmp/config4855491279108139757tmp test -Dmaven.repo.local=/home/jenkins/workspace/_acceptance-test-harness_PR-1097@tmp/m2repo -Dmaven.test.failure.ignore=true -DforkCount=1
[2023-04-05T01:40:59.314Z] Apache Maven 3.9.1 (2e178502fcdbffc201671fb2537d0cb4b4cc58f8)
[2023-04-05T01:40:59.314Z] Maven home: /opt/maven
[2023-04-05T01:40:59.314Z] Java version: 11.0.18, vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-amd64
[2023-04-05T01:40:59.314Z] Default locale: en_US, platform encoding: ANSI_X3.4-1968
[2023-04-05T01:40:59.314Z] OS name: "linux", version: "5.15.0-1035-azure", arch: "amd64", family: "unix"
[2023-04-05T01:41:01.871Z] [ERROR] Error executing Maven.
[2023-04-05T01:41:01.871Z] org.apache.maven.cli.internal.ExtensionResolutionException: Extension io.jenkins.tools.incrementals:git-changelist-maven-extension:1.6 or one of its dependencies could not be resolved: The following artifacts could not be resolved: org.apache.httpcomponents:httpclient:jar:4.3.6 (absent): Could not transfer artifact org.apache.httpcomponents:httpclient:jar:4.3.6 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveExtension (BootstrapCoreExtensionManager.java:182)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveCoreExtensions (BootstrapCoreExtensionManager.java:115)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.loadCoreExtensions (BootstrapCoreExtensionManager.java:99)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.loadCoreExtensions (MavenCli.java:677)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.container (MavenCli.java:548)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:266)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:195)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-04-05T01:41:01.871Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-04-05T01:41:01.871Z] Caused by: org.eclipse.aether.resolution.ArtifactResolutionException: The following artifacts could not be resolved: org.apache.httpcomponents:httpclient:jar:4.3.6 (absent): Could not transfer artifact org.apache.httpcomponents:httpclient:jar:4.3.6 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve (DefaultArtifactResolver.java:456)
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts (DefaultArtifactResolver.java:261)
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies (DefaultRepositorySystem.java:352)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.plugin.internal.DefaultPluginDependenciesResolver.resolveInternal (DefaultPluginDependenciesResolver.java:212)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.plugin.internal.DefaultPluginDependenciesResolver.resolveCoreExtension (DefaultPluginDependenciesResolver.java:148)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveExtension (BootstrapCoreExtensionManager.java:174)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveCoreExtensions (BootstrapCoreExtensionManager.java:115)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.loadCoreExtensions (BootstrapCoreExtensionManager.java:99)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.loadCoreExtensions (MavenCli.java:677)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.container (MavenCli.java:548)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:266)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:195)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-04-05T01:41:01.871Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-04-05T01:41:01.871Z] Caused by: org.eclipse.aether.transfer.ArtifactTransferException: Could not transfer artifact org.apache.httpcomponents:httpclient:jar:4.3.6 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.connector.basic.ArtifactTransportListener.transferFailed (ArtifactTransportListener.java:44)
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:417)
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:66)
[2023-04-05T01:41:01.871Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-04-05T01:41:01.871Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-04-05T01:41:01.871Z]     at java.lang.Thread.run (Thread.java:829)
[2023-04-05T01:41:01.871Z] Caused by: java.net.SocketException: Connection reset
[2023-04-05T01:41:01.871Z]     at java.net.SocketInputStream.read (SocketInputStream.java:186)
[2023-04-05T01:41:01.871Z]     at java.net.SocketInputStream.read (SocketInputStream.java:140)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketInputRecord.read (SSLSocketInputRecord.java:484)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketInputRecord.readFully (SSLSocketInputRecord.java:467)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketInputRecord.decodeInputRecord (SSLSocketInputRecord.java:243)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketInputRecord.decode (SSLSocketInputRecord.java:181)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLTransport.decode (SSLTransport.java:111)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketImpl.decode (SSLSocketImpl.java:1510)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketImpl.readApplicationRecord (SSLSocketImpl.java:1477)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketImpl$AppInputStream.read (SSLSocketImpl.java:1066)
[2023-04-05T01:41:01.871Z]     at org.apache.http.impl.io.SessionInputBufferImpl.streamRead (SessionInputBufferImpl.java:137)
[2023-04-05T01:41:01.872Z]     at org.apache.http.impl.io.SessionInputBufferImpl.read (SessionInputBufferImpl.java:197)
[2023-04-05T01:41:01.872Z]     at org.apache.http.impl.io.ContentLengthInputStream.read (ContentLengthInputStream.java:176)
[2023-04-05T01:41:01.872Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:135)
[2023-04-05T01:41:01.872Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:148)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.copy (AbstractTransporter.java:163)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.utilGet (AbstractTransporter.java:94)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.transport.http.HttpTransporter.access$200 (HttpTransporter.java:96)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.transport.http.HttpTransporter$EntityGetter.handle (HttpTransporter.java:565)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.transport.http.HttpTransporter.execute (HttpTransporter.java:374)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.transport.http.HttpTransporter.implGet (HttpTransporter.java:319)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.get (AbstractTransporter.java:63)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$GetTaskRunner.runTask (BasicRepositoryConnector.java:482)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:414)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:66)
[2023-04-05T01:41:01.872Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-04-05T01:41:01.872Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-04-05T01:41:01.872Z]     at java.lang.Thread.run (Thread.java:829)

@basil
Copy link
Collaborator Author

basil commented Apr 5, 2023

Seeing the same error in ATH even in Azure with Maven 3.9.1 in https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/PR-1097/10/ -- giving up on enabling artifact caching proxy for ATH.

[2023-04-05T02:54:23.756Z] + eval export BROWSER_DISPLAY=:42 export DISPLAY=:42
[2023-04-05T02:54:23.756Z] ++ export BROWSER_DISPLAY=:42 export DISPLAY=:42
[2023-04-05T02:54:23.756Z] ++ BROWSER_DISPLAY=:42
[2023-04-05T02:54:23.756Z] ++ DISPLAY=:42
[2023-04-05T02:54:23.756Z] + java -version
[2023-04-05T02:54:23.756Z] openjdk version "11.0.18" 2023-01-17
[2023-04-05T02:54:23.756Z] OpenJDK Runtime Environment (build 11.0.18+10-post-Ubuntu-0ubuntu122.04)
[2023-04-05T02:54:23.756Z] OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Ubuntu-0ubuntu122.04, mixed mode, sharing)
[2023-04-05T02:54:23.756Z] + run.sh firefox lts -Dmaven.repo.local=/home/jenkins/workspace/_acceptance-test-harness_PR-1097@tmp/m2repo -Dmaven.test.failure.ignore=true -DforkCount=1
[2023-04-05T02:54:23.756Z] + BROWSER=firefox
[2023-04-05T02:54:23.756Z] + JENKINS_WAR=target/jenkins-war.war
[2023-04-05T02:54:23.756Z] + mvn -B -V -e -ntp -s /home/jenkins/workspace/_acceptance-test-harness_PR-1097@tmp/config11205400074458065414tmp test -Plts -Dmaven.repo.local=/home/jenkins/workspace/_acceptance-test-harness_PR-1097@tmp/m2repo -Dmaven.test.failure.ignore=true -DforkCount=1
[2023-04-05T02:54:24.022Z] Apache Maven 3.9.1 (2e178502fcdbffc201671fb2537d0cb4b4cc58f8)
[2023-04-05T02:54:24.022Z] Maven home: /opt/maven
[2023-04-05T02:54:24.022Z] Java version: 11.0.18, vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-amd64
[2023-04-05T02:54:24.022Z] Default locale: en_US, platform encoding: ANSI_X3.4-1968
[2023-04-05T02:54:24.022Z] OS name: "linux", version: "5.15.0-1035-azure", arch: "amd64", family: "unix"
[2023-04-05T02:54:26.050Z] [ERROR] Error executing Maven.
[2023-04-05T02:54:26.050Z] org.apache.maven.cli.internal.ExtensionResolutionException: Extension io.jenkins.tools.incrementals:git-changelist-maven-extension:1.6 or one of its dependencies could not be resolved: The following artifacts could not be resolved: com.google.inject:guice:jar:5.1.0 (absent): Could not transfer artifact com.google.inject:guice:jar:5.1.0 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveExtension (BootstrapCoreExtensionManager.java:182)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveCoreExtensions (BootstrapCoreExtensionManager.java:115)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.loadCoreExtensions (BootstrapCoreExtensionManager.java:99)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.loadCoreExtensions (MavenCli.java:677)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.container (MavenCli.java:548)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:266)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:195)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-04-05T02:54:26.050Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-04-05T02:54:26.050Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-04-05T02:54:26.050Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-04-05T02:54:26.050Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-04-05T02:54:26.050Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-04-05T02:54:26.050Z] Caused by: org.eclipse.aether.resolution.ArtifactResolutionException: The following artifacts could not be resolved: com.google.inject:guice:jar:5.1.0 (absent): Could not transfer artifact com.google.inject:guice:jar:5.1.0 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T02:54:26.050Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve (DefaultArtifactResolver.java:456)
[2023-04-05T02:54:26.050Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts (DefaultArtifactResolver.java:261)
[2023-04-05T02:54:26.050Z]     at org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies (DefaultRepositorySystem.java:352)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.plugin.internal.DefaultPluginDependenciesResolver.resolveInternal (DefaultPluginDependenciesResolver.java:212)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.plugin.internal.DefaultPluginDependenciesResolver.resolveCoreExtension (DefaultPluginDependenciesResolver.java:148)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveExtension (BootstrapCoreExtensionManager.java:174)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveCoreExtensions (BootstrapCoreExtensionManager.java:115)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.loadCoreExtensions (BootstrapCoreExtensionManager.java:99)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.loadCoreExtensions (MavenCli.java:677)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.container (MavenCli.java:548)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:266)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:195)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-04-05T02:54:26.050Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-04-05T02:54:26.051Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-04-05T02:54:26.051Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-04-05T02:54:26.051Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-04-05T02:54:26.051Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-04-05T02:54:26.051Z] Caused by: org.eclipse.aether.transfer.ArtifactTransferException: Could not transfer artifact com.google.inject:guice:jar:5.1.0 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.connector.basic.ArtifactTransportListener.transferFailed (ArtifactTransportListener.java:44)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:417)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:66)
[2023-04-05T02:54:26.051Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-04-05T02:54:26.051Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-04-05T02:54:26.051Z]     at java.lang.Thread.run (Thread.java:829)
[2023-04-05T02:54:26.051Z] Caused by: java.net.SocketException: Connection reset
[2023-04-05T02:54:26.051Z]     at java.net.SocketInputStream.read (SocketInputStream.java:186)
[2023-04-05T02:54:26.051Z]     at java.net.SocketInputStream.read (SocketInputStream.java:140)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketInputRecord.read (SSLSocketInputRecord.java:484)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketInputRecord.readFully (SSLSocketInputRecord.java:467)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketInputRecord.decodeInputRecord (SSLSocketInputRecord.java:243)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketInputRecord.decode (SSLSocketInputRecord.java:181)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLTransport.decode (SSLTransport.java:111)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketImpl.decode (SSLSocketImpl.java:1510)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketImpl.readApplicationRecord (SSLSocketImpl.java:1477)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketImpl$AppInputStream.read (SSLSocketImpl.java:1066)
[2023-04-05T02:54:26.051Z]     at org.apache.http.impl.io.SessionInputBufferImpl.streamRead (SessionInputBufferImpl.java:137)
[2023-04-05T02:54:26.051Z]     at org.apache.http.impl.io.SessionInputBufferImpl.read (SessionInputBufferImpl.java:197)
[2023-04-05T02:54:26.051Z]     at org.apache.http.impl.io.ContentLengthInputStream.read (ContentLengthInputStream.java:176)
[2023-04-05T02:54:26.051Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:135)
[2023-04-05T02:54:26.051Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:148)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.copy (AbstractTransporter.java:163)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.utilGet (AbstractTransporter.java:94)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.transport.http.HttpTransporter.access$200 (HttpTransporter.java:96)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.transport.http.HttpTransporter$EntityGetter.handle (HttpTransporter.java:565)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.transport.http.HttpTransporter.execute (HttpTransporter.java:374)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.transport.http.HttpTransporter.implGet (HttpTransporter.java:319)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.get (AbstractTransporter.java:63)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$GetTaskRunner.runTask (BasicRepositoryConnector.java:482)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:414)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:66)
[2023-04-05T02:54:26.051Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-04-05T02:54:26.051Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-04-05T02:54:26.051Z]     at java.lang.Thread.run (Thread.java:829)
script returned exit code 1

@basil
Copy link
Collaborator Author

basil commented Apr 5, 2023

Closing as the "unreliable" behavior (which is BOM-only) is tracked in #3481

I am not sure why scare quotes were used. When trying to enable the artifact caching proxy in ATH, I got failures on both DigitalOcean and Azure, but not when I removed the artifact caching proxy (using Maven 3.9.1 in all three cases). The artifact caching proxy is clearly unreliable, and this is clearly not BOM-only.

@dduportal
Copy link
Contributor

About the ACP Azure problem encountered with ATH:

  • The error message is different than the failure in Digital Ocean. Now it is a client side Connection Reset.
  • First glance at the metrics/logs on the ACP side:

Capture d’écran 2023-04-05 à 08 57 29

Capture d’écran 2023-04-05 à 08 58 22

Capture d’écran 2023-04-05 à 08 59 47

Nothing obvious at first sight.

Currently checking the metrics of the VMs used as agent by the ATH to see if there has been an error on these machines

@dduportal
Copy link
Contributor

dduportal commented Apr 5, 2023

Extracted the list of agent names from https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/PR-1097/10/ with the command grep -iIr 'running on ubuntu' consoleText | cut -d' ' -f4 | sort -u , adapted it to the following Datadog filter:
host:ubuntu-20-ba8a00 OR host:ubuntu-20-highmem0d07c0 OR host:ubuntu-20-highmem0d07c1 OR host:ubuntu-20-highmem165be1 OR host:ubuntu-20-highmem1b0380 OR host:ubuntu-20-highmem2b8f80 OR host:ubuntu-20-highmem2b8f81 OR host:ubuntu-20-highmem34e3e1 OR host:ubuntu-20-highmem4a1770 OR host:ubuntu-20-highmem5a9190 OR host:ubuntu-20-highmem5a9191 OR host:ubuntu-20-highmem5a9192 OR host:ubuntu-20-highmem63e5d0 OR host:ubuntu-20-highmem826dc0 OR host:ubuntu-20-highmem97a160 OR host:ubuntu-20-highmembe07f0 OR host:ubuntu-20-highmembf7da0 OR host:ubuntu-20-highmemcff7d0 OR host:ubuntu-20-highmemdc8fe0 OR host:ubuntu-20-highmemfc8d80 OR host:

=> it appears that only one of these machines reported metrics during the past 24h, and only for a few minutes:

Capture d’écran 2023-04-05 à 09 11 08

while there is a build of the ATH (https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/master/722/) currently being build and its VM agents are reporting metrics:

Capture d’écran 2023-04-05 à 09 14 06

Azure status does not report any incident though: https://azure.status.microsoft/en-us/status/ and https://azure.status.microsoft/en-us/status/history/.

I'm not sure where to go from here: again, there is nothing pointing at the ACP particularly, but it is hard to diagnose so any idea to help is welcome.

Feels like a network issue, which might be caused by the IP overlap in the legacy networks (ref. #3351).
Proposal: setting the network of the ci.jenkins.io closer to the ACP's to decrease the latency and in a not-overlapped network

@dduportal
Copy link
Contributor

Closing as the "unreliable" behavior (which is BOM-only) is tracked in #3481

I am not sure why scare quotes were used. When trying to enable the artifact caching proxy in ATH, I got failures on both DigitalOcean and Azure, but not when I removed the artifact caching proxy (using Maven 3.9.1 in all three cases). The artifact caching proxy is clearly unreliable, and this is clearly not BOM-only.

I can edit my comment if the quotes bother you. There is nothing to understand from this quotes. It was written by reflex by someone who is not a native English speaker.

Yes, you have issues when enabling the ACP. Now that you reported the issue with the ATH we are looking on it.

@basil
Copy link
Collaborator Author

basil commented Apr 5, 2023

Scare quotes are used to elicit doubt, so I wanted to be clear that there is a problem. I am happy to step back and let the infrastructure team investigate. I do not have the context and access to dig deeper anyway, and I do not debug problems like this with dashboards and log files — I am more old-school and debug problems like this by doing experiments and running system-level commands to observe the effects. Nevertheless I have never met a debugging problem that I could not eventually solve.

@lemeurherve
Copy link
Member

lemeurherve commented Apr 5, 2023

FWIW since the merge of your PR introducing ACP on ATH, there has been 3 builds on master and 3 builds on PRs without any ACP error.

@basil
Copy link
Collaborator Author

basil commented Apr 5, 2023

FWIW since your PR introducing ACP on ATH, there has been 3 builds on master and 3 builds on PRs without any ACP error.

That is expected, because I removed the lines introducing the artifact caching proxy from that PR:

giving up on enabling artifact caching proxy for ATH.

I would have kept them, if it were not for the proxy being unreliable as described in this ticket.

@dduportal
Copy link
Contributor

Hello 👋 It's worth retrying ATH artifact caching proxy as #3535 (comment) was implemented : that should remove a lot of the network errors seen in the past months

@lemeurherve
Copy link
Member

I've opened a PR at jenkinsci/acceptance-test-harness#1279

@lemeurherve lemeurherve self-assigned this Jul 12, 2023
@lemeurherve
Copy link
Member

lemeurherve commented Jul 25, 2023

Since the PR merge two weeks ago I didn't notice particular ATH failures directly related to the artifact caching proxy.

It seems to me we can close this issue, WDYT @basil ?

@lemeurherve
Copy link
Member

No objection, closing this issue.

@basil
Copy link
Collaborator Author

basil commented Jul 31, 2023

Sorry, I missed this notification last week. In any case I have not seen any ACP failures. Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants