Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 Bug]: Error while creating session, because of java.util.concurrent.TimeoutException #2373

Closed
zsbutosi opened this issue Aug 28, 2024 · 6 comments · Fixed by SeleniumHQ/selenium#14584

Comments

@zsbutosi
Copy link

zsbutosi commented Aug 28, 2024

What happened?

Hi!

We have been using selenium grid docker image in kubernetes setup for almost 2 years now.
But since the 4.23.0 version, we get org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Response code 500. Message: Could not start a new session. java.util.concurrent.TimeoutException errors.
I can reproduce it locally with docker container too.
I have tried the docker image 4.23.0-20240727, 4.23.1-20240820 and the nightly build too, with the same result.

For testing framework, we are using selenide, with kotlin and java21.
Currently we are using selenide version 7.4.2, which is using selenium 4.23.1.
The browser config for chrome is the following:

            addArguments("--no-sandbox")
            addArguments("--disable-dev-shm-usage")
            addArguments("--disable-search-engine-choice-screen")
            addArguments("--unsafely-disable-devtools-self-xss-warnings")
            addArguments("window-size=1920,1080")

and we also set Configuration.headless as true for the selenide config.

Command used to start Selenium Grid with Docker (or Kubernetes)

The compose file is this:

  chrome:
    image: selenium/node-chromium:4.23.1-20240820
    shm_size: 3gb
    depends_on:
      - selenium-hub
    environment:
      - SE_EVENT_BUS_HOST=selenium-hub
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
      - SE_NODE_MAX_SESSIONS=4
      - SE_NODE_OVERRIDE_MAX_SESSIONS=true
      - SE_START_XVFB=false
      - SE_START_VNC=false
      - SE_START_NO_VNC=false
      - SE_NODE_SESSION_TIMEOUT=30
      - SE_SCREEN_WIDTH=1920
      - SE_SCREEN_HEIGHT=1080
      - SE_ENABLE_TRACING=false
      - SE_STRUCTURED_LOGS=false
      - SE_ENABLE_TLS=false
      - SE_SUPERVISORD_LOG_LEVEL=info
      - SE_HTTP_LOGS=false

  selenium-hub:
    image: selenium/hub:4.23.1-20240820
    container_name: selenium-hub
    environment:
      - SE_SESSION_REQUEST_TIMEOUT=90
      - SE_SESSION_RETRY_INTERVAL=10
      - SE_ENABLE_TRACING=false
      - SE_ENABLE_TLS=false
      - SE_SUPERVISORD_LOG_LEVEL=debug
      - SE_STRUCTURED_LOGS=false
    ports:
      - "4442:4442"
      - "4443:4443"
      - "4444:4444"


### Relevant log output

```shell
From node-chormium:

2024-08-28 17:32:05 15:32:05.505 WARN [DriverServiceSessionFactory.apply] - Error while creating session with the driver service. Stopping driver service: java.util.concurrent.TimeoutException
2024-08-28 17:32:05 Build info: version: '4.23.1', revision: '656257d8e9'
2024-08-28 17:32:05 System info: os.name: 'Linux', os.arch: 'aarch64', os.version: '6.10.0-linuxkit', java.version: '17.0.12'
2024-08-28 17:32:05 Driver info: driver.version: unknown
2024-08-28 17:32:05 org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
2024-08-28 17:32:05 Build info: version: '4.23.1', revision: '656257d8e9'
2024-08-28 17:32:05 System info: os.name: 'Linux', os.arch: 'aarch64', os.version: '6.10.0-linuxkit', java.version: '17.0.12'
2024-08-28 17:32:05 Driver info: driver.version: unknown
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:418)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:55)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:374)
2024-08-28 17:32:05     at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:54)
2024-08-28 17:32:05     at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:89)
2024-08-28 17:32:05     at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:75)
2024-08-28 17:32:05     at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:61)
2024-08-28 17:32:05     at org.openqa.selenium.grid.node.config.DriverServiceSessionFactory.apply(DriverServiceSessionFactory.java:162)
2024-08-28 17:32:05     at org.openqa.selenium.grid.node.config.DriverServiceSessionFactory.apply(DriverServiceSessionFactory.java:71)
2024-08-28 17:32:05     at org.openqa.selenium.grid.node.local.SessionSlot.apply(SessionSlot.java:147)
2024-08-28 17:32:05     at org.openqa.selenium.grid.node.local.LocalNode.newSession(LocalNode.java:469)
2024-08-28 17:32:05     at org.openqa.selenium.grid.node.NewNodeSession.execute(NewNodeSession.java:50)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.Route$TemplatizedRoute.handle(Route.java:192)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.Route.execute(Route.java:69)
2024-08-28 17:32:05     at org.openqa.selenium.grid.security.RequiresSecretFilter.lambda$apply$0(RequiresSecretFilter.java:62)
2024-08-28 17:32:05     at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute(SpanWrappedHttpHandler.java:87)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:63)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:360)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.Route.execute(Route.java:69)
2024-08-28 17:32:05     at org.openqa.selenium.grid.node.Node.execute(Node.java:270)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:360)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.Route.execute(Route.java:69)
2024-08-28 17:32:05     at org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)
2024-08-28 17:32:05     at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:63)
2024-08-28 17:32:05     at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:63)
2024-08-28 17:32:05     at org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)
2024-08-28 17:32:05     at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
2024-08-28 17:32:05     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2024-08-28 17:32:05     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
2024-08-28 17:32:05     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2024-08-28 17:32:05     at java.base/java.lang.Thread.run(Thread.java:840)
2024-08-28 17:32:05 Caused by: java.util.concurrent.TimeoutException
2024-08-28 17:32:05     at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
2024-08-28 17:32:05     at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
2024-08-28 17:32:05     at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:401)
2024-08-28 17:32:05     ... 33 more




Previous attempts with log-level FINE:

2024-08-27 09:35:54 07:35:54.244 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.pauseReadEvent] - [HttpClient-12-SelectorManager] [67s 233ms] SocketTube(12) pausing read event
2024-08-27 09:35:54 07:35:54.244 DEBUG [HttpClientImpl$SelectorAttachment.register] - [HttpClient-12-SelectorManager] [67s 233ms] SelectorAttachment Registering jdk.internal.net.http.SocketTube$InternalReadPublisher$ReadEvent@ed18cc4 for 0 (false)
2024-08-27 09:35:54 07:35:54.244 DEBUG [SocketTube$InternalReadPublisher$ReadSubscription.signalCompletion] - [HttpClient-12-SelectorManager] [67s 233ms] SocketTube(12) completing subscriber
2024-08-27 09:35:54 07:35:54.244 DEBUG [ConnectionPool.cleanup] - [HttpClient-12-SelectorManager] [67s 233ms] ConnectionPool(12) SocketTube(12) : ConnectionPool.cleanup(null)
2024-08-27 09:35:54 07:35:54.244 DEBUG [PlainHttpConnection.close] - [HttpClient-12-SelectorManager] [67s 233ms] PlainHttpConnection(SocketTube(12)) Closing channel: channel registered with selector, key.interestOps=1, sa.interestOps=1
2024-08-27 09:35:54 07:35:54.245 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.signalError] - [HttpClient-12-SelectorManager] [67s 234ms] SocketTube(12) signal read error: java.io.IOException: connection closed locally
2024-08-27 09:35:54 07:35:54.245 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.signalError] - [HttpClient-12-SelectorManager] [67s 234ms] SocketTube(12) got read error: java.io.IOException: connection closed locally
2024-08-27 09:35:54 07:35:54.245 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.read] - [HttpClient-12-SelectorManager] [67s 234ms] SocketTube(12) Stopping read scheduler
2024-08-27 09:35:54 07:35:54.245 DEBUG [SocketTube.debugState] - [HttpClient-12-SelectorManager] [67s 234ms] SocketTube(12) leaving read() loop after EOF:  Reading: [ops=0, demand=0, stopped=true], Writing: [ops=0, demand=1]
2024-08-27 09:35:54 07:35:54.245 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.read] - [HttpClient-12-SelectorManager] [67s 234ms] SocketTube(12) Read scheduler stopped
2024-08-27 09:35:54 07:35:54.245 DEBUG [HttpClientImpl$SelectorAttachment.resetInterestOps] - [HttpClient-12-SelectorManager] [67s 234ms] SelectorAttachment key cancelled for java.nio.channels.SocketChannel[closed]
2024-08-27 09:35:54 07:35:54.245 DEBUG [ExternalProcess$Builder.lambda$start$0] - completed to copy the output of process 1675
2024-08-27 09:35:54 07:35:54.245 DEBUG [SocketTube$SocketFlowEvent.abort] - [HttpClient-12-SelectorManager] [67s 234ms] SocketTube(12) ReadEvent abort: java.io.IOException: java.nio.channels.CancelledKeyException
2024-08-27 09:35:54 07:35:54.246 DEBUG [SocketTube$InternalReadPublisher$ReadEvent.signalError] - [HttpClient-12-SelectorManager] [67s 234ms] SocketTube(12) signalError to jdk.internal.net.http.SocketTube$InternalReadPublisher$InternalReadSubscription@691c2895 (java.io.IOException: java.nio.channels.CancelledKeyException)
2024-08-27 09:35:54 07:35:54.246 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.signalError] - [HttpClient-12-SelectorManager] [67s 234ms] SocketTube(12) signal read error: java.io.IOException: java.nio.channels.CancelledKeyException

Operating System

Ubuntu

Docker Selenium version (image tag)

4.23.1-20240820

Selenium Grid chart version (chart version)

N/A

Copy link

@zsbutosi, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@VietND96
Copy link
Member

VietND96 commented Sep 8, 2024

we also set Configuration.headless as true for the selenide config

I saw headless is mentioned, but I am not sure selenide pass which argument? --headless or --headless=new
You can try to disable that configure and use addArguments("--headless=new") to see any help

@zsbutosi
Copy link
Author

zsbutosi commented Sep 9, 2024

@VietND96 Thank you for the response.
I checked the selenide code and its sending --headless=new.
And I also made changes, and now I am adding the --headless=new argument to the chrome configuration from my code.
But unfortunately I am getting the same timeout exception.
I tested locally with the 4.24.0-20240907 release.

As I see, the problem is when I try to run multiple browser sessions in one chrome-node.
As long as I am running only 1 test, i don't see any problem.
But if multiple tests are executed in parallel, the first one got passed, and the other tests got this timeout exception.
Like if after closing the first session, it shuts down something in the background, which prevents any other session to start or work properly.

@zsbutosi
Copy link
Author

zsbutosi commented Oct 9, 2024

@VietND96 I think I found a solution.
Now I am using the latest docker images for hub and chrome-node (tag: 4.25.0-20240922).
When a session starts, it spawn multiple chrome processes. And one of them took an entire core in processing. That is why I got the timeout exceptions.
After quite a few trial I reenabled the xvfb, because that was turned off by default, as we always use headless mode in GRID.
And the timeouts resolved.
I think for one of the chrome process, xvfb is required.

So currently it seems to be working with the following env variables:
SE_START_XVFB - true
SE_START_VNC - false
SE_START_NO_VNC - false
SE_NODE_MAX_SESSIONS - 4
SE_NODE_OVERRIDE_MAX_SESSIONS - true

So it seems that xvfb is a must have to use GRID.

Also an other weird thing is that the chrome-node info in GRID states the following:

{"browserName":"chrome","browserVersion":"129.0","goog:chromeOptions":{"binary":"/usr/bin/google-chrome"},"platformName":"linux","se:containerName":"","se:noVncPort":7900,"se:vncEnabled":true}

But as I said before, vnc and novnc is disabled.

Thx

@VietND96
Copy link
Member

Thanks for your feedback, I think a note on README also mentioned a similar issue https://github.com/SeleniumHQ/docker-selenium?tab=readme-ov-file#headless
Regarding caps "se:vncEnabled":true is present when SE_START_VNC and SE_START_NO_VNC are false, let me check again.

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Nov 10, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants