-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[🐛 Bug]: Session not being created sometimes when running parallel tests through selenium grid. #2309
Comments
@kjjnygres, thank you for creating this issue. We will troubleshoot it as soon as we can. Info for maintainersTriage this issue by using labels.
If information is missing, add a helpful comment and then
If the issue is a question, add the
If the issue is valid but there is no time to troubleshoot it, consider adding the
If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C),
add the applicable
After troubleshooting the issue, please add the Thank you! |
I think you can retest after this PR will be released in 4.23 SeleniumHQ/selenium#14272 |
Has the issue been identified? What is the issue here? |
Are using Grid autoscaling in K8s? If yes, it could relate to another issue is described in SeleniumHQ/selenium#14282 |
No. I am spinning and destroying containers on demand. |
Yes, so it could be related. When destroying the containers, I think the drain node API endpoint is used to detach the Node from the Hub. The node will have a chance to switch status from UP to DRAINING (for some time until the container stops completely). |
Oh. Waiting for the new release then. Thanks for explaining |
Waiting for this release as well as I encountered issues when running parallel tests using Kubernetes with auto KEDA autoscaling. Hope this will fix my issue. Thanks |
I have an issue also when running parallel tests wherein the ongoing tests (session or the session id) will be hijack by the incoming tests and that session will be closed even if there is still an ongoing test from owner of that session. Unable to find session with ID: b77e7338533a3ccbd9c70eee277c156d |
I also encountered this issue, I currently bypass this by scheduling my tests to not run in parallel, Waiting for the release as well. Thank you. |
Images tag |
Issue still persists for me. Node is created, but is waiting in the queue. Session never starts. Message: Could not start a new session. New session request timed out Host info: host: '801aca98ceec', ip: '172.19.0.2' Build info: version: '4.23.0', revision: '77010cd' System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.14.0-427.22.1.el9_4.x86_64', java.version: '17.0.11' Driver info: driver.version: unknown Stacktrace:
|
@kjjnygres, from client binding, can you provide all capabilities that you provided? I suspect that request session not match with Node stereotypes, so there is no session could start |
Ok, I saw you are mentioning the scenario is setting custom capabilities for matching specific Nodes. |
I think the issue still is what you mentioned before: after destroying the container and spinning it up with the same name, there might be some issue with its status. which is causing this issue. Stacktrace in last comment was my mistake. I was sending in wrong capabilities. Please find updated stacktrace below: Message: Could not start a new session. New session request timed out Host info: host: '801aca98ceec', ip: '172.19.0.2' Build info: version: '4.23.0', revision: '77010cd' System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.14.0-427.22.1.el9_4.x86_64', java.version: '17.0.11' Driver info: driver.version: unknown Stacktrace:
|
I am using capabilities like this and it is working fine for me. I am using this capability from my code: SE_NODE_STEREOTYPE='{"browserName":"chrome","browserVersion":"122.0","goog:chromeOptions":{"binary":"/usr/bin/google-chrome"},"platformName":"linux","se:noVncPort":7900,"se:vncEnabled":true, "nodename:applicationName":"AccountCreation1"}' |
Ok, Node status switches to DRAINING from UP only API endpoint drain node is called - https://www.selenium.dev/documentation/grid/advanced_features/endpoints/#drain. |
So may be I have to wait between docker stop and docker run for a same container? |
Let me try to add a mechanism graceful shutdown the Node when deploying it via docker, or docker-compose |
@kjjnygres Btw, do you have any script publicly that I can use to test your scenario with the implementation? |
I'm afraid no. But what I do is before running a container, I check if container with the same name is already up or not. If so, I destroy that container and then run with the same name. There is not much time between these two activities. I think you can reproduce by creating a small script wherein you first run a container, destroy it, and run again. Here is how I do it in my code: sh script: "docker stop ${job} || true && docker rm ${job} || true" sh script: "docker run -d --net grid --name ${job} -e SE_EVENT_BUS_HOST=selenium-hub -e SE_NODE_STEREOTYPE='{"browserName":"chrome","browserVersion":"122.0","goog:chromeOptions":{"binary":"/usr/bin/google-chrome"},"platformName":"linux","se:noVncPort":7900,"se:vncEnabled":true, "nodename:applicationName":"${job}"}' --cap-add=CAP_AUDIT_WRITE --shm-size="2g" -e SE_EVENT_BUS_PUBLISH_PORT=4442 -e SE_EVENT_BUS_SUBSCRIBE_PORT=4443 localhost/node-image-xx" |
Just an update: it is now happening very rarely. I almost forgot about this issue :) |
I also encountered the same issue using standalone docker configuration. The tests will trigger in parallel and creates browser sessions. However, the test execution for each session is overriding. `version: "3" selenium-sessions: selenium-session-queue: selenium-distributor: selenium-router: chrome: networks: Btw, I'm using simple robotframework script using browser library. `*** Settings *** *** Test Cases *** command to execute in terminal SELENIUM_REMOTE_URL=http://localhost:4444 robot -v TS.robot |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
What happened?
Context:
I am trying to run parallel tests with the help of selenium grid. My grid and nodes all are using same machine.
Currently what is being done is when I run a test suite, a docker container gets spinned up for that suite, execute the test, and then destroys itself when the suite execution is complete. Every docker container here is a node, connected with the hub.
Which container/node should run which test suite is managed by desired capabilities:
SE_NODE_STEREOTYPE='{"browserName":"chrome","browserVersion":"122.0","goog:chromeOptions":{"binary":"/usr/bin/google-chrome"},"platformName":"linux","se:noVncPort":7900,"se:vncEnabled":true, "nodename:applicationName":"Container1"}
Problem:
Sometimes the suites execute just fine; sometimes one of the suites get failed; sometimes none of the test starts; giving below error:
"Driver info: driver.version: unknown
Stacktrace:
System info:
os.name: 'Linux', os.arch: 'amd64', os.version: '5.14.0-427.22.1.el9_4.x86_64', java.version: '17.0.11'
Command used to start Selenium Grid with Docker (or Kubernetes)
Relevant log output
Operating System
Linux
Docker Selenium version (image tag)
4.22.0
Selenium Grid chart version (chart version)
No response
The text was updated successfully, but these errors were encountered: