-
Notifications
You must be signed in to change notification settings - Fork 575
Performance issues #116
Comments
Hi @joao-valente, It might be possible that some performance issues arise in cases of high concurrency or many tests running in parallel. Our normal scenarios don't run more than 4 tests in parallel so that's why we have not seen something relevant yet. We have seen some things that we can improve:
Maybe you can help us with some more information, perhaps a timeline of how things happen. From the beginning where everything is running fine, and then what happens afterwards to make the performance go down. Also, what hardware specs are you using? How many containers are you starting at the beginning? With more info from your side we could come up with more ideas. |
Hi @diemol Reusing containers does make sense. In my case, I need static nodes which I can get internal IP, and allocate user account per IP. When nodes are dynamic, I have to use extra database to manage these account in order to make sure each test get unique account. Because I can't create thousands of testing accounts, I have to lock and unlock them in db during every test running. Your improvement plan is highly appreciated. |
@diemol Executing simultaneously on 10 ~ 20 containers doesn't yield stable results.Tests hangs sometimes and I see interrupt and null pointer exceptions in logs. Can share the logs tomorrow if required. |
@SrinivasanTarget could you please also share your HW specifications? Logs are helpful as well. |
@diemol Yup i was running in a 16Gig VM which runs Ubuntu 16.x. Don't have logs in hand now. Will share it surely tomorrow. Was trying to execute around 200~ tests with 20 containers spinned up. Same execution on elgalu/docker-selenium was fine. |
Thanks @SrinivasanTarget, logs will be useful. Perhaps you can also share with us:
All this info will be very helpful for us :) |
@diemol Please find the zalenium logs here: https://gist.github.com/SrinivasanTarget/a88aa39274717d31af46d01056408175
docker run --rm -ti --name zalenium -p 4444:4444 -p 5555:5555 Results were same even when executed via docker-compose.
data-provider-thread-count="15" but it is the same results even when count is reduced to 4 or 8.
Yeah it is through docker-compose https://github.com/elgalu/docker-selenium/blob/master/docker-compose.yml. |
@diemol Do you have any updates on this? Do you need any other information? |
We need more time to check it. We were running 16 parallel tests on a linux machine with 16GB and it worked OK, the same amount of threads on a Mac with 16GB didn't work so well. We'll check if something can be improved or if it just a matter of HW. |
@diemol we also have the same issue, when we bring up more than 15 containers |
Hi, regarding performance issues, we've found that, with the vanilla containers (from selenium) ram was not an issue (14gb are more than sufficient for 20 containers), in order to stabilize test runs we needed to upgrade from 2 to 4 cores (using azure cloud) and we even move to a 4 core on a improved processor family (30% plus processing capacity). We're working over swarm network and scaling from nothing to 60 containers on 3 machines takes less than a minute, including node registering (i would risk saying about 30 seconds for all nodes to register). Another side note is that when testing heavy load on the same setup is easy to pass the point where you overload the machine with containers and tests start failing because the grid doesn't respond in time. The same setup we use now to run 60 browsers will run 200 without complains, but the test results will be flaky. Another point to notice is that the browser instance request (to the selenium hub) overhead makes scaling, for instance, from 15 to 20 browser may not really be worth it when running between 100-200 tests on the same test run (assuming parallelism) the request from 0 to X browsers up and running takes to long. We've gained 2 minutes out of 15 going from 15 to 20. Making it 60 parallel browser for the same run made it to 9 minutes only. My point is that, you guys surely can work out the performance issues as they must have something to do with your own customizations. Loved seeing these features in a more stabilized way. Nonetheless you should note these facts that i worked out in order to differentiate between the performance issues that you can address and the selenium grid nature. These are my 2 cents, hope it will be helpful. |
Hi, I also ran into the same (probably) problem.
I reproduced this problem on Mac's Docker too. I have not experienced this kind of problem when I ran the tests with the official Selenium Docker image ( https://github.com/SeleniumHQ/docker-selenium ) while I ran the eight tests in parallel. In addition, I got some other problems, such as OS: CentOS 7.2.1511 docker's log: https://gist.github.com/katryo/d2c588554d1ace8583ccaa3e755bfb98 I hope this helps you. |
Thank you @SrinivasanTarget, @tacf, @katryo, @saikrishna321 for all the info and detailed logs. Right now we are spending time on reading the logs you submitted us and also running Zalenium in debug mode to spot where the main bottleneck happen when many tests are executed at the same time. What we plan to achieve is:
I am not sure how long this will take, but we are investing time on this because we think that if we are able to fix this performance issues (and adding the Kubernetes feature), Zalenium could become very successful. |
@diemol Thanks for your response :) Thanks for this wonderful project 👍 I would like to share few observations from my end here. I hope you guys are aware of https://github.com/aerokube/selenoid. I was trying all available docker selenium solutions in market. Based on my attempts, i see i was able to execute upto ~ 200 tests in 13-15 containers using Zalenium/docker-selenium/elgalu's docker-selenium images in a 16Gig ubuntu machine. I did executed same 200~ scripts in 16Gig Ubuntu machine with 30 containers (CPU usage was 85-90%) using Selenoid successfully. I was able to derive stable results from selenoid during each execution. Though i love the idea of on-demand containers in zalenium, i see selenoid spins up little less containers and seems like they reuse containers to an extent. I think it would be great if Zalenium also resues containers instead of killing/relaunch/registering nodes for each tests. I accept kubernetes/Docker Swarm/ powerful AWS instances might be a long term solution.
Looking forward to it :)
Are we planning to support Docker Swarm as well because both kubernetes and Docker Swarm supports self healing capability now. |
@SrinivasanTarget thanks for this info!! is really helpful
Do you think you could send us a PR to add an "Alternatives" section to the README.md listing all available working alternatives (with the links) ? I think this will be very useful to us and to our users, ideally we would differentiate each project per use case so people reading it can decide what fits them better and they don't have to go to trying them all. It should be something short and concise, if it's too long then a blog post might be a better place though. |
@elgalu Sure, still couple of solutions left for me to try. Will raise a PR post that attempts. |
@SrinivasanTarget That's a good piece of work on researching in terms of stability. |
Thanks for the comments @SrinivasanTarget Continuing with the topic, right now we are in the process of breaking apart Zalenium in pieces to detect where it gets slow when adding many tests in parallel, during this we found yesterday a few things that may lead us to improvements. We'll work on changing the network mode and also changing the way containers are created, until reaching the point where the only limit is the grid itself. We'll keep you posted. |
@SrinivasanTarget you may also want to check https://github.com/seleniumkit/gridrouter as someone pointed out in another issue |
Yes it is in my list @elgalu :) |
As Diego mentioned, we tested Selenoid yesterday with great results! I was able to run, without VNC enabled, 50 tests in parallel within 1 minute in my laptop! (8 cores, 16GB) Diego is looking into Zalenium performance issues as we speak:) |
@SrinivasanTarget regarding GridRouter - please try the newer implementation: http://github.com/aerokube/ggr It's also a Golang stuff tested enough in production. |
Just to put all eggs in one basket :) here are some recently posted articles about ggr and Selenoid: |
@vania-pooh I did read all the histories today. Interesting and a long journey. Great Stuff 👍 |
@vania-pooh Do you want to submit a paper on this for the upcoming Selenium Conference in Berlin ? |
@manoj9788: already submitted a talk about scalable Selenium. |
Oh! yeah! I see that. Thanks. |
Btw, regarding Selenium server performance I found several places in code that could be optimized:
|
Thanks for the comments @vania-pooh, and hopefully we meet in SeleniumConf! I mostly agree with the three points you mention. The thing is that we are using the grid as it is, we are not compiling our own grid (yet, I don't discard to do it in the future). I'll look into them, so maybe we find a way to improve the grid. We already found ways to improve Zalenium's performance by tuning some of the parameters passed to the grid and also changing the way we create the containers on the fly. We are still testing those changes, but it looks promising. It won't be as fast Selenoid :), but at least it is running several threads in parallel in a stable way and in a decent time. More details to come soon. |
Hi all, We just released version 3.3.1i, where we have improved a few things. Taking the list of improvements that I mentioned in a previous comment, I can give you an update:
We have worked to improve Zalenium and also created a basic document with our findings. In addition, for the pending tasks there are separated issues that will complete them. Please check the document and try the new version we have released. Thank you very much for all the input you gave us. For now, I would like to close this issue since there are too many things in it. In case of finding new bugs or performance problems, please create a new issue and we will work on it. We invite you to contribute to the linked document with your own performance data, so more people can benefit from it. |
@diemol Can you add the tag for 3.3.1l? |
Hi @felippenardi, This was released with tag 3.3.1i, but more improvements were doing in subsequent releases, the current release is 3.3.1k. 3.3.1l is still under development. |
Oh got you! Thanks :) |
Hi all,
While using Zalenium, I've had some performance issues with the Docker containers. It's very slow to start
and to launch containers to the point where all the tests start timing out. Even when I use a fixed number of containers, some of them time out and are shut down, and it ends up being very slow waiting for the others to start. For reference, I'm usually running about 10 to 20 tests in parallel from a test run of over 150 tests, and the machine slows down considerably, and all the tests past the first one usually fail.
Is there any way, to speed up this process? Either a feature that is planned, available, or by hacking.
Thank you in advance.
The text was updated successfully, but these errors were encountered: