-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Install Linux perf on some Linux machines #1274
Comments
@rvagg would it make sense to have it installed in the images supporting the containerized builds ? @mmarchini which of these two is it:
|
The first one (if the PR lands) |
Are we talking about the broken I don't think we're going to be able to jam this into the containers unfortunately, even though that's a great place for these kinds of tests. A better approach might just be installing perf on one of our Ubuntu LTS hosts, maybe 18.04, and let the tests detect its presence and run if its there and skip if it's not. We can lock this in to our ansible scripts so we always get linux-tools-generic installed on some of those hosts. |
That's the theory but in practice you can just mix them. For example: I ran stock perf 3.13 on Ubuntu 14.04 for a while against the latest, continually upgraded mainline kernel and it worked just fine. (Yes, I run mainline kernels. I even test -rc's.) |
I usually just clone the Linux master and cd into the perf source to compile it when I need perf on my servers. Does not seem to cause problems when I upgrade kernels from time to time. |
Can we use the benchmark CI machine for this? It would be nice to have perf there, I am thinking about using |
Yes. It will be unbroken once V8 6.7 lands and the tests would help us to know in advance if changes on V8 break it again.
Yes. The proposed test can be seen here: nodejs/node#20783
The proposed test already does that.
+1 for that, I don't think we need to run this test on containers (at least not for now)
Looks like a good idea, as long as we have |
Installing on the benchmarking machine sounds good to me (although I think it would be the 2 additional benchmark machines as opposed to the ones used for the nightly runs). The only challenge is that if we only have it installed on a single machine then we'd want to run the test nightly as opposed to being part of every regression test (we not want, but probably only be able to). If nightly is ok, then we can easily setup a job to run once a day pinned to the benchmarking machine, just like the benchmark jobs. Ideally, we'd really like it to be run on every PR that upgrade V8 as well. If we had the nightly job then it would just be a matter of ensuring that was added to the list of what to run to validate when updating V8. |
let me get linux-tools-generic on some ubuntu machines we already have in CI, the test that's in the WIP should pick it up and skip otherwise so I think that'll solve this perf on the benchmarking machines would be a good idea regardless, I'll look at that at the same time, the intel/nearform ones mostly go through ansible so I think we can just put the changes in there. |
@rvagg getting it on some of the ubuntu machines is good, but I think we want to be sure the tests runs regularly and I don't think we can be confident we'll get that regularly through chance. Just to say I don't think we only want to rely on the chance that it runs on those machines during the regular regression runs. Once it is installed on some of the machines we can set up a job that only runs on that subset and make sure it runs at some interval. If we have it on enough machines we could make that job(the one that runs on the subset) run as part of the regular regression job as opposed to at some interval. |
I still think it's more important to run on every test run at nodejs/node-v8. Usually we open the PR to update V8 after the version branch-cut, so if the update breaks perf it might be already too late to fix it (remember: perf is not officially supported by the V8 team). Ideally the test should also run on the PR, but the priority should be to make sure it runs on nodejs/node-v8. |
Looking at nodejs/node#20783 I think it is also possible to write the test as a benchmark similar to how the http/http2 benchmarks are structured (also have external dependencies like wrk), and write a test for the benchmark controlling the parameters so that it does not take much time to run...or if we can get the benchmark job run on arbitrary base and PR, we can simply run the benchmark for v8 updates (which is also worth doing regardless of the perf test). |
Totally agree, running benchmarks on v8 updates would be a good idea.
Because it's easier to setup the infra for benchmarks or there are concerns about the test speed? The test takes only a few seconds to run (should be below 10s even on slower machines). BTW, I'm open to help install |
@mmarchini Mostly because it depends on external tools for stats, similar to how the HTTP/HTTP2 benchmarks work. Also the benchmark machines are supposed to only run one job at a time for the stability of the results, so no benchmarks should be run when running tests on them. Although come to think of it, maybe we could put the post-mortem and perf tests in a new directory under |
I like this idea. I'll add a commit to nodejs/node#20783 moving those tests to a new directory. |
I'm weary about this one. I've done some experimenting and debian is a bit of a mess but ubuntu maintains linux-tools-generic pretty nicely. However, it's strict about matching kernel version to perf version, they have to be exactly the same down to tags:
^ in this case it's because the server is running 4.4.0-119 but it's had one or two kernel updates without reboot since that point and is up to 4.4.0-127, ready to run once a reboot happens. So linux-tools-generic installs linux-tools-4.4.0-127 and gives that error. Reboot and it's all fine because we have 4.4.0-127 everywhere. Note that it's not actually just a "warning", it's fatal with a non-zero exit code and it doesn't run anything useful. We don't currently tie rebooting to software updates. It's very common for a new kernel to be installed but not activated until a reboot and those reboots may not happen for long periods of time. Updates may be done manually or as part of another Ansible run against the machine but we still end up in that awkward state if there's a kernel update but no reboot. I think we're going to run into the same thing, but maybe worse, in Docker give the host/container mismatch problem and the longer caching of container layers for building. Perhaps this error is only for packaged |
i think that is the case. From my server
The |
Ansible role to install Linux perf on Ubuntu by cloning the Linux source code and building tools/perf to avoid Kernel mismatch errors. Ref: nodejs#1274
Install Linux perf on Ubuntu 16.04 machines through jenkins/worker/create playbook. Ref: nodejs#1274
Install Linux perf on Ubuntu 16.04 machines through jenkins/worker/create playbook. Ref: nodejs#1274
Fixed by #1321 |
Linux perf has been broken on V8 since the Turbofan/Ignition pipeline became the default compiler. Recently on V8 6.7, we got it back to work (through a flag). Since there are some Node.js tools and some huge Node.js deployments relying on Linux perf (and other external profilers), having tests will help to keep these tools more stable.
For those tests to work, we need Linux perf available on at least some Linux machines. We could start on Ubuntu 16.04 machines and if we want we can install it on other machines later. The package can be installed with:
apt install linux-tools-generic
. Is it feasible?Ref: nodejs/node#20783
The text was updated successfully, but these errors were encountered: