Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce CPU utilization in executors when scanning for PIDs. #5832

Closed
picoDoc opened this issue Jun 13, 2019 · 8 comments · Fixed by #5951 or #6008
Closed

Reduce CPU utilization in executors when scanning for PIDs. #5832

picoDoc opened this issue Jun 13, 2019 · 8 comments · Fixed by #5951 or #6008

Comments

@picoDoc
Copy link

picoDoc commented Jun 13, 2019

Nomad version

Nomad v0.9.2 (028326684b9da489e0371247a223ef3ae4755d87)

Operating system and Environment details

$ lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description:    CentOS Linux release 7.2.1511 (Core)
Release:        7.2.1511
Codename:       Core

Issue

I've found when running large numbers of nomad jobs per host (>100) the CPU overhead of the nomad executor processes becomes a major problem. This seems to be due to each nomad executor frequently scanning the process tree (via collectPids). Reducing this frequency seems to greatly improve the situation for us (see reproduction below), but it's currently hard-coded to 5 seconds. Could this be made configurable? As far I can tell the collection of pids is only required for telemetry collection? Any help here would be greatly appreciated!

Reproduction steps

Running the nomad agent with the config file below, and starting up 10 instances of our test job (just a simple sleep command) using nomad job run test.nomad, the executors settle down to using about 0.3% CPU each:

$ ps -eo pid,%cpu,cmd|grep executor
  PID %CPU CMD
25275  0.3 nomad executor {"LogFile":"/home/...
25292  0.3 nomad executor {"LogFile":"/home/...
25309  0.3 nomad executor {"LogFile":"/home/...
25340  0.3 nomad executor {"LogFile":"/home/...
25366  0.3 nomad executor {"LogFile":"/home/...
25370  0.3 nomad executor {"LogFile":"/home/...
25374  0.3 nomad executor {"LogFile":"/home/...
25381  0.3 nomad executor {"LogFile":"/home/...
25414  0.3 nomad executor {"LogFile":"/home/...
25446  0.3 nomad executor {"LogFile":"/home/...

On the other hand if I repeat this with the count in the job config below increased to 500, each executor now uses something closer to 2% CPU:

$ ps -eo pid,%cpu,cmd|grep executor
  PID %CPU CMD
11006  1.8 nomad executor {"LogFile":"/home/...
14985  1.8 nomad executor {"LogFile":"/home/...
16390  1.8 nomad executor {"LogFile":"/home/...
13952  1.8 nomad executor {"LogFile":"/home/...
14714  1.8 nomad executor {"LogFile":"/home/...
11164  1.8 nomad executor {"LogFile":"/home/...
16467  1.8 nomad executor {"LogFile":"/home/...
 9901  1.8 nomad executor {"LogFile":"/home/...
...

This is an issue for us as in this case the executors start to use the majority of CPU cycles on our hosts. I assume this is because each executor spawns a large number of go threads and so increases the over pid count on the system, which each executor is scanning every 5 seconds. I tried changing the pidScanInterval to 120s and recompiling, and this brought the CPU usage per executor down to less than when there were only 10 processes, so if were able to tweak this it would solve our issue.

Nomad config

{
  "client": {"enabled":true,"options": {"driver.raw_exec.enable": "1"}},
  "server": {"enabled":true, "bootstrap_expect": 1},
  "log_level": "DEBUG",
  "telemetry": {"collection_interval": "5m", "disable_dispatched_job_summary_metrics": true}
}

Job file

$ cat test.nomad
job "test" {
  datacenters = ["dc1"]
  type = "service"
  group "test" {
    count = 10
    ephemeral_disk {
      size = "101"
    }
    task "sleep" {
      driver = "raw_exec"
      config {
        command = "sleep"
        args    = ["3600"]
      }
      resources {
        memory = 10
        cpu    = 20
        network {
          mbits = 10
          port "db" {}
        }
      }
    }
  }
}
@preetapan
Copy link
Contributor

@picoDoc Thanks for the detailed report and repro steps.

I tried replicating what you saw - I do see CPU utilization with pidScanInterval left at 5 seconds to be higher than when it was modified to be every 120 seconds (it spiked initially to 1.8% and then stabilized to around 0.7%) . I am not seeing it go as high as 2% like you reported above. Regardless this seems like a valid improvement suggestion, because by increasing the pidScanInterval I did see progressive reduction in CPU utilization on the executor.

This is very much an internal implementation detail of the executor and not currently designed to be overridden, I will discuss internally and report back with our concrete plans.

@preetapan
Copy link
Contributor

Discussed this internally and rather than making pidScaninterval configurable we think there are other optimization approaches in the code that's scanning PIDs, it can be done much more effeciently than our current approach. We will target this in a future release.

@preetapan preetapan changed the title Could pidScanInterval be made configurable? Reduce CPU utilization in executors when scanning for PIDs. Jun 24, 2019
@picoDoc
Copy link
Author

picoDoc commented Jul 1, 2019

Ok cool, thanks. Any idea on a timescale for this improvement? This currently has quite a large impact for us.

@picoDoc
Copy link
Author

picoDoc commented Jul 22, 2019

Since the changes in #5951 the test described above seems to fail to launch any jobs. All jobs fail with a timeline something like:

Recent Events:
Time                       Type            Description
2019-07-22T05:32:26-04:00  Killing         Sent interrupt. Waiting 5s before force killing
2019-07-22T05:31:59-04:00  Not Restarting  Error was unrecoverable
2019-07-22T05:31:59-04:00  Driver Failure  failed to launch command with executor: rpc error: code = Unknown desc = mkdir /sys/fs/cgroup/freezer/nomad: permission denied
2019-07-22T05:31:58-04:00  Task Setup      Building Task Directory
2019-07-22T05:31:58-04:00  Received        Task received by client

Could this be because this change assumes cgroups are being used? In our case I think because nomad is not launched as root they are not used.

Can we re-open this ticket? @langmartin @preetapan

@langmartin
Copy link
Contributor

@picoDoc thanks very much for the followup, I left a testing gap around non-root raw_exec use.

In order to take advantage of the fix implemented in #5951, you'll need to run nomad with cgroup creation privileges. This should be possible outside of nomad if you use a root script to create a cgroup that allows the nomad user to create cgroups.

Without that permission, #5991 will allow nomad to start, but won't improve your CPU usage.

@picoDoc
Copy link
Author

picoDoc commented Jul 24, 2019

Awesome thank you!

For reference to give nomad the appropriate cgroup permissions I had to run:

sudo mkdir /sys/fs/cgroup/freezer/nomad
sudo chown –R ven_mdoherty /sys/fs/cgroup/freezer/nomad

After this I re-ran the tests above using Nomad v0.9.4-rc1 (4999923574186583ce7a007865f745ecbc55ceb9) with 500 executors and the resource usage looks much better:

  PID %CPU CMD
21113  0.0 nomad executor {"LogFile":"/home/...
21116  0.0 nomad executor {"LogFile":"/home/...
21124  0.0 nomad executor {"LogFile":"/home/...
21129  0.0 nomad executor {"LogFile":"/home/...
21192  0.0 nomad executor {"LogFile":"/home/...
21193  0.0 nomad executor {"LogFile":"/home/...
...

So thanks, appreciate the help! Would it be worth noting in the documentation that you need to setup cgroups appropriately to take advantage of this optimization?

@langmartin
Copy link
Contributor

Great! glad to hear it. This issue should be documented, I've just added some.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants