Reduce CPU utilization in executors when scanning for PIDs. #5832

picoDoc · 2019-06-13T15:24:40Z

Nomad version

Nomad v0.9.2 (028326684b9da489e0371247a223ef3ae4755d87)

Operating system and Environment details

$ lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description:    CentOS Linux release 7.2.1511 (Core)
Release:        7.2.1511
Codename:       Core

Issue

I've found when running large numbers of nomad jobs per host (>100) the CPU overhead of the nomad executor processes becomes a major problem. This seems to be due to each nomad executor frequently scanning the process tree (via collectPids). Reducing this frequency seems to greatly improve the situation for us (see reproduction below), but it's currently hard-coded to 5 seconds. Could this be made configurable? As far I can tell the collection of pids is only required for telemetry collection? Any help here would be greatly appreciated!

Reproduction steps

Running the nomad agent with the config file below, and starting up 10 instances of our test job (just a simple sleep command) using nomad job run test.nomad, the executors settle down to using about 0.3% CPU each:

$ ps -eo pid,%cpu,cmd|grep executor
  PID %CPU CMD
25275  0.3 nomad executor {"LogFile":"/home/...
25292  0.3 nomad executor {"LogFile":"/home/...
25309  0.3 nomad executor {"LogFile":"/home/...
25340  0.3 nomad executor {"LogFile":"/home/...
25366  0.3 nomad executor {"LogFile":"/home/...
25370  0.3 nomad executor {"LogFile":"/home/...
25374  0.3 nomad executor {"LogFile":"/home/...
25381  0.3 nomad executor {"LogFile":"/home/...
25414  0.3 nomad executor {"LogFile":"/home/...
25446  0.3 nomad executor {"LogFile":"/home/...

On the other hand if I repeat this with the count in the job config below increased to 500, each executor now uses something closer to 2% CPU:

$ ps -eo pid,%cpu,cmd|grep executor
  PID %CPU CMD
11006  1.8 nomad executor {"LogFile":"/home/...
14985  1.8 nomad executor {"LogFile":"/home/...
16390  1.8 nomad executor {"LogFile":"/home/...
13952  1.8 nomad executor {"LogFile":"/home/...
14714  1.8 nomad executor {"LogFile":"/home/...
11164  1.8 nomad executor {"LogFile":"/home/...
16467  1.8 nomad executor {"LogFile":"/home/...
 9901  1.8 nomad executor {"LogFile":"/home/...
...

This is an issue for us as in this case the executors start to use the majority of CPU cycles on our hosts. I assume this is because each executor spawns a large number of go threads and so increases the over pid count on the system, which each executor is scanning every 5 seconds. I tried changing the pidScanInterval to 120s and recompiling, and this brought the CPU usage per executor down to less than when there were only 10 processes, so if were able to tweak this it would solve our issue.

Nomad config

{
  "client": {"enabled":true,"options": {"driver.raw_exec.enable": "1"}},
  "server": {"enabled":true, "bootstrap_expect": 1},
  "log_level": "DEBUG",
  "telemetry": {"collection_interval": "5m", "disable_dispatched_job_summary_metrics": true}
}

Job file

$ cat test.nomad
job "test" {
  datacenters = ["dc1"]
  type = "service"
  group "test" {
    count = 10
    ephemeral_disk {
      size = "101"
    }
    task "sleep" {
      driver = "raw_exec"
      config {
        command = "sleep"
        args    = ["3600"]
      }
      resources {
        memory = 10
        cpu    = 20
        network {
          mbits = 10
          port "db" {}
        }
      }
    }
  }
}

The text was updated successfully, but these errors were encountered:

preetapan · 2019-06-17T21:51:15Z

@picoDoc Thanks for the detailed report and repro steps.

I tried replicating what you saw - I do see CPU utilization with pidScanInterval left at 5 seconds to be higher than when it was modified to be every 120 seconds (it spiked initially to 1.8% and then stabilized to around 0.7%) . I am not seeing it go as high as 2% like you reported above. Regardless this seems like a valid improvement suggestion, because by increasing the pidScanInterval I did see progressive reduction in CPU utilization on the executor.

This is very much an internal implementation detail of the executor and not currently designed to be overridden, I will discuss internally and report back with our concrete plans.

preetapan · 2019-06-24T14:15:23Z

Discussed this internally and rather than making pidScaninterval configurable we think there are other optimization approaches in the code that's scanning PIDs, it can be done much more effeciently than our current approach. We will target this in a future release.

picoDoc · 2019-07-01T12:27:02Z

Ok cool, thanks. Any idea on a timescale for this improvement? This currently has quite a large impact for us.

picoDoc · 2019-07-22T09:58:57Z

Since the changes in #5951 the test described above seems to fail to launch any jobs. All jobs fail with a timeline something like:

Recent Events:
Time                       Type            Description
2019-07-22T05:32:26-04:00  Killing         Sent interrupt. Waiting 5s before force killing
2019-07-22T05:31:59-04:00  Not Restarting  Error was unrecoverable
2019-07-22T05:31:59-04:00  Driver Failure  failed to launch command with executor: rpc error: code = Unknown desc = mkdir /sys/fs/cgroup/freezer/nomad: permission denied
2019-07-22T05:31:58-04:00  Task Setup      Building Task Directory
2019-07-22T05:31:58-04:00  Received        Task received by client

Could this be because this change assumes cgroups are being used? In our case I think because nomad is not launched as root they are not used.

Can we re-open this ticket? @langmartin @preetapan

langmartin · 2019-07-22T19:48:17Z

@picoDoc thanks very much for the followup, I left a testing gap around non-root raw_exec use.

In order to take advantage of the fix implemented in #5951, you'll need to run nomad with cgroup creation privileges. This should be possible outside of nomad if you use a root script to create a cgroup that allows the nomad user to create cgroups.

Without that permission, #5991 will allow nomad to start, but won't improve your CPU usage.

picoDoc · 2019-07-24T13:12:22Z

Awesome thank you!

For reference to give nomad the appropriate cgroup permissions I had to run:

sudo mkdir /sys/fs/cgroup/freezer/nomad
sudo chown –R ven_mdoherty /sys/fs/cgroup/freezer/nomad

After this I re-ran the tests above using Nomad v0.9.4-rc1 (4999923574186583ce7a007865f745ecbc55ceb9) with 500 executors and the resource usage looks much better:

  PID %CPU CMD
21113  0.0 nomad executor {"LogFile":"/home/...
21116  0.0 nomad executor {"LogFile":"/home/...
21124  0.0 nomad executor {"LogFile":"/home/...
21129  0.0 nomad executor {"LogFile":"/home/...
21192  0.0 nomad executor {"LogFile":"/home/...
21193  0.0 nomad executor {"LogFile":"/home/...
...

So thanks, appreciate the help! Would it be worth noting in the documentation that you need to setup cgroups appropriately to take advantage of this optimization?

langmartin · 2019-07-24T17:17:28Z

Great! glad to hear it. This issue should be documented, I've just added some.

github-actions · 2022-11-20T02:30:56Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

preetapan added type/enhancement theme/driver/exec theme/driver/raw_exec labels Jun 17, 2019

preetapan changed the title ~~Could pidScanInterval be made configurable?~~ Reduce CPU utilization in executors when scanning for PIDs. Jun 24, 2019

langmartin mentioned this issue Jul 11, 2019

reduce CPU usage running large numbers of clients #5951

Merged

1 task

langmartin closed this as completed in #5951 Jul 19, 2019

langmartin reopened this Jul 22, 2019

langmartin mentioned this issue Jul 22, 2019

executor_universal_linux raw_exec cgroup failure is not fatal #5991

Merged

langmartin mentioned this issue Jul 24, 2019

document cgroup raw exec #6008

Merged

langmartin closed this as completed in #6008 Jul 24, 2019

srihari-vistrian mentioned this issue Jul 28, 2021

Nomad executor processes consume lot of CPU on windows Server #10960

Open

github-actions bot locked as resolved and limited conversation to collaborators Nov 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce CPU utilization in executors when scanning for PIDs. #5832

Reduce CPU utilization in executors when scanning for PIDs. #5832

picoDoc commented Jun 13, 2019

preetapan commented Jun 17, 2019

preetapan commented Jun 24, 2019

picoDoc commented Jul 1, 2019

picoDoc commented Jul 22, 2019 •

edited

Loading

langmartin commented Jul 22, 2019

picoDoc commented Jul 24, 2019

langmartin commented Jul 24, 2019

github-actions bot commented Nov 20, 2022

Reduce CPU utilization in executors when scanning for PIDs. #5832

Reduce CPU utilization in executors when scanning for PIDs. #5832

Comments

picoDoc commented Jun 13, 2019

Nomad version

Operating system and Environment details

Issue

Reproduction steps

Nomad config

Job file

preetapan commented Jun 17, 2019

preetapan commented Jun 24, 2019

picoDoc commented Jul 1, 2019

picoDoc commented Jul 22, 2019 • edited Loading

langmartin commented Jul 22, 2019

picoDoc commented Jul 24, 2019

langmartin commented Jul 24, 2019

github-actions bot commented Nov 20, 2022

picoDoc commented Jul 22, 2019 •

edited

Loading