Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad task got OOM killed when it was using only ~70% of it's MemoryMB limit #4495

Closed
fho opened this issue Jul 11, 2018 · 9 comments
Closed

Comments

@fho
Copy link
Contributor

fho commented Jul 11, 2018

This is the same issue then described in #4491 but as bug report.

Nomad version

Nomad v0.8.3

Issue

We have a nomad job that runs an application called claimsearch-service with the exec executor.
The memory limit is set to 50MiB in the nomad job file.
The application got OOM killed when it was only using 35,35MB RSS.

In the memory cgroup were the following processes with the following RSS usage:

Process RSS
nomad 13,75MiB
claimsearch-service 35,36MiB
grpc-health-check 4,93 MiB

Expected behaviour

  • The task is not OOM killed when it uses less RSS memory then configured in the MemoryMB parameter in the Resources Stanza of the nomad job file.
  • The configured memory limit only applies to the executed nomad Task.

That the memory consumption of other processes are accounted into the memory limit is non-intuitive, it's not documented and it makes it difficult to calculate the correct Memory limit value for a task.
See also: #4491

OOM kill Kernel log

Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: Task in /nomad/ebf75298-ba47-98a8-28e5-a08daf20d60e killed as a result of limit of /nomad/ebf75298-ba47-98a8-28e5-a08daf20d60e
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: memory: usage 51200kB, limit 51200kB, failcnt 16402868
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: Memory cgroup stats for /nomad/ebf75298-ba47-98a8-28e5-a08daf20d60e: cache:36KB rss:51164KB rss_huge:16384KB mapped_file:8KB dirty:0KB writeback:0KB inactive_anon:25624KB a
ctive_anon:23492KB inactive_file:0KB active_file:0KB unevictable:0KB
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: [31962]     0 31962    82711     3521      44       5       61             0 nomad
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: [31970]    33 31970    13807     9046      31       5        0             0 claimsearch-ser
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: [30195]    33 30195    28465     1262      19       6        0             0 grpc-health-che
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: Memory cgroup out of memory: Kill process 31970 (claimsearch-ser) score 709 or sacrifice child
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: Killed process 31970 (claimsearch-ser) total-vm:55228kB, anon-rss:36184kB, file-rss:0kB
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: grpc-health-che invoked oom-killer: gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 10 08:56:03 prd-sisu-nomad-client-1.localdomain kernel: grpc-health-che cpusetëf75298-ba47-98a8-28e5-a08daf20d60e mems_allowed=0

Job file

The full job file can be found at: http://dpaste.com/05YWFVW

@preetapan
Copy link
Contributor

@fho thanks for the details. We plan to fix executor memory utilization in the upcoming release, 13MB is rather high.

@fho
Copy link
Contributor Author

fho commented Jul 13, 2018

@preetapan
The amount of memory that nomad executor consumes is not the issue.
As long as the task is in a cgroup with other processes , the task gets OOM killed when it uses less
memory then it's configured memory limit.

Let's assume the memory consumption of nomad executor was lowered from 13MB to 5MB.
Now I run a task with a low memory footprint of 2MB via nomad, configure it's memory limit to 5MB to have some buffer.
The task would get OOM killed because the cgroup memory limit is reached:
2MB task memory + 5MB nomad executor memory > 5MB memory limit

@preetapan
Copy link
Contributor

@fho see comments me and my co-worker made already about why the executor and script checks have to be in the same cgroup.

#4491 (comment)
#4491 (comment)

There's always going to be some amount of overhead from using the executor, and we will address that with a TBD mechanism - will likely either account for that when creating the container, or use soft limits.

@fho
Copy link
Contributor Author

fho commented Jul 13, 2018

@preetapan

The executor is responsible for managing the lifecycle of the application, so its a desired feature to have it be in the same cgroup.
[..]
Allowing script checks to run outside the task's container and resource limits would be a major security and isolation issue.

I don't understand yet why they have to be in the same cgroup.
It would be great if you could elaborate on it.

  • What would be the disadvantages of other solutions like having each check and each nomad-executor in their own memory cgroups?
  • What are the advantages of having them in the same memory cgroup?
  • What would are the concrete security and isolation issues if each check and each nomad-executor is in their own memory cgroup?

thanks a lot

@memelet
Copy link

memelet commented Sep 6, 2018

I'm having a hard time understanding this. I have a container with 600MB limit. A java process with use heap+non-heap used of ~360. And its getting oom killed every 10 minutes or so. I can't be that nomad services are using 240MB? And if not, how can i tell why the processing is getting killed?

@sirkjohannsen
Copy link

I would just like to add that with nomad 0.9 the resource footprint of nomad processes within the cgroup have increased even more.
Most of our lightweight microservices now need double the resources configured in nomad compared to 0.8.

@notnoop
Copy link
Contributor

notnoop commented May 31, 2019

Wanted to clarify the behavior of Nomad 0.9:

@notnoop
Copy link
Contributor

notnoop commented Dec 13, 2019

I'm closing this ticket as exec driver has been significantly changed since 0.8 and I believe the notes here are either addressed or no longer relevant. I'd encourage users experiencing memory issues to create a new issue against 0.10.

Since my last May 31, comment we made the following changes:

Please let us know of any issues you see and we will follow up. Thanks!

@notnoop notnoop closed this as completed Dec 13, 2019
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants