Podman stats showing wrong memory usage #1642

abdelaziz-ouhammou · 2023-09-08T19:48:06Z

Issue Description

$ podman version
Client:       Podman Engine
Version:      4.4.1
API Version:  4.4.1
Go Version:   go1.19.6
Built:        Thu Jun 15 17:39:56 2023
OS/Arch:      linux/amd64

$ rpm -q podman
podman-4.4.1-14.module+el8.8.0+19108+ffbdcd02.x86_64

Steps to reproduce the issue

run any file creation command such as tar or dd to create a file inside the container
example :

$ sudo podman container exec -it api sh
/app # dd if=/dev/zero of=output.dat  bs=24M  count=100
100+0 records in
100+0 records out
/app # sync
/app #

Describe the results you received

when runnin podman stats it shows a high memory utilisation for the container

$ sudo podman stats
ID             NAME        CPU %       MEM USAGE / LIMIT  MEM %       NET IO             BLOCK IO           PIDS        CPU TIME           AVG CPU %
ed71asdasdaas  api         4.91%       2.58GB / 8.052GB   32.04%      842.7MB / 1.188GB  55.28MB / 29.39MB  7           15m35.164593502s   4.91%

the process is normaly using 48mb

the only way to fix this is to manually run the following command on the host:

$ sudo sh -c 'echo 1 > /proc/sys/vm/drop_caches'
$ sudo podman stats
ed71asdasda   api         0.14%       40.98MB / 8.052GB  0.51%       852.7MB / 1.201GB  55.28MB / 29.93MB  7           15m45.419706366s   4.91%

Describe the results you expected

I expect podman stats to be accurate for monitoring the memory usage of containers. But running a backup inside the container or any IO operation messes up with the output

podman info output

host:
  arch: amd64
  buildahVersion: 1.29.0
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.6-1.module+el8.8.0+18098+9b44df5f.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.6, commit: 8c4ab5a095127ecc96ef8a9c885e0e1b14aeb11b'
  cpuUtilization:
    idlePercent: 81.92
    systemPercent: 4.36
    userPercent: 13.72
  cpus: 2
  distribution:
    distribution: '"rhel"'
    version: "8.8"
  eventLogger: file
  hostname: [hostname redacted]
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 231072
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 231072
      size: 65536
  kernel: 4.18.0-372.19.1.el8_6.x86_64
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 4544892928
  memTotal: 8052305920
  networkBackend: cni
  ociRuntime:
    name: runc
    package: runc-1.1.4-1.module+el8.8.0+18060+3f21f2cc.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.1.4
      spec: 1.0.2-dev
      go: go1.19.4
      libseccomp: 2.5.2
  os: linux
  remoteSocket:
    path: /run/user/1002/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_SYS_CHROOT,CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-2.module+el8.8.0+18060+3f21f2cc.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.2
  swapFree: 8332025856
  swapTotal: 8589930496
  uptime: 1592h 37m 31.00s (Approximately 66.33 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: ~/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: ~/.local/share/containers/storage
  graphRootAllocated: 10726932480
  graphRootUsed: 1256620032
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/1002/containers
  transientStore: false
  volumePath: ~/.local/share/containers/storage/volumes
version:
  APIVersion: 4.4.1
  Built: 1686839996
  BuiltTime: Thu Jun 15 17:39:56 2023
  GitCommit: ""
  GoVersion: go1.19.6
  Os: linux
  OsArch: linux/amd64
  Version: 4.4.1

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

Yes

Additional environment details

No response

Additional information

No response

The text was updated successfully, but these errors were encountered:

giuseppe · 2023-09-11T07:20:45Z

can you reproduce it with any image?

How have you created the container?

abdelaziz-ouhammou · 2023-09-11T07:41:16Z

@giuseppe Yes so far all images are affected (postgres image, prometheus image....)

the containers have been created using podman native commands.

sudo podman container run --restart=always -d --name api -p 4444:4444 api

giuseppe · 2023-09-11T09:28:54Z

that is expected output, the file you've just created is in the memory cache and the kernel accounts for that in the memory.usage_in_bytes file.

I've tried your same command on cgroup v1 and the cgroup reports the following usage:

# podman stats --no-stream --no-reset
ID            NAME              CPU %       MEM USAGE / LIMIT  MEM %       NET IO             BLOCK IO           PIDS        CPU TIME       AVG CPU %
5db629b04d96  unruffled_darwin  2.21%       2.87GB / 3.843GB   74.68%      190.2MB / 409.4kB  27.49MB / 5.935MB  1           42.101330797s  2.21%
# cat /sys/fs/cgroup/memory/machine.slice/libpod-5db629b04d960b7b4641928480075c245dc0503dc309fe033eb76096e6adee62.scope/memory.usage_in_bytes 
2869575680

This memory is reclaimed if the container needs more, in fact you can see it is only cache:

# cat /sys/fs/cgroup/memory/machine.slice/libpod-5db629b04d960b7b4641928480075c245dc0503dc309fe033eb76096e6adee62.scope/memory.stat
cache 2842775552
rss 675840
rss_huge 0
shmem 0
mapped_file 0
dirty 0
writeback 0
swap 0
pgpgin 2239392
pgpgout 1545190
pgfault 279477
pgmajfault 6
inactive_anon 655360
active_anon 20480
inactive_file 2653802496
active_file 188973056
unevictable 0
hierarchical_memory_limit 9223372036854771712
hierarchical_memsw_limit 9223372036854771712
total_cache 2842775552
total_rss 675840
total_rss_huge 0
total_shmem 0
total_mapped_file 0
total_dirty 0
total_writeback 0
total_swap 0
total_pgpgin 2239392
total_pgpgout 1545190
total_pgfault 279477
total_pgmajfault 6
total_inactive_anon 655360
total_active_anon 20480
total_inactive_file 2653802496
total_active_file 188973056
total_unevictable 0

You can give a hint the kernel about releasing a file with fadvise, e.g. I've tried the following C program:

#include <fcntl.h>

int main() {
    return posix_fadvise(1, 0, 0, POSIX_FADV_DONTNEED) ? 1 : 0;
}

and from the container:

# ./try-release-file-from-cache < output.dat

and after a while:

# podman stats --no-stream --no-reset
ID            NAME              CPU %       MEM USAGE / LIMIT  MEM %       NET IO             BLOCK IO           PIDS        CPU TIME       AVG CPU %
5db629b04d96  unruffled_darwin  1.99%       347.2MB / 3.843GB  9.04%       190.2MB / 409.4kB  27.49MB / 5.935MB  1           42.293771618s  1.99%

I am closing the issue since Podman is just reporting the information it gets from the kernel, but feel free to comment further

abdelaziz-ouhammou · 2023-09-11T09:49:25Z

@giuseppe Thank you very much for your help. I wrongly assumed that podman has the same behavior as docker. this is the excerpt from the documentation for docker stats:
On Linux, the Docker CLI reports memory usage by subtracting cache usage from the total memory usage.

so in your opinion @giuseppe what would be the best way to monitor the actual usage of memory ?

giuseppe · 2023-09-11T10:06:44Z

thanks for the additional info, I'll take another look and compare with Docker

abdelaziz-ouhammou · 2023-09-11T10:14:23Z

@giuseppe I just want to add that i checked the docker source code and they have the following function

// calculateMemUsageUnixNoCache calculate memory usage of the container.
// Cache is intentionally excluded to avoid misinterpretation of the output.
//
// On cgroup v1 host, the result is `mem.Usage - mem.Stats["total_inactive_file"]` .
// On cgroup v2 host, the result is `mem.Usage - mem.Stats["inactive_file"] `.
//
// This definition is consistent with cadvisor and containerd/CRI.
// * https://github.com/google/cadvisor/commit/307d1b1cb320fef66fab02db749f07a459245451
// * https://github.com/containerd/cri/commit/6b8846cdf8b8c98c1d965313d66bc8489166059a
//
// On Docker 19.03 and older, the result was `mem.Usage - mem.Stats["cache"]`.
// See https://github.com/moby/moby/issues/40727 for the background.
func calculateMemUsageUnixNoCache(mem types.MemoryStats) float64 {
	// cgroup v1
	if v, isCgroup1 := mem.Stats["total_inactive_file"]; isCgroup1 && v < mem.Usage {
		return float64(mem.Usage - v)
	}
	// cgroup v2
	if v := mem.Stats["inactive_file"]; v < mem.Usage {
		return float64(mem.Usage - v)
	}
	return float64(mem.Usage)
}

this is the actual link for the file https://github.com/docker/cli/blob/master/cli/command/container/stats_helpers.go

calculate the memory usage on cgroup v1 using the same logic as cgroup v2. Since there is no single "anon" field, calculate the memory usage by summing the two fields "total_active_anon" and "total_inactive_anon". Closes: containers#1642 [NO NEW TESTS NEEDED] Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe · 2023-09-11T11:18:15Z

opened a PR: #1643

rhatdan · 2023-09-11T12:55:29Z

We should probably match Docker's behaviour. Thanks @abdelaziz-ouhammou for diagnosing this.

ahmad-75 · 2024-05-26T14:55:16Z

Hi @giuseppe , since issue was fixed recently, I assume it exists in earlier podman 3.4.2 version?

calculate the memory usage on cgroup v1 using the same logic as cgroup v2. Since there is no single "anon" field, calculate the memory usage by summing the two fields "total_active_anon" and "total_inactive_anon". Closes: containers#1642 Closes: https://issues.redhat.com/browse/RHEL-16376 [NO NEW TESTS NEEDED] Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com> (cherry picked from commit 1a9d45c)

calculate the memory usage on cgroup v1 using the same logic as cgroup v2. Since there is no single "anon" field, calculate the memory usage by summing the two fields "total_active_anon" and "total_inactive_anon". Closes: containers#1642 [NO NEW TESTS NEEDED] Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

calculate the memory usage on cgroup v1 using the same logic as cgroup v2. Since there is no single "anon" field, calculate the memory usage by summing the two fields "total_active_anon" and "total_inactive_anon". Closes: containers#1642 Closes: https://issues.redhat.com/browse/RHEL-16376 [NO NEW TESTS NEEDED] Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com> (cherry picked from commit 1a9d45c)

abdelaziz-ouhammou added the kind/bug label Sep 8, 2023

giuseppe closed this as completed Sep 11, 2023

giuseppe reopened this Sep 11, 2023

giuseppe transferred this issue from containers/podman Sep 11, 2023

giuseppe mentioned this issue Sep 11, 2023

cgroups: fix memory usage on cgroup v1 #1643

Merged

openshift-merge-robot closed this as completed in #1643 Sep 12, 2023

giuseppe mentioned this issue Sep 10, 2024

[v0.51] cgroups: fix memory usage on cgroup v1 #2157

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Podman stats showing wrong memory usage #1642

Podman stats showing wrong memory usage #1642

abdelaziz-ouhammou commented Sep 8, 2023 •

edited

Loading

giuseppe commented Sep 11, 2023

abdelaziz-ouhammou commented Sep 11, 2023 •

edited

Loading

giuseppe commented Sep 11, 2023

abdelaziz-ouhammou commented Sep 11, 2023

giuseppe commented Sep 11, 2023

abdelaziz-ouhammou commented Sep 11, 2023 •

edited

Loading

giuseppe commented Sep 11, 2023

rhatdan commented Sep 11, 2023

ahmad-75 commented May 26, 2024 •

edited

Loading

Podman stats showing wrong memory usage #1642

Podman stats showing wrong memory usage #1642

Comments

abdelaziz-ouhammou commented Sep 8, 2023 • edited Loading

Issue Description

Steps to reproduce the issue

Describe the results you received

Describe the results you expected

podman info output

Podman in a container

Privileged Or Rootless

Upstream Latest Release

Additional environment details

Additional information

giuseppe commented Sep 11, 2023

abdelaziz-ouhammou commented Sep 11, 2023 • edited Loading

giuseppe commented Sep 11, 2023

abdelaziz-ouhammou commented Sep 11, 2023

giuseppe commented Sep 11, 2023

abdelaziz-ouhammou commented Sep 11, 2023 • edited Loading

giuseppe commented Sep 11, 2023

rhatdan commented Sep 11, 2023

ahmad-75 commented May 26, 2024 • edited Loading

abdelaziz-ouhammou commented Sep 8, 2023 •

edited

Loading

abdelaziz-ouhammou commented Sep 11, 2023 •

edited

Loading

abdelaziz-ouhammou commented Sep 11, 2023 •

edited

Loading

ahmad-75 commented May 26, 2024 •

edited

Loading