Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Btrfs dedupe images Raw #13894

Closed
lizelive opened this issue Apr 16, 2022 · 5 comments
Closed

Btrfs dedupe images Raw #13894

lizelive opened this issue Apr 16, 2022 · 5 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@lizelive
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind feature

Description

What is the correct way to dedupe images.

Steps to reproduce the issue:

  1. sudo dnf -y install duperemove

  2. sudo duperemove -drAh --hashfile=containers.hash ~/.local/share/containers/storage/

  3. Comparison of extent info shows a net change in shared extents of: 15.9M

Describe the results you received:

It deduped 15.9M, but mostly a huge number of Skipping - extents are already deduped.

Describe the results you expected:
I want to safely dedupe images because the images where created by a bunch of different providers and have a lot of overlap. Also multiple non-root users.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Client:       Podman Engine
Version:      4.0.0-rc4
API Version:  4.0.0-rc4
Go Version:   go1.18beta2

Built:      Fri Feb 11 06:51:09 2022
OS/Arch:    linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.24.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.0-2.fc36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpus: 24
  distribution:
    distribution: fedora
    variant: silverblue
    version: "36"
  eventLogger: journald
  hostname: L04
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.17.0-0.rc7.116.fc36.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 95162351616
  memTotal: 135059922944
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.4.2-2.fc36.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4.2
      commit: f6fbc8f840df1a414f31a60953ae514fa497c748
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 8589930496
  swapTotal: 8589930496
  uptime: 75h 47m 11.48s (Approximately 3.12 days)
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /var/home/lizelive/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/lizelive/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /run/user/1000/containers
  volumePath: /var/home/lizelive/.local/share/containers/storage/volumes
version:
  APIVersion: 4.0.0-rc4
  Built: 1644591069
  BuiltTime: Fri Feb 11 06:51:09 2022
  GitCommit: ""
  GoVersion: go1.18beta2
  OsArch: linux/amd64
  Version: 4.0.0-rc4

Package info (e.g. output of rpm -q podman or apt list podman):

podman-4.0.0-0.6.rc4.fc36.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

I read the guide and also tested on 3.4.4 and btrfs graph driver

Additional environment details (AWS, VirtualBox, physical, etc.):
physical fedora silverblue 36 beta

@openshift-ci openshift-ci bot added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 16, 2022
@mheon
Copy link
Member

mheon commented Apr 16, 2022

Can I ask what you are attempting to do? The storage backend already ensures image layers are not duplicated. Are you trying to deduplicate within layers? Our primary focus is on layer deduplication, I'm not aware of any focus on deduplicating individual files as such.

@lizelive
Copy link
Author

lizelive commented Apr 16, 2022 via email

@giuseppe
Copy link
Member

there is an attempt to solve this issue in containers/storage: containers/storage#775

Could you try creating images with the zstd:chunked format (more information here: https://www.redhat.com/sysadmin/faster-container-image-pulls)?

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented May 23, 2022

Since we have had no response to the request in the zstd:chunked code has made its way into podman, I am closing this issue.

@rhatdan rhatdan closed this as completed May 23, 2022
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

No branches or pull requests

4 participants