Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep package-installed files listing for Debian packages installed in a distroless image #1876

Closed
pombredanne opened this issue May 26, 2021 · 8 comments
Labels
Can Close? Will close in 30 days unless there is a comment indicating why not

Comments

@pombredanne
Copy link
Contributor

pombredanne commented May 26, 2021

🚀 feature request

Relevant Rules

When a package is installed, only metadata are kept and the list of installed files is lost/not saved with the package metadata.

I have a concern with what happens here:

def add_pkg_metadata(self, metadata_tar, deb):

Description

In a distroless container image, the as-installed .deb packages are not saved with their files/md5sums file lists in what would be in /var/lib/dpkg/info on a regular Debian install. As a result, it is not possible to relate an installed package in a distroless image/layer to the set of files that were installed with this package.

This data can be important for software composition analysis and its security and license compliance tracking applications.

Describe the solution you'd like

Each installed package should include some installed file listing possibly added in some per package file in the status.d/ directory. This is a Debian standard in /var/lib/dpkg/info/<package name>

This would make distroless images more readily introspectable, otherwise there is no intrisic way to relate a package (in status.d) to the set of its installed files.

@tejal29 you committed this originally with @dlorenc ... any insight to share there?

Describe alternatives you've considered

I cannot fathom an in-container alternative to keep a tab of each packaged-installed file. Tracking outside would mean maintaining some external database which does not seem practical.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!

@github-actions github-actions bot added the Can Close? Will close in 30 days unless there is a comment indicating why not label Nov 25, 2021
@fedemengo
Copy link

fedemengo commented Dec 23, 2021

@pombredanne I would love to help, do you already have something I can start with?

@github-actions github-actions bot removed the Can Close? Will close in 30 days unless there is a comment indicating why not label Dec 24, 2021
@pombredanne
Copy link
Contributor Author

@fedemengo sorry for a late reply. I do not have anything done yet, but I would likely either:

  • continue using a "non-like-debian" /var/lib/dpkg/status.d/<package name> and add /var/lib/dpkg/status.d/<package name>.md5sums (originally under /var/lib/dpkg/info/<package name>.md5sums) and/or create a /var/lib/dpkg/status.d/<package name>.list file for installed files also originally under /var/lib/dpkg/info/<package name>.list
  • OR just keep and copy over the /var/lib/dpkg/info/ directory as is which will contain all the original parts of the packages.

@fedemengo
Copy link

After a brief investigation the metadata file passed to add_pkg_metadata seems to still have the package files. I tested with random deb packages
jq_1.6-2.1_amd64.deb

Version: 1.6-2.1
Architecture: amd64
Maintainer: ChangZhuo Chen (陳昌倬) <czchen@debian.org>
Installed-Size: 110
Depends: libjq1 (= 1.6-2.1), libc6 (>= 2.4)
Section: utils
Priority: optional
Multi-Arch: foreign
Homepage: https://github.com/stedolan/jq
Description: lightweight and flexible command-line JSON processor
 jq is like sed for JSON data – you can use it to slice
 and filter and map and transform structured data with
 the same ease that sed, awk, grep and friends let you
 play with text.
 .
 It is written in portable C, and it has minimal runtime
 dependencies.
 .
 jq can mangle the data format that you have into the
 one that you want with very little effort, and the
 program to do so is often shorter and simpler than
 you’d expect.
./md5sums0000644000000000000000000000064613764355465011261 0ustar  rootroot4805bfbf88146bbb434b248c8548ba9a  usr/bin/jq
5563b04c49c62365021c85daa51b2ea9  usr/share/doc/jq/AUTHORS.gz
c364f0eca2f62a00bdba467b6dcec0c6  usr/share/doc/jq/README
7bbac574353d0a7b979154962e609e9e  usr/share/doc/jq/changelog.Debian.gz
71ebdef08d6145814339da04bbb38ee7  usr/share/doc/jq/changelog.gz
1745dfca81b4c36132c52fa5e972d6cd  usr/share/doc/jq/copyright
f7f27caeb55e22fb67c74a10a03053a1  usr/share/man/man1/jq.1.gz

and pppoe_3.12-1.2_amd64.deb

Source: rp-pppoe
Version: 3.12-1.2
Architecture: amd64
Maintainer: Andreas Barth <aba@not.so.argh.org>
Installed-Size: 239
Depends: libc6 (>= 2.14), ppp (>= 2.3.10-1)
Section: net
Priority: optional
Description: PPP over Ethernet driver
 PPP over Ethernet (PPPoE) is a protocol used by
 many ADSL Internet service providers. This package allows
 you to connect to those PPPoE service providers.
./md5sums0000644000000000000000000000270513400775643011246 0ustar  rootroot49f16f269e495ac63284930ddb35819d  usr/sbin/pppoe
fecf103f0643fc47f6e2b6ab189ba836  usr/sbin/pppoe-connect
f0c57b276c5f71c1bea0e68d1ed05cc4  usr/sbin/pppoe-relay
fc48502b12c572f651db2c830e1e5023  usr/sbin/pppoe-server
05ca70beff7548aa62f7338526c4de7f  usr/sbin/pppoe-sniff
6fad7c0d267557577956b34d9a8e5ab2  usr/sbin/pppoe-start
b60a26c2098b31466c597b69086481b8  usr/sbin/pppoe-status
08ad72bf1a79f3d5c726216bdd3c7be0  usr/sbin/pppoe-stop
7135c95b9cd1de83278a6ed59967f15e  usr/share/doc/pppoe/README.Debian.gz
273f383c93571467392442a65efb59d3  usr/share/doc/pppoe/changelog.Debian.gz
588f951008f3c8832342e32afbbce587  usr/share/doc/pppoe/changelog.gz
e6d8e774d4c0b4a71cd7c0b407ee51a2  usr/share/doc/pppoe/copyright
3989cc121f314dd71ba995bce0d7cc7a  usr/share/lintian/overrides/pppoe
0f56b077433fa2ae3061b81734c0c3ab  usr/share/man/man5/pppoe.conf.5.gz
00f42cc119e815b6f67a06b66f6bc98e  usr/share/man/man8/pppoe-connect.8.gz
3043945ca5df6fd72ca21175e7363f4c  usr/share/man/man8/pppoe-relay.8.gz
0bc3f0deffb7e56e88629ac4306a7c50  usr/share/man/man8/pppoe-server.8.gz
6ca2fde4e0c47ba3f31b29ebd557e3ad  usr/share/man/man8/pppoe-setup.8.gz
af7d2547aead5aedd657e6dffc9238a1  usr/share/man/man8/pppoe-sniff.8.gz
6737faadbf8d5182b570a2238bf0662f  usr/share/man/man8/pppoe-start.8.gz
6c6185ef482c042a3bb5e72a4eca4a5b  usr/share/man/man8/pppoe-status.8.gz
948141509e725bcdb152e73a16effc38  usr/share/man/man8/pppoe-stop.8.gz
2b99016e346bc73fa7c4be686c0b527b  usr/share/man/man8/pppoe.8.gz

So we must be losing the file information somewhere else. I'll keep digging.

pombredanne added a commit to pombredanne/rules_docker that referenced this issue Apr 23, 2022
This ensures that a "distroless" container layer tarball built from
Debian packages contains not only the control file of each package, but
also the md5sums file that lists original files included in a package.
The md5sums file is extracted from a .deb package and saved side-by-side
with the control under this path:
  var/lib/dpkg/status.d/<package-name>.md5sums

Reference: bazelbuild#1876
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Contributor Author

@fedemengo I do not think anything is "lost" .... but rather that in

def add_pkg_metadata(self, metadata_tar, deb):

we have only one file that's extracted and that's the control file:
PKG_METADATA_FILE = 'control'

we want the md5sums file to be extracted.

I pushed a PR here with a test: #2065

pombredanne added a commit to pombredanne/rules_docker that referenced this issue Apr 23, 2022
This ensures that a "distroless" container layer tarball built from
Debian packages contains not only the control file of each package, but
also the md5sums file that lists original files included in a package.
The md5sums file is extracted from a .deb package and saved side-by-side
with the package control file under this path:
  var/lib/dpkg/status.d/<package-name>.md5sums

Reference: bazelbuild#1876
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit to pombredanne/rules_docker that referenced this issue Apr 23, 2022
This ensures that a "distroless" container layer tarball built from
Debian packages contains not only the control file of each package, but
also the md5sums file that lists original files included in a package.
If present, we extract the md5sums file and save is side-by-side with
the package control file under this path:
  var/lib/dpkg/status.d/<package-name>.md5sums

Reference: bazelbuild#1876
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Contributor Author

I would really appreciate some review of #2065 before it goes stale.

pombredanne added a commit to pombredanne/rules_docker that referenced this issue May 25, 2022
This ensures that a "distroless" container layer tarball built from
Debian packages contains not only the control file of each package, but
also the md5sums file that lists original files included in a package.
If present, we extract the md5sums file and save is side-by-side with
the package control file under this path:
  var/lib/dpkg/status.d/<package-name>.md5sums

Reference: bazelbuild#1876
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit to pombredanne/rules_docker that referenced this issue May 25, 2022
This ensures that a "distroless" container layer tarball built from
Debian packages contains not only the control file of each package, but
also the md5sums file that lists original files included in a package.
If present, we extract the md5sums file and save is side-by-side with
the package control file under this path:
  var/lib/dpkg/status.d/<package-name>.md5sums

Reference: bazelbuild#1876
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit to pombredanne/rules_docker that referenced this issue May 25, 2022
This ensures that a "distroless" container layer tarball built from
Debian packages contains not only the control file of each package, but
also the md5sums file that lists original files included in a package.
If present, we extract the md5sums file and save is side-by-side with
the package control file under this path:
  var/lib/dpkg/status.d/<package-name>.md5sums

Reference: bazelbuild#1876
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit to pombredanne/rules_docker that referenced this issue May 26, 2022
This ensures that a "distroless" container layer tarball built from
Debian packages contains not only the control file of each package, but
also the md5sums file that lists original files included in a package.
If present, we extract the md5sums file and save is side-by-side with
the package control file under this path:
  var/lib/dpkg/status.d/<package-name>.md5sums

Reference: bazelbuild#1876
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
gravypod pushed a commit that referenced this issue May 30, 2022
* Add md5sums file list to distroless container

This ensures that a "distroless" container layer tarball built from
Debian packages contains not only the control file of each package, but
also the md5sums file that lists original files included in a package.
If present, we extract the md5sums file and save is side-by-side with
the package control file under this path:
  var/lib/dpkg/status.d/<package-name>.md5sums

Reference: #1876
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Remove trailing whitespaces

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
St0rmingBr4in pushed a commit to St0rmingBr4in/rules_docker that referenced this issue Oct 17, 2022
* Add md5sums file list to distroless container

This ensures that a "distroless" container layer tarball built from
Debian packages contains not only the control file of each package, but
also the md5sums file that lists original files included in a package.
If present, we extract the md5sums file and save is side-by-side with
the package control file under this path:
  var/lib/dpkg/status.d/<package-name>.md5sums

Reference: bazelbuild#1876
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

* Remove trailing whitespaces

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!

@github-actions github-actions bot added the Can Close? Will close in 30 days unless there is a comment indicating why not label Nov 17, 2022
@github-actions
Copy link

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Can Close? Will close in 30 days unless there is a comment indicating why not
Projects
None yet
Development

No branches or pull requests

2 participants