overlay: lock staging directories #1916

giuseppe · 2024-04-30T13:36:07Z

lock any staging directory while it is being used so that another process cannot delete it.

Now the Cleanup() function deletes only the staging directories that are not locked by any other user.

Closes: #1915

Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com

openshift-ci · 2024-04-30T13:36:30Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [giuseppe]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

edsantiago · 2024-05-01T12:53:58Z

I vendored in your PR and ran podman tests. tons of failures, all in seccomp, all timeouts. Too many failures to be coincidence, I think.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe · 2024-05-02T10:26:41Z

I vendored in your PR and ran podman tests. tons of failures, all in seccomp, all timeouts. Too many failures to be coincidence, I think.

sorry for wasting your time on this.

I've updated the PR and ran the Podman tests: containers/podman#22573 which are green now

edsantiago · 2024-05-02T10:56:50Z

Thanks! podman + composefs test now in progress, containers/podman#22425

edsantiago · 2024-05-02T12:06:55Z

Same thing: timeouts in seccomp-policy. Common factor seems to be podman-local (not remote) + sqlite (not boltdb) + fedora (not debian).

I also vendored c-common, so maybe the bug is there?

edsantiago · 2024-05-02T12:49:47Z

Confirmed.

# cat /etc/containers/storage.conf 
[storage]
driver = "overlay"
runroot = "/run/containers/storage"
graphroot = "/var/lib/containers/storage"

[storage.options]
pull_options = {enable_partial_images = "true", use_hard_links = "false", ostree_repos="", convert_images = "true"}

[storage.options.overlay]
use_composefs = "true"

# go mod edit --replace github.com/containers/storage=github.com/giuseppe/storage@a5c63583a5f1ae8d9495716f2fd8f84755c64feb
...
# make
...

# bin/podman run --rm --seccomp-policy '' quay.io/libpod/alpine-with-seccomp:label ls
bin
dev
...
HANG!

giuseppe · 2024-05-02T13:47:52Z

ah no, this is another issue in the PR. I've fixed it now. It happens only when the registry rejects the range request, and I am going to debug now why quay does that with the quay.io/libpod/alpine-with-seccomp:labels image

giuseppe · 2024-05-02T14:17:01Z

ah no, this is another issue in the PR. I've fixed it now. It happens only when the registry rejects the range request, and I am going to debug now why quay does that with the quay.io/libpod/alpine-with-seccomp:labels image

opened a PR in c/image to address it: containers/image#2391

mtrmac

(Note to self: the history is #775 (comment) , so this really needs to access other processes’ layers.)

The general intent of this does make sense to me; marking as “request changes” to make sure this is not merged prematurely.

mtrmac · 2024-05-02T14:38:06Z

store.go

-		if !rlstore.Exists(to) {
-			return struct{}{}, ErrLayerUnknown
+func (s *store) ApplyStagedLayer(args ApplyStagedLayerOptions) (*Layer, error) {
+	layer, err := writeToLayerStore(s, func(rlstore rwLayerStore) (*Layer, error) {


How is this code path, updating an existing store, going to be used?

Thinking of c/image pulls , we need the layer, when it is unlocked, to either not exist, or to exist with the full contents. This seems to imply a WIP layer which does not contain contents; a concurrent pull of an image with the same layer would succeed, and allow using the image, when the contents are still missing.

Is this for some non-image container-only layers, or something?

it exists only to be used from the containers-storage tool. applydiff-using-staging-dir was added to mirror applydiff, which applies the diff to an existing layer.

Isn’t the primary purpose of containers-storage to run tests? (And possibly to inspect existing broken stores, for which read-only operations are sufficient.)

It’s a bit disappointing to maintain a code path for the rare use case; to have no test coverage for the atomic create+apply code path we actually are going to exercise; and to pay for that with an extra lock/presence check on every ApplyStagedLayer call. Most of that is pre-existing, not the fault of this PR…

… but maybe we can leave ApplyDiffFromStagingDirectory around, then?

Or, well, change the applydiff-using-staging-dir and the calling test to create the layer atomically, and hope (?) that there are no external users.

I’d still prefer the tests to exercise the atomic creation code path, and not to add the test-only one here.

I’m not currently sure that’s blocking for me.

store.go

mtrmac · 2024-05-02T14:45:45Z

drivers/overlay/overlay.go

+	parentStagingDir := filepath.Dir(stagingDirectory)
+	if filepath.Dir(parentStagingDir) != d.getStagingDir(id) {


Intuitively it seems to me that the API of the driver should always return the “top level staging” directory; how it organizes the insides into locks / contents / … is an internal matter of this subpackage.

Or at the very least, the layout needs to be thoroughly documented.

The staging directory returned now is the directory where the content for the layer will be stored.

A staging directory is something like /var/lib/containers/storage/overlay/staging/12345/dir, and the lock file is ``/var/lib/containers/storage/overlay/staging/12345/staging.lock`

A caller shouldn't care about anything except the directory where the files are expected to be written.

My view is that the caller has no business reading any individual files either way; it essentially gets an ~opaque handle. (Now that the other APIs are being removed, ApplyStagedLayerOptions and CleanupStagedLayer both ask for a complete, presumably unmodified, DriverWithDifferOutput.)

That’s even more the case with ComposeFS, where the structure of the staged contents is no longer the traditional overlayFS filesystem.

Assuming the above way of thinking, I think it’s simpler and clearer for the overlay driver itself to structure it so that the “handle” being passed is the top-level directory; affecting anything in parent directories is unusual and unexpected. “Unusual and unexpected” is not automatically a blocker, but I think overlay is already complex and minimizing complexity is good. Or, at least, documenting the complexity, which is “the very least” thing I wish for.

Still outstanding, somehow.

pkg/lockfile/lockfile_unix.go

pkg/lockfile/lockfile.go

drivers/overlay/overlay.go

store.go

drivers/overlay/overlay.go

mtrmac · 2024-05-02T16:29:49Z

drivers/overlay/overlay.go

 		return fmt.Errorf("%q is not a staging directory", stagingDirectory)
 	}
+
+	defer func() {
+		if lock, ok := d.stagingDirsLocks[parentStagingDir]; ok {


With this, ApplyDiffFromStagingDirectory unlocks the lock on some error return paths, but not all. A caller can’t deal with that.

… but there are also various error paths on callers. So maybe the staging area should only be unlocked when this succeeds?

Or maybe this does not matter?

Note to self: FIXME investigate.

edsantiago · 2024-05-02T17:04:26Z

Well, FWIW, with this and the c-i PR podman tests now pass

giuseppe · 2024-05-03T10:56:06Z

pushed a new version that addresses all the comments.

rhatdan · 2024-05-04T12:40:55Z

@mtrmac waiting on you.

giuseppe · 2024-05-06T14:50:45Z

@mohanboddu should the jira label create automatically the issue?

giuseppe · 2024-05-08T15:09:24Z

I'd like to get this into the next release, can we move it forward?

mtrmac

Just to unblock progress, not a careful review I’m afraid.

pkg/lockfile/lockfile.go

pkg/lockfile/lockfile_test.go

store.go

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

the callers in c/image were already replaced, so simplify the store API and drop the functions. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

this is a preparatory patch to allow storing a lock file for each staging directory. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

extend the public API to allow a non blocking usage. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

lock any staging directory while it is being used so that another process cannot delete it. Now the Cleanup() function deletes only the staging directories that are not locked by any other user. Closes: containers#1915 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe · 2024-05-11T15:56:06Z

Just to unblock progress, not a careful review I’m afraid.

Thanks. Fixed the comments and pushed a new version

edsantiago · 2024-05-13T13:20:02Z

Successful CI run with composefs in #22425, with latest (this morning's) main. Also a successful run this weekend but that was before the giant vendor merge. Either way, tentative LGTM from the passing-tests front.

rhatdan · 2024-05-13T15:34:08Z

/lgtm

mtrmac · 2024-05-21T18:57:25Z

pkg/lockfile/lockfile.go

+		// If we're the first reference on the lock, we need to open the file again.
+		fd, err := openLock(l.file, l.ro)
+		if err != nil {
+			l.rwMutex.Unlock()


⚠️ For readLock, this must be RUnlock.

mtrmac · 2024-05-21T18:57:31Z

pkg/lockfile/lockfile.go

+		// reader lock or a writer lock.
+		if err = lockHandle(l.fd, lType, true); err != nil {
+			closeHandle(fd)
+			l.rwMutex.Unlock()


⚠️ For readLock, this must be RUnlock.

mtrmac · 2024-05-21T19:06:49Z

store.go

-	defer s.containerStore.stopWriting()
-
+// putLayer requires the rlstore, rlstores, as well as s.containerStore (even if not an argument to this function) to be locked for write.
+func (s *store) putLayer(rlstore rwLayerStore, rlstores []roLayerStore, id, parent string, names []string, mountLabel string, writeable bool, lOptions *LayerOptions, diff io.Reader, slo *stagedLayerOptions) (*Layer, int64, error) {


the read-only layer stores must not be locked on entry.

giuseppe · 2024-05-21T19:35:01Z

@mtrmac thanks for the review, I've opened a PR: #1926

Fix locking bugs from #1916, and one more

openshift-ci bot added the approved label Apr 30, 2024

giuseppe mentioned this pull request Apr 30, 2024

composefs: committing the finished image: failed to put layer using a partial pull: rename xxx/diff: no such file or directory #1915

Closed

giuseppe force-pushed the lock-staging-dir branch 2 times, most recently from 6ebcbf6 to 46a3ae8 Compare April 30, 2024 15:24

giuseppe force-pushed the lock-staging-dir branch from 46a3ae8 to a5c6358 Compare May 2, 2024 08:13

giuseppe added a commit to giuseppe/libpod that referenced this pull request May 2, 2024

TEST: vendor github.com/containers/storage/pull/1916

d0fbc5d

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe force-pushed the lock-staging-dir branch from a5c6358 to ff3eccc Compare May 2, 2024 13:46

giuseppe added the jira label May 2, 2024

mtrmac requested changes May 2, 2024

View reviewed changes

mtrmac mentioned this pull request May 2, 2024

storage: cleanup staged layer if unused containers/image#2390

Merged

giuseppe force-pushed the lock-staging-dir branch from ff3eccc to 631b7b9 Compare May 2, 2024 20:36

mtrmac reviewed May 10, 2024

View reviewed changes

pkg/lockfile/lockfile.go Show resolved Hide resolved

pkg/lockfile/lockfile_test.go Outdated Show resolved Hide resolved

store.go Show resolved Hide resolved

store.go Show resolved Hide resolved

giuseppe added 5 commits May 10, 2024 15:00

overlay: fix comments

ca8158b

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

lockfile: fix comment

1e43992

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

store: extend ApplyStagedLayer to use existing layers

26e97fb

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

cmd: replace usage of deprecated functions

e6d2359

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

store: drop deprecated functions

eb53d59

the callers in c/image were already replaced, so simplify the store API and drop the functions. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

overlay: use a sub-directory for the staging path

76f8994

this is a preparatory patch to allow storing a lock file for each staging directory. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe force-pushed the lock-staging-dir branch from 631b7b9 to 780f7c1 Compare May 10, 2024 13:21

giuseppe added 2 commits May 10, 2024 15:42

lockfile: add functions for non blocking lock

c06cd1c

extend the public API to allow a non blocking usage. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe force-pushed the lock-staging-dir branch from 780f7c1 to f1352df Compare May 10, 2024 13:42

openshift-ci bot assigned rhatdan May 13, 2024

openshift-ci bot added the lgtm label May 13, 2024

openshift-merge-bot bot merged commit 0cea595 into containers:main May 13, 2024
18 checks passed

mtrmac reviewed May 21, 2024

View reviewed changes

giuseppe mentioned this pull request May 21, 2024

follow-up for "overlay: lock staging directories" #1926

Closed

openshift-merge-bot bot added a commit that referenced this pull request May 22, 2024

Merge pull request #1927 from mtrmac/three-people-review

5cd00c5

Fix locking bugs from #1916, and one more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

overlay: lock staging directories #1916

overlay: lock staging directories #1916

giuseppe commented Apr 30, 2024

openshift-ci bot commented Apr 30, 2024

edsantiago commented May 1, 2024

giuseppe commented May 2, 2024

edsantiago commented May 2, 2024

edsantiago commented May 2, 2024

edsantiago commented May 2, 2024

giuseppe commented May 2, 2024

giuseppe commented May 2, 2024

mtrmac left a comment

mtrmac May 2, 2024

giuseppe May 2, 2024

mtrmac May 2, 2024

mtrmac May 10, 2024

mtrmac May 2, 2024

giuseppe May 2, 2024

mtrmac May 2, 2024

mtrmac May 10, 2024

mtrmac May 2, 2024

mtrmac May 10, 2024

edsantiago commented May 2, 2024

giuseppe commented May 3, 2024

rhatdan commented May 4, 2024

giuseppe commented May 6, 2024

giuseppe commented May 8, 2024

mtrmac left a comment

giuseppe commented May 11, 2024

edsantiago commented May 13, 2024

rhatdan commented May 13, 2024

mtrmac May 21, 2024

mtrmac May 21, 2024

mtrmac May 21, 2024

giuseppe commented May 21, 2024

		parentStagingDir := filepath.Dir(stagingDirectory)
		if filepath.Dir(parentStagingDir) != d.getStagingDir(id) {

overlay: lock staging directories #1916

overlay: lock staging directories #1916

Conversation

giuseppe commented Apr 30, 2024

openshift-ci bot commented Apr 30, 2024

edsantiago commented May 1, 2024

giuseppe commented May 2, 2024

edsantiago commented May 2, 2024

edsantiago commented May 2, 2024

edsantiago commented May 2, 2024

giuseppe commented May 2, 2024

giuseppe commented May 2, 2024

mtrmac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edsantiago commented May 2, 2024

giuseppe commented May 3, 2024

rhatdan commented May 4, 2024

giuseppe commented May 6, 2024

giuseppe commented May 8, 2024

mtrmac left a comment

Choose a reason for hiding this comment

giuseppe commented May 11, 2024

edsantiago commented May 13, 2024

rhatdan commented May 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

giuseppe commented May 21, 2024