Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chunked: fix reuse of the layers cache #2024

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 23 additions & 13 deletions layers.go
Original file line number Diff line number Diff line change
Expand Up @@ -314,9 +314,15 @@ type rwLayerStore interface {
// applies its changes to a specified layer.
ApplyDiff(to string, diff io.Reader) (int64, error)

// ApplyDiffWithDiffer applies the changes through the differ callback function.
// If to is the empty string, then a staging directory is created by the driver.
ApplyDiffWithDiffer(to string, options *drivers.ApplyDiffWithDifferOpts, differ drivers.Differ) (*drivers.DriverWithDifferOutput, error)
// ApplyDiffWithDifferNoLock applies the changes through the differ callback function.
// If layer is nil, then a staging directory is created by the driver.
// It is the caller's responsibility to call ApplyDiffWithDifferFinalize to store the changes performed by this function.
ApplyDiffWithDifferNoLock(layer *Layer, options *drivers.ApplyDiffWithDifferOpts, differ drivers.Differ) (*drivers.DriverWithDifferOutput, error)

// ApplyDiffWithDifferFinalize stores the changes performed by ApplyDiffWithDifferNoLock.
// ApplyDiffWithDifferNoLock runs without holding the lock, so this function is used to finalize the changes while it has exclusive access
// to the store.
ApplyDiffWithDifferFinalize(id string, output *drivers.DriverWithDifferOutput) error

// CleanupStagingDirectory cleanups the staging directory. It can be used to cleanup the staging directory on errors
CleanupStagingDirectory(stagingDirectory string) error
Expand Down Expand Up @@ -2543,22 +2549,18 @@ func (r *layerStore) applyDiffFromStagingDirectory(id string, diffOutput *driver
return err
}

// Requires startWriting.
func (r *layerStore) ApplyDiffWithDiffer(to string, options *drivers.ApplyDiffWithDifferOpts, differ drivers.Differ) (*drivers.DriverWithDifferOutput, error) {
// It must be called without any c/storage locks held to allow differ to make c/storage calls.
func (r *layerStore) ApplyDiffWithDifferNoLock(layer *Layer, options *drivers.ApplyDiffWithDifferOpts, differ drivers.Differ) (*drivers.DriverWithDifferOutput, error) {
ddriver, ok := r.driver.(drivers.DriverWithDiffer)
if !ok {
return nil, ErrNotSupported
}

if to == "" {
if layer == nil {
output, err := ddriver.ApplyDiffWithDiffer("", "", options, differ)
return &output, err
}

layer, ok := r.lookup(to)
if !ok {
return nil, ErrLayerUnknown
}
if options == nil {
options = &drivers.ApplyDiffWithDifferOpts{
ApplyDiffOpts: drivers.ApplyDiffOpts{
Expand All @@ -2567,14 +2569,22 @@ func (r *layerStore) ApplyDiffWithDiffer(to string, options *drivers.ApplyDiffWi
},
}
}
output, err := ddriver.ApplyDiffWithDiffer(layer.ID, layer.Parent, options, differ)
out, err := ddriver.ApplyDiffWithDiffer(layer.ID, layer.Parent, options, differ)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️

I don’t think this works for pre-existing layers (layer != nil); if the layer is unlocked:

  • Two callers can get here simultaneously, and write over each others’ getStagingDir
  • The layer can be deleted concurrently to this operation!

c/image only ever calls ApplyDiffWithDiffer without the layer ID, that’s the only code path I fully care about. (The other code path is used in cmd/containers-storage and in tests, per comments in store.go.)

It is very tempting to me to have a new specialized PrepareStagedLayer (to match the names of {Apply,Cleanup}StagedLayer) that only supports this simpler care; then several of the operations here could just disappear.

But ApplyDiffWithDiffer is now a stable API; I don’t immediately know how to deal with that. The concurrent layer deletion might be possible to handle by holding a read lock on the layer store only (assuming all operations in pkg/chunked lock the layer store only for reading, and that’s not currently the case), but the exclusion of two concurrent callers, and of any possible concurrent readers, is fundamentally the existing exclusive layer store lock.

Outright refusing to do the operation on existing layers … is code that could be written… but probably shouldn’t be merged. I don’t know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But ApplyDiffWithDiffer is now a stable API; I don’t immediately know how to deal with that. The concurrent layer deletion might be possible to handle by holding a read lock on the layer store only (assuming all operations in pkg/chunked lock the layer store only for reading, and that’s not currently the case), but the exclusion of two concurrent callers, and of any possible concurrent readers, is fundamentally the existing exclusive layer store lock.

Outright refusing to do the operation on existing layers … is code that could be written… but probably shouldn’t be merged. I don’t know.

I don't know either how to solve it, unless introducing another per-layer lock file. As you said, this code path is used only by cmd/containers-storage.

Could we just document this behavior, and warn any potential user that using to != "" is not safe as there can be multiple writers, and the layer could potentially be deleted while the operation runs?

if err != nil {
return nil, err
}
return &out, nil
}

// Requires startWriting, must be called after ApplyDiffWithDifferNoLock
func (r *layerStore) ApplyDiffWithDifferFinalize(id string, output *drivers.DriverWithDifferOutput) error {
layer, ok := r.lookup(id)
if !ok {
return ErrLayerUnknown
}
layer.UIDs = output.UIDs
layer.GIDs = output.GIDs
err = r.saveFor(layer)
return &output, err
return r.saveFor(layer)
}

func (r *layerStore) CleanupStagingDirectory(stagingDirectory string) error {
Expand Down
13 changes: 6 additions & 7 deletions pkg/chunked/cache_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,6 @@ type layersCache struct {
refs int
store storage.Store
mutex sync.RWMutex
created time.Time
}

var (
Expand All @@ -83,6 +82,7 @@ func (c *layer) release() {
if err := unix.Munmap(c.mmapBuffer); err != nil {
logrus.Warnf("Error Munmap: layer %q: %v", c.id, err)
}
c.mmapBuffer = nil
}
}

Expand All @@ -107,14 +107,13 @@ func (c *layersCache) release() {
func getLayersCacheRef(store storage.Store) *layersCache {
cacheMutex.Lock()
defer cacheMutex.Unlock()
if cache != nil && cache.store == store && time.Since(cache.created).Minutes() < 10 {
if cache != nil && cache.store == store {
cache.refs++
return cache
}
cache := &layersCache{
cache = &layersCache{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I wonder about making this kind of mistake harder to do … naming the global cacheSingleton, globalCache, or something, might make it less likely to conflict with a local. OTOH that’s very likely an overreaction.)

store: store,
refs: 1,
created: time.Now(),
}
return cache
}
Expand Down Expand Up @@ -781,14 +780,14 @@ func (c *layersCache) findDigestInternal(digest string) (string, string, int64,
return "", "", -1, nil
}

c.mutex.RLock()
defer c.mutex.RUnlock()

binaryDigest, err := makeBinaryDigest(digest)
if err != nil {
return "", "", 0, err
}

c.mutex.RLock()
defer c.mutex.RUnlock()

for _, layer := range c.layers {
if !layer.cacheFile.bloomFilter.maybeContains(binaryDigest) {
continue
Expand Down
39 changes: 34 additions & 5 deletions store.go
Original file line number Diff line number Diff line change
Expand Up @@ -3104,12 +3104,41 @@ func (s *store) CleanupStagedLayer(diffOutput *drivers.DriverWithDifferOutput) e
}

func (s *store) ApplyDiffWithDiffer(to string, options *drivers.ApplyDiffWithDifferOpts, differ drivers.Differ) (*drivers.DriverWithDifferOutput, error) {
return writeToLayerStore(s, func(rlstore rwLayerStore) (*drivers.DriverWithDifferOutput, error) {
if to != "" && !rlstore.Exists(to) {
return nil, ErrLayerUnknown
rlstore, err := s.getLayerStore()
if err != nil {
return nil, err
}

var layer *Layer
if to != "" {
if err := rlstore.startReading(); err != nil {
return nil, err
}
return rlstore.ApplyDiffWithDiffer(to, options, differ)
})
layer, err = rlstore.Get(to)
rlstore.stopReading()
if err != nil {
return nil, err
}
}

out, err := rlstore.ApplyDiffWithDifferNoLock(layer, options, differ)
if err != nil {
return nil, err
}

if layer != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all the error paths from now on need some variant of CleanupStagedLayer?

if err := rlstore.startWriting(); err != nil {
_ = s.CleanupStagedLayer(out)
return nil, err
}
err = rlstore.ApplyDiffWithDifferFinalize(layer.ID, out)
rlstore.stopWriting()
if err != nil {
_ = s.CleanupStagedLayer(out)
return nil, err
}
}
return out, nil
}

func (s *store) DifferTarget(id string) (string, error) {
Expand Down