Skip to content

Commit

Permalink
allocwatcher: don't destroy local allocdir after migration (#18108)
Browse files Browse the repository at this point in the history
When ephemeral disks are migrated from an allocation on the same node,
allocation logs for the previous allocation are lost.

There are two workflows for the best-effort attempt to migrate the allocation
data between the old and new allocations. For previous allocations on other
clients (the "remote" workflow), we create a local allocdir and download the
data from the previous client into it. That data is then moved into the new
allocdir and we delete the allocdir of the previous alloc.

For "local" previous allocations we don't need to create an extra directory for
the previous allocation and instead move the files directly from one to the
other. But we still delete the old allocdir _entirely_, which includes all the
logs!

There doesn't seem to be any reason to destroy the local previous allocdir, as
the usual client garbage collection should destroy it later on when needed. By
not deleting it, the previous allocation's logs are still available for the user
to read.

Fixes: #18034
  • Loading branch information
tgross committed Aug 2, 2023
1 parent 0a50fe6 commit 97a8339
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 9 deletions.
3 changes: 3 additions & 0 deletions .changelog/18108.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
```release-note:bug
migration: Fixed a bug where previous alloc logs were destroyed when migrating ephemeral_disk on the same client
```
10 changes: 1 addition & 9 deletions client/allocwatcher/alloc_watcher.go
Original file line number Diff line number Diff line change
Expand Up @@ -276,15 +276,7 @@ func (p *localPrevAlloc) Migrate(ctx context.Context, dest *allocdir.AllocDir) e

p.logger.Debug("copying previous alloc")

moveErr := dest.Move(p.prevAllocDir, p.tasks)

// Always cleanup previous alloc
if err := p.prevAllocDir.Destroy(); err != nil {
p.logger.Error("error destroying alloc dir",
"error", err, "previous_alloc_dir", p.prevAllocDir.AllocDir)
}

return moveErr
return dest.Move(p.prevAllocDir, p.tasks)
}

// remotePrevAlloc is a prevAllocWatcher for previous allocations on remote
Expand Down

0 comments on commit 97a8339

Please sign in to comment.