Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: 404 when accessing files for GC'ed alloc #18232

Merged
merged 7 commits into from
Aug 21, 2023
Merged

Conversation

lgfa29
Copy link
Contributor

@lgfa29 lgfa29 commented Aug 16, 2023

When an allocation is garbage collected from the client, but not from the servers, the API request is routed to the client and the client does attempt to read the file, but the alloc files have already been deleted, resulting in a 500 error.

This happens because the client GC only destroys the alloc runner (deleting the alloc dir), but it keeps a reference to the alloc runner until the alloc is garbage collected from the servers as well.

This commit adjusts this logic by checking if the alloc runner (and the alloc files) has been destroyed, returning a 404 if so.

Closes #18216

When an allocation is garbage collected from the client, but not from
the servers, the API request is routed to the client and the client
does attempt to read the file, but the alloc dir has already been
deleted, resulting in a 500 error.

This happens because the client GC only destroys the alloc runner
(deleting the alloc dir), but it keeps a reference to the alloc runner
until the alloc is garbage collected from the servers as well.

This commit adjusts this logic by checking if the alloc runner (and the
alloc files) has been destroyed, returning a 404 if so.
@lgfa29 lgfa29 changed the title client: 404 when accessing logs for GC'ed alloc client: 404 when accessing files for GC'ed alloc Aug 17, 2023
@lgfa29 lgfa29 added the backport/1.6.x backport to 1.6.x release line label Aug 17, 2023
if err != nil {
handleStreamResultError(structs.NewErrUnknownAllocation(req.AllocID), pointer.Of(int64(404)), encoder)
return
}
if ar.IsDestroyed() {
handleStreamResultError(
fmt.Errorf("allocation %s destroyed from client", req.AllocID),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the best message to return. I think GC is only one of the reasons the alloc runner may have been destroyed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if something along the lines of "allocation %s not found on client" would be better? Destroyed seems like an internal aspect of the client which operators might find confusing to the idea that an alloc is not found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wasn't happy with the message. You suggestion is good, I just adjusted it to state for allocation %s not found on client because, weirdly, the alloc is still there, it's just that the GC deletes the state (alloc runner)

Copy link
Member

@jrasell jrasell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some minor suggestions and a comment about the message, but nothing blocking.

Edit: do we want to backport this further than just 1.6?

client/fs_endpoint.go Outdated Show resolved Hide resolved
client/fs_endpoint.go Outdated Show resolved Hide resolved
client/fs_endpoint_test.go Outdated Show resolved Hide resolved
client/fs_endpoint_test.go Outdated Show resolved Hide resolved
if err != nil {
handleStreamResultError(structs.NewErrUnknownAllocation(req.AllocID), pointer.Of(int64(404)), encoder)
return
}
if ar.IsDestroyed() {
handleStreamResultError(
fmt.Errorf("allocation %s destroyed from client", req.AllocID),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if something along the lines of "allocation %s not found on client" would be better? Destroyed seems like an internal aspect of the client which operators might find confusing to the idea that an alloc is not found.

@lgfa29
Copy link
Contributor Author

lgfa29 commented Aug 18, 2023

Edit: do we want to backport this further than just 1.6?

Good question, maybe this could be considered a bug?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.4.x backport to 1.4.x release line backport/1.5.x backport to 1.5.x release line backport/1.6.x backport to 1.6.x release line
Projects
None yet
Development

Successfully merging this pull request may close these issues.

alloc logs returns 500 if alloc has been garbage collected
2 participants