-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: 404 when accessing files for GC'ed alloc #18232
Conversation
When an allocation is garbage collected from the client, but not from the servers, the API request is routed to the client and the client does attempt to read the file, but the alloc dir has already been deleted, resulting in a 500 error. This happens because the client GC only destroys the alloc runner (deleting the alloc dir), but it keeps a reference to the alloc runner until the alloc is garbage collected from the servers as well. This commit adjusts this logic by checking if the alloc runner (and the alloc files) has been destroyed, returning a 404 if so.
b914de8
to
23d06cd
Compare
client/fs_endpoint.go
Outdated
if err != nil { | ||
handleStreamResultError(structs.NewErrUnknownAllocation(req.AllocID), pointer.Of(int64(404)), encoder) | ||
return | ||
} | ||
if ar.IsDestroyed() { | ||
handleStreamResultError( | ||
fmt.Errorf("allocation %s destroyed from client", req.AllocID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is the best message to return. I think GC is only one of the reasons the alloc runner may have been destroyed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if something along the lines of "allocation %s not found on client" would be better? Destroyed seems like an internal aspect of the client which operators might find confusing to the idea that an alloc is not found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't happy with the message. You suggestion is good, I just adjusted it to state for allocation %s not found on client
because, weirdly, the alloc is still there, it's just that the GC deletes the state (alloc runner)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some minor suggestions and a comment about the message, but nothing blocking.
Edit: do we want to backport this further than just 1.6?
client/fs_endpoint.go
Outdated
if err != nil { | ||
handleStreamResultError(structs.NewErrUnknownAllocation(req.AllocID), pointer.Of(int64(404)), encoder) | ||
return | ||
} | ||
if ar.IsDestroyed() { | ||
handleStreamResultError( | ||
fmt.Errorf("allocation %s destroyed from client", req.AllocID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if something along the lines of "allocation %s not found on client" would be better? Destroyed seems like an internal aspect of the client which operators might find confusing to the idea that an alloc is not found.
Good question, maybe this could be considered a bug? |
When an allocation is garbage collected from the client, but not from the servers, the API request is routed to the client and the client does attempt to read the file, but the alloc files have already been deleted, resulting in a 500 error.
This happens because the client GC only destroys the alloc runner (deleting the alloc dir), but it keeps a reference to the alloc runner until the alloc is garbage collected from the servers as well.
This commit adjusts this logic by checking if the alloc runner (and the alloc files) has been destroyed, returning a 404 if so.
Closes #18216