-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
surface specific GRPC errors more visibly #4930
Conversation
switch (rpcException.Status.StatusCode) | ||
{ | ||
case StatusCode.Unavailable: | ||
// This can happen when python disconnects. Ignore it to avoid noisy logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This happens roughly 20%-25% of the time when hitting Ctrl-C from python. Not exactly sure why it happens instead of the non-200 status (handled above).
break; | ||
default: | ||
// Other unknown errors. Keep these quiet for now. | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be convinced to log other exceptions here (maybe as Info instead of Error). That would at least help us for future unknown unknowns.
catch | ||
{ | ||
// Fall-through for other error types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about logging here - I don't think we ever hit it, so it would be nice to know when we do, but it's also potentially noisy to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so happy
Proposed change(s)
We want to ignore some GRPC errors but raise others. Currently just opting in to one error (ResourceExhausted, when the payload is too large), but I'm willing to reconsider.
Also not sure how to test this; I did it manually with some breakpoints (to confirm the
StatusCode.Unavailable
case) and modifying GridWorld with a large uncompressed sensor (to confirmStatusCode.ResourceExhausted
),Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)
https://jira.unity3d.com/browse/MLA-1719
Types of change(s)
Checklist