Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime error handling #135

Merged
merged 30 commits into from
Dec 4, 2021

Conversation

PhilippPlank
Copy link
Contributor

@PhilippPlank PhilippPlank commented Nov 30, 2021

Solving the issue of deadlock in the runtime if any ProcessModel throws an exception. (Issue #83)
Also passing the exception all the way up to the runtime. (model -> runtime_service -> runtime)

Open for discussion is the usage of existing csp_ports for error messages or if there are better ways to transport the exception to the runtime.

Edit: the unit test currently does not really work as unit test, but shows a use case.

PhilippPlank and others added 25 commits November 12, 2021 14:53
@PhilippPlank PhilippPlank added 1-bug Something isn't working area: magma/runtime Issues with something in lava/magma/runtime area: magma/core Issues with something in lava/magma/core labels Nov 30, 2021
@PhilippPlank PhilippPlank self-assigned this Nov 30, 2021
@PhilippPlank
Copy link
Contributor Author

PhilippPlank commented Dec 1, 2021

Updated initial commit.

I come across the channel size which is 64 at the moment for pypy channels. Why is it 64? I needed to increase it a bit in order to send a longer error message over the channels.
There are other options of course (starting the read at the same time etc.), but I wanted to keep it simple for now.

I think this should work now and stop all running process model threads, if there are one or multiple errors in the process models.

PS: The checks will fails, since I do not know how to assert for multiple Exceptions raised and the 64 is also checked in a unit test.

@awintel
Copy link
Contributor

awintel commented Dec 3, 2021

Wouldn't a simply pipe like in this example be simpler through which you can just push the exception as is?
https://newbedev.com/python-multiprocessing-handling-child-errors-in-parent
If you rethrow it at the Runtime level, it will probably print the usual nice stack trace so you can click on the hyperlinks in your IDE which will take you to the place in the code that caused the problem.

I see no need to manually serialize the exception and push it through CSP channels. Error handling is not performance critical at all.

@PhilippPlank
Copy link
Contributor Author

PhilippPlank commented Dec 3, 2021

I adopted the behavior to your suggestion @awintel

If there is an error is still communicated via csp channels, but the traceback of the exception is stored within the Process of multiprocessing, which we can then fully print out.

This also allows unit tests to actually succeed.
I am quite happy with the new solution, let me know if there is anything else.

Copy link
Contributor

@mgkwill mgkwill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@PhilippPlank PhilippPlank merged commit 4775c80 into lava-nc:main Dec 4, 2021
@PhilippPlank PhilippPlank deleted the runtime_error_handling branch December 4, 2021 23:10
Copy link
Contributor

@awintel awintel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I would just suggest to give a bit more detailed feedback to the user.

actors.join()
if actors.exception:
_, traceback = actors.exception
print(traceback)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually identify the problem from which the error came? Perhaps attach this to the Exception object.
You should print the class name, Process name and id.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear. Both the processes should identify themselves but also the RuntimeServices in case the error happened in the RuntimeService.
Perhaps you even want to distinguish them clearly in the command line prints.

Perhaps you even want to first print a summary of everything that has thrown any errors at the top of the command line to orient the user because otherwise there could be quite a messy stack trace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The printed traceback leads you to the line of code the exception occurred, e.g., "line 50, in run_spk
raise AssertionError("All the error info")" -> line 50 is within PyProcModel1. So you know exactly which exception happened and on which code line.

These tracebacks are printed for every exception occurred in all ProcessModels.

print(traceback)
error_cnt += 1

raise RuntimeError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could define own Execptions. RuntimeError for actual errors in the Runtime and ProcessModelError for ... you know what.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception is a RuntimeError, due to other Exceptions happened in the ProcessModel. The details have been printed already, this is just to stop the Runtime and tell the user that there have been other Exceptions (+ the number of them).

For all the details the user only has to scroll upwards in the console.

monkin77 pushed a commit to monkin77/thesis-lava that referenced this pull request Jul 12, 2024
* - Initial commit on passing exception from ProcessModels up to the runtime

* - Initial commit on passing exception from ProcessModels up to the runtime

* - Updated commit on passing exception from ProcessModels up to the runtime

* - Updated commit on passing exception from ProcessModels up to the runtime
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1-bug Something isn't working area: magma/core Issues with something in lava/magma/core area: magma/runtime Issues with something in lava/magma/runtime
Projects
None yet
4 participants