Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle tracker not resolvable / reachable #58

Open
derpeter opened this issue Dec 11, 2018 · 6 comments
Open

Handle tracker not resolvable / reachable #58

derpeter opened this issue Dec 11, 2018 · 6 comments
Assignees
Milestone

Comments

@derpeter
Copy link
Contributor

If the tracker is not reachable we still try to set the ticket failed which results in an error.

C3TT.setTicketFailed[b'C3TTException: A OS error occurred\nError message: [Errno -2] Name or service not known',

@derpeter derpeter added the bug label Dec 11, 2018
@derpeter derpeter self-assigned this Dec 11, 2018
@a-tze
Copy link
Contributor

a-tze commented Jan 4, 2019

Maybe the script should follow the behaviour of the perl stuff: try it every n seconds for a decent amount of time (like 100 times). If the tracker is unreachable, there is no downside in having the script "hang", it would not get a new workload on the next run anyways.

@derpeter
Copy link
Contributor Author

derpeter commented Jan 9, 2019

I would prefer that it ends it self gracefully and get retriggert by the systemd unit timer.
But independent of how we handle the retry, trying to set the ticket failed is an useless behavior that i want to fix.

@a-tze
Copy link
Contributor

a-tze commented Jan 9, 2019

Yes, setting the ticket failed is useless because it will probably not work anyway. It will not be retriggered anyway: if setting ticket failed succeeds, it is not given to the worker again. If nothing is done with the ticket, it is still in state releasing and owned by the worker, and will also not be given to any worker without someone manually fixing this.
Self-healing in a retry loop has indeed the disadvantage that the worker is blocked for some time, but if the tracker is unreachable, that doesnt matter much.

Does producing an error have an impact on the systemd service/timer? Does systemd suspend the timer or something?

@derpeter
Copy link
Contributor Author

derpeter commented Jan 9, 2019

At the point this error happens in the code the script has no ticket, so it will not work in any case :-)
It has not impact on systemd but i see not harm in trying it each timer interval.
I can check if there is way to tell systemd to suspend a timer for specific time if the exit code is non 0.

@a-tze
Copy link
Contributor

a-tze commented Jan 9, 2019

Ahhh okay. Then it makes perfectly sense to gracefully exit! I thought you're referring to the tracker communication at the end of the script run, when it tries to call "setTicketDone"!

@derpeter
Copy link
Contributor Author

derpeter commented Jan 9, 2019

sorry is this was misleading.

@derpeter derpeter added this to the 35c3 milestone Oct 30, 2019
@derpeter derpeter added the c3tt label Oct 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants