You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Lately, the failure rate of pipelines due to system failures started increasing. To give some concrete example from today, here is a pipeline which should be ✔️ but failed because of system failures:
Whenever something like that happens on a PR, the two solutions users have are:
Close and reopen the PR, to trigger the creation of a new merge commit, and a new set of pipelines
Comment @spackbot run pipeline, to re-run all the pipelines
If the failure rate is high enough, there is a fair chance the procedure needs to be repeated a few times to get to a ✔️ CI mark. This has the effect of multiplying by a factor the resources we need to run pipelines, in particular for "generate" jobs, which are always re-run.
I guess the best solution from user's perspective would be having a low failure rate but, in absence of that, I wonder if we could add a new command1:
@spackbot re-run failed pipelines
that re-runs only pipelines that failed due to system errors. This should:
Reduce the chance of a possible new failure
Reduce the resources we need to get a given CI run ✔️
Footnotes
Naming is tentative, any better choice is welcome ↩
The text was updated successfully, but these errors were encountered:
Lately, the failure rate of pipelines due to system failures started increasing. To give some concrete example from today, here is a pipeline which should be ✔️ but failed because of system failures:
On the same day we run two pipelines on
develop
:They are both ❌ due to system failures.
Whenever something like that happens on a PR, the two solutions users have are:
@spackbot run pipeline
, to re-run all the pipelinesIf the failure rate is high enough, there is a fair chance the procedure needs to be repeated a few times to get to a ✔️ CI mark. This has the effect of multiplying by a factor the resources we need to run pipelines, in particular for "generate" jobs, which are always re-run.
I guess the best solution from user's perspective would be having a low failure rate but, in absence of that, I wonder if we could add a new command1:
that re-runs only pipelines that failed due to system errors. This should:
Footnotes
Naming is tentative, any better choice is welcome ↩
The text was updated successfully, but these errors were encountered: