-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move asoctl:ci target to run sequentially #2744
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2744 +/- ##
=======================================
Coverage 56.01% 56.01%
=======================================
Files 970 970
Lines 348950 348950
=======================================
+ Hits 195472 195475 +3
+ Misses 122666 122664 -2
+ Partials 30812 30811 -1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
- crossplane:ci | ||
cmds: | ||
- task: asoctl:ci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not convinced that this is the issue. If you look at an example stuck run:
https://github.com/Azure/azure-service-operator/actions/runs/4249026647/jobs/7388790440
The final logs are:
[generator:unit-tests-cover] go: downloading golang.org/x/text v0.7.0
[generator:unit-tests-cover] go: downloading github.com/josharian/intern v1.0.0
[generator:unit-tests-cover] go: downloading golang.org/x/sys v0.5.0
task: [controller:generate-genruntime-deepcopy] find ./pkg/genruntime -type f -name "zz_generated.*" -delete
task: [controller:generate-genruntime-deepcopy] cd pkg/genruntime && controller-gen object:headerFile=/workspace/v2/boilerplate.go.txt paths=./...
task: [controller:generate-genruntime-deepcopy] cd pkg/genruntime && gofmt -l -s -w .
[controller:generate-genruntime-deepcopy] conditions/zz_generated.deepcopy.go
[controller:generate-genruntime-deepcopy] zz_generated.deepcopy.go
Error: The operation was canceled.
To me that looks like either controller:generate-genruntime-deepcopy
or generator:unit-tests-cover
are getting stuck, not asoctl
, which can't even be building at the time those steps run as it has a dependency on controller:generate-crds
which hasn't happened yet.
I think we should add some logs to the end of the controller:generate-genruntime-deepcopy
and generator:unit-tests-cover
steps to see if they are actually finishing. Then when things get stuck again we can determine if the issue is either of those commands.
We may also want to stick a log at the beginning of produce-markdown-summary
as that task is deferred til after the generator:unit-tests-cover
finishes -- so it should run immediately after, but obviously isn't.
For reference here's a successful run to compare that stuck run to: https://github.com/Azure/azure-service-operator/actions/runs/4249442170/jobs/7389599661 |
Actually it looks like this might be related to: It sounds like it's possible to deadlock go-task if we use |
This has been discussed over on the Task Discord - best hypothesis was that something was trying to acquire the lock to select the next task to run before the lock was released for the previous one. |
If that's the case, rather than force asoctl to run sequentially maybe we should just drop our usage of |
We added Have a look at #2475 (Limit concurrency of CI builds). The information I've seen about a potential deadlock in If we want to get rid of the |
What this PR does / why we need it:
Moving asoctl:ci target to run sequentially. As we're experiencing timeouts randomly this could be because of a race in the
generator:ci
andasoctl:ci
trying to modify the files parallelly.