Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update publish-ci to continue on certain failures #5191

Merged
merged 1 commit into from
Feb 23, 2024

Conversation

devinrsmith
Copy link
Member

There was an error observed during the v0.33.0 release where the deephaven-server artifact was too large for PyPi. There are other failure scenarios that we could potentially run into here as well - network connectivity, expired tokens, temporary service interruptions, etc. Given that all of the release artifacts have already been successfully uploaded as a GitHub artifact, we should prefer to continue with publishing to all the sources even if one of the sources fails.

There was an error observed during the v0.33.0 release where the deephaven-server artifact was too large for PyPi. There are other failure scenarios that we could potentially run into here as well - network connectivity, expired tokens, temporary service interruptions, etc. Given that all of the release artifacts have already been successfully uploaded as a GitHub artifact, we should prefer to continue with publishing to all the sources even if one of the sources fails.
Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as code-owner. Please defer to @stanbrub for a review before merge.

@stanbrub
Copy link
Contributor

stanbrub commented Feb 23, 2024

I assume this would still show up as a failed run so that it doesn't get overlooked? There is a section in the README.md to verify that things like wheels have been published, so that is covered somewhat.

The question is what process do we have if one thing fails and everything else published. Do we stop and fix that? Do we continue with the release and risk forgetting to fix it? What if "fixing it" means waiting 3 weeks to get more disk space?

@devinrsmith
Copy link
Member Author

https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepscontinue-on-error

Prevents a job from failing when a step fails. Set to true to allow a job to pass when this step fails.

So, I read this as "the step will still fail, but the job will not". I don't know if the top level job will have some sort of warning state, but we should still be able to see the individual steps as failed. Our release doc does mention:

Once the workflow job is done, ensure all publication sources have the new artifacts.

We can be mindful during release to make sure we double-check the step states and physically check the publication sources.

@devinrsmith devinrsmith merged commit e9323cf into deephaven:main Feb 23, 2024
25 checks passed
@devinrsmith devinrsmith deleted the publish-continue-on-error branch February 23, 2024 21:21
@github-actions github-actions bot locked and limited conversation to collaborators Feb 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants