Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce impact of networking slowdowns #374

Merged
merged 1 commit into from
Aug 12, 2024

Conversation

cevich
Copy link
Member

@cevich cevich commented Aug 9, 2024

Previously if a repository server, the internet, or the execution
environment experienced some kind of networking slowdown, it could lead
to a package install or update timeout failure. Increase resiliency in
these situations with additional retries, timeouts, and lowered minimum
rates. Also increase the timeout on the related Cirrus-CI tasks.

@cevich cevich requested a review from edsantiago August 9, 2024 16:29
@cevich
Copy link
Member Author

cevich commented Aug 9, 2024

@edsantiago PTAL when you have a moment, this isn't critical.

Copy link

github-actions bot commented Aug 9, 2024

Cirrus CI build successful. Found built image names and IDs:

Stage Image Name IMAGE_SUFFIX
base debian do-not-use
base fedora do-not-use
base fedora-aws do-not-use
base fedora-aws-arm64 do-not-use
base image-builder do-not-use
base prior-fedora do-not-use
cache build-push c20240809t162032z-f40f39d13
cache debian c20240809t162032z-f40f39d13
cache fedora c20240809t162032z-f40f39d13
cache fedora-aws c20240809t162032z-f40f39d13
cache fedora-netavark c20240809t162032z-f40f39d13
cache fedora-netavark-aws-arm64 c20240809t162032z-f40f39d13
cache fedora-podman-aws-arm64 c20240809t162032z-f40f39d13
cache fedora-podman-py c20240809t162032z-f40f39d13
cache prior-fedora c20240809t162032z-f40f39d13
cache rawhide c20240809t162032z-f40f39d13
cache win-server-wsl c20240809t162032z-f40f39d13

Copy link
Member

@edsantiago edsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine, but do you have any examples of when either of these has failed? My recollection (unsubstantiated) is that jobs flake because they run into the Cirrus timeout. Should that be bumped as well?

@cevich
Copy link
Member Author

cevich commented Aug 12, 2024

do you have any examples of when either of these has failed

I don't, this is purely speculative. I do recall on occasion, an entire repo. mirror (or single package) will flake irrecoverably. In those cases you just need to wait and try again later. Sometimes the next day.

that jobs flake because they run into the Cirrus timeout

That's my recollection as well, and yes I probably should increase that timeout as well.

Previously if a repository server, the internet, or the execution
environment experienced some kind of networking slowdown, it could lead
to a package install or update timeout failure.  Increase resiliency in
these situations with additional retries, timeouts, and lowered minimum
rates.  Also increase the timeout on the related Cirrus-CI tasks.

Signed-off-by: Chris Evich <cevich@redhat.com>
@cevich cevich force-pushed the rm_network_flakes branch from 77f48bd to 0a1e3db Compare August 12, 2024 15:00
@cevich
Copy link
Member Author

cevich commented Aug 12, 2024

force-push: Add/increase timeouts for the Cirrus-CI tasks.

@edsantiago In case it's not clear, the primary target is the nested-virt Fedora base image builds. But I'm not married to this PR, it's really just speculation on increasing build-problem/flake resiliency. If you feel it's unnecessary, adds too much complexity, or whatever I'll just close it w/o any hard feelings.

Copy link
Member

@edsantiago edsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

Cirrus CI build successful. Found built image names and IDs:

Stage Image Name IMAGE_SUFFIX
base debian do-not-use
base fedora do-not-use
base fedora-aws do-not-use
base fedora-aws-arm64 do-not-use
base image-builder do-not-use
base prior-fedora do-not-use
cache build-push c20240812t145931z-f40f39d13
cache debian c20240812t145931z-f40f39d13
cache fedora c20240812t145931z-f40f39d13
cache fedora-aws c20240812t145931z-f40f39d13
cache fedora-netavark c20240812t145931z-f40f39d13
cache fedora-netavark-aws-arm64 c20240812t145931z-f40f39d13
cache fedora-podman-aws-arm64 c20240812t145931z-f40f39d13
cache fedora-podman-py c20240812t145931z-f40f39d13
cache prior-fedora c20240812t145931z-f40f39d13
cache rawhide c20240812t145931z-f40f39d13
cache win-server-wsl c20240812t145931z-f40f39d13

@edsantiago
Copy link
Member

Identical to #375 (comment)

@cevich cevich merged commit b162196 into containers:main Aug 12, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants