Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intermittent failure in ext.rpm-ostree.destructive.container-image #4567

Open
cgwalters opened this issue Aug 30, 2023 · 1 comment
Open
Labels
area/ci-flake Tracker for CI flakes or bugs triaged This issue was triaged

Comments

@cgwalters
Copy link
Member

[2023-08-30T15:00:16.554Z] Aug 30 14:59:54 qemu0 kola-runext-container-image[1134]: + rpm-ostree rebase ostree-unverified-image:containers-storage:localhost/fcos-derived
[2023-08-30T15:00:16.554Z] Aug 30 14:59:54 qemu0 kola-runext-container-image[1957]: Pulling manifest: ostree-unverified-image:containers-storage:localhost/fcos-derived
[2023-08-30T15:00:16.554Z] Aug 30 14:59:54 qemu0 kola-runext-container-image[1957]: Importing: ostree-unverified-image:containers-storage:localhost/fcos-derived (digest: sha256:db32a3de020c5f7c74191e50690ad520c2158ccae3b99734cd59f15e0d9b73da)
[2023-08-30T15:00:16.554Z] Aug 30 14:59:54 qemu0 kola-runext-container-image[1957]: ostree chunk layers needed: 1 (1.5?GB)
[2023-08-30T15:00:16.554Z] Aug 30 14:59:54 qemu0 kola-runext-container-image[1957]: custom layers needed: 1 (24.7?MB)
[2023-08-30T15:00:16.554Z] Aug 30 15:00:15 qemu0 kola-runext-container-image[1957]: error: Importing: Parsing layer blob sha256:00623c39da63781bdd3fb00fedb36f8b9ec95e42cdb4d389f692457f24c67144: Failed to invoke skopeo proxy method FinishPipe: remote error: write |1: broken pipe
[2023-08-30T15:00:16.554Z] Aug 30 15:00:15 qemu0 systemd[1]: kola-runext.service: Main process exited, code=exited, status=1/FAILURE
[2023-08-30T15:00:16.554Z] Aug 30 15:00:15 qemu0 systemd[1]: kola-runext.service: Failed with result 'exit-code'.
[2023-08-30T15:00:16.554Z] Aug 30 15:00:15 qemu0 systemd[1]: kola-runext.service: Consumed 36.730s CPU time.

This one is a bit concerning because it's been happening more frequently recently I think. Also, I think we may be running into something related to https://github.com/ostreedev/ostree-rs-ext/blob/bd77743c21280b0089c7146668e4c72f4d588143/lib/src/container/unencapsulate.rs#L143 which is masking the real error.

@cgwalters cgwalters added triaged This issue was triaged area/ci-flake Tracker for CI flakes or bugs labels Aug 30, 2023
cgwalters added a commit to cgwalters/ostree-rs-ext that referenced this issue Aug 31, 2023
I don't think we're hitting this in coreos/rpm-ostree#4567
but it'd be useful to have a trace message in case.
cgwalters added a commit to cgwalters/ostree-rs-ext that referenced this issue Aug 31, 2023
I'm hoping this will help us debug coreos/rpm-ostree#4567
```
[2023-08-30T15:00:16.554Z] Aug 30 15:00:15 qemu0 kola-runext-container-image[1957]: error: Importing: Parsing layer blob sha256:00623c39da63781bdd3fb00fedb36f8b9ec95e42cdb4d389f692457f24c67144: Failed to invoke skopeo proxy method FinishPipe: remote error: write |1: broken pipe
```

I haven't been able to reproduce it outside of CI yet, but we had
a prior ugly hack for this in
ostreedev@a27dac8

As the comments say - the goal is to hold open the input stream
as long as feasibly possible.
@cgwalters
Copy link
Member Author

cgwalters commented Sep 1, 2023

Discoveries so far:

More generally it's definitely a race condition; I can sometimes reproduce this by doing
ostree refs --delete ostree/container and then re-running the rebase.

Also of note: kola defaults to a uniprocessor VM, which I think is more likely to expose this race.

I'm quite certain it has something to do with the scheduling of us closing the pipe vs calling FinishPipe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci-flake Tracker for CI flakes or bugs triaged This issue was triaged
Projects
None yet
Development

No branches or pull requests

1 participant