test flakes tracker #579

cgwalters · 2024-06-01T12:03:01Z

Parsing layer blob: Broken pipe

stderr: "\e[31mERROR\e[0m Switching: Pulling: Importing: Parsing layer blob sha256:4367367aae6325ce7351edb720e7e6929a7f369205b38fa88d140b7e3d0a274f: Broken pipe (os error 32)"

This one is like my enemy! I have a tracker over here for it coreos/rpm-ostree#4567 too

henrywang · 2024-06-01T13:24:53Z

But anyways I think the larger problem pointed out by the aws error message is the script hardcodes a security group in a specific AZ, when it could really be targeting any AZ right?

There's only one Zone we can ues because RHEL needs internal access to install podman to run bootc install command. IT only configured one subnet in one Zone.

We had get available Zone for non-rhel test https://gitlab.com/fedora/bootc/tests/bootc-workflow-test/-/blob/2bebcdd18f4e0ff9639aff59e2fdfdfcec70f450/playbooks/deploy-aws.yaml#L55.

A few things on this. First it seems like a lot of this script is a basic "provision an ec2 instance" code that could probably be shared and live outside this repo? Maybe we fetch this stuff from a container or a distinct repo?

That's the things I'd like to talk with you on Monday QE sync meeting.

cgwalters · 2024-06-01T13:44:00Z

There's only one Zone we can ues because RHEL needs internal access to install podman to run bootc install command. IT only configured one subnet in one Zone.

OK, got it. Well...per the other discussion, what if we focused only on fedora:40 and centos:stream9 for PR testing by default, and did rhel integration testing both post merge (I'll get the -dev images re-spun up which build relevant things from git main) and also as part of dist-git merges to https://gitlab.com/redhat/centos-stream/rpms/bootc/ ?

henrywang · 2024-06-01T14:40:25Z

OK, got it. Well...per the other discussion, what if we focused only on fedora:40 and centos:stream9 for PR testing by default

I agree.

and did rhel integration testing both post merge (I'll get the -dev images re-spun up which build relevant things from git main) and also as part of dist-git merges to https://gitlab.com/redhat/centos-stream/rpms/bootc/ ?

Just like you mentioned above, rhel-bootc-dev repo can be added just like centos-bootc-dev and -dev image can be saved in gitlab repo (repos under https://gitlab.com/redhat/rhel/bifrost should be private?). I can add test job in this repo without test code added, only run pipeline with https://gitlab.com/fedora/bootc/tests/bootc-workflow-test code. -dev image can be built daily and test will be run daily as well.

I'd not suggest to add testing in https://gitlab.com/redhat/centos-stream/rpms/bootc/ to avoid release block. From my perspective, all tests should be run before release, not on release.

henrywang · 2024-06-08T05:00:49Z

Recently, let's say last week, this error has been found more times. Automation added 3-times retry in ansible playbook as a workaround. Let's see what happens after retry.

cgwalters · 2024-06-24T21:11:50Z

In a different run, we somehow ended up with

Creating root filesystem (xfs) on device /dev/loop0p2 (size=512M)

Which seems related but different from the other one:

Creating root filesystem (xfs) on device /dev/loop0p1 (size=1M)

Actually, having it be 1M sometimes and 512M others looks very much like we're getting partitions swapped.

henrywang · 2024-07-03T05:56:17Z

The test is facing Installing to filesystem: Creating ostree deployment: Performing deployment: Importing: Parsing layer blob sha256:9536e521dd6b076e09fa076feb4428e4b94e5330c6d6b3ab1e235a54be3d88b7: Failed to invoke skopeo proxy method FinishPipe: remote error: write |1: broken pipe error recently when run bootc install to-existing-root.

cgwalters · 2024-07-16T17:15:49Z

@henrywang anything we can do to fix/improve

[13:43:01] [E] [CentOS-Stream-9:x86_64:/plans/e2e/to-disk] guest provisioning failed: Guest couldn't be provisioned: Artemis resource ended in 'error' state
As seen on e.g. https://artifacts.dev.testing-farm.io/4fec6905-15b7-49d6-aff5-2bad9d78a12e/

Having some basically permanently-red CI is a mental overhead to check each time which specific jobs are failing.

henrywang · 2024-07-17T14:57:16Z

Yes, have issue https://issues.redhat.com/browse/TFT-2691 to track.

cgwalters · 2024-07-17T15:51:41Z

Actually, having it be 1M sometimes and 512M others looks very much like we're getting partitions swapped.

I didn't try to stress test this much, but I think #698 is going to help. At the very least if we are still racing somehow, we'll get a more clear error message.

cgwalters · 2024-08-01T00:38:35Z

I didn't try to stress test this much, but I think #698 is going to help. At the very least if we are still racing somehow, we'll get a more clear error message.

I think that fixed the install flake, haven't seen it since.

henrywang · 2024-10-01T17:18:19Z

Recently, install to-existing-root test got Installing to filesystem: Creating ostree deployment: Pulling: Importing: Unencapsulating base: failed to invoke method FinishPipe: failed to invoke method FinishPipe: expected 45 bytes in blob, got 139264 error in some tests. I think we should give this error a look. Thanks.

Failed log example:

jeckersb · 2024-10-01T17:30:13Z

Recently, install to-existing-root test got Installing to filesystem: Creating ostree deployment: Pulling: Importing: Unencapsulating base: failed to invoke method FinishPipe: failed to invoke method FinishPipe: expected 45 bytes in blob, got 139264 error in some tests. I think we should give this error a look. Thanks.

Failed log example:
* https://artifacts.dev.testing-farm.io/c4f7b9ab-02f7-485f-84dd-9f55559c9129/

* https://artifacts.dev.testing-farm.io/e9de5ed4-d125-4167-8968-ecfbbbe94072/

I noted this one over in the ostree-rs-ext tracker, it's likely related to the other similar issues around broken pipes.

henrywang · 2024-10-02T16:12:17Z

This issue looks only exists on bare metal machine (testing farm public ranch runs virtualization test on AWS bare metal instance). I can't find same issue on nested virtualization environment, I mean run same test script.
Is that possible this issue is related with disk I/O?

jeckersb · 2024-10-07T20:37:28Z

This issue looks only exists on bare metal machine (testing farm public ranch runs virtualization test on AWS bare metal instance). I can't find same issue on nested virtualization environment, I mean run same test script. Is that possible this issue is related with disk I/O?

Hmm could be. Any idea what kind of storage is used on the baremetal instances?

I'm thinking of trying to reproduce in a virtualized environment by attaching the disk via nbd and then using the spinning filter to simulate a slow disk.

henrywang · 2024-10-28T15:19:15Z

Hi @jeckersb, Do you know any workaround for issue Installing to filesystem: Creating ostree deployment: Pulling: Importing: Unencapsulating base: failed to invoke method FinishPipe: failed to invoke method FinishPipe: expected 45 bytes in blob, got 139264? This error failed a lot in our CI. Thank!

cgwalters · 2024-10-28T15:32:49Z

@henrywang isn't that #509 (comment) ? Is the input image zstd:chunked? Is it a RHEL10 system?

henrywang · 2024-10-28T15:43:46Z

It's C10S system. Yeah, same thing as RHEL 10. The following workaround might work? Thanks.

if [[ "${REDHAT_VERSION_ID%%.*}" == "10" ]]; then
    sed -i 's/^compression_format = .*/compression_format = "gzip"/' /usr/share/containers/containers.conf
fi

cgwalters · 2024-10-28T17:12:25Z

Yep per #509 (comment) that's what the new default will be, hopefully soon

cgwalters · 2024-11-05T19:17:26Z

I think we're good on this!

two minor patches + c9s compat

store: Cleanup `copy` function

cgwalters added the area/ci Issues related to our own CI label Jun 4, 2024

cgwalters mentioned this issue Jun 20, 2024

cli: Add a new bootc image subcommand #620

Merged

cgwalters mentioned this issue Aug 1, 2024

A few misc cleanups #746

Merged

cgwalters closed this as completed Nov 5, 2024

cgwalters added a commit to cgwalters/bootc that referenced this issue Nov 5, 2024

Merge pull request containers#579 from cgwalters/misc

4e2eb8a

two minor patches + c9s compat

cgwalters pushed a commit to cgwalters/bootc that referenced this issue Nov 6, 2024

Merge pull request containers#579 from cgwalters/misc-semver2

6c9c311

store: Cleanup `copy` function

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test flakes tracker #579

test flakes tracker #579

cgwalters commented Jun 1, 2024 •

edited

Loading

henrywang commented Jun 1, 2024

cgwalters commented Jun 1, 2024

henrywang commented Jun 1, 2024

henrywang commented Jun 8, 2024

cgwalters commented Jun 24, 2024

henrywang commented Jul 3, 2024 •

edited

Loading

cgwalters commented Jul 16, 2024

henrywang commented Jul 17, 2024

cgwalters commented Jul 17, 2024

cgwalters commented Aug 1, 2024

henrywang commented Oct 1, 2024

jeckersb commented Oct 1, 2024

henrywang commented Oct 2, 2024

jeckersb commented Oct 7, 2024

henrywang commented Oct 28, 2024

cgwalters commented Oct 28, 2024

henrywang commented Oct 28, 2024 •

edited

Loading

cgwalters commented Oct 28, 2024

cgwalters commented Nov 5, 2024

test flakes tracker #579

test flakes tracker #579

Comments

cgwalters commented Jun 1, 2024 • edited Loading

Parsing layer blob: Broken pipe

henrywang commented Jun 1, 2024

cgwalters commented Jun 1, 2024

henrywang commented Jun 1, 2024

henrywang commented Jun 8, 2024

cgwalters commented Jun 24, 2024

henrywang commented Jul 3, 2024 • edited Loading

cgwalters commented Jul 16, 2024

henrywang commented Jul 17, 2024

cgwalters commented Jul 17, 2024

cgwalters commented Aug 1, 2024

henrywang commented Oct 1, 2024

jeckersb commented Oct 1, 2024

henrywang commented Oct 2, 2024

jeckersb commented Oct 7, 2024

henrywang commented Oct 28, 2024

cgwalters commented Oct 28, 2024

henrywang commented Oct 28, 2024 • edited Loading

cgwalters commented Oct 28, 2024

cgwalters commented Nov 5, 2024

cgwalters commented Jun 1, 2024 •

edited

Loading

henrywang commented Jul 3, 2024 •

edited

Loading

henrywang commented Oct 28, 2024 •

edited

Loading