You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MacOS test runners sometimes randomly fail to allocate local ephemeral ports (binding on localhost, port 0) in a reasonable amount of time. (<10 seconds)
Background
The Open Policy Agent project has had MacOS test runner flakeouts for the last year or longer, which we've traced down to Github's MacOS runners not allocating ephemeral ports on localhost within a reasonable timeout window.
We reduced the frequency of test failures by setting our test timeouts dramatically higher, but even that has not completely eliminated the issue.
Platforms affected
Azure DevOps
GitHub Actions - Standard Runners
GitHub Actions - Larger Runners
Runner images affected
Ubuntu 18.04
Ubuntu 20.04
Ubuntu 22.04
macOS 11
macOS 12
Windows Server 2019
Windows Server 2022
Image version and build link
Runner Image Version
Image: macos-11
Version: 20221204.1
Included Software: https://github.com/actions/runner-images/blob/macOS-11/20221204.1/images/macos/macos-11-Readme.md
Image Release: https://github.com/actions/runner-images/releases/tag/macOS-11%2F20221204.1
Failed Build Links
Below is all of the failed builds I could find in the last 50+ pages of our Actions tab. There are more failures before that, but we lack logs for those, so I couldn't filter out the relevant MacOS-specific failures any further back than mid-September of this year.
We are aware of this problem and we are working in order to make macOS agents better, but for now we are having some hardware limitations and nothing can be done from the images side :(. Feel free to reach us again if you have questions left.
Description
MacOS test runners sometimes randomly fail to allocate local ephemeral ports (binding on localhost, port 0) in a reasonable amount of time. (<10 seconds)
Background
The Open Policy Agent project has had MacOS test runner flakeouts for the last year or longer, which we've traced down to Github's MacOS runners not allocating ephemeral ports on localhost within a reasonable timeout window.
We reduced the frequency of test failures by setting our test timeouts dramatically higher, but even that has not completely eliminated the issue.
Platforms affected
Runner images affected
Image version and build link
Runner Image Version
Failed Build Links
Below is all of the failed builds I could find in the last 50+ pages of our Actions tab. There are more failures before that, but we lack logs for those, so I couldn't filter out the relevant MacOS-specific failures any further back than mid-September of this year.
Is it regression?
Unknown, but likely not a new regression.
Expected behavior
MacOS test runners provide ephemeral/dynamic localhost ports to our Golang tests in <10 seconds.
Actual behavior
MacOS test runners occasionally fail to provide a port within a 10 second time frame, causing our tests to crash.
Repro steps
The failures are intermittent, and seem to be almost completely random. The failures may be tied to faulty hardware, as referenced in #3885.
I will periodically add new failures to the list above if that would be helpful.
The text was updated successfully, but these errors were encountered: