Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(reaper): fix race condition when reusing reapers #1904

Merged
merged 4 commits into from
Nov 8, 2023

Conversation

lefinal
Copy link
Contributor

@lefinal lefinal commented Nov 5, 2023

Reaper reuse/creation logic has been adjusted to facilitate the reuse of already running reapers. This includes using a specific naming convention based on the session id, and handling failed container creation attempts due to name conflicts by retrieving the already running reaper container. This fixes a race condition when tests are run in parallel in multiple packages which renders global locks ineffective.

What does this PR do?

Reaper container now uses a deterministic name based on the session id.
If container creation fails due to name conflict, we reuse the existing container.

Why is it important?

If an application uses testcontainers in multiple packages, go test will run them isolated from each other.
This will render global locks as in reaper.go ineffective causing a race condition as multiple reapers are created.
However, they all use the same session id, because of the parent process being the same.
This may lead to unwanted cancellation of containers of still running tests in other packages.

There is already logic for reducing the risk of race conditions in reaper.go by introducing a small random delay when searching for any running reaper container.
However, this does not fully avoid the race condition as there is a small delay after container creation and it being visible from ContainerList.

How to test this PR

Tests have been added to reaper_test.go.
You can test the functionality by simply discarding the changes in reaper.go and running the tests.

@lefinal lefinal requested a review from a team as a code owner November 5, 2023 18:39
Copy link

netlify bot commented Nov 5, 2023

Deploy Preview for testcontainers-go ready!

Name Link
🔨 Latest commit 181e7bc
🔍 Latest deploy log https://app.netlify.com/sites/testcontainers-go/deploys/654abb0db3b30e0008471325
😎 Deploy Preview https://deploy-preview-1904--testcontainers-go.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Reaper reuse/creation logic has been adjusted to facilitate the reuse of already running reapers. This includes using a specific naming convention based on the session id, and handling failed container creation attempts due to name conflicts by retrieving the already running reaper container. This fixes a race condition when tests are run in parallel in multiple packages which renders global locks ineffective.
@lefinal lefinal force-pushed the fix-ryuk-race-condition branch from 4bb3a7b to 37d3093 Compare November 5, 2023 18:41
@lefinal lefinal changed the title fix(reaper): fix race condition for reusing reapers fix(reaper): fix race condition when reusing reapers Nov 5, 2023
@mdelapenya mdelapenya self-assigned this Nov 6, 2023
@mdelapenya mdelapenya added the bug An issue with the library label Nov 6, 2023
reaper.go Show resolved Hide resolved
reaper.go Show resolved Hide resolved
mdelapenya
mdelapenya previously approved these changes Nov 6, 2023
Copy link
Member

@mdelapenya mdelapenya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lefinal thanks for this fix. You're doing a great job investigating the potential race conditions in the project, which I enormously appreciate it.

I took the test code and put it into the main branch and it failed, so it demonstrated the bug. Same test in this branch does pass 👏

In any case, there a few comments to be addressed, but I'm approving this PR already. LGTM

reaper.go Outdated Show resolved Hide resolved
reaper.go Show resolved Hide resolved
The log message related to the reaper container in the reaper.go file has been updated for better clarity. The redundant phrase "Canceling creation -" has been removed as it does not provide additional relevant information, aiming to improve log readability.
Further comments where added to error handling in the reaper.go file, to better account for possible race conditions. This includes conditions where a container creation fails due to name conflict but no containers are visible in list-requests. Additionally, a possible scenario where the container may have died between requests has been covered.
@lefinal lefinal force-pushed the fix-ryuk-race-condition branch from a006d8d to 181e7bc Compare November 7, 2023 22:32
Copy link
Member

@mdelapenya mdelapenya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your patience during the review 🙇

@mdelapenya mdelapenya merged commit fc966d5 into testcontainers:main Nov 8, 2023
111 checks passed
mdelapenya added a commit to mdelapenya/testcontainers-go that referenced this pull request Nov 8, 2023
* main:
  fix(reaper): fix race condition when reusing reapers (testcontainers#1904)
mdelapenya added a commit to kuisathaverat/testcontainers-go that referenced this pull request Nov 20, 2023
* main: (31 commits)
  feat: support for executing commands in a container with user, workDir and env (testcontainers#1914)
  fix(modules.kafka): Switch to MaxInt for 32-bit support (testcontainers#1923)
  docs: fix code snippet for image substitution (testcontainers#1918)
  Add database driver note to SQL Wait strategy docs (testcontainers#1916)
  Reduce flakiness in ClickHouse tests (testcontainers#1902)
  lint: enable nonamedreturns (testcontainers#1909)
  chore: deprecate BindMount APIs (testcontainers#1907)
  fix(reaper): fix race condition when reusing reapers (testcontainers#1904)
  feat: Allow the container working directory to be specified (testcontainers#1899)
  chore: make rabbitmq examples more readable (testcontainers#1905)
  chore(deps): bump github.com/twmb/franz-go and github.com/twmb/franz-go/pkg/kadm in /modules/redpanda (testcontainers#1896)
  Fix - respect ContainerCustomizer in neo4j module (testcontainers#1903)
  chore(deps): bump github.com/nats-io/nkeys and github.com/nats-io/nats.go in /modules/nats (testcontainers#1897)
  chore: add tests for withNetwork option (testcontainers#1894)
  chore(deps): bump google.golang.org/grpc and cloud.google.com/go/firestore in /modules/gcloud (testcontainers#1891)
  chore(deps): bump github.com/aws/aws-sdk-go and github.com/aws/aws-sdk-go-v2/config in /modules/localstack (testcontainers#1892)
  chore(deps): bump Github actions (testcontainers#1890)
  chore(deps): bump github.com/shirou/gopsutil/v3 from 3.23.9 to 3.23.10 (testcontainers#1858)
  chore(deps): bump github.com/hashicorp/consul/api in /examples/consul (testcontainers#1863)
  chore(deps): bump github.com/IBM/sarama in /modules/kafka (testcontainers#1874)
  ...
mdelapenya added a commit to mdelapenya/testcontainers-go that referenced this pull request Nov 30, 2023
* main: (100 commits)
  fix: fallback matching of registry authentication config (testcontainers#1927)
  feat: support customizing the Docker build command (testcontainers#1931)
  docs: include MongoDB's username and password options into the docs (testcontainers#1930)
  feat: support for custom registry prefixes at the configuration level (testcontainers#1928)
  Add username and password functions to mongodb (testcontainers#1910)
  chore: skip TestContainerLogWithErrClosed as flaky on rootless docker (testcontainers#1925)
  docs: add some Vault module examples (testcontainers#1825)
  feat: support for executing commands in a container with user, workDir and env (testcontainers#1914)
  fix(modules.kafka): Switch to MaxInt for 32-bit support (testcontainers#1923)
  docs: fix code snippet for image substitution (testcontainers#1918)
  Add database driver note to SQL Wait strategy docs (testcontainers#1916)
  Reduce flakiness in ClickHouse tests (testcontainers#1902)
  lint: enable nonamedreturns (testcontainers#1909)
  chore: deprecate BindMount APIs (testcontainers#1907)
  fix(reaper): fix race condition when reusing reapers (testcontainers#1904)
  feat: Allow the container working directory to be specified (testcontainers#1899)
  chore: make rabbitmq examples more readable (testcontainers#1905)
  chore(deps): bump github.com/twmb/franz-go and github.com/twmb/franz-go/pkg/kadm in /modules/redpanda (testcontainers#1896)
  Fix - respect ContainerCustomizer in neo4j module (testcontainers#1903)
  chore(deps): bump github.com/nats-io/nkeys and github.com/nats-io/nats.go in /modules/nats (testcontainers#1897)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An issue with the library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants