Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flakiness by synchronising the State #188

Merged
merged 2 commits into from
Aug 2, 2022

Conversation

Pldi23
Copy link
Contributor

@Pldi23 Pldi23 commented Jul 20, 2022

Description:

Cloudbees CI reported a flake in
org.jenkinsci.plugins.workflow.support.steps.StageStepTest.serializability

Error Log:
stderr.txt

How to reproduce:
I wasn't managed to reproduce it locally without code changes, but if we add TimeUnit.MILLISECONDS.sleep(200); before this line the StageStepTest.serializability started to flake regularly.

Assumption:
My assumption that there is a deadlock possibility.

Solution:
As a solution I want to suggest to synchronize the State local variable (the same synchronisation mechanism is used here)

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

@Pldi23 Pldi23 changed the title synchronize state Fix flakiness by synchronising state Jul 20, 2022
@Pldi23 Pldi23 changed the title Fix flakiness by synchronising state Fix flakiness by synchronising the State Jul 20, 2022
} else {
System.err.println("Planning to unblock " + k + " as failure");
s.errors.put(k, error);
synchronized (s) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not notice some tests that failed because of the race condition here, but since we are synchronizing the success method, then we need to synchronize here too.

@Pldi23
Copy link
Contributor Author

Pldi23 commented Jul 26, 2022

Waiting for a review from maintainers

@Pldi23
Copy link
Contributor Author

Pldi23 commented Aug 1, 2022

@jglick could you please review

Copy link
Member

@jglick jglick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC to match

synchronized (s) {
while (!s.started.contains(k)) {
if (b != null && !b.isBuilding()) {
throw new AssertionError(JenkinsRule.getLog(b));
}
s.wait(1000);
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants