-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add log for ShardLockException in InternalTestCluster #13632
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Hailong Cui <ihailong@amazon.com>
❌ Gradle check result for c6960c2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
@@ -2770,6 +2770,7 @@ public synchronized void assertAfterTest() throws Exception { | |||
try { | |||
env.shardLock(id, "InternalTestCluster assert after test", TimeUnit.SECONDS.toMillis(5)).close(); | |||
} catch (ShardLockObtainFailedException ex) { | |||
logger.error("Obtained shard lock failed", ex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The below fail function should effectively fail the tests with the needed logging already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fail function will log something like java.lang.AssertionError: Shard [.plugins-ml-config][0] is still locked after 5 sec waiting
, it will have shard information, but lack of why it still been locked and who hold the lock? that's the main purpose of add this log. The exception will have these information.
org.opensearch.env.ShardLockObtainFailedException: [.plugins-ml-config][0]: obtaining shard lock for [InternalTestCluster assert after test] timed out after [5000ms], lock already held for [starting shard] with age [20365ms]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe improve the fail logging instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, changed to improve existing fail assertion message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, you may merge the latest main branch into your branch to rerun the gradle checks, make sure all checks can pass.
❌ Gradle check result for 300ecc9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Hailong Cui <ihailong@amazon.com>
❌ Gradle check result for 35df6aa: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #13632 +/- ##
============================================
+ Coverage 71.42% 71.56% +0.14%
- Complexity 59978 61158 +1180
============================================
Files 4985 5059 +74
Lines 282275 287522 +5247
Branches 40946 41646 +700
============================================
+ Hits 201603 205773 +4170
- Misses 63999 64792 +793
- Partials 16673 16957 +284 ☔ View full report in Codecov by Sentry. |
This PR is stalled because it has been open for 30 days with no activity. |
This PR is stalled because it has been open for 30 days with no activity. |
Description
log out shardLockException for InternalTestCluster, it will helps to troubleshot failed IT test cases.
Related Issues
#13628
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.