Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport 2.x] Segment Replication - Update /_cat/segment_replication API with backp… #6741

Merged
merged 1 commit into from
Mar 18, 2023

Conversation

Poojita-Raj
Copy link
Contributor

@Poojita-Raj Poojita-Raj commented Mar 17, 2023

…ressure metrics. (#6674)

  • Segment Replication - Update Segment Replication API with backpressure metrics.

This change updates the existing /_cat/segment_replication API to include backpressure metrics. It does this by returning stats from primary shards for its tracked replication group and merging it with metrics returned from replicas. Primary captured metrics will now appear by default, with replica per sync events showing when detailed=true is set.

  • PR Feedback.

  • Fixed current_lag header alias.


Description

Manual backport, rebased.

Issues Resolved

Resolves #4478

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…ressure metrics. (opensearch-project#6674)

* Segment Replication - Update Segment Replication API with backpressure metrics.

This change updates the existing /_cat/segment_replication API to include backpressure metrics.
It does this by returning stats from primary shards for its tracked replication group and merging it with metrics returned from replicas.
Primary captured  metrics will now appear by default, with replica per sync events showing when detailed=true is set.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* PR Feedback.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fixed current_lag header alias.

Signed-off-by: Marc Handalian <handalm@amazon.com>

---------

Signed-off-by: Marc Handalian <handalm@amazon.com>
@Poojita-Raj Poojita-Raj changed the title Segment Replication - Update /_cat/segment_replication API with backp… [Backport 2.x] Segment Replication - Update /_cat/segment_replication API with backp… Mar 17, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@Rishikesh1159
Copy link
Member

Rishikesh1159 commented Mar 17, 2023

Gradle Check (Jenkins) Run Completed with:

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.snapshots.SearchableSnapshotIT.testPruneFileCacheOnIndexDeletion" -Dtests.seed=BC8DED9028DD8814 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=fr-CA -Dtests.timezone=Canada/Saskatchewan -Druntime.java=17

org.opensearch.snapshots.SearchableSnapshotIT > testPruneFileCacheOnIndexDeletion FAILED
    java.lang.AssertionError: timed out waiting for green state
        at __randomizedtesting.SeedInfo.seed([BC8DED9028DD8814:64E504583C56F5B5]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.opensearch.test.OpenSearchIntegTestCase.ensureColor(OpenSearchIntegTestCase.java:1007)
        at org.opensearch.test.OpenSearchIntegTestCase.ensureGreen(OpenSearchIntegTestCase.java:938)
        at org.opensearch.test.OpenSearchIntegTestCase.ensureGreen(OpenSearchIntegTestCase.java:927)
        at org.opensearch.snapshots.SearchableSnapshotIT.createIndexWithDocsAndEnsureGreen(SearchableSnapshotIT.java:279)
        at org.opensearch.snapshots.SearchableSnapshotIT.testPruneFileCacheOnIndexDeletion(SearchableSnapshotIT.java:534)

Gradle check failure unrelated to this PR. Issue already opened for failure:#6738

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@Poojita-Raj
Copy link
Contributor Author

Gradle Check (Jenkins) Run Completed with:

* **RESULT:** FAILURE ❌

* **URL:** https://build.ci.opensearch.org/job/gradle-check/12605/

* **CommitID:** [abe1ac8](https://github.com/opensearch-project/OpenSearch/commit/abe1ac813bee99203138ceb86e0f6463bbb91420)
  Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
  Is the failure [a flaky test](https://github.com/opensearch-project/OpenSearch/blob/main/DEVELOPER_GUIDE.md#flaky-tests) unrelated to your change?

Failing tests:
org.opensearch.repositories.azure.AzureBlobStoreRepositoryTests
org.opensearch.indices.replication.SegmentReplicationAllocationIT

Both suites are known flaky tests.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@Poojita-Raj
Copy link
Contributor Author

known flaky test: #6287 org.opensearch.cluster.shards.ClusterShardLimitIT.testCreateIndexWithMaxClusterShardSetting

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.indices.replication.SegmentReplicationRelocationIT.testFlushAfterRelocation

@codecov-commenter
Copy link

Codecov Report

Merging #6741 (abe1ac8) into 2.x (70707fe) will decrease coverage by 0.05%.
The diff coverage is 30.21%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##                2.x    #6741      +/-   ##
============================================
- Coverage     70.49%   70.45%   -0.05%     
+ Complexity    59517    59434      -83     
============================================
  Files          4805     4806       +1     
  Lines        285248   285295      +47     
  Branches      41467    41472       +5     
============================================
- Hits         201086   200996      -90     
- Misses        67448    67555     +107     
- Partials      16714    16744      +30     
Impacted Files Coverage Δ
...lication/SegmentReplicationShardStatsResponse.java 0.00% <0.00%> (ø)
...earch/index/SegmentReplicationPressureService.java 80.00% <0.00%> (-1.64%) ⬇️
...org/opensearch/index/seqno/ReplicationTracker.java 68.38% <ø> (+0.10%) ⬆️
...cation/TransportSegmentReplicationStatsAction.java 9.80% <3.03%> (-1.01%) ⬇️
...s/replication/SegmentReplicationStatsResponse.java 13.79% <22.22%> (+0.45%) ⬆️
...opensearch/index/SegmentReplicationShardStats.java 36.58% <33.33%> (+4.23%) ⬆️
...st/action/cat/RestCatSegmentReplicationAction.java 45.09% <47.27%> (-5.93%) ⬇️
...nsearch/index/SegmentReplicationPerGroupStats.java 32.00% <60.00%> (+3.42%) ⬆️
.../org/opensearch/index/SegmentReplicationStats.java 15.38% <100.00%> (ø)
...ensearch/index/SegmentReplicationStatsTracker.java 95.45% <100.00%> (+0.21%) ⬆️

... and 502 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants