Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fs info reporting negative available size #11573

Merged
merged 6 commits into from
Jun 24, 2024

Conversation

bugmakerrrrrr
Copy link
Contributor

Description

Today, the fs info may report a negative available size, which happens when the following conditions are met:

  1. A node has both data and search roles.
  2. Some space is reserved for file cache.
  3. The file cache space is being used by other files, primarily local indices files.

Related exception:

java.lang.IllegalArgumentException: Values less than -1 bytes are not supported: -1781760b
	at org.opensearch.core.common.unit.ByteSizeValue.<init>(ByteSizeValue.java:78) ~[classes/:?]
	at org.opensearch.core.common.unit.ByteSizeValue.<init>(ByteSizeValue.java:73) ~[classes/:?]
	at org.opensearch.monitor.fs.FsInfo$Path.getAvailable(FsInfo.java:141) ~[classes/:?]
	at org.opensearch.cluster.InternalClusterInfoService.fillDiskUsagePerNode(InternalClusterInfoService.java:452) ~[classes/:?]
	at org.opensearch.cluster.InternalClusterInfoService$1.onResponse(InternalClusterInfoService.java:265) ~[classes/:?]
	at org.opensearch.cluster.InternalClusterInfoService$1.onResponse(InternalClusterInfoService.java:260) [classes/:?]
	at org.opensearch.action.LatchedActionListener.onResponse(LatchedActionListener.java:58) [classes/:?]
	at org.opensearch.action.support.TransportAction$1.onResponse(TransportAction.java:113) ~[classes/:?]
	at org.opensearch.action.support.TransportAction$1.onResponse(TransportAction.java:107) [classes/:?]
	at org.opensearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:74) [classes/:?]
	at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:89) ~[classes/:?]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [classes/:?]
	at org.opensearch.common.util.concurrent.OpenSearchExecutors$DirectExecutorService.execute(OpenSearchExecutors.java:341) [classes/:?]
	at org.opensearch.action.support.nodes.TransportNodesAction$AsyncAction.finishHim(TransportNodesAction.java:315) [classes/:?]

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented Dec 11, 2023

Compatibility status:

Checks if related components are compatible with change 0ac9db3

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/performance-analyzer.git]

Copy link
Contributor

❌ Gradle check result for cf2596f: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kotwanikunal
Copy link
Member

Thanks for the changes @bugmakerrrrrr. Can you please add in a changelog entry in the 2.x section?

Copy link
Contributor

✅ Gradle check result for 35e1721: SUCCESS

Copy link
Contributor

❕ Gradle check result for e354303: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.http.SearchRestCancellationIT.testAutomaticCancellationMultiSearchDuringFetchPhase
      1 org.opensearch.http.SearchRestCancellationIT.testAutomaticCancellationDuringQueryPhase
      1 org.opensearch.http.SearchRestCancellationIT.classMethod
      1 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
Copy link
Contributor

❌ Gradle check result for 1bae5b8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 1bae5b8: SUCCESS

Signed-off-by: Andrew Ross <andrross@amazon.com>
Copy link
Contributor

❌ Gradle check result for 5514e03: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 5514e03: SUCCESS

@jed326 jed326 merged commit 1da19d3 into opensearch-project:main Jun 24, 2024
31 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jun 24, 2024
* fix fs info reporting negative available size

Signed-off-by: panguixin <panguixin@bytedance.com>

* change log

Signed-off-by: panguixin <panguixin@bytedance.com>

* fix test

Signed-off-by: panguixin <panguixin@bytedance.com>

* fix test

Signed-off-by: panguixin <panguixin@bytedance.com>

* spotless

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Co-authored-by: Andrew Ross <andrross@amazon.com>
(cherry picked from commit 1da19d3)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
andrross added a commit that referenced this pull request Jun 25, 2024
(cherry picked from commit 1da19d3)

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Andrew Ross <andrross@amazon.com>
harshavamsi pushed a commit to harshavamsi/OpenSearch that referenced this pull request Jul 12, 2024
* fix fs info reporting negative available size

Signed-off-by: panguixin <panguixin@bytedance.com>

* change log

Signed-off-by: panguixin <panguixin@bytedance.com>

* fix test

Signed-off-by: panguixin <panguixin@bytedance.com>

* fix test

Signed-off-by: panguixin <panguixin@bytedance.com>

* spotless

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Co-authored-by: Andrew Ross <andrross@amazon.com>
kkewwei pushed a commit to kkewwei/OpenSearch that referenced this pull request Jul 24, 2024
…) (opensearch-project#14521)

(cherry picked from commit 1da19d3)

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: kkewwei <kkewwei@163.com>
wdongyu pushed a commit to wdongyu/OpenSearch that referenced this pull request Aug 22, 2024
* fix fs info reporting negative available size

Signed-off-by: panguixin <panguixin@bytedance.com>

* change log

Signed-off-by: panguixin <panguixin@bytedance.com>

* fix test

Signed-off-by: panguixin <panguixin@bytedance.com>

* fix test

Signed-off-by: panguixin <panguixin@bytedance.com>

* spotless

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Co-authored-by: Andrew Ross <andrross@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants