Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Bug BZ 2271067 | NSFS | Change Mapping in Pool Server #7936

Merged
merged 1 commit into from
Apr 3, 2024

Conversation

shirady
Copy link
Contributor

@shirady shirady commented Mar 31, 2024

Explain the changes

  1. In pool_server inside calc_namespace_resource_mode remove the mapping of ENOENT to storage_not_exist.
  2. In the namespace_monitor change the error code either from a thrown error or an error that we created to a code that we choose and add log printing (could be seen in the endpoint logs).
  3. Add a config NS_MAX_ALLOWED_IO_ERRORS that we will use in the calc_namespace_resource_mode.

Issues: Fixed BZ 2271067

  1. Currently, when someone tries to head an object that doesn't exist in NSFS it will be mapped to ENOENT and then the namespace mode will be STORAGE_NOT_EXISTS (instead of IO_ERROR).
  2. GAP (more info here) - with the suggested change we would not be able to to know if the origin of NoSuchBucket or AccessDenied error code is from the namespace monitor or requests that were updated in the issues report (we would only have the printing in the endpoint logs to guide us that it is from the namespace monitor, example:

[Endpoint/98] [ERROR] core.server.bg_services.namespace_monitor:: test_nsfs_resource: got error: [Error: No such file or directory] { code: 'ENOENT' } { fs_root_path: '/nsfs/fs1' }

Testing Instructions:

Manual Tests:

General:

  1. Deploy noobaa on a local cluster (MInikube or Rancher Desktop) (see guide).
  2. Deploy NSFS on a local cluster (Based on the instructions here).

Mapping Change:
Use AWS CLI and use the head-object command on a noobaa bucket in NSFS on non-existing key.

  1. Before: the namespace phase was changed from Ready to Rejected with mode STORAGE_NOT_EXIST after 1 failure.
  2. After: the namespace phase was changed from Ready to Rejected with mode IO_ERRORS after a couple of failures.
    You can use a loop for this, for example: for i in {1..10}; do s3-nb-user-1 s3api head-object --bucket fs1-jenia-bucket --key non_exist.txt; done (s3-nb-user-1 is an alias alias s3-nb-user-1='AWS_ACCESS_KEY=<access_key_nsfs_account> AWS_SECRET_ACCESS_KEY=<secret_access_key_nsfs_account> aws --no-verify-ssl --endpoint-url https://localhost:12443'.

Updating the env:
If you wish to update the NS_MAX_ALLOWED_IO_ERRORS you would need to:

  1. kubectl exec statefulset/noobaa-core -c core -- printenv | grep IO_ERRORS (should be empty).
  2. kubectl set env statefulset/noobaa-core CONFIG_JS_NS_MAX_ALLOWED_IO_ERRORS=3 (a different number that we defined in the config, then it should output statefulset.apps/noobaa-core env updated and noobaa-core-0 pod will be terminated and start running again). Please notice that to override envs we will need to add the prefix CONFIG_JS_ in the variable name.
  3. kubectl exec statefulset/noobaa-core -c core -- printenv | grep IO_ERRORS (should be NS_MAX_ALLOWED_IO_ERRORS=6).
  4. You can run the loop mention above with for i in {1..7}; and see that the namespace phase was changed from Ready to Rejected with mode STORAGE_NOT_EXIST.
  • Doc added/updated
  • Tests added

@shirady shirady requested review from liranmauda and romayalon March 31, 2024 13:11
@shirady shirady self-assigned this Mar 31, 2024
src/server/bg_services/namespace_monitor.js Outdated Show resolved Hide resolved
src/server/bg_services/namespace_monitor.js Outdated Show resolved Hide resolved
src/server/bg_services/namespace_monitor.js Outdated Show resolved Hide resolved
src/server/system_services/pool_server.js Outdated Show resolved Hide resolved
src/server/bg_services/namespace_monitor.js Outdated Show resolved Hide resolved
@pull-request-size pull-request-size bot added size/M and removed size/S labels Apr 1, 2024
@shirady shirady requested a review from romayalon April 1, 2024 08:45
src/server/bg_services/namespace_monitor.js Outdated Show resolved Hide resolved
src/server/bg_services/namespace_monitor.js Outdated Show resolved Hide resolved
config.js Outdated Show resolved Hide resolved
config.js Outdated Show resolved Hide resolved
src/server/system_services/pool_server.js Outdated Show resolved Hide resolved
src/server/bg_services/namespace_monitor.js Outdated Show resolved Hide resolved
@shirady shirady force-pushed the nsfs-bz-namespacestore branch from 587c307 to 63ef435 Compare April 1, 2024 11:46
@pull-request-size pull-request-size bot added size/S and removed size/M labels Apr 1, 2024
@shirady shirady force-pushed the nsfs-bz-namespacestore branch from 63ef435 to 86d7c6e Compare April 1, 2024 11:48
@shirady shirady requested review from romayalon and liranmauda April 1, 2024 12:18
@shirady shirady force-pushed the nsfs-bz-namespacestore branch 3 times, most recently from a3d1184 to 81fd35e Compare April 2, 2024 09:28
@shirady shirady force-pushed the nsfs-bz-namespacestore branch from 81fd35e to 051a489 Compare April 2, 2024 10:42
@shirady shirady requested a review from romayalon April 2, 2024 11:00
@shirady shirady force-pushed the nsfs-bz-namespacestore branch 2 times, most recently from 95d337e to 209c375 Compare April 3, 2024 06:37
1. In pool_server inside calc_namespace_resource_mode remove the mapping of ENOENT to storage_not_exist.
2. In the namespace_monitor change the error code either from a thrown error or an error that we created to a code that we choose and add log printing (could be seen in the endpoint logs).
3. Add config NS_MAX_ALLOWED_IO_ERRORS that we will use in the calc_namespace_resource_mode.

Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
@shirady shirady force-pushed the nsfs-bz-namespacestore branch from 209c375 to 07b195c Compare April 3, 2024 06:56
@shirady shirady merged commit dc7d623 into noobaa:master Apr 3, 2024
10 checks passed
@shirady shirady deleted the nsfs-bz-namespacestore branch May 20, 2024 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants