Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write to UFS when failed to write to Alluxio Worker #18017

Merged
merged 2 commits into from
Sep 15, 2023

Conversation

JiamingMai
Copy link
Contributor

  1. Use addSuppressed instead of creating a new exception to throw the original exception.
  2. Go ahead to write data to UFS when it encountered exception during the time writing data to Alluxio Worker.

@JiamingMai JiamingMai added the type-code-quality code quality improvement label Aug 18, 2023
@JiamingMai JiamingMai self-assigned this Aug 18, 2023
if (throwable != null) {
e.addSuppressed(throwable);
}
LOG.error("Failed to write data to alluxio worker. ", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comments here:

    //writing to worker over netty failed. But Alluxio client continues writing to UFS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we abort trying to write to worker for the following requests? And delete already written pages from worker cache.

The reasons are:

  1. The subsequent write request might fail, too.
  2. The worker will receive some data but there is already a hole (from the previous failed request). I am not sure if the worker/cache manager will handle this. I am afraid not. The cache manager on worker expects contiguous write. The write request does not have a POSITION for current data.
  3. suppose for a 1MB page, the alluxio client has written 4K data at [0, 4096). And then failure for next [4096, 8192) write. And then a successful write of 4K at [8192, 12288). So, the page file on worker will have a hole, not filled with actually data. Subsequent reading from paging store will get all ZEROES for such hole. This is not correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I submitted an issue #18157 to make this be a future work.

@JiamingMai
Copy link
Contributor Author

alluxio-bot, merge this please

@alluxio-bot alluxio-bot merged commit 84dc05f into Alluxio:main Sep 15, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-code-quality code quality improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants