Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free task descriptor memory for finished tasks #12478

Merged
merged 1 commit into from
May 19, 2022

Conversation

losipiuk
Copy link
Member

@losipiuk losipiuk commented May 19, 2022

Description

In execution mode with task retries we need to keep task descriptors which
(among other things) wrap set of input split until matching tasks finishe.
It is needed so we can restart a task in case of a failure.
Previously we kept the descriptor in memory even after task successfully
finished which was wasteful. It could result in crossing descriptor
storage size boundary which resulted in query failures with
EXCEEDED_TASK_DESCRIPTOR_STORAGE_CAPACITY error code.

With this commit we drop task descriptor from storage as soon as we
observe matching task complete succesfully.

Is this change a fix, improvement, new feature, refactoring, or other?

improvement

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

core engine

How would you describe this change to a non-technical end user or system administrator?

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# General
* Improve coordinator memory management for fault-tolerant execution (`retry-mode=TASK`)
  decreasing the chance that some queries may fail with `EXCEEDED_TASK_DESCRIPTOR_STORAGE_CAPACITY` 
  error code. ({issue}`12478`)

In execution mode with task retries we need to keep task descriptors which
(among other things) wrap set of input split until matching tasks finishe.
It is needed so we can restart a task in case of a failure.
Previously we kept the descriptor in memory even after task successfully
finished which was wasteful. It could result in crossing descriptor
storage size boundary which resulted in query failures with
EXCEEDED_TASK_DESCRIPTOR_STORAGE_CAPACITY error code.

With this commit we drop task descriptor from storage as soon as we
observe matching task complete succesfully.
@losipiuk
Copy link
Member Author

CI: #12413

@losipiuk losipiuk merged commit 223a11b into trinodb:master May 19, 2022
@github-actions github-actions bot added this to the 382 milestone May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants