Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rdb loader tasks continue running on failure #567

Closed
dranikpg opened this issue Dec 16, 2022 · 1 comment
Closed

Rdb loader tasks continue running on failure #567

dranikpg opened this issue Dec 16, 2022 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@dranikpg
Copy link
Contributor

Currently, the rdb loader collects items to be inserted and schedules them for insertion inside the shard with FlushShardAsync. Because those tasks are added to the shard_set and not executed immediately, it might happen, that RdbLoader::Load exits early with an error and those tasks will continue running. They can access the loaders fields and will crash if the loader was already deleted after reporting an error. We should make sure on every return that those tasks finished executing, like it is done on the very bottom of the function.

I would also suggest replacing the stop_early_ flag with Cancellation from commons 🙂

@dranikpg dranikpg added the bug Something isn't working label Dec 16, 2022
@romange
Copy link
Collaborator

romange commented Dec 16, 2022

I think a simpler thing would be to wrap the final part of the function, i.e.

  fibers_ext::BlockingCounter bc(shard_set->size());
  for (unsigned i = 0; i < shard_set->size(); ++i) {
    // Flush the remaining items.
    FlushShardAsync(i);

    // Send sentinel callbacks to ensure that all previous messages have been processed.
    shard_set->Add(i, [bc]() mutable { bc.Dec(); });
  }
  bc.Wait();  // wait for sentinels to report.

  absl::Duration dur = absl::Now() - start;
  load_time_ = double(absl::ToInt64Milliseconds(dur)) / 1000;
  keys_loaded_ = keys_loaded;

with absl::Cleanup

adiholden added a commit that referenced this issue Dec 18, 2022
Signed-off-by: adi_holden <adi@dragonflydb.io>
romange pushed a commit that referenced this issue Dec 18, 2022
Signed-off-by: adi_holden <adi@dragonflydb.io>

Signed-off-by: adi_holden <adi@dragonflydb.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants