-
Notifications
You must be signed in to change notification settings - Fork 981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support huge values in RdbLoader #3760
Comments
In more detail: LoadKeyValPair loads all the blobs into memory (OpaqueObj) but does not create Valkey objects (i.e. hashes, strings etc). Instead, it delegates the object creation into the shard thread using It works well but with huge objects (sets, for example) we need to load all their blobs into OpaqueObj and it takes up lots of memory and then later, stalls the process inside LoadItemsBuffer just because its CPU is busy creating millions of set fields from the OpaqueObj. The suggestion here is to break the creation flow into parts. Nothing in the high level flow should change, I think but Note, that right now the logic today - to treat "duplicate" keys during load with a logical error, which of course should change in case an To summarize: we have a code that loads series of blobs into memory (ReadObj, ReadSet, ReadHMap etc) and we have code that creates Dragonfly data structures The first milestone for this issue would be to convert |
We have an internal utility tool that we use to deserialize values in some use cases: * `RESTORE` * Cluster slot migration * `RENAME`, if the source and target shards are different We [recently](#3760) changed this area of the code, which caused this regression as it only handled RDB / replication streams. Fixes #4143
* fix: Huge entries fail to load outside RDB / replication We have an internal utility tool that we use to deserialize values in some use cases: * `RESTORE` * Cluster slot migration * `RENAME`, if the source and target shards are different We [recently](#3760) changed this area of the code, which caused this regression as it only handled RDB / replication streams. Fixes #4143
* fix: Huge entries fail to load outside RDB / replication We have an internal utility tool that we use to deserialize values in some use cases: * `RESTORE` * Cluster slot migration * `RENAME`, if the source and target shards are different We [recently](#3760) changed this area of the code, which caused this regression as it only handled RDB / replication streams. Fixes #4143
Currently
RdbLoader::LoadKeyValPair
loads all the data from a single entry, even if it's huge (set, map , zset, list etc).This creates the following problems:
FlushShardAsync
and will block replication.This might trigger
DflyCmd::BreakStalledFlowsInShard
on the master side, effectively cancelling a possibly correct full sync.If replica would be more responsive the whole chain would disappear.
Suggestion:
Change the loading code to streaming to support huge sets/lists etc. Allow adding multiple entries in parts. While the total CPU time for loading a huge entry will stay the same, the replica won't stall for dozens of seconds if the LoadTrace size will be capped.
The text was updated successfully, but these errors were encountered: