Support huge values in RdbLoader #3760

romange · 2024-09-22T07:15:26Z

Currently RdbLoader::LoadKeyValPair loads all the data from a single entry, even if it's huge (set, map , zset, list etc).
This creates the following problems:

A huge spike in memory during the load process (or in replica). Because we need to hold the load traces before we apply them on a shard.
The shard thread will stuck for many seconds while processing these traces (inside LoadItemsBuffer()), which in turn will block subsequent calls to FlushShardAsync and will block replication.
If fullsync replication is stalled on replica, it will eventually propagate to the master, blocking its snapshotting flow. At this point we can not distinguish between a deadlocked replication and a slow one.

This might trigger DflyCmd::BreakStalledFlowsInShard on the master side, effectively cancelling a possibly correct full sync.
If replica would be more responsive the whole chain would disappear.

Suggestion:
Change the loading code to streaming to support huge sets/lists etc. Allow adding multiple entries in parts. While the total CPU time for loading a huge entry will stay the same, the replica won't stall for dozens of seconds if the LoadTrace size will be capped.

The text was updated successfully, but these errors were encountered:

romange · 2024-09-22T07:19:06Z

Related to his PR: #3759 and created based on the learnings from this issue #3726

romange · 2024-09-23T11:52:38Z

In more detail: LoadKeyValPair loads all the blobs into memory (OpaqueObj) but does not create Valkey objects (i.e. hashes, strings etc). Instead, it delegates the object creation into the shard thread using FlushShardAsync function that takes batches of Items and creates an object from each one of them inside LoadItemsBuffer.

It works well but with huge objects (sets, for example) we need to load all their blobs into OpaqueObj and it takes up lots of memory and then later, stalls the process inside LoadItemsBuffer just because its CPU is busy creating millions of set fields from the OpaqueObj.

The suggestion here is to break the creation flow into parts. Nothing in the high level flow should change, I think but RdbLoaded::Item should be able to describe whether it's not the whole object but a part of it, and then LoadItemsBuffer will be able to "append" the additional parts to the already existing object.

Note, that right now the logic today - to treat "duplicate" keys during load with a logical error, which of course should change in case an Item is the continutation of the object that was broken down.

To summarize: we have a code that loads series of blobs into memory (ReadObj, ReadSet, ReadHMap etc) and we have code that creates Dragonfly data structures RdbLoaderBase::OpaqueObjLoader.

The first milestone for this issue would be to convert ReadSet, for example, into a streaming variant that adds one or more Items, and make sure that OpaqueObjLoader can handle this correctly.

We have an internal utility tool that we use to deserialize values in some use cases: * `RESTORE` * Cluster slot migration * `RENAME`, if the source and target shards are different We [recently](#3760) changed this area of the code, which caused this regression as it only handled RDB / replication streams. Fixes #4143

* fix: Huge entries fail to load outside RDB / replication We have an internal utility tool that we use to deserialize values in some use cases: * `RESTORE` * Cluster slot migration * `RENAME`, if the source and target shards are different We [recently](#3760) changed this area of the code, which caused this regression as it only handled RDB / replication streams. Fixes #4143

romange added the enhancement New feature or request label Sep 22, 2024

romange assigned andydunstall Sep 23, 2024

adiholden closed this as completed Oct 15, 2024

adiholden mentioned this issue Oct 22, 2024

test: add test to test big collections or collections with big values #3959

Merged

chakaz mentioned this issue Nov 19, 2024

fix: Huge entries fail to load outside RDB / replication #4154

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support huge values in RdbLoader #3760

Support huge values in RdbLoader #3760

romange commented Sep 22, 2024 •

edited

Loading

romange commented Sep 22, 2024

romange commented Sep 23, 2024 •

edited

Loading

Support huge values in RdbLoader #3760

Support huge values in RdbLoader #3760

Comments

romange commented Sep 22, 2024 • edited Loading

romange commented Sep 22, 2024

romange commented Sep 23, 2024 • edited Loading

romange commented Sep 22, 2024 •

edited

Loading

romange commented Sep 23, 2024 •

edited

Loading