-
Notifications
You must be signed in to change notification settings - Fork 266
(NBS) *Prevent* noms sync from actively *hurting* chunk locality #2968
Comments
hmmm...I think this works, yes. There's a nagging concern about this in-memory list of hashes blowing up RAM, but it's probably OK to TODO that for now, yes? |
Yup. I think todo is fine. Also, it might be literally just as much work to add a feature to nbs to just iterate all chunks in "store" (insertion) order |
Now that I think about this more, it might actually have the happy side effect of improving the perf of writing over http (even if the server is backed by ldb) because the tmp ldb store is currently doing a ton of pointless compactions as we write. |
NBS benefits from related chunks being near one another. Initially, let's use write-order as a proxy for "related". This patch contains a pretty heinous hack to allow sync to continue putting chunks into httpBatchStore top-down without breaking server-side validation. Work to fix this is tracked in attic-labs#2982 This patch fixes attic-labs#2968, at least for now
NBS benefits from related chunks being near one another. Initially, let's use write-order as a proxy for "related". This patch contains a pretty heinous hack to allow sync to continue putting chunks into httpBatchStore top-down without breaking server-side validation. Work to fix this is tracked in attic-labs#2982 This patch fixes attic-labs#2968, at least for now
…2983) NBS benefits from related chunks being near one another. Initially, let's use write-order as a proxy for "related". This patch contains a pretty heinous hack to allow sync to continue putting chunks into httpBatchStore top-down without breaking server-side validation. Work to fix this is tracked in #2982 This patch fixes #2968, at least for now * Introduces PullWithFlush() to allow noms sync to explicitly pull chunks over and flush directly after. This allows UpdateRoot to behave as before. Also clears out all the legacy batch-put machinery. Now, Flush() just directly calls sendWriteRequests().
Re-opened and modified the bug title to reflect what's left of this. I think sync still effectively randomizes chunks within each height level. |
Here's a half an idea for this. Sync progressively writes chunks of lower & lower heights. That means as soon as a chunk of a given height X seen, we can learn nothing more about the locality of chunks of greater height. Thus, it seems reasonable to -Keep of buffer of incoming chunks This doesn't explain how we organize buffered yet, but it's a start. |
The new version of this tool now estimates the locality of a DB written using the "grandchild" strategy implemented by types.ValueStore. It does do by dividing each level of the graph up into groups that are roughly the size of the branching factor of that level, and then calculating how many physical reads are needed to read each group. In the case of perfect locality, each group could be read in a single physical read, so that's what the tool uses as its estimate of the optimal case. Toward attic-labs#2968
The new version of this tool now estimates the locality of a DB written using the "grandchild" strategy implemented by types.ValueStore. It does do by dividing each level of the graph up into groups that are roughly the size of the branching factor of that level, and then calculating how many physical reads are needed to read each group. In the case of perfect locality, each group could be read in a single physical read, so that's what the tool uses as its estimate of the optimal case. Toward attic-labs#2968
The new version of this tool now estimates the locality of a DB written using the "grandchild" strategy implemented by types.ValueStore. It does do by dividing each level of the graph up into groups that are roughly the size of the branching factor of that level, and then calculating how many physical reads are needed to read each group. In the case of perfect locality, each group could be read in a single physical read, so that's what the tool uses as its estimate of the optimal case. Toward attic-labs#2968
This patch implements a new strategy for Pull() that pulls the chunks from a given level of the graph over in the order they'll be encountered by clients reading the graph. Fixes attic-labs#2968
The new version of this tool now estimates the locality of a DB written using the "grandchild" strategy implemented by types.ValueStore. It does do by dividing each level of the graph up into groups that are roughly the size of the branching factor of that level, and then calculating how many physical reads are needed to read each group. In the case of perfect locality, each group could be read in a single physical read, so that's what the tool uses as its estimate of the optimal case. Toward #2968
This patch implements a new strategy for Pull() that pulls the chunks from a given level of the graph over in the order they'll be encountered by clients reading the graph. Fixes #2968
#2880 focused on offering guarantees about good chunk locality (related chunks being stored near each other in NBS), but it's now clear that two mechanisms in noms are actively working against this goal: (1) writing over http and (2) sync.
While we work on a design for retaining and enforcing locality throughout noms (including over http & sync), I think there's some low hanging fruit which would be compatible with the existing wire protocol, and that is to just start with preventing writing over http from destroying locality.
If we can do this, at least we will accomplish the common case of writing to a remote server the first time having decent locality.
Right now writing to a remote server involves temporarily storing chunks in a ldb instances and then shipping them all over on commit. The trouble is that what the code currently does is store them in ldb ordered by ascending ref height and then ascending hash. This has the effect of essentially shuffling all chunks of a given height on their way over to the server.
So here's my idea: just for the case of writing to a remote noms server, simply buffer out chunks in the order they were written (e.g. to a file), and then send them up to the server in exactly that order. The in-process validation will already retain the one required wire invariant: that no chunk shall be written before the server has seen all referenced chunks.
Implementation idea for this: essentially use a local nbs store rather than ldb. As we are writing keep a in-memory list of hashes written. On commit, just read them out in that order (which will be ideal read-order for nbs) as we ship them over to the server.
If this works, I think something slightly more complicated, but similar could work for the case of sync.
cc @cmasone-attic
The text was updated successfully, but these errors were encountered: