-
Notifications
You must be signed in to change notification settings - Fork 266
Modify httpBatchStore so that writing values maintains some locality #2983
Conversation
NBS benefits from related chunks being near one another. Initially, let's use write-order as a proxy for "related". This patch contains a pretty heinous hack to allow sync to continue putting chunks into httpBatchStore top-down without breaking server-side validation. Work to fix this is tracked in attic-labs#2982 This patch fixes attic-labs#2968, at least for now
PTAL Now that I've gone all the way through the patch, I'm not sure that having NomsBlockCache be distinct from NomsBlockStore makes a lot of sense. On one hand, it's a nice place to put the code that creates and deletes the temp directory that the cache data is in, and it means that I'm trying a little bit of further cleanup, but I broke something while doing so. I'll continue trying to chase that down, but please go ahead and review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTU.
In general, I think this is good. If you want to address some of these issues subsequently, I'm ok with that.
I think there's the opportunity to simplify things here quite a bit.
Unless I'm missing something, I don't think the Insert-return-whether inserted semantics are needed any more since we only ever send one write batch.
Also, it's a minor bummer that local sync (i.e. sync) still uses the ordered_put_cache.
I guess my temptation would be to go ahead and just replace the ordered_put_cache implementation with what is here called the NomsBlockCache, which just uses the normal NBS apis (the extract functions would need to become public).
} | ||
return | ||
} | ||
for i := len(ts.chunkSources) - 1; i >= 0; i-- { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion. i find reverse iteration eaiser to visual parse as
count := len(thing)
for count-- > 0 {
// do something at index |count|
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so would I!
Go, however, apparently doesn't. Increment and decrement aren't expressions in Go (statements? whatever the right term is. You can't use them inline like that :-/)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:-)
d.Chk.True(uint64(n) == chunkLen) | ||
|
||
sendChunk := func(i uint32) { | ||
localOffset := tr.offsets[i] - tr.offsets[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an interesting point. i think we have code elsewhere which assumes that chunk 0 is at offset zero in the file. this is kind of a nice defensive move. should we look for other places are calculating physical offset in the table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #2986
| Prefix Map | Sizes | Suffixes | | ||
+------------+-------+----------+ | ||
+------------+---------+----------+ | ||
| Prefix Map | Lengths | Suffixes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch
@@ -79,6 +79,20 @@ func (mt *memTable) getMany(reqs []getRecord, wg *sync.WaitGroup) (remaining boo | |||
return | |||
} | |||
|
|||
func (mt *memTable) extract(order EnumerationOrder, chunks chan<- extractRecord) { | |||
sort.Sort(hasRecordByOrder(mt.order)) // ensure "insertion" order for extraction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a little surprising. how could mt.order not be in insertion order on invocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be in insertion order, but we don't document that as a requirement, so I was programming defensively. I can just require that that be an invariant
dbDir string | ||
} | ||
|
||
// Insert stores c in the cache. If c is successfully added to the cache, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't actually think these semantics are provided. addChunk
returns true if the chunk was added or already contained to the current memTable, but doesn't tell you anything about whether the chunk was contained in the store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right. Maybe I don't actually need these semantics if I can get rid of all the goroutine machinery
return | ||
} | ||
|
||
bhcs.requestWg.Add(1) | ||
bhcs.writeQueue <- writeRequest{c.Hash(), hints, false} | ||
bhcs.writeQueue <- writeRequest{hints, false} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether the machinery around the write queue (the channel, the persistent goroutine that are servicing the write-queue and the flush queue) are just unneeded complexity now. this stuff is a hold over from the before time when we didn't have any validation and thus were writing in larger batches, but we haven't done that in a while.
i think it's possible now to just have SchedulePut
Put into the nbs (not care about whether the chunk was contained), and add its hints to the hint set, all inside the mutex, and have Flush
just send everything over.
i'm probably forgetting about some critical thing, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think this will work. As long as AddHints()
uses that mutex also, I think it will work.
@@ -448,8 +470,7 @@ func (bhcs *httpBatchStore) Root() hash.Hash { | |||
// UpdateRoot flushes outstanding writes to the backing ChunkStore before updating its Root, because it's almost certainly the case that the caller wants to point that root at some recently-Put Chunk. | |||
func (bhcs *httpBatchStore) UpdateRoot(current, last hash.Hash) bool { | |||
// POST http://<host>/root?current=<ref>&last=<ref>. Response will be 200 on success, 409 if current is outdated. | |||
bhcs.Flush() | |||
|
|||
bhcs.flushInternal(nbs.InsertOrder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is it the case that the sync user always sends over chunks by calling Flush and the simple write user always calls UpdateRoot()?
If so, I guess I'm ok with this, but it is a bit mysterious. If so, let's comment it somewhere and maybe put more detail in the bug you opened that the specific problem is that the API of http_batch_store is now kind of wonky.
As an alternative, maybe BatchStore gets a temporary method called "MarkFlushReverseOrder", or something that only sync calls before it starts writing top-down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the thing I was working on changing after I sent this patch, because I also don't like that this relies on making noms sync
only thing that calls BatchStore.Flush(). This patch made that so, but there's no guarantee at all that it would remain that way.
Something mysteriously broke when I changed things that way, so I'll need to chase that down
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. LGTM
} | ||
return | ||
} | ||
for i := len(ts.chunkSources) - 1; i >= 0; i-- { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:-)
c8b9c58
to
c076224
Compare
pull chunks over and flush directly after. This allows UpdateRoot to behave as before. Also clears out all the legacy batch-put machinery. Now, Flush() just directly calls sendWriteRequests().
c076224
to
88712ee
Compare
Actually, can you look again? I've squashed so that the second commit includes all the changes since yesterday, which includes getting rid of the old batch-put machinery, that |
Still LG |
NBS benefits from related chunks being near one another. Initially,
let's use write-order as a proxy for "related".
This patch contains a pretty heinous hack to allow sync to continue
putting chunks into httpBatchStore top-down without breaking
server-side validation. Work to fix this is tracked in #2982
This patch fixes #2968, at least for now