-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slogSender using positions in file instead of mmap #8889
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach and architecture looks good!
I don't know what your plan is for what you want to do before you take this PR out of draft, so please forgive me if I'm pointing out what you consider to be already obvious.
To get it out of draft I want:
|
haven't reviewed or read any comment yet, but while the send is sync, there is an async flush operation which can be used to guarantee previous writes succeeded. |
Thanks for pointing that out. I'm convinced I can make the writes async without any API changes. But it does add a bit of work so please let me know whether you think it's necessary. |
I am slightly concerned by the perf impact of the sync fs API for something enabled by default by all validators. I would not be as concerned if the slog sender agent defaulted to a subprocess, which we should consider doing instead. If not, we do have utilities around to build async queues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass reviewing commit-by-commit, spread over the day when I had time. As such some earlier comments may no longer be accurate.
The request for change is for the queueing mechanism, and the params of the writeJSON
.
@@ -31,10 +31,14 @@ const I_ARENA_START = 4 * BigUint64Array.BYTES_PER_ELEMENT; | |||
|
|||
const RECORD_HEADER_SIZE = BigUint64Array.BYTES_PER_ELEMENT; | |||
|
|||
/** | |||
* Initializes a circular buffer with the given size, creating the buffer file if it doesn't exist or is not large enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I've observed that with the current mmap implementation, resizing does not actually work (crashes if I remember, I forgot to write down my findings in #8425)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, more reason for this PR I suppose. Worth writing it there now?
For good measure I'll force integration as that causes more stress on the flight-recorder |
a859475
to
3612f98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great so far.
I'd like to re-do a full pass after a rebase, especially if you can split the commit titled feat: simple CircularBuffer with fs offsets
fs.readSync(file.fd, outbuf, { | ||
length: firstReadLength, | ||
position: Number(readStart) + I_ARENA_START, | ||
}); | ||
if (firstReadLength < outbuf.byteLength) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about this suggestion?
3612f98
to
e7b2d46
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for refactoring this!
Approving, but I removed the automerge label in case you want to address the read bytes / length checks.
arena.set(record.subarray(0, firstWriteLength), Number(circEnd)); | ||
if (firstWriteLength < record.byteLength) { | ||
// Write to the beginning of the arena. | ||
arena.set(record.subarray(firstWriteLength, record.byteLength), 0); | ||
} | ||
header.setBigUint64( | ||
I_CIRC_END, | ||
(circEnd + BigInt(record.byteLength)) % arenaSize, | ||
); | ||
|
||
return writeRecord(record, firstWriteLength, circEnd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I note that technically the arena header is now updated before the record is written, but that is not consequential
// Allocate for the data and a header | ||
const record = new Uint8Array(RECORD_HEADER_SIZE + data.byteLength); | ||
// Set the data, after the header | ||
record.set(data, RECORD_HEADER_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never realized we were copying the data so many times, but that under the future improvement category.
const bytesRead = fs.readSync(file.fd, outbuf, { | ||
length: firstReadLength, | ||
position: Number(readStart) + I_ARENA_START, | ||
}); | ||
|
||
if (bytesRead < outbuf.byteLength) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so I went back to read the caller's logic, and basically we can use either bytesRead
or firstReadLength
here because we should technically be checking that bytesRead === firstReadLength
in the first place.
Since we're pre-allocating a file the data should be available, but there are technically signal cases where the read may still return less data than was requested.
None of the read in this implementation seem to check the actual number of bytes read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out bytesRead could come up short. This never had a way to handle that so I'll just assert that it didn't happen.
// TS saying options bag not available | ||
0, | ||
firstWriteLength, | ||
Number(circEnd) + header.byteLength, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: to be consistent this should also be I_ARENA_START
(or revert the others and use header.byteLength
everywhere)
e7b2d46
to
515ef12
Compare
refs: #8636
Description
On the path to Node 20 compatibility, we need to drop the
bufferfromfile
dependency. (It doesn't work in Node 20 and fixing it is prohibitive: agoric-labs/BufferFromFile#2). We decided to give on on mmap and just use file offsets.This refactors
flight-recorder.js
to abstract the file IO and implements an alternate IO using sync FS methods.Then it changes the code using the mmap-ed IO to use the fs IO.
Finally it removes the mmap-ed version. The factoring could be simplified more after that but imo it helps to have the separate concerns in separate functions.
Security Considerations
n/a
Scaling Considerations
Writes are async. Clients can
forceFlush
when it matters. The tests do this to ensure they're reading after the writes.Documentation Considerations
Nothing relevant.
Testing Considerations
I think the existing test coverage is enough.
Upgrade Considerations
n/a