-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add checksum and other metadata to document prefix #72
Comments
Making the sequence number writer choosable would allow for consistent reads over multiple partitions even without an index, and hence allow rewriting indexes (see #24). |
Possible document header structure:
Optimal would be a block of 16 bytes, or a multiple (see #71).
Regarding timestamp resolution:
hence So if TrueTime like external consistency is to be used, more than 16bytes are required for the document header. Otherwise, there is only room for either an external global sequence number or a commit position. In the latter case, this could only be the file offset. |
Some benchmarking and testing in how to best write/read a 64bit timestamp to/from a buffer.
const scale = 0x100000000n;
buffer.writeUInt32BE(Number(time / scale), 0);
buffer.writeUInt32BE(Number(time % scale), 4);
var high = BigInt(buffer.readUInt32BE(0));
var low = BigInt(buffer.readUInt32BE(4));
var time2 = high * scale + low; But, Since ECMA Script timestamps ignore leap seconds (https://tc39.es/ecma262/#sec-time-values-and-time-range), it is safe to choose the Since hrtime has an arbitrary epoch // The epoch of the partition read from metadata, e.g.
const epoch = (new Date('2019-11-02T20:00:00')).getTime() * 1000.0;
// Epoch of current process execution
const timeStart = process.hrtime();
const currentEpoch = Date.now() * 1000.0;
function time() {
const delta = process.hrtime(timeStart);
return (currentEpoch - epoch) + (delta[0] * US_PER_SEC + delta[1] / 1000);
} Since Now considering some form of "external consistency" ala Spanner, we need to store the accuracy of the timestamp, so that during a write, we can wait at minimum the inaccuracy of time, before actually inserting a document
Since the document timestamp is stored relative to the partition epoch, it has relatively low magnitude and hence precision is free to store the timestamp accuracy e.g. in the decimal part of the timestamp. Addendum: |
Can't use partition file creation time as epoch for a monotonic clock, because *NIX filesystems don't consistently store a file creation time. See https://stackoverflow.com/questions/17552017/get-file-created-date-in-node Hence epoch will be stored in partition metadata and a clock can then measure monotonic time, which will work as a cross-partition ordering field. |
As of #80 the document header now contains an external sequence number as well as a (single) monotonic timestamp relative to the partition epoch which is stored in the partition metadata. Note: Spanner TrueTime uses two timestamps to make the uncertainty of the timestamp explicit. It basically encodes a range of valid times the data was written at. In most cases, this uncertainty is very low in relation to the actual timestamp, e.g. a few ms for timestamps that may grow as old as a couple of years. Hence, it suffices to encode the uncertainty in the decimal part of a double precision float that denotes the amount of |
A checksum (crc32) in the document should help against corruption and torn writes.(see #31 (comment)) An additional field for document sequence number and commit position would allow 2pc protocols for potential cross-partition transaction with the partition storage. See #71The text was updated successfully, but these errors were encountered: