-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Snapshot Isolation OCC to DBTransaction #19
Conversation
Bringing in the new changes from TypeScript-Demo-Lib-Native. But without the application builds because this is a pure library. And also removing deployment jobs. |
Ok it's time to finally bring in the leveldb source code and start hacking C++. |
I'm also going to solve the problem with key path here and probably prepare it for merging by cherry picking into staging, this can go along with a number of other CI/CD changes too. |
@emmacasolin I'm changing the /**
* Iterator options
* The `keyAsBuffer` property controls
* whether DBIterator returns KeyPath as buffers or as strings
* It should be considered to default to true
* The `valueAsBuffer` property controls value type
* It should be considered to default to true
*/
type DBIteratorOptions = {
gt?: KeyPath | Buffer | string;
gte?: KeyPath | Buffer | string;
lt?: KeyPath | Buffer | string;
lte?: KeyPath | Buffer | string;
limit?: number;
keys?: boolean;
values?: boolean;
keyAsBuffer?: boolean;
valueAsBuffer?: boolean
};
/**
* Iterator
*/
type DBIterator<K extends KeyPath | undefined, V> = {
seek: (k: KeyPath | string | Buffer) => void;
end: () => Promise<void>;
next: () => Promise<[K, V] | undefined>;
[Symbol.asyncIterator]: () => AsyncGenerator<[K, V]>;
}; This means now |
This also allows the key returned by the iterator to be later used by seek or the range options. This will impact downstream EFS and PK usage though. But find and replace should be sufficient. |
I've updated the |
I've also created a The |
c6731ec
to
fbf8ab3
Compare
If the problem is double encoding then couldn't this be solved by having clear barriers to where and when the encoding is applied? We need the encoded form internally but the the user needs the un-encoded form. Doesn't this mean encoding conversion only needs to happen when we pass to and from the user? |
There are clear barriers. It's only applied when user passes input into the system. The problem is only for the iterator. Because iterator returns your the "key" that you're supposed to use later for other operations. Then the solution is not encode the key then again while inside the iterator. Solving the double escaping problem. |
Something I recently discovered the effect of the empty key. That is if you have a Therefore if you use the empty key, in To solve this, I had to change the default iterator option to instead of using For example: if (options_.gt == null && options_.gte == null) {
options_.gte = utils.levelPathToKey(levelPath);
} This now ensures that the empty key shows up within the level during iteration. |
Extra test cases were added for empty keys now, and this actually resolves another bug involving empty keys. describe('utils', () => {
const keyPaths: Array<KeyPath> = [
// Normal keys
['foo'],
['foo', 'bar'],
// Empty keys are possible
[''],
['', ''],
['foo', ''],
['foo', '', ''],
['', 'foo', ''],
['', '', ''],
['', '', 'foo'],
// Separator can be used in key part
['foo', 'bar', Buffer.concat([utils.sep, Buffer.from('key'), utils.sep])],
[utils.sep],
[Buffer.concat([utils.sep, Buffer.from('foobar')])],
[Buffer.concat([Buffer.from('foobar'), utils.sep])],
[Buffer.concat([utils.sep, Buffer.from('foobar'), utils.sep])],
[Buffer.concat([utils.sep, Buffer.from('foobar'), utils.sep, Buffer.from('foobar')])],
[Buffer.concat([Buffer.from('foobar'), utils.sep, Buffer.from('foobar'), utils.sep]),
],
// Escape can be used in key part
[utils.esc],
[Buffer.concat([utils.esc, Buffer.from('foobar')])],
[Buffer.concat([Buffer.from('foobar'), utils.esc])],
[Buffer.concat([utils.esc, Buffer.from('foobar'), utils.esc])],
[Buffer.concat([utils.esc, Buffer.from('foobar'), utils.esc, Buffer.from('foobar')])],
[Buffer.concat([Buffer.from('foobar'), utils.esc, Buffer.from('foobar'), utils.esc])],
// Separator can be used in level parts
[Buffer.concat([utils.sep, Buffer.from('foobar')]), 'key'],
[Buffer.concat([Buffer.from('foobar'), utils.sep]), 'key'],
[Buffer.concat([utils.sep, Buffer.from('foobar'), utils.sep]), 'key'],
[Buffer.concat([utils.sep, Buffer.from('foobar'), utils.sep, Buffer.from('foobar')]), 'key'],
[Buffer.concat([Buffer.from('foobar'), utils.sep, Buffer.from('foobar'), utils.sep]), 'key'],
// Escape can be used in level parts
[Buffer.concat([utils.sep, utils.esc, utils.sep]), 'key'],
[Buffer.concat([utils.esc, utils.esc, utils.esc]), 'key'],
];
test.each(keyPaths.map(kP => [kP]))(
'parse key paths %s',
(keyPath: KeyPath) => {
const key = utils.keyPathToKey(keyPath);
const keyPath_ = utils.parseKey(key);
expect(keyPath.map((b) => b.toString())).toStrictEqual(
keyPath_.map((b) => b.toString()),
);
}
);
}); |
Needs rebase on top of staging now. |
Time to rebase. |
0bfd7db
to
a8ba7b0
Compare
👇 Click on the image for a new way to code review
Legend |
bd9ca76
to
7b69ba9
Compare
I was looking at 2 codebases to understand how to integrate leveldb. It appears the classic-level is still using leveldb 1.20 which is 5 years old. The rocksdb is bit more recent. The leveldb codebase has a bit more of a complicated build process. The top level
The Note that the #include <leveldb/db.h> These headers are not specified by the The
These are all organised under I think for us, we should just copy the structure of So it seems that leveldb has alot of legacy aspects, rocksdb compilation is alot cleaner. In fact lots of tooling has stopped using gyp file. We can explore that later. |
The presence of the snappy submodule means cloning has to be done now with |
While porting over the The However both I'm not sure if this is true for non-linux platforms. The upstream classic-level does not bother with |
I found out what
So This is required due to: https://github.com/nodejs/node-addon-api/blob/main/doc/setup.md. The reason this is required is documented here: nodejs/node-addon-api#460 (comment). This should go into TS-Demo-Lib-Native too. |
731d1e0
to
af0e92a
Compare
…t demonstrating race condition thrashing
5bf710c
to
58a62f2
Compare
I've added a test Note that since the The main thing for implementing task 26, is to integrate
|
Task 17 can only be done after merging to staging. |
Ok to solve task 26, we need to introduce the It has to be shared between all the transactions, which means it's going to be stored on Each transaction will then expose a So it may look like: t1 = withF([db.transaction()], async ([tran]) => {
await tran.lock(['counter', Lock], ['someotherkey', Lock]);
});
t2 = withF([db.transction()], async ([tran]) => {
await tran.lock(['counter', Lock]);
});
await Promise.allSettled([t1, t2]); Note that However I'm wondering if its worth simplifying this. By convention we should be locking based on key paths, but really anything could be used. If we simplify things, one has to then be able to take just a string, and then the default lock constructor can be Furthermore if Most important is that the locks are only released after you commit or rollback. No deadlock detection, this can be addressed later. As for re-entrant locking, that's something that should also be done, but can also be done later. So future work:
But I do want to add in default lock constructor if it is not specified, it should just be |
Yea so I reckon this is a bit incorrect, since locks are not a separate resource within the transaction, instead transaction locks are properties of the transaction itself. Therefore This would allow one to do things like:
In addition to this, the To be able to do |
Ok turns out I have some problems with the design of In our I originally programmed this by adding the ability to However it turns out that this doesn't work if the This means the tracking of imperative unlocking has to be done by the user/owner of the Since locked keys are isolated to each transaction, then we can have each However the issue is that the Furthermore I found that it's not a good idea to use |
Ok It's all done, PCC locking is now available in |
e4471c7
to
a12afb4
Compare
a1a1fa6
to
567526b
Compare
Subsequent work must take place on the |
This should trigger v5 of js-db. |
Derived from #18
Description
This implements the snapshot isolation DB transaction.
This means DBTransaction will be automatically snapshot isolated, which means most locking will be unnecessary.
Instead when performing a transaction, there's a chance for a
ErrorDBTransactionConflict
exception which means there was a write conflict with another transaction.Users can then decide on their discretion to retry the operation if they need to (assuming any non-DB side-effects are idempotent, noops or can be compensated). This should reduce the amount of locking overhead we need to do in Polykey. We may bubble up the conflict exception to the user, so the user can re-run their command, or in some cases, in-code we will automatically perform a retry. The user in this case can be the PK client, or the another PK agent or the PK GUI.
There is still one situation where user/application locks are needed, and that's where there may be a write-skew. See snapshot isolation https://en.wikipedia.org/wiki/Snapshot_isolation for more details and also https://www.cockroachlabs.com/blog/what-write-skew-looks-like/.
In the future we may upgrade to SSI (serializable snapshot isolation) which will eliminate this write-skew possibility.
Additionally this PR will also enable the- already mergedkeyAsBuffer
andvalueAsBuffer
options on the iterators, enabling easier usage of the iterators without having to usedbUtils.deserialize<T>(value)
where it can be configured ahead of time.See this https://www.fluentcpp.com/2019/08/30/how-to-disable-a-warning-in-cpp/ as to how to disable warnings in C++ cross platform.
Also see: https://nodejs.github.io/node-addon-examples/special-topics/context-awareness/
Issues Fixed
keyPathToKey
to escape key parts #22Tasks
[x] 1. Added additional overloads to iterator for- already merged in stagingkeyAsBuffer
andvalueAsBuffer
DBTransaction
lifecyclesnapshotLock
to ensure mutual exclusion when using the snapshot iteratorgetSnapshot
as the last-resort getter forDBTransaction.get
DBTransaction.iterator
to use the snapshot iterator[x] 6. Upgraded dependencies to match TypeScript-Demo-Lib (excluding node v16 upgrade)- already merged in staging[x] 7. Add tests for- already merged into stagingkeyAsBuffer
andvalueAsBuffer
usage on the iterator, expectstring
andV
types. Existing tests should still work. Since by default iterator returns buffers. This is different fromget
which by default does not return raw buffers.[ ] 8. Maintain the last updated timestamp for each key being updated in transactions and also in the main DB- relying on rocksdb SI directly nowErrorDBTransactionConflict
sync
option when committingDBTransaction
by default, allow it to be set off[ ] 11. See if- no need for this, because we may have more interesting state in the db pathdb_destroy
can replace our removal of the db path[ ] 12. Catch- To be addressed later in Catch CORRUPTION error code and attempt automatic repair and reopen, or at least provide users the ability to directly repair #35CORRUPTION
error code and attempt automatic repair and reopen, or at least provide users the ability to call the command with a special exception for this - Introduce Snapshot Isolation OCC to DBTransaction #19 (comment)LevelDB
LevelDBP
DB.ts
directly[ ] 17. Update the CI/CD to auto-build the prebuilds, and to do auto-release as part of prerelease and release- to be done in staging ci: merge staging to master #38DBIterator
to maintain async lifecycle of the iterator - Introduce Snapshot Isolation OCC to DBTransaction #19 (comment) (it was a good thing we caught this during benchmarking, a serious performance regression that led us to discover a resource leak)[ ] 19. UseErrorDBLiveReference
to mean that an iterator or transaction object is still alive whendb.stop()
is called, and therefore it must prevent any stopping, we must there maintain a weakset for every transaction/iterator objects that we create. This can also be used for subdatabases, where you create a prefixed DB in the future.WeakSet
does not support size or length, there's no way to know how many of these objects are still aliveDB
and subtract an allocation counter instead, or just maintain reference to objects and remove themSet
then, anddestroy
removes them from the setErrorDBLiveReference
, just auto-closes the same as in C++transaction_*
native functions:transactionInit
transactionCommit
transactionRollback
transactionGet
transationGetForUpdate
transactionPut
transactionDel
transactionMultiGet
transactionClear
transactionIteratorInit
transactionSnapshot
transactionMultiGetForUpdate
GetForUpdate
, this "materializes the conflict" in a write skew so that it becomes a write write conflict. Although rocksdb calls this a read write conflcit. SSI is not provided by rocksdb, it is however available in cockroachdb and badgerdb, but hopefully someone backports that to rocksdb.DBIterator
doesn't seem to work when no level path is specified, it iterates over no records at all, it's possible that our iterator options don't make sense when thedata
sublevel isn't used.levelPath
andoptions
parameters, becauselevelPath
is a far more used thanoptions
, this does imply an API break for EFS... etc, but it should be a quick find and replaceFinal checklist