-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass advisory lock calls to filesystem #15070
Comments
Sounds useful indeed. Yes please send a PR! |
cc @tlively @ethanalee - this looks relevant to #15041. In particular the first part about perf sounds like it supports a block-based design, and also we should make sure that the new system supports locking in the way needed here. @jlongster Thanks for the writeup! I agree, this sounds good for a PR. It sounds like this would need to have JS call into C code to do the lock, though, is that right - basically calling from JS into the musl streams code? For the new filesystem, I think we may have a lock per file in the file system itself (the new code that replaces the current JS code). That would make this a lot easier IIUC. However, interestingly, it would require that even a single-threaded build support file locks - which would have some overhead questions. |
Looking at the man page, F_SETLK and friends are much more complex than just having a lock per file because they let you set read/write locks on byte ranges within a file. This would be another great use case for backend composition in the new modular file system design; a lock management layer could be inserted in front of any other backend and not included in builds that don't need it. |
Hello! I meant to file this issue a while back.
Context
I wrote a custom filesystem that conforms to emscripten's filesystem API, and it is specially built for SQLite (using the sql.js project). It takes reads/writes from SQLite and persists them into IndexedDB. The difference between this and most other IDB filesystems is that it actually writes the data down in blocks, so you don't have to load the whole thing in memory to use it. It basically makes IDB a real-ish filesystem (in the way that filesystems read/write in blocks).
My project is here: https://github.com/jlongster/absurd-sql. This research led to some very interesting results, particularly that using SQLite with this backend is way faster than using raw IDB directly. The reason is simple: it naturally batches reads/writes to the number of times it calls out out to IDB is way lower. Since IDB is super slow, this makes it really really fast.
I wrote a post explaining the project and the results here: https://jlongster.com/future-sql-web
Problem
One of the critical things for this to work is locking. If you open the db in several different tabs, each tab has the potential to corrupt the database by overwriting other writes with out-of-date data. To solve this, SQLite uses posix adivsory locking. Each connection locks the db before writing (the whole process is explained in Atomic Commit In SQLite).
We need to respect these locks on the web, otherwise dbs will get corrupted.
The way my project works is it hooks into the filesystem calls. I found this to be an intuitive and easy way to hook SQLite up to IDB; more so than a custom SQLite vfs. Almost everything fell into place nicely (opening a db, handing short reads, etc).
There's one thing that didn't: locking. And the only reason it didn't is because emscripten doesn't expose the locking APIs for filesystems to handle. To solve this we had to extend the unix vfs and install our lock/unlock hooks, but this is painful because it requires users to call an initialization function and we need to track file pointers. That's all a bunch of implementation detail; basically we had to do a bunch of work to handle lock/unlock.
We shouldn't have to though! All we want to do is handle
fcntl(fd, F_SETLK, pLock)
. If we could handle that in the filesystem, we don't have to do anything special.(Read my blog post for more information about how we implement locks. In short, we leverage IDB transaction semantics for locking semantics. Locking is super important for us!)
The problem is emscripten right now assumes locking is successful:
emscripten/src/library_syscall.js
Line 1139 in 2126d3c
F_SETLK
andF_GETLK
.If we could have that it would simplify our SQLite use cases a bunch! Would you all be open to a PR?
The text was updated successfully, but these errors were encountered: