Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reader connection pool in native-driver #2125

Closed
wants to merge 10 commits into from

Conversation

andersio
Copy link
Contributor

@andersio andersio commented Dec 23, 2020

Reader pool

Implement a reader connection pool in the NativeSqlDriver as discussed in #1986.

This is an opt-in feature — it emulates the current SQLDelight default (one reader connection), unless maxConcurrentReader > 1 is specified when instantiating NativeSqlDriver. As a "bonus", if you always wrap everything in a transaction, no reader connection will be instantiated.

The new Pool<T> is implemented using a mix of lock-free atomics, pthread mutex and pthread condition. In summary:

  • borrowing is lock-free in the general case;
  • lack of available entries put the thread into suspension (via pthread_cond_wait()); and
  • one blocked thread (if any) is woken up every time a borrow is released (via pthread_cond_signal()).

Pool<T> is used for both the reader pool (configurable capacity), and the transaction/writer pool (hardcoded capacity of 1). In other words, SinglePool<T> has been removed.

Note that the native-driver module has been changed to four source sets: commonMain, commonTest, nativeDarwinMain and mingwMain. Almost all code previously in nativeMain/nativeTest are now common code, which give these code a working IDE experience as a side effect.

The source set reshuffling is to make space for a small set of platform-specific, expect-actual declarations, since now divergence is required between Windows and Darwin/UNIX, due to discrepencies in the imported API signatures of pthreads.

Runtime read-only enforcements

Two runtime checks have been enabled on reader connections:

  • PRAMGA query_only = 1: This instructs SQLite to reject any data manipulation statements.
  • ThreadConnection.isReadOnly: This allows SQLDelight to catch internal misuse of reader connections in accessConnection().

Future directions

@kpgalligan
Copy link
Collaborator

Will take a look at this soon-ish. I'm in the middle of a rework of the native driver, but starting at the sqliter level currently. There are a few other performance improvements that have been discussed but never implemented that I want to explore.

@andersio
Copy link
Contributor Author

andersio commented Dec 30, 2020

While having worked on this, I did wonder if thread-local connections are a simpler choice for today's Kotlin/Native, which also dodges the headaches around the cost of shared data structures in the current strict memory model.

With thread pools being absence from the ecosystem, likely until the new memory model is ready, users are likely sticking to the library/language defaults in their K/N applications, which means 1-2 threads at maximum. So thread local doesn't seem a bad choice for the time being, putting aside caveats when using connections in native callbacks.

@kpgalligan
Copy link
Collaborator

Most will be in main and default threads, but not everybody. I think some form of reader pool makes sense, but the implementation with Stately collections for statement caching, etc, is way overdue for an update. We had coded up some optimizations in the past that just never made it into the code base, but I really want to rethink how shared data works in the driver. I've had a lot of time to think on how KN does things since the first version of Stately. There are much better ways to manage sharing state. Not to rail on my own library, but those atomic collections are not great.

@andersio
Copy link
Contributor Author

andersio commented Dec 31, 2020

I fear we are going off topic here. 😛

But since you mentioned the data structures, having poked around a bit more, I see some potentials in removing/replacing the hot data structures in today's ThreadConnection. e.g.:

  • SQLiter adopting sqlite_close_v2 might render tracking active SqlCursors in ThreadConnection unnecessary.
  • We could have an insert-only statement cache, and track usage on a per-entry basis. This way the cache would have a fairly low write rate (once per statement per app invocation), and hence is suitable for the simpler copy-on-write approach.

@kpgalligan
Copy link
Collaborator

sqlite_close_v2 is interesting, but I need to understand that quite a bit better. To review.

I'd need to look at your cache changes better, but for write statements, there's no need to have more than one per write connection, and only one connection can be used for writing anyway, so the add/removing causes a lot of unecessary churn. We coded something for this almost 2 years ago now, measuring the driver performance: #1226. The changes never made it into the driver, though.

@ivyspirit
Copy link
Contributor

ivyspirit commented Jan 25, 2021

If the WAL mode is enabled, the write and read are supposed to be able to happen at the same time, even when reads are wrapped in a transaction. Which is most of the case for us: we wrap lots of our reads in transactions for different reasons. With the current design, the read transaction will not be able to start until the prev transaction is committed.

While having worked on this, I did wonder if thread-local connections are a simpler choice for today's Kotlin/Native, which also dodges the headaches around the cost of shared data structures in the current strict memory model.

With thread pools being absence from the ecosystem, likely until the new memory model is ready, users are likely sticking to the library/language defaults in their K/N applications, which means 1-2 threads at maximum. So thread local doesn't seem a bad choice for the time being, putting aside caveats when using connections in native callbacks.

In terms of the data structure, based on the profiling the bottle neck right now really is this function safePut. The data structure used to cache the statement frozenLinkedList and frozenHashMap are very expensive each time when modify the content. One proposal here is:

  1. Since we expect write always happen on the single thread, can we make the the write connection not shared by multiple threads?
  2. Support to pass in a dedicated write thread to the driver to create the write connection.
  3. In execute check the current thread. Throw exception if the execute does not happen on the dedicated write thread. It is client's responsibility to make sure that write transaction and write execution happen on the write thread.
  4. Replace the frozenList and frozenHashMap to normal data structures, will significantly help on the performance.
  5. Allow creating transaction from any Connection, so starting the read transaction will not get blocked by write. If there is write in the transaction, 3) above will do the safety check.

Sorry if this is not the right place for the discussion. I can move this to a different issue. Let me know what you think. @kpgalligan @andersio

@andersio
Copy link
Contributor Author

andersio commented Jan 25, 2021

@ivyspirit

Which is most of the case for us: we wrap lots of our reads in transactions for different reasons.

You mean lack of read transaction support, right? This is indeed a missing corner in the SQLDelight API.

I think it should be tackled independently since it impacts the API instead of just reader concurrency of the native-driver (i.e. this PR).

@ivyspirit
Copy link
Contributor

ivyspirit commented Jan 25, 2021

Yeah we can move to a separate discussion. But I don't think it is necessarily to change the API. If we do the thread checking in execute. The read transaction should be able to be started by any connections. But if there is write, it has to start on the write thread , which we will map to the write thread connection. Otherwise the check in execute would throw exception.

@kpgalligan
Copy link
Collaborator

I've been working on merging this with other planned changes. It's still a bit of a work in progress, but connection management has changed somewhat. The isolated write thread is going away (unless there's a really good reason to keep it), data structures are changing (no frozen on apple clients), write statements aren't removed from the cache, etc. Should be posting something soon.

@andersio
Copy link
Contributor Author

andersio commented Feb 17, 2021

@kpgalligan

The isolated write thread is going away

It is alright not to have a dedicated connection for writes. But I think it is still valuable to keep the application-level locking for any write (single statement or transaction), so people need not worry about transactions failing to start (edit: or upgrade) with SQLITE_BUSY in single-process setups.

@kpgalligan
Copy link
Collaborator

It is alright not to have a dedicated connection for writes. But I think it is still valuable to keep the application-level locking for any write (single statement or transaction), so people need not worry about transactions failing to start with SQLITE_BUSY in single-process setups.

Totally agree

kpgalligan@038afc4#diff-fdb8f8a5153034bba1cdc1adbb25f25280324223f01bc660b6507804cfce0925R24

My statement "Should be posting something soon" got a little derailed with other priorities but still intend to update soon.

@andersio
Copy link
Contributor Author

andersio commented Mar 4, 2021

(unless there's a really good reason to keep it)

I found a reason not to keep it.

I recently had problems with tests using in-memory DBs being intermittently stuck at a write operation. Turns out that in-memory SQLite DBs have been locked by a parallel reader when the writer attempts to write, because in-memory DBs don't support WAL and its concurrency model. So the 1 reader + 1 writer setup can cause issues, when application/test code does go parallel.

So allowing having exactly one connection can help with the in-memory DB usage, esp. as a substitute of WAL database in tests. Rollback journal too, I suppose, though it is probably rare on Apple platforms these days.

@AlecKazakova
Copy link
Collaborator

Closing and moving progress into #2303

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants