-
Notifications
You must be signed in to change notification settings - Fork 92
Commit_Concurrency
Concurrency in ZODB is provided via transactions. Separate logical threads of execution (processes, threads, or even greenlets) execute independently and synchronize their data at transaction boundaries.
Transactions are committed using a two-phase commit protocol.
Traditionally, storages only committed one transaction at a
time. This was ensured using commit locks that were acquired at the
beginning of the commit process (tpc_begin
) and released at the
end (tpc_abort
or tpc_finish
). This is wasteful and limits
transaction throughput and increases latency.
ZEO servers provide somewhat finer-level concurrency, holding a lock
only in the second phase of 2-phase commit, and NEO takes this further, by
holding object-level locks. NEO only truly gets a storage-global lock
in tpc_finish
, when a final transaction id is assigned.
In general, we should assume that multiple transactions can be in
flight at once, with object-level locks held in the second phase, and a
global lock needed only for tpc_finish
.
This has implications for transaction ids. Ideally, transaction ids
wouldn't be set until tpc_finish
. Return values from existing APIs
that return new serials (store
and tpc_vote
) aren't useful
except in the special case where a storage signals that a conflict was
resolved. In the near future, ZODB will stop requiring storages to
return new serials for objects in store
and vote
, except to
indicate conflict resolution.
If there are multiple transactions in flight, we need to be able to keep track of multiple sets of transaction data.
This simplest approach is to just hang data on transaction objects, as suggested In a post to the transaction list
ZEO defined an API for getting the current transaction. This API doesn't make any sense anymore. The API is used primarily for logging and in tests. ZEO servers should be changed to manage their own data for this purpose, since they control their storage life cycles anyway.