-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Improve type hints for data store usage #11165
Comments
Yes, see |
Are proposals 1 and 2 mutually exclusive? I.e. could we have runtime and build-time checks? I think any costs would be one-off setup costs at the point on instantiating the different handlers. |
re: Proposal 1
I think concrete classes are nicer; it feels a bit like an abuse of
We can make the type variable used in the generic have a bound to avoid this, but ideally we'd just do it all at once anyway. re: Proposal 2
I'm not so much a fan of this, since it seems like we should be able to do it statically rather than relying on our tests triggering |
I'm struggling to imagine what you mean here (and if we can check it statically, I don't see a need for dynamic checks on top, but maybe I misunderstood) |
Just to throw another point out here -- the storage classes frequently call |
Previous discussion, as I remember it:
|
Broadly agree in principle. But we don't yet run mypy across the whole source tree, and I wouldn't want us to lull ourselves into a false sense of typechecked security. In contrast, I think it's relatively easy to get runtime checks that we can be confident in. |
I don't know if this suggestion was brought up before, and although verbose, I thought it might be worth a mention: Instead of inheritance, each store is a field in the main store. We have abstract classes to describe anything that should be common across multiple stores, otherwise the individual stores will depend on a more specific type; stores that work anywhere accept an This is a bit of a sketch, but I haven't looked too deeply at whether this will fit in or not: class AbstractStore(abc.ABC): # I don't know if this is better as a Protocol or using abstract fields
# I want to write:
wombat_store: WombatStore
# but I think only this way works?: (someone please correct me!)
@property
@abstractmethod
def wombat_store() -> WombatStore:
...
@property
@abstractmethod
def spqr_store() -> AbstractSpqrStore:
...
class AbstractSpqrStore(abc.ABC): # this is a store common to both worker and main process, but with different implementations
@abstractmethod
def get_spqr_by_blahblah(blah: int) -> str: # as expected, list out the signatures of all the methods we would expect to be common.
...
class WorkerSpqrStore(AbstractSpqrStore): # implementation of the worker side. You can imagine there'd be a different one for the main process.
def __init__(self, store: AbstractStore) -> None:
self.store = store
def get_spqr_by_blahblah(blah: int) -> str:
return self.store.db_pool.simple_select_one(...)
class WombatStore(): # this store is agnostic about main process/worker process
def __init__(self, store: AbstractStore) -> None:
self.store = store
def get_wombat_blah_blah(x: int) -> str:
self.store.db_pool.simple_select_one(...)
def get_wombat_with_spqr(x: int) -> Tuple[str, str]:
# well you get the point, don't forget the runInteraction magic
return self.store.spqr_store.get_spqr_by_blahblah(...)
class WorkerStore(AbstractStore): # this is the top-level store for workers.
def __init__(self, hs) -> None:
self.wombat_store = WombatStore(self)
self.spqr_store = WorkerSpqrStore(self)
self.db_pool = hs.db_pool # don't look at me as though I know what I'm
# doing, but I guess something like this happens somewhere In a way, the top-level 'main store' acts as a typing trampoline for all the little stores. |
This sounds similar to what we do for the |
Everywhere in Synapse,
HomeServer.get_datastore()
is annotated as theDataStore
god class, which is incorrect for workers and allowed #11163 to slip into matrix.org, causing an outage. It would be a good idea for data store consumers (servlets, handlers, the module API, etc) to declare which aspects of the data store they need, and have CI check that we don't pass them a data store that's missing the required interfaces.There are three data store classes to consider:
DataStore
, which is the data store class used on the main processGenericWorkerSlavedStore
, which is the data store class used on worker processesAdminCmdSlavedStore
, which is the data store class used when running admin commands(?)DataStore
andGenericWorkerSlavedStore
overlap but aren't subtypes of each other.AdminCmdSlavedStore
is a subset ofGenericWorkerSlavedStore
functionality-wise, but not through inheritance.It's been suggested by @reivilibre that we define two or three types for use in type hints:
WorkerStore
MainStore
(a subtype ofWorkerStore
?)EitherStore = Union[WorkerStore, MainStore]
, if it turns out thatWorkerStore
contains functionality not inMainStore
These don't have to be concrete classes and could be
Protocol
s if needed.We could have more granular store requirements, but it wouldn't catch any extra bugs.
The code is currently structured like this in a lot of places:
Proposal 1:
HomeServer[WorkerStore]
We could add a covariant type parameter to
HomeServer
to have data store consumers declare which type of data store they need:HomeServer[WorkerStore]
would mean "a homeserver with a data store that supports at least the capabilities of aWorkerStore
".We'd have to do this refactor across the entire codebase at once, otherwise the type hints for data stores would be implicitly degraded to
Any
everywhere.Proposal 2:
get_worker_datastore()
@clokep suggests that we add new
get_*_datastore()
methods toHomeServer
that raise errors when called on the wrong process:mypy would not flag up issues, but any CI stage which spins up workers would fail.
The text was updated successfully, but these errors were encountered: