-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal: data-oriented storage for TLib
#797
internal: data-oriented storage for TLib
#797
Conversation
Summary ======= Replace storing `TLib` objects as `ref`s in `TSym` with storing all `TLib` instances belonging to a module in a sequence that a `TSym` then stores an index into. This slightly reduces the size of `TSym` and, more importantly, makes serializing the symbols and associated `TLib` instances easier: instead of storing a `PackedLib` for *each* packed symbol, the packed symbols now only store an index into the attached-to module's packed lib list. Details ======= For simplicity, the list with the libs for a module is stored in a separate `seq` in `ModuleGraph`. The list cannot be stored in `PContext`, as it needs to be accessible after semantic analysis, at which point `PContext` is already gone. Storing the `TLib` objects for a module happens when the module is closed. While not strictly necessary at the moment, this is a preparation for storing a `PSym` in `TLib.name` that is only set up once the module is closed. Aside from reducing the size of rodfiles, not storing a `PackedLib` for each `PackedSym` also fixes another inefficiency, namely that each deserialized got a unique, possibly duplicated, `TLib` instance.
A symbol can now reference the `TLib` belonging to a module that is not the one the symbol is attached to. This works around an issue with aliased structural types. Also, use the ID provided by the module's ID generator instead of the `skModule`s position. While both represent the same thing (a ``FileIndex``), querying the ID generator is more consistent with how it works elsewhere.
A module ID isn't necessarily the same before and after loading/storing a rodfile. Loading and storing of `LibId`s now considers this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the overall approach, I can see it nudging us towards further convergence between existing compiler and IC.
@@ -1615,6 +1615,20 @@ type | |||
name*: Rope | |||
path*: PNode ## can be a string literal! | |||
|
|||
LibId* = object | |||
## Identifies a ``TLib`` instance. The default value means 'none'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth adding a unknownLibId
or the like as a constant below to further 'reserve' the concept of the default value LibId(module: 0, index: 0)
as none.
Update: reading more of the PR, there is a similar tuple type for IC, so we wouldn't end up using that const/this object type all that much. Seems less worth it practically speaking, mostly there to signal intent.
# the index is 1-based, so we can directly use the length | ||
result = LibId(module: module.int32, index: g.libs[module].len.uint32) | ||
|
||
proc storeLibs*(g: ModuleGraph, module: int) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proc storeLibs*(g: ModuleGraph, module: int) = | |
proc ownTheLibs*(g: ModuleGraph, module: int) = |
jk jk 😆
/merge |
Merge requested by: @saem Contents after the first section break of the PR description has been removed and preserved below:
|
Summary
Replace storing
TLib
objects asref
s inTSym
with storing allTLib
instances belonging to a module in a sequence that aTSym
thenstores an index-like reference into.
This makes serializing the symbols and associated
TLib
instanceseasier: instead of storing a
PackedLib
for each packed symbol, thepacked symbols now only store a module+index tuple.
Details
For simplicity, the list with the libs for a module is stored in a
separate
seq
inModuleGraph
. The list cannot be stored inPContext
, as it needs to be accessible after semantic analysis, atwhich point
PContext
is already gone.While the handle to the
TLib
instance stored withTSym
wouldideally be a single index into the
TLib
list of the module asymbol is attached to, the way symbols of aliased structural types
currently work (the aliased type's
TSym
content is partiallycopied) makes this not possible, meaning that a symbol also has to
store the ID of the module the referenced
TLib
instance is partof.
Storing the
TLib
objects for a module happens when the module isclosed. While not strictly necessary at the moment, this is a
preparation for storing a
PSym
inTLib.name
that is only setupwhen the module is closed.
Aside from reducing the size of rodfiles, not storing a
PackedLib
for each
PackedSym
also fixes another inefficiency, namely thateach deserialized
TSym
got a unique, possibly duplicated,TLib
instance.