-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meta-proposal: Comptime memory management reform #5895
Comments
I don't have time to read this over properly, the one thing I will say is I think there should be a defined, relatively small limit on how much memory can be acquired at comptime. |
Updated the proposal with one possible mechanism for this. |
I don't think a builtin to update it helps; there should be a finite limit to how much memory a program is allowed to demand at compile time. For instance, under no circumstances should a program ever reach the GiB range at comptime. That's very clearly an abuse of the compile time mechanisms, and there's no reason to allow it. |
I'm hesitant to include such a restriction, as we may disallow as yet unforeseen legitimate use cases -- however, I do recognise the danger of giving the program too much control. Updated with possibly a better solution. |
With #7396, the crux of this proposal (eliminating comptime closure) can be assumed, hence the individual concepts presented make sense on their own. On advice of the committee, selected individual proposals have been reopened/created, and this amalgam is obsolete. |
Reconstructed from the corpses of #5675, #5718, #5873, #5874, #5881. I'm sorry.
Comptime Memory Management Reform
Currently, comptime is automatically memory managed, and relies on closure for persistent mutable state. This is good for flexibility, but has a few problems:
I believe we can solve these problems by unifying some aspects of comptime and runtime semantics.
What Should Not Be Allowed
This is currently the only way of bundling comptime state with runtime data, and is used extensively for such applications. The problem is that
GiveTwice
is memoised, which is necessary for generics, however it means that every call togive
references the sametimes
, and every call after the second fails. The user should not have to think about comptime execution order, and yet this enforces it.Comptime "superpowers" shall include operations on types and functions, passing anything between functions by value,
anytype
variables/fields/variants,anytype
arrays (where every element is a different type), operations oncomptime_int
andcomptime_float
, concatenation (++
), repetition (**
), and assigning to slices (as long as length is preserved). I think this covers everything currently possible at comptime -- if I've missed anything, let me know.Crucially, this does not impose all of runtime's restrictions on comptime. Internals of execution may still depend on implicit dynamic allocation -- the key is that this is not exposed to the user, to encourage performant and time-agnostic code.
Alternative Solutions
Misc.
I propose we rename
comptime_int
andcomptime_float
toint
andfloat
, so field and parameter decls aren't so long and repetitive. This is not without precedent -- we have unadorned comptime-only types (type
,fn
types once #1717 comes through) and comptime-only operators (++
,**
,/
and%
on unsigned integers),int
andfloat
would still mean the same thing everywhere, and acomptime
annotation would still be required in those places.Mutability
It is useful to be able to call out to a function to mutate some state. I propose, that when a function receives a
comptime
parameter with a pointer or slice (reference) type, it will cause operations on that reference to be executed at comptime, much like the semantics ofcomptime var
. Consider the following code:hybridIncrement
executes partially at comptime, partially at runtime. Comptime or hybrid-time function calls that pass mutable references shall not be memoised, for predictability -- all other comptime function calls shall be memoised. The values of immutable comptime references are reified to static constants at runtime, unless the value is or containsint
orfloat
. References to these values, or the values of mutable comptime references, are forbidden from being read at runtime, and attempted codegen of these will fail compilation -- if the goal is to initialise data with comptime logic, a statically-typed comptime constant must be created, or the values must be explicitly copied over. (Mutable comptime references may have different values throughout the source, so for consistency would need to be instantiated in the output for every access by runtime -- this is unintuitive and wasteful.)Bundling Comptime State with Runtime Data
I propose we allow structs to exist in hybrid time, like so:
comptime
fields may also beanytype
. Such structs may not bepacked
orextern
. A pointer to such a struct has a defined runtime representation, that leaves out some fields -- thus, it does not make sense to reify these references to static constants, and so values of this type need not be annotated withcomptime
or some hypotheticalhytime
keyword. (For this to be efficient, we would need to implement generic deduplication.)Acquiring (and Releasing) Memory
At runtime we would leverage syscalls for this, which of course is not possible -- however, we can take inspiration from them. I propose two new builtins:
@alloc(T: type, len: int) *[len]T
and@resize(buf: *[_]anytype, len: int) void
. These can only be run at comptime, and produce comptime values:@alloc
returns a mutable buffer,@release
resizes it (resizing it to 0 frees it). The compiler guarantees that no allocations will overlap.T
may beanytype
, and then every element of the buffer may be a different type -- this of course requires implicit dynamic allocation, but in a controlled, defined way that does not necessitate lifetime tracking. The comptime allocation quota is by default set to some small value, say a few kilobytes, and is increased by a command line option, say-Dalloc-quota=3M
or something similar -- this accommodates as yet unforeseen uses of comptime without making it too easy to abuse sneakily. The returned space is always mutable, as immutable state can simply be passed by value; alignment is not considered, as the extra cost of misaligned access can be absorbed at comptime; operations will never fail in a program-visible way, as builds must remain deterministic -- if the compiler cannot perform the operation, compilation will simply fail.With these, defining and using a simple comptime allocator looks like this:
With these changes, we'll be able to use the same techniques at comptime that we can at runtime in more places, and the resulting code for applications will be more efficient and easier to manage. Furthermore, the user will have a much easier time reasoning about the semantics of comptime code.
Addendum: The Branch Quota
Currently, there is a single global branch quota that cannot be read. This has two problems.
Firstly, an irresponsible author of a long-running calculation may, instead of reasoning about its behaviour, choose to simply increment the quota in a loop, bypassing the purpose of a quota altogether:
Secondly, a responsible author who wishes to increase the quota must take a blind guess as to how many branches have already been taken:
Allowing the quota to be read solves the second problem but worsens the first; restricting
@setEvalBranchQuota
to global scope solves the first but worsens the second.I propose that we abolish
@setEvalBranchQuota
entirely, in favour of a new builtin:@addBranchQuota(comptime q: int) void
. This may only be called within linear scope. It imbues the scope with a branch quota of its own initialised toq
, counted separately from the global quota, and only valid within that scope. A branch first draws from its own scope's quota, then when that is exhausted (or if it does not exist) it draws from the enclosing linear scope's quota if applicable, then from the calling function's quota, and so on up the call stack until it reaches the global quota. A loop repeat draws from the scope surrounding the loop and not the loop itself, so a loop cannot increment its own quota to infinity; branch debt flows up the dynamic call graph, so the compiler can generate an exact trace of any overdraw. For reproducibility (specifically, order-independence), memoised functions also cache their branch debt, and this is counted for each invocation.The following actions draw from the quota:
while
loopThese are the only actions which could potentially loop infinitely -- note especially that
for
loops are not included, as they iterate up to a fixed number of times. A function or inline expansion recursion draws from the scope containing the earliest invocation -- otherwise a function could call itself and increment its own quota (other actions still draw from the innermost local quota). If the compiler decides to pre-compute values not explicitly marked ascomptime
, or inlines non-inline
functions or loops, the mechanism for bounding these actions is internal to the compiler and separate from the local quota -- it would be unfair and unpredictable to do otherwise.Increasing the global quota cannot be done from within the program -- it requires a command line option, say
-Dbranch-quota=2000
or something similar, just like the acquire quota. (The added complexity of a local quota system does not make sense for allocations -- the concept of memory footprint is an implementation detail at comptime, and the compiler can benefit from freedom of allocation representation to save space, so debting a specific amount of memory to each scope would be impractical and inefficient.)The text was updated successfully, but these errors were encountered: