mirpasses: make call-argument fixup a MIR pass #818

zerbina · 2023-07-29T19:02:42Z

Summary

Replace the PNode-based fixup for call arguments in ccgcalls with a
MIR pass. This removes a dependency on PNode-based analysis from
cgen, and also makes it possible to, in the future, enable the fixup
for the JS or VM backends.

While the used analysis stays mostly the same, an observable evaluation-
order violation is fixed. Injecting (shallow) copies for values passed
to both immutable and mutable parameters now considers all parameters,
instead of only parameters to the right of immutable ones. For example,
given:

f(a, a) # proc f(x: var T, y: T)

this fixes mutations through x inside f being visible on y.

Details

The analysis used by the MIR pass works mostly the same as the one
previously used in ccgcalls: for each argument value that is not
explicitly passed by-reference, it is analyzed whether:

the value is potentially mutated after it is bound to the parameter
but before the procedure is called
the value is potentially also passed to a var parameter
If either is the case, the argument is shallow-copied to a temporary
that is then passed to the parameter instead.

The differences compared to the previous analysis are that:

checking whether the argument value (or something that potentially
overlaps with it in memory) is also passed to a mutable parameter
now also considers preceding parameters. Previously, only the
following parameters were checked
testing for overlapping values considers the whole path now, instead
of only the root. In effect, this means that for f(a.x, a.y), where
the second parameter is mutable, no temporary is (unnecessarily)
injected for the first parameter

For overlap testing, the maybeSameMutableLocation procedure is
introduced, which mirrors the behaviour of dfa.aliases (the routine
used by the previous PNode-based analysis). Since dereferences of
pointer-like values are treated like a normal field access, calls like
f(a[].x, b[].y), where one of the parameters is mutable and a and
b point to the same location, still cause observable evaluation-order
violations.

Finally, the analysis and temporary injection in ccgcalls is removed,
and a test is added for the fixed issue.

Summary ======= Replace the `PNode`-based fixup for call arguments in `ccgcalls` with a MIR pass. This removes a dependency on `PNode`-based analysis from `cgen`, and also makes it possible to, in the future, enable the fixup when the JS or VM backends are used (both are also affected by the issue). While the used analysis stays mostly the same, an observable evaluation- order violation is fixed. Injecting (shallow) copies for values passed to both immutable and mutable parameters now considers *all* parameters, instead of only parameters to the right of immutable ones. For example, given: ```nim f(a, a) # proc f(x: var T, y: T) ``` this fixes mutations through `x` inside `f` being visible on `y`. Details ======= The analysis used by the MIR pass works mostly the same as the one previously used in `ccgcalls`: for each argument value that is not explicitly passed by-reference, it is analyzed whether: - the value is potentially mutated *after* it is bound to the parameter but *before* the procedure is called - the value is potentially also passed to a `var` parameter If either is the case, the argument is shallow-copied to a temporary that is then passed to the parameter instead. The differences compared to the previous analysis are that: - checking whether the argument value (or something that potentially overlaps with it in memory) is also passed to a mutable parameter now also considers preceding parameters. Previously, only the following parameters were checked - testing for overlapping values considers the whole path now, instead of only the root. In effect, this means that for `f(a.x, a.y)`, where the second parameter is mutable, no temporary is (unnecessarily) injected for the first parameter For overlap testing, the `maybeSameMutableLocation` procedure is introduced, which mirrors the behaviour of `dfa.aliases` (the routine used by the previous `PNode`-based analysis). Since dereferences of pointer-like values are treated like a normal field access, calls like `f(a[].x, b[].y)`, where one of the parameters is mutable and `a` and `b` point to the same location, still cause observable evaluation-order violations. Finally, the analysis and temporary injection in `ccgcalls` is removed, and a test is added for the fixed issue.

zerbina · 2023-07-29T19:14:08Z

compiler/mir/mirpasses.nim

+  ## specified range, rather it means that the value *could* be mutated.
+  var i = start
+  while i <= last:
+    # all ``mnkTag`` nodes currently imply some sort of mutation/change


I'm increasingly starting to think that generalizing the original mnkModify into mnkTag was a mistake. As the comment mentions, all current value tags represent some form of mutation, but the name mnkTag doesn't really convey that.

I'll think about it some more, but my current opinion is that mnkTag should become mnkMutation (or similar).

unless the intention is to generalize across other effects, I agree I think a name indicating mutation/memory makes sense.

saem

very cool

saem · 2023-07-29T21:46:38Z

compiler/mir/mirpasses.nim

+  ## Do note that due to the placement of this pass (it happens after the
+  ## ``injectdestructors`` pass), only *shallow*, non-owning copies of the
+  ## affected arguments are made, meaning that there's the issue of resource-
+  ## like values (refs, seqs, strings, everything else that has a destructor)


I'm increasingly convinced that destructors, as realized are a misfeature, or one that should be deemphasized. instead we should trigger ops on the containing type, which would be better for data oriented programming and allow for better composition. the op interface might be: =receiveCopy, =takeOwnership, =endOfLife, etc.

What does instead we should trigger ops on the containing type mean concretely, since the current hooks are attached to the respective type?

type seq[T] = object # for discussion assume this definition data: uncheckedArray[T] len: int proc `=takeOwnership`[T](s: seq[T], t: T) = # ... whatever default impl we want # `seq` is a container and its 'containers ops' are triggered based on a `T`'s lifetime (like current type ops), but call `seq[T]`'s 'container ops' # this also opens the door on specialization based on the contained type and overriding might be feasible proc `=takeOwnership`(s: seq[Uri], u: Uri) = # ... specialized impl

I'm not sure about the actual events/hooks we want, but what I'm somewhat certain about is type ops triggered and called on the same type is 'off'. To clarify, Uri could have container ops of its own under this scheme, but they'd apply to the pile of string fields that are in its type definition and not the Uri type itself.

I had separately started thinking about triggered ops on containers before and I was coming at it from a DOD perspective. That further corroborated that type ops on the actual type aren't as useful wrt correctness or saving work, below is a snippet I was using to think about this.

Using the below example, if we specialized the Component types with type ops we'd end up with something "wrong". Those types would start having to know where and how they're stored, plus others couldn't reuse the components and associated systems easily (poor composition).

type Position* = object x*, y*: int64 Movement* = object dx*, dy*, ddx*, ddy*: float32 Component* = Position | Movement EntityData = object Store* = object allocator: Allocator entities*: seq[EntityData] positions*: seq[Position] movements*: seq[Movement] proc `=alloc`[T=Component or EntitData](s: Store, t: typedesc[T]) proc `=dealloc`[T=Component or EntitData](s: Store, a: T) proc `=transferInto`[T=Component or EntitData](s: Store, a: T) proc `=copyInto`[T=Component or EntitData](s: Store, a: T)

The last bit I should add is that we effectively do this sort of logic for the stack, the push/pop of activation records, lifting of envs and associated allocs/deallocs, etc. All that is more call stack/container focused, that's what's being bumped, that's the ever present context. If we were actually modelling a call stack, we wouldn't want to put the type ops on each call frame type (that's a lot of object types 😆), we'd want activation record lifecycle events to trigger container ops on the call stack (container).

Anyhow, I hope that makes some sense.

compiler/mir/mirpasses.nim

saem · 2023-07-29T22:02:13Z

tests/ccgbugs/targ_lefttoright.nim

+
+  var i = C(i: 1)
+  when nimvm: # XXX: doesn't work yet
+    discard


this gave me an idea, maybe we should add a doAssertKnownIssue that'll work with inverted behaviour.

I like it, I've come across multiple cases were that would have been useful.

One bug was that the wrong nodes were compared (`^1` was used instead of `^i`), and while this doesn't necessarily lead to incorrect behaviour, it does lead to overlap being detected where there is none. The second was with the node position for array element comparisons being wrong.

compiler/mir/mirpasses.nim

Co-authored-by: Clyybber <darkmine956@gmail.com>

saem · 2023-07-31T01:16:41Z

/merge

github-actions · 2023-07-31T01:17:08Z

Merge requested by: @saem

Contents after the first section break of the PR description has been removed and preserved below:

Notes for Reviewers

another preparation for compiler: introduce an IR for the code generators #551

the new pass is not enabled for the JavaScript and VM backends yet

zerbina added bug Something isn't working compiler/backend Related to backend system of the compiler simplification Removal of the old, unused, unnecessary or un/under-specified language features. labels Jul 29, 2023

zerbina added this to the MIR phase milestone Jul 29, 2023

zerbina requested review from saem and Clyybber July 29, 2023 19:02

zerbina commented Jul 29, 2023

View reviewed changes

saem approved these changes Jul 29, 2023

View reviewed changes

zerbina added 2 commits July 30, 2023 14:41

mirpasses: slightly adjust doc comment wording

487d553

Clyybber approved these changes Jul 30, 2023

View reviewed changes

compiler/mir/mirpasses.nim Outdated Show resolved Hide resolved

refactor: move loops into a common block

7a97752

Co-authored-by: Clyybber <darkmine956@gmail.com>

chore-runner bot added this pull request to the merge queue Jul 31, 2023

Merged via the queue into nim-works:devel with commit df0aaf3 Jul 31, 2023
18 checks passed

zerbina deleted the call-argument-fixup-as-mir-pass branch July 31, 2023 18:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mirpasses: make call-argument fixup a MIR pass #818

mirpasses: make call-argument fixup a MIR pass #818

zerbina commented Jul 29, 2023 •

edited by github-actions bot

Loading

zerbina Jul 29, 2023

saem Jul 29, 2023

saem left a comment

saem Jul 29, 2023

Clyybber Jul 30, 2023

saem Jul 31, 2023

saem Jul 29, 2023

zerbina Jul 30, 2023

saem commented Jul 31, 2023

github-actions bot commented Jul 31, 2023

Notes for Reviewers

mirpasses: make call-argument fixup a MIR pass #818

mirpasses: make call-argument fixup a MIR pass #818

Conversation

zerbina commented Jul 29, 2023 • edited by github-actions bot Loading

Summary

Details

zerbina Jul 29, 2023

Choose a reason for hiding this comment

saem Jul 29, 2023

Choose a reason for hiding this comment

saem left a comment

Choose a reason for hiding this comment

saem Jul 29, 2023

Choose a reason for hiding this comment

Clyybber Jul 30, 2023

Choose a reason for hiding this comment

saem Jul 31, 2023

Choose a reason for hiding this comment

saem Jul 29, 2023

Choose a reason for hiding this comment

zerbina Jul 30, 2023

Choose a reason for hiding this comment

saem commented Jul 31, 2023

github-actions bot commented Jul 31, 2023

Notes for Reviewers

zerbina commented Jul 29, 2023 •

edited by github-actions bot

Loading