Skip to content

Commit

Permalink
prevent RVO and in-place construction via an MIR pass (#815)
Browse files Browse the repository at this point in the history
## Summary

Add a MIR pass that injects write-to-temporaries where necessary, which,
if enabled, guarantees to the C code generator that it can always use
RVO and in-place aggregate construction. This removes the dependency on
`isPartOf` from `cgen`, makes the return-value and in-place-construction
optimization related logic backend-agnostic, and also adds a basic
framework for MIR passes.

In theory, the VM code-generator could also use the in-place aggregate
construction optimization now.

In addition, multiple bugs where the in-place construction optimization
was not correctly prevented when using the C backend are fixed:
```nim
  # in the following statements, the left-hand side is (in terms of
  # behaviour) now fully evaluated first (previously it wasn't)
  arr = [1, arr[0]]
  arr = [1, (let v = arr[0]; v)]
  arr = [1, call(arr[0])]
  obj = Obj(a: 1, b: (var v = obj.a; v))

  # the following doesn't crash with an NPE anymore:
  obj = Obj(prc: ...)
  obj = Obj(prc: nil, other: obj.prc())
```

## Details

In preparation for the introduction of more MIR passes, a dedicated
module with the very basic framework is added. The only public routine
is the `applyPasses` procedure, which runs all passes enabled for the
specified backend. Since the targeted backend can be different from the
one selected by the user (because of compile-time execution), a
dedicated enum is used.

Due to their similarity in processing, the temporary injection for calls
and aggregate constructions is combined into a single pass. The pass
works by searching for assignments where the source operand is the
result of a call or aggregate construction and then running an analysis
for whether the assignment destination is used in a way that prevents
RVO or in-place construction. If the destination does, the source r-
value is assigned to an intermediate temporary first.

In the abstract, the used analysis works similar to `isPartOf`, but it
is, apart from the type-analysis, implemented from scratch for the MIR.

### `transf`

For the purpose of making assignments like `x = (x.a, x.b)` work when
using the C backend, `transf` previously unpacked the literal tuple
construction expression and then repacked it, like so:
```nim
let
  a = x.a
  b = x.b
x = (a, b)
```
While this does what is intended, it introduced an left-to-right
evaluation-order violation when the `x` has side effects. The new MIR
pass taking care of introducing a temporary allows for the removal of
``transformAsgn``, fixing the aforementioned issue and slightly reducing
the amount of work for the `injectdestructors` pass (because there are
less locals to analyze).

The removal does have an unintended side-effect: r-value inputs to
literal tuple construction expressions are not properly destroyed when
an exception is raised (see the changed `ttuple_unpacking.nim` test).
This is a known and documented issue for array and object constructions,
which now also affects tuple constructions.

### `cgen`

Calls that are the source operand to an assignment and return the result
via an out parameter always use the assignment destination as their
`Result` argument now; the `preventNrvo` procedure is renamed to
`reportObservableStore` and the related injection of temporaries
removed.

In `genObjConstr`, a temporary now only has to be created for `ref`s and
when the destination is an unset location.

For literal `seq` construction expressions (e.g., `@[1, 2]`), in-place
construction cannot be used anymore, since `isPartOf` usage is phased
out and the new MIR pass doesn't special-case `mArrToSeq`. However, with
the removal of the legacy GCs, the in-place `seq` construction had only
very little benefit, as using a temporary there only implies an
additional `(int, pointer)` local plus assignment thereof.

### Misc

Two internal bugs in the `mirtree` module are fixed:
- `operand` incorrectly ignored `mnkConsume` nodes
- the `SingleInputNode` set was missing `mnkStdConv`, `mnkPathNamed`,
  `mnkPathPos`, and `mnkPathVariant`. The aforementioned nodes now work
  with `unaryOperand`, as they should
  • Loading branch information
zerbina authored Jul 28, 2023
1 parent 61414fa commit 26178e4
Show file tree
Hide file tree
Showing 14 changed files with 514 additions and 98 deletions.
10 changes: 10 additions & 0 deletions compiler/backend/backends.nim
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import
mirbridge,
mirconstr,
mirgen,
mirpasses,
mirtrees,
sourcemaps
],
Expand Down Expand Up @@ -356,6 +357,15 @@ proc process*(prc: var Procedure, graph: ModuleGraph, idgen: IdGenerator) =
# XXX: ^^ this is a hack. See the documentation of the routine for more
# details

let target =
case graph.config.backend
of backendC: targetC
of backendJs: targetJs
of backendNimVm: targetVm
of backendInvalid: unreachable()

applyPasses(prc.body.tree, prc.body.source, prc.sym, target)

proc process(body: var MirFragment, ctx: PSym, graph: ModuleGraph,
idgen: IdGenerator) =
## Applies all applicable MIR passes to the fragment `body`. `ctx`
Expand Down
52 changes: 21 additions & 31 deletions compiler/backend/ccgcalls.nim
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,10 @@ proc canRaiseDisp(p: BProc; n: PNode): bool =
# we have to be *very* conservative:
result = canRaiseConservative(n)

proc preventNrvo(p: BProc; le, ri: PNode): bool =
proc reportObservableStore(p: BProc; le, ri: PNode) =
## Reports the ``rsemObservableStores`` hint when the called procedure can
## exit with an exception and `le` is something to which an assignment is
## observable in the exception-raised case.
proc locationEscapes(p: BProc; le: PNode; inTryStmt: bool): bool =
var n = le
while true:
Expand All @@ -48,15 +51,11 @@ proc preventNrvo(p: BProc; le, ri: PNode): bool =
# cannot analyse the location; assume the worst
return true

if le != nil:
for i in 1..<ri.len:
let r = ri[i]
if isPartOf(le, r) != arNo: return true
# we use the weaker 'canRaise' here in order to prevent too many
# annoying warnings, see #14514
if canRaise(ri[0]) and
locationEscapes(p, le, p.nestedTryStmts.len > 0):
localReport(p.config, le, reportSem rsemObservableStores)
# we use the weaker 'canRaise' here in order to prevent too many
# annoying warnings, see #14514
if le != nil and canRaise(ri[0]) and
locationEscapes(p, le, p.nestedTryStmts.len > 0):
localReport(p.config, le, reportSem rsemObservableStores)

proc hasNoInit(call: PNode): bool {.inline.} =
result = call[0].kind == nkSym and sfNoInit in call[0].sym.flags
Expand Down Expand Up @@ -96,9 +95,12 @@ proc fixupCall(p: BProc, le, ri: PNode, d: var TLoc,
if typ[0] != nil:
if isInvalidReturnType(p.config, typ[0]):
if params != "": pl.add(~", ")
# beware of 'result = p(result)'. We may need to allocate a temporary:
if d.k in {locTemp, locNone} or not preventNrvo(p, le, ri):
# Great, we can use 'd':
# the destination is guaranteed to be either a temporary or an lvalue
# that can be modified in-place
if true:
if d.k notin {locTemp, locNone}:
reportObservableStore(p, le, ri)

if d.k == locNone: getTemp(p, typ[0], d, needsInit=true)
elif d.k notin {locTemp} and not hasNoInit(ri):
# reset before pass as 'result' var:
Expand All @@ -107,14 +109,6 @@ proc fixupCall(p: BProc, le, ri: PNode, d: var TLoc,
pl.add(~");$n")
line(p, cpsStmts, pl)
exitCall(p, ri[0], canRaise)
else:
var tmp: TLoc
getTemp(p, typ[0], tmp, needsInit=true)
pl.add(addrLoc(p.config, tmp))
pl.add(~");$n")
line(p, cpsStmts, pl)
exitCall(p, ri[0], canRaise)
genAssignment(p, d, tmp)
else:
pl.add(~")")
if isHarmlessStore(p, canRaise, d):
Expand Down Expand Up @@ -419,9 +413,12 @@ proc genClosureCall(p: BProc, le, ri: PNode, d: var TLoc) =
if typ[0] != nil:
if isInvalidReturnType(p.config, typ[0]):
if ri.len > 1: pl.add(~", ")
# beware of 'result = p(result)'. We may need to allocate a temporary:
if d.k in {locTemp, locNone} or not preventNrvo(p, le, ri):
# Great, we can use 'd':
# the destination is guaranteed to be either a temporary or an lvalue
# that can be modified in-place
if true:
if d.k notin {locTemp, locNone}:
reportObservableStore(p, le, ri)

if d.k == locNone:
getTemp(p, typ[0], d, needsInit=true)
elif d.k notin {locTemp} and not hasNoInit(ri):
Expand All @@ -430,13 +427,6 @@ proc genClosureCall(p: BProc, le, ri: PNode, d: var TLoc) =
pl.add(addrLoc(p.config, d))
genCallPattern()
exitCall(p, ri[0], canRaise)
else:
var tmp: TLoc
getTemp(p, typ[0], tmp, needsInit=true)
pl.add(addrLoc(p.config, tmp))
genCallPattern()
exitCall(p, ri[0], canRaise)
genAssignment(p, d, tmp)
elif isHarmlessStore(p, canRaise, d):
if d.k == locNone: getTemp(p, typ[0], d)
assert(d.t != nil) # generate an assignment to d:
Expand Down
36 changes: 13 additions & 23 deletions compiler/backend/ccgexprs.nim
Original file line number Diff line number Diff line change
Expand Up @@ -521,7 +521,7 @@ proc genDeref(p: BProc, e: PNode, d: var TLoc) =
d.storage = OnUnknown # BUGFIX!
else:
internalError(p.config, e.info, "genDeref " & $typ.kind)

if mt == ctPtrToArray and lfEnforceDeref in d.flags:
# we lie about the type for better C interop: 'ptr array[3,T]' is
# translated to 'ptr T', but for deref'ing this produces wrong code.
Expand Down Expand Up @@ -1134,11 +1134,10 @@ proc genObjConstr(p: BProc, e: PNode, d: var TLoc) =
var t = e.typ.skipTypes(abstractInst)
let isRef = t.kind == tyRef

# check if we need to construct the object in a temporary
var useTemp =
isRef or
(d.k notin {locTemp,locLocalVar,locGlobalVar,locParam}) or
(isPartOf(d.lode, e) != arNo)
# a temporary was injected if in-place construction cannot be used,
# meaning that we can always construct in-place here (we still have
# to consider uninitialized and expression locs)
let useTemp = isRef or d.k == locNone

# if the object has a record-case, don't initialize type fields before but
# after initializing discriminators. Otherwise, the type fields in the
Expand Down Expand Up @@ -1208,37 +1207,28 @@ proc genObjConstr(p: BProc, e: PNode, d: var TLoc) =

specializeInitObject(p, r, t, e.info)

proc lhsDoesAlias(a, b: PNode): bool =
for y in b:
if isPartOf(a, y) != arNo: return true

proc genSeqConstr(p: BProc, n: PNode, d: var TLoc) =
var arr, tmp: TLoc
# bug #668
let doesAlias = lhsDoesAlias(d.lode, n)
let dest = if doesAlias: addr(tmp) else: addr(d)
if doesAlias:
getTemp(p, n.typ, tmp)
elif d.k == locNone:
getTemp(p, n.typ, d)
getTemp(p, n.typ, tmp)

let l = intLiteral(n.len)
block:
let seqtype = n.typ
linefmt(p, cpsStmts, "$1.len = $2; $1.p = ($4*) #newSeqPayload($2, sizeof($3), NIM_ALIGNOF($3));$n",
[rdLoc dest[], l, getTypeDesc(p.module, seqtype.lastSon),
[rdLoc tmp, l, getTypeDesc(p.module, seqtype.lastSon),
getSeqPayloadType(p.module, seqtype)])

for i in 0..<n.len:
initLoc(arr, locExpr, n[i], OnHeap)
arr.r = ropecg(p.module, "$1$3[$2]", [rdLoc(dest[]), intLiteral(i), dataField(p)])
arr.r = ropecg(p.module, "$1$3[$2]", [rdLoc(tmp), intLiteral(i), dataField(p)])
arr.storage = OnHeap # we know that sequences are on the heap
expr(p, n[i], arr)
if doesAlias:
if d.k == locNone:
d = tmp
else:
genAssignment(p, d, tmp)

if d.k == locNone:
d = tmp
else:
genAssignment(p, d, tmp)

proc genArrToSeq(p: BProc, n: PNode, d: var TLoc) =
var elem, a, arr: TLoc
Expand Down
1 change: 1 addition & 0 deletions compiler/backend/ccgtypes.nim
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,7 @@ proc isInvalidReturnType(conf: ConfigRef; rettype: PType): bool =
# Arrays and sets cannot be returned by a C procedure, because C is
# such a poor programming language.
# We exclude records with refs too. This enhances efficiency.
# keep synchronized with ``mirpasses.eligibleForRvo``
if rettype == nil: result = true
else:
case mapType(conf, rettype, skResult)
Expand Down
1 change: 0 additions & 1 deletion compiler/backend/cgen.nim
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ import
],
compiler/sem/[
rodutils,
aliases,
lowerings,
],
compiler/backend/[
Expand Down
Loading

0 comments on commit 26178e4

Please sign in to comment.