Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transf: lower all unlabeled break statements #918

Merged
merged 11 commits into from
Sep 26, 2023

Conversation

zerbina
Copy link
Collaborator

@zerbina zerbina commented Sep 21, 2023

Summary

Fix the cases where unlabeled break statements were inserted by
transf, which means that unlabeled break statements now no longer
exist after the main transf pass. This allows for:

  • removing unlabeled break support from the MIR and CgNode IR
  • simplifying code generation and MIR processing
  • simplifying the closureiters transformation

Replacing skLabel usage in the code generators also becomes easier.

Details

Since they're directly associated with a block, labeled break
statements are easier to handle for analysis and code generation. For
this reason, transf already transforms all unlabeled break
statements into labeled ones, by:

  • assigning labels to blocks that don't have a user-provided one
  • wrapping while and for statements with named blocks

However, there were two occurrences where unlabeled break statements
((nkBreakStmt (nkEmpty))) were injected by transf:

  • for the exit of for statements of closure iterators (liftForLoop)
  • by the closureiters transformation when turning while cond loops
    into while true loops (although not strictly part of transf)

As a consequence, unlabeled break reached all the way into the code
generators, requiring dedicated support by the MIR, MIR processing, and
code generation.

While mirgen could lower unlabeled break statements, the association
of a while to its surrounding injected block no longer exists, and
mirgen is generally not well suited for non-local (i.e., not applying
to a single node) transformations.

Instead, the occurrences where unlabeled break statements are inserted
are adjusted. For the for statement, the label of the wrapper-block is
passed to liftForLoop, where the injected nkBreakStmt can then use
it. However, for the other occurrence, a more complex solution is
needed.

Since the while lowering the closureiters pass conditionally applies
(which is where the unlabeled break is injected) is the same as the
one mirgen automatically applies when generating repeat statements,
the transformation from mirgen.genWhile is turned into a PNode-based
transformation that is applied in transf.transformWhile. Some extra
adjustments are required:

  • PNode doesn't have a dedicated "scope" construct like the MIR does.
    It is emulated via a nkBlockStmt
  • the closure-iterator transformation doesn't support yield statements
    within if/elif conditions, so directly placing the former while
    condition into an if condition would regress on yield support. Prior
    to placing the expression into the if condition, nkStmtListExprs
    are thus first unpacked into the loop body

Only while true loops exist after the main transf pass now.

  while cond:
    body
  # is transformed into:
  block label:
    while true:
      if not cond:
        break label
      block: # a separate scope is needed for the clean-up semantics
        body

closureiters simplifications

Multiple simplifications are possible now that neither unlabeled
breaks nor while loops with non-true-literal conditions exist
at the point when the closureiters pass is run.

  • remove transformBreaksAndContinuesInWhile, which handled unlabeled
    breaks and continue statements in while loops
  • remove the now-obsolete statement-list lowering logic for
    nkWhileStmts (lowerStmtListExprs)
  • remove the nkWhileStmt condition handling from the to-basic-block
    transformation (transformClosureIteratorBody)
  • remove the nkWhileStmt handling from preprocessing (preprocess)

MIR simplifications

  • remove the "unset" sentinel value support from LabelId; all breaks
    in the MIR are labeled now
  • remove everything related to unlabeled MIR breaks from MIR processing
    (mirexec and cgirgen)

Code generator simplifications

Each code generator required special handling for associating unlabeled
breaks with their targeted construct (while or block), with
cgen and jsgen being affected the most. All of it is removed.

As a side effect, the output of jsgen for loops slightly improves: no
extra Label: is emitted around every while.

`liftForLoop` was emitting an unlabeld `nkBreakStmt` for exiting the
loop. The procedure now gets passed the loop's enclosing block's label,
which allows it to emit a labeled `nkBreakStmt` instead.
Instead of happening in `mirgen`, all `while` loops are now transformed
into roughly:

```nim
while true:
  if not cond:
    break
  body
```

during `transf`. This means that the `closureiters` transformation
doesn't has to, which in turn removes the only source of `nkBreakStmt`
nodes with no label.
The `closureiters` transformation doesn't support `nkYieldStmt`s
appearing within `if`/`elif` statements/expressions, so just placing the
`nkWhileStmt`s condition expression into an `nkIfStmt` introduces a
regression.
With all 'break's coming out of `transf` having labels, the MIR doesn't
have to support unlabeled ones anymore. Most notably, this means that:
- `LabelId` doesn't need a sentinel value
- the control-flow-graph computation becomes simpler
With all 'break' statement coming out of the MIR phase using labels,
the CG IR doesn't have to support unlabeled ones. This simplifies all
three code generators, with the JS and C one being affected the most.

For the JS target, this also has the effect of less code being generated
for `while` statements.
The transformations no longer need to support:
- unlabeled breaks
- `while` statements with complex condition expressions

Since `nkContinueStmt`s and now unlabeled `nkBreakStmt`s no longer reach
`closureiters`, `transformBreaksAndContinuesInWhile` becomes obsolete
and is thus removed.
With unlabeled breaks removed from the MIR, the `while true: break` has
to be updated to use a labeled break.
@zerbina zerbina added compiler/backend Related to backend system of the compiler simplification Removal of the old, unused, unnecessary or un/under-specified language features. labels Sep 21, 2023
@zerbina zerbina added this to the MIR phase milestone Sep 21, 2023
Since all elements in the `blocks` list have a label now, the label can
be stored there directly, instead of going through the `PNode`
indirection.
@zerbina
Copy link
Collaborator Author

zerbina commented Sep 24, 2023

My plan is to eventually turn the closureiters pass into a MIR-based transformation, which, besides simplifying it and fixing the places where yield is not properly supported, would also allow for removing the statement-list-expression unpacking from transf again.

For the extraneous blocks, I have a follow-up planned where mirgen will omit the mnkBlock (but not the mnkScope) for nkBlockStmt/nkBlockExpr nodes where the label doesn't have the sfUsed flag included. This will elide all mnkBlocks for blocks that are not used for control-flow.

@zerbina zerbina marked this pull request as ready for review September 24, 2023 17:49
Copy link
Collaborator

@saem saem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The increased regularity of the code is really good, it's silly looking at how much code was just devoted to carrying around that ambiguity.

compiler/mir/mirgen.nim Outdated Show resolved Hide resolved
scope(c.stmts):
c.gen(n[1])

c.stmts.add endNode(mnkRepeat)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, due to the simpler generation the need for bookending with the endNode drops too, cool.

Co-authored-by: Saem Ghani <saemghani+github@gmail.com>
@saem
Copy link
Collaborator

saem commented Sep 26, 2023

/merge

@github-actions
Copy link

Merge requested by: @saem

Contents after the first section break of the PR description has been removed and preserved below:


To-Do

  • write a proper commit message

Notes for Reviewers

  • this is soft-blocked by sem: reject illegal break/continue statements #915. Without said PR, illegal break statements would now cause crashes where they previously lead to proper errors
  • in addition, this is a preparation for removing/replacing skLabel usage from the code generators

@chore-runner chore-runner bot added this pull request to the merge queue Sep 26, 2023
Merged via the queue into nim-works:devel with commit 11751e9 Sep 26, 2023
18 checks passed
@zerbina zerbina deleted the no-transformed-unlabeled-breaks branch September 29, 2023 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/backend Related to backend system of the compiler simplification Removal of the old, unused, unnecessary or un/under-specified language features.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants