Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Moore] Improve WaitEventOp, lower to LLHD #7518

Merged
merged 1 commit into from
Aug 16, 2024
Merged

Conversation

fabianschuiki
Copy link
Contributor

Rework the moore.wait_event op to be able to accurately model the semantics of SystemVerilog's @... event control statements. The op now has a body region which is executed to detect a relevant change in one or more interesting values. A new moore.detect_event op serves as the mechanism to encode whether a posedge, negedge, both, or any change at all on a value should be detected as an event.

Based on this new moore.wait_event op we can properly convert most of the event control statements in ImportVerilog to a corresponding MLIR op. Delay control like #1ns is not handled yet.

In the MooreToCore conversion this new op allows us to properly generate llhd.wait operations at the right place that suspend process execution until an interesting event has occurred. This now also allows us to support almost all SystemVerilog processes in the lowering. The only missing ones are always_comb and always_latch which require an implicit llhd.wait to be inserted. @maerhart has a version of that lowering almost ready though.

This commit also adds an llhd.final op in order to be able to lower final procedures.

Copy link
Member

@maerhart maerhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first part of my review, I'll finish it later 😅

@@ -51,6 +51,32 @@ def ProcessOp : LLHDOp<"process", [
let assemblyFormat = "attr-dict-with-keyword $body";
}

def FinalOp : LLHDOp<"final", [
NoRegionArguments,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it'd make sense to add the RecursiveMemoryEffects trait?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good idea!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ProcessOp probably also wants this, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point!

include/circt/Dialect/LLHD/IR/LLHDStructureOps.td Outdated Show resolved Hide resolved
include/circt/Dialect/LLHD/IR/LLHDStructureOps.td Outdated Show resolved Hide resolved
include/circt/Dialect/LLHD/IR/LLHDStructureOps.td Outdated Show resolved Hide resolved
include/circt/Dialect/Moore/MooreOps.td Outdated Show resolved Hide resolved
def EventOp : MooreOp<"wait_event", [
HasParent<"ProcedureOp">
def WaitEventOp : MooreOp<"wait_event", [
RecursiveMemoryEffects,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have a HasParent<"ProcedureOp">?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, because you can write something like this:

task foo(ref logic x);
  @(posedge x);
endtask

module Bar;
  logic y;
  always foo(y);
endmodule

which would probably lower to this:

func.func @foo(%x: !moore.ref<l1>) {
  moore.wait_event {
    %0 = moore.read %x : <l1>
    moore.detect_event posedge %0 : l1
  }
}
hw.module @Bar() {
  %y = moore.variable : <l1>
  moore.procedure always {
    func.call @foo(%y) : (!moore.ref<l1>) -> ()
  }
}

I'm not entirely sure how we'd deal with this… probably some lowering pass on functions to make sure we can suspend in the middle of a function call 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh forgot about tasks 😞
That's a good question indeed.

// TODO: Enable the following once it not longer interferes with @(...)
// event control checks. The introduced dummy variables make the event
// control observe a static local variable that never changes, instead of
// observing a module-wide signal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this due to mem2reg and not having something like the sample op we talked about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly. The current implementation of SimplifyProcedures introduces shadow variables for any global variable that it accesses. Since we now have regular Control Flow ops and basic blocks, and mem2reg is becoming more solid, we'll have to revisit this and figure out what exactly we want to do in this pass. Some form of mapping assignments of global variables to process outputs would be great.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we originally planned to inline the body of always_comb into the top level/module(hw.module). Since we need to ensure the variable has a correct value(After running SimplifyProcedures, we will run mem2reg to eliminate the shadow variables). However, we would like to lower moore.procedure into llhd.process for the time being, I think maybe we needn't SimplifyProcedures, WDYT 😁?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that sounds fantastic! 😀 I'm pretty sure we can add SimplifyProcedures again in the future to optimize some procedures early on, but I totally agree with going to LLHD processes for now and doing optimizations there. 🥳

test/Conversion/MooreToCore/basic.mlir Show resolved Hide resolved
}];
let arguments = (ins
EdgeAttr:$edge,
UnpackedType:$input,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be restricted to a one-bit type or should we add a test for multi-bit operands?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can have multi-bit operands, and the edge detection truth table will apply per bit. If the result is non-zero you get an event. The current MooreToCore conversion might need a few extensions to work reliably with multi-bit operands. Maybe a follow-up PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Erroring out in MooreToCore for now sounds good to me.

// the interesting signals.
rewriter.setInsertionPointToEnd(checkBlock);
if (!triggers.empty()) {
auto triggered = rewriter.createOrFold<comb::OrOp>(loc, triggers, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, no short-circuiting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is short-circuiting between the events in SystemVerilog. Having it be an expression may make lowering a bit easier later 😃

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good to know that there's no short-circuiting in SV. 👍

@cepheus69
Copy link

It is excited to see its taking shape 👍.

Comment on lines +188 to +217
for (auto returnOp :
llvm::make_early_inc_range(body.getOps<ReturnOp>())) {
rewriter.setInsertionPoint(returnOp);
rewriter.replaceOpWithNewOp<llhd::HaltOp>(returnOp);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, the return statement cannot be used in a(n) initial/final. Slang also detects it(return statement is only valid inside task and function blocks).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're right! The Moore dialect has its own moore.return operation, which is used to terminated moore.procedure ops. We should probably give it a better name… A return in Verilog is mapped to a func.return in MLIR, which is different from the moore.return inserted implicitly as a terminator for moore.procedures 😬

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I mistook body.getOps<ReturnOp> as the return of function. Thanks for correcting me!

@fabianschuiki fabianschuiki force-pushed the fschuiki/wait-event branch 2 times, most recently from 8ffa907 to 3c8775a Compare August 15, 2024 18:48
Copy link
Member

@maerhart maerhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the rest of my review. Thanks for all these great improvements! 🎉

SmallDenseSet<Value> alreadyObserved;
clonedOp.getBody().walk([&](Operation *operation) {
for (auto value : operation->getOperands()) {
if (value.getParentRegion() == &clonedOp.getBody())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are nested regions within the wait_event op region, we'd have to check if clonedOp.getBody() is an ancestor region of the value defining region.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! I've changed this and added a test for it.

auto triggered = rewriter.createOrFold<comb::OrOp>(loc, triggers, true);
rewriter.create<cf::CondBranchOp>(loc, triggered, resumeBlock, waitBlock);
} else {
rewriter.create<cf::BranchOp>(loc, waitBlock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the resume block would have no predecessors. Do you think it's save to leave this to the conversion framework? We don't have a test for this case to be sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I have added the pattern I've seen failing before to the test with an empty wait_event. It looks like the dialect conversion uses the dominance information collected before the conversion to iterate through ops -- looks like making a block unreachable doesn't exclude it from the conversion.

@maerhart
Copy link
Member

Resolves #7482

Rework the `moore.wait_event` op to be able to accurately model the
semantics of SystemVerilog's `@...` event control statements. The op now
has a body region which is executed to detect a relevant change in one
or more interesting values. A new `moore.detect_event` op serves as the
mechanism to encode whether a posedge, negedge, both, or any change at
all on a value should be detected as an event.

Based on this new `moore.wait_event` op we can properly convert most of
the event control statements in `ImportVerilog` to a corresponding MLIR
op. Delay control like `#1ns` is not handled yet.

In the MooreToCore conversion this new op allows us to properly generate
`llhd.wait` operations at the right place that suspend process execution
until an interesting event has occurred. This now also allows us to
support almost all SystemVerilog processes in the lowering. The only
missing ones are `always_comb` and `always_latch` which require an
implicit `llhd.wait` to be inserted. @maerhart has a version of that
lowering almost ready though.

This commit also adds an `llhd.final` op in order to be able to lower
`final` procedures.

Fixes #7482.

Co-authored-by: Martin Erhart <maerhart@outlook.com>
@fabianschuiki fabianschuiki merged commit d71e6e3 into main Aug 16, 2024
4 checks passed
@fabianschuiki fabianschuiki deleted the fschuiki/wait-event branch August 16, 2024 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants