[Moore] Clean up struct ops and add missing tests #7392

fabianschuiki · 2024-07-27T18:59:17Z

Rework the Moore dialect operations that manipulate struct values. These are intended to operate on StructType and UnpackedStructType values directly, but were defined in ODS as operating on references to structs. This was likely a hold-over from early development where we were still figuring out the distinction between ref types and value types in SV.

This commit adjusts the struct ops such that they actually operate on struct values instead of references to structs. It also moves more operand constraints into ODS and simplifies the op verifiers by factoring out some common code into helper functions.

Enhance the struct_inject canonicalizer to also consider struct_create operations as part of the inject chain. This allows an initial struct_create that is modified by a subsequent inject to be canonicalized into a new struct_create with updated values.

Add missing basic and error tests for the struct-related ops, and simplify the variable destructuring test by removing unrelated operations.

Also fixes an issue in variable op destructuring where a variable with initial value would have its initial value discarded during destructuring. Initial values now prevent destructuring. Alternatively, we may choose to insert appropriate struct_extract ops to destructure the initial value in the future.

mingzheTerapines · 2024-07-29T01:38:58Z

Awesome, I learned a lot from these codes.

mingzheTerapines · 2024-07-29T01:47:07Z

lib/Dialect/Moore/MooreOps.cpp

+      return {};
+    fields.push_back(NamedAttribute(member.name, field));
+  }
+  return DictionaryAttr::get(getContext(), fields);


Really elegent

mingzheTerapines · 2024-07-29T02:48:01Z

@fabianschuiki So structExtractOp could be like this? But it is has trait Pure which can not trigger SROA.

func.func @SplitStructs(%arg0: !moore.i42, %arg1: !moore.i1337) {
  // Named variables

  // CHECK-DAG: %x.a = moore.variable : <i42>
  // CHECK-DAG: %x.b = moore.variable : <i1337>
  // CHECK-NOT: moore.struct_extract
  %x = moore.variable : <struct<{a: i42, b: i1337}>>
  %0 = moore.read %x : struct<{a: i42, b: i1337}>
  // CHECK-DAG: [[A:%.+]] = moore.read %x.a : <i42>
  // CHECK-DAG: [[B:%.+]] = moore.read %x.b : <i1337>
  %1 = moore.struct_extract %x, "a" : struct<{a: i42, b: i1337}> -> i42
  %2 = moore.struct_extract %x, "b" : struct<{a: i42, b: i1337}> -> i1337
  // CHECK: moore.blocking_assign %arg0, [[A]]
  // CHECK: moore.blocking_assign %arg1, [[B]]
  moore.blocking_assign %arg0, %1 : !moore.i42
  moore.blocking_assign %arg1, %2 : !moore.i1337
  return
}

hailongSun2000 · 2024-07-29T03:35:35Z

lib/Dialect/Moore/MooreOps.cpp

  if (isa<SVModuleOp>(getOperation()->getParentOp()))
    return {};


It looks like we will tag global variables with struct type as destructurable. For example, struct values exist in func.func.

I'm wondering if we may want to also allow for variables in SVModuleOp to be destructured by SROA. Would that be a problem?

If we destructure in Moore dialect, mem2reg pass may eliminate part of struct variable, such as %x.a will be eliminated if it is unused. Some struct variable will be incomplete which means struct information will lose in this IR. Some optimizations for nested type may be disabled in lower IR.
What's more, most cases of struct variables operation will be transformed to non-struct variables operation unless this struct is used not by fielding. But in hw dialect, it still has many struct Ops. We do not need to destructure them in so higher IR.
I was concerning about this, wo I disabled SROA for global variables.

There seems to be no problem. As the follow you mentioned, when moore.read exists, SROA has nothing to do, and vice versa. So I think maybe we can allow for variables in SVModuleOp to be destructured by SROA. Those variables with struct type don't expand by SROA, we can turn them into moore.struct_create(I notice canonicalize method can do this), then lower moore.struct_create directly into hw.struct_create. What do you think?

Yeah that sounds great. We can't directly go from moore.variable to moore.struct_create, because the variable has ref<struct> type but moore.struct_create creates a struct directly. But in case Mem2Reg replaces the variable with SSA values, we can definitely materialize moore.struct_create. That would be especially nice also with moore.assigned_variable 😃

hailongSun2000 · 2024-07-29T03:37:43Z

test/Dialect/Moore/sroa.mlir

+  // CHECK-DAG: %x.a = moore.variable : <i42>
+  // CHECK-DAG: %x.b = moore.variable : <i1337>
+  // CHECK-NOT: moore.struct_extract_ref
+  %x = moore.variable : <struct<{a: i42, b: i1337}>>
+  %0 = moore.struct_extract_ref %x, "a" : <struct<{a: i42, b: i1337}>> -> <i42>
+  %1 = moore.struct_extract_ref %x, "b" : <struct<{a: i42, b: i1337}>> -> <i1337>
+  // CHECK: moore.blocking_assign %x.a, %arg0
+  // CHECK: moore.blocking_assign %x.b, %arg1
+  moore.blocking_assign %0, %arg0 : !moore.i42
+  moore.blocking_assign %1, %arg1 : !moore.i1337


I'm curious how to handle the moore.extract like 🤔

%0 = moore.read %x; %1 = moore.extract %0, "a"

when we run --sroa, %x will be earsed.

Having a read %x blocks SROA for %x. For example, the following code without read causes SROA to expand the variable %x:

func.func @Case1() { %x = moore.variable : <struct<{a: i42, b: i1337}>> moore.struct_extract_ref %x, "a" : <struct<{a: i42, b: i1337}>> -> <i42> moore.struct_extract_ref %x, "b" : <struct<{a: i42, b: i1337}>> -> <i1337> return } // after circt-opt --sroa func.func @Case1() { %x.b = moore.variable : <i1337> %x.a = moore.variable : <i42> return }

But if I add a read %x, SROA does not expand the variable:

func.func @Case2() { %x = moore.variable : <struct<{a: i42, b: i1337}>> moore.struct_extract_ref %x, "a" : <struct<{a: i42, b: i1337}>> -> <i42> moore.struct_extract_ref %x, "b" : <struct<{a: i42, b: i1337}>> -> <i1337> moore.read %x : <struct<{a: i42, b: i1337}>> return } // after circt-opt --sroa func.func @Case2() { %x = moore.variable : <struct<{a: i42, b: i1337}>> moore.struct_extract_ref %x, "a" : <struct<{a: i42, b: i1337}>> -> <i42> moore.struct_extract_ref %x, "b" : <struct<{a: i42, b: i1337}>> -> <i1337> moore.read %x : <struct<{a: i42, b: i1337}>> return }

So it looks like SROA is doing the correct thing in case there is a read which it cannot properly expand.

Yes, SROA will be enabled only if the destructurable variable is unused after rewiring.

In this case, if we have a local variable with struct type declared in the procedure body. We cannot eliminate it using mem2reg. But it's not a problem, we have decided to lower moore.procedure to llhd.process 😃 .

Sounds good! Maybe we'll find more optimizations later on to help eliminate more of these variables -- in case that's even needed. But as long as we can lower it to LLHD, things should work 🙂

@fabianschuiki what order should SROA and Canonicalize be?
If this PR #7341 is applied, no more variableOp for struct will exist after the Canonicalize Pass.

I don't think we can canonicalize variables away everywhere. There are a lot of use cases that involve multiple assignments to variables, potentially from different processes. So I'm pretty sure that canonicalization will not really remove any variables. Things like a mem2reg conversion is fairly involved and requires quite a bit of analysis. I'm pretty sure that can't be done in a canonicalization pattern alone, but requires a dedicated mem2reg pass. Does the one that MLIR already has work for this?

mingzheTerapines · 2024-07-29T03:45:15Z

lib/Dialect/Moore/MooreOps.cpp

+  if (auto index = getStructFieldIndex(type, name))
+    return getStructMembers(type)[*index].type;
+  return {};
+}



Wonderful interface for structLikeType. Will it be better if let be Interfaces for StructLikeType which means moving these to mooreTypes.cpp?

I'm currently experimenting with creating a common base class for StructType and UnpackedStructType, such that you can do a dyn_cast<AnyStructType>(type) to go from StructType and UnpackedStructType to a common AnyStructType base class. Hopefully I can move this code into that base class in MooreTypes.cpp as you suggest 😃

Rework the Moore dialect operations that manipulate struct values. These are intended to operate on `StructType` and `UnpackedStructType` values directly, but were defined in ODS as operating on references to structs. This was likely a hold-over from early development where we were still figuring out the distinction between ref types and value types in SV. This commit adjusts the struct ops such that they actually operate on struct values instead of references to structs. It also moves more operand constraints into ODS and simplifies the op verifiers by factoring out some common code into helper functions. Enhance the `struct_inject` canonicalizer to also consider `struct_create` operations as part of the inject chain. This allows an initial `struct_create` that is modified by a subsequent inject to be canonicalized into a new `struct_create` with updated values. Add missing basic and error tests for the struct-related ops, and simplify the variable destructuring test by removing unrelated operations. Also fixes an issue in variable op destructuring where a variable with initial value would have its initial value discarded during destructuring. Initial values now prevent destructuring. Alternatively, we may choose to insert appropriate `struct_extract` ops to destructure the initial value in the future.

fabianschuiki · 2024-07-29T21:25:25Z

@mingzheTerapines in your example

%x = moore.variable : <struct<{a: i42, b: i1337}>>
%0 = moore.read %x : <struct<{a: i42, b: i1337}>>
%1 = moore.struct_extract %0, "a" : struct<{a: i42, b: i1337}> -> i42
%2 = moore.struct_extract %0, "b" : struct<{a: i42, b: i1337}> -> i1337

the SROA pass would not expand %x because there is a read operation that reads the entire variable %x. However, we could add a canonicalization that tries to pull struct_exracts up an in front of read operations. In your example that would result in:

%x = moore.variable : <struct<{a: i42, b: i1337}>>
%0 = moore.struct_extract_ref %x, "a" : <struct<{a: i42, b: i1337}>> -> <i42>
%1 = moore.struct_extract_ref %x, "b" : <struct<{a: i42, b: i1337}>> -> <i1337>
%2 = moore.read %0 : <i42>
%3 = moore.read %1 : <i1337>

In this form, SROA could operate again. This is a difficult canonicalization to get right though: splitting a single read of %x up into separate reads of the fields means that we now have to be careful that there are no writes to %x in between the individual reads. The reads have to be inserted at the exact same location as the old read, which means that the struct_extracts have to be pulled up in front of the reads and converted into a struct_extract_ref.

I don't know how LLVM handles this. Splitting a struct read up into multiple field reads doesn't feel like it's always the right thing to do: if the struct has 20 small i1 fields, doing a single read of the 20-bit struct is going to be much more efficient than 20 individual i1 reads. I'm wondering if optimizations like these use a cost modle in LLVM, or if there is a heuristic that is used to pick whether field reads or struct reads should be preferred. (For example, if >50% of a struct's bits are read it might be more efficient to read the entire struct once and then use struct_extract, instead of reading every individual field through struct_extract_ref.)

mingzheTerapines · 2024-07-30T01:16:00Z

@fabianschuiki I partly agree with you.
" For splitting a single read of %x up into separate reads of the fields means that we now have to be careful that there are no writes to %x in between the individual reads":
we don't need to split old readOp, we can directly generate moore.struct_extract_refOp and new readOp by canonicalization of structExtractOp, which will "getinputmutable " of the old readOp. Then canonicalization will remove unused variable such as old readOp.
"For it might be more efficient to read the entire ":
It depends.
If it is for parallel case for reading and writing, reading field a and writing field b could be done parallely which means more efficient, but

read entire struct
readfield a from 1
writefiled b to struct
need to be atomically in order of codes.
But it could be all right if we
a. read entire struct
b. readfield a from a
c. writefiled b to a.
d. write a to struct
Read the value and do reading and writing on this value. Finally write the value back after all reading and writing. This may cause a long time occupying register. I am not sure this is a kind of side effect.

fabianschuiki added the Moore label Jul 27, 2024

fabianschuiki requested review from uenoku, maerhart and hailongSun2000 July 27, 2024 18:59

mingzheTerapines reviewed Jul 29, 2024

View reviewed changes

hailongSun2000 reviewed Jul 29, 2024

View reviewed changes

mingzheTerapines reviewed Jul 29, 2024

View reviewed changes

fabianschuiki force-pushed the fschuiki/moore-struct-fixes branch from 619cd44 to 5a75bb6 Compare July 30, 2024 00:19

hailongSun2000 approved these changes Jul 30, 2024

View reviewed changes

fabianschuiki merged commit b7b82fd into main Jul 30, 2024
4 checks passed

fabianschuiki deleted the fschuiki/moore-struct-fixes branch July 30, 2024 19:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Moore] Clean up struct ops and add missing tests #7392

[Moore] Clean up struct ops and add missing tests #7392

fabianschuiki commented Jul 27, 2024

mingzheTerapines commented Jul 29, 2024 •

edited

Loading

mingzheTerapines Jul 29, 2024

mingzheTerapines commented Jul 29, 2024 •

edited

Loading

hailongSun2000 Jul 29, 2024

fabianschuiki Jul 29, 2024

mingzheTerapines Jul 30, 2024

hailongSun2000 Jul 30, 2024

fabianschuiki Jul 30, 2024

hailongSun2000 Jul 29, 2024 •

edited

Loading

fabianschuiki Jul 29, 2024

mingzheTerapines Jul 30, 2024

hailongSun2000 Jul 30, 2024

fabianschuiki Jul 30, 2024

mingzheTerapines Jul 31, 2024

fabianschuiki Aug 1, 2024

mingzheTerapines Jul 29, 2024

fabianschuiki Jul 29, 2024

fabianschuiki commented Jul 29, 2024

mingzheTerapines commented Jul 30, 2024 •

edited

Loading

		if (isa<SVModuleOp>(getOperation()->getParentOp()))
		return {};

[Moore] Clean up struct ops and add missing tests #7392

[Moore] Clean up struct ops and add missing tests #7392

Conversation

fabianschuiki commented Jul 27, 2024

mingzheTerapines commented Jul 29, 2024 • edited Loading

Choose a reason for hiding this comment

mingzheTerapines commented Jul 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hailongSun2000 Jul 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabianschuiki commented Jul 29, 2024

mingzheTerapines commented Jul 30, 2024 • edited Loading

mingzheTerapines commented Jul 29, 2024 •

edited

Loading

mingzheTerapines commented Jul 29, 2024 •

edited

Loading

hailongSun2000 Jul 29, 2024 •

edited

Loading

mingzheTerapines commented Jul 30, 2024 •

edited

Loading