Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[major] Fine-grind parameters and capabilities for memories #96

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

CircuitCoder
Copy link

@CircuitCoder CircuitCoder commented Apr 17, 2023

This PR changes memory statements in FIRRTL to allow more fine-grind control over parameters on memories, allowing:

  • Per-port read / write latency
  • Write ports with no masks

In short, a memory in the syntax proposed in this PR would look like this:

mem mymem:
  data-type => {real:SInt<16>, imag:SInt<16>}
  depth => 256
  read-under-write => undefined
  port r1:
    read => yes
    write => no
    read-latency => 0
  port r2:
    read => yes
    write => no
    read-latency => 0
  port w:
    read => no
    write => with-mask
    write-latency => 1

Note that this will be a breaking change. Although it would be easy to make the old port declaration syntax (e.g. reader => r1 ) a shorthand for the new ones, signal names is also different in this new approach (e.g. wdata vs data for writers). It would add a lot of burden to the spec to have different signal names in ports with different sets of capabilities.

This is a draft PR. The ideas presented here is still in early design stage, and we've not yet added changelogs for it.

Pending problems

Port-pairwise read-under-write policy

In theory, there might be memories that have different read-under-write policies between different pairs of ports. If we would like to characterize such behavior, we would need to allow specifying RUW policies for all (ordered) pairs of ports in the form of, let's say, a matrix.

Though possible, we believe this would be a highly uncommon scenario. So another way of settling this is to allow a special value indicating a non-existing or non-standard global read-under-write policy for the memory instance, and allow compilers and tools to have their own implementation-dependent way of specifying those policies:

mem m1:
  read-under-write: non-standard

Another way is to simply provide no support such memories. Users would need to resort to using extmodules.

Alternative read / write latency syntax

Currently from a pure syntactical POV, we doesn't enforce ports that have read capability to have a read-latency settings. We may make this requirement purely syntactical by requiring users to provide latencies when specifying capabilities, e.g.:

mem m1:
  port rw1:
    read: yes, 1
    write: no-mask, 2

Backward compatibility

To make this new syntax backward compatible, we would have to:

  1. Make the old reader => r1 a shorthand for the new syntax.
  2. Allow read-latency and write-latency in global scope, and specify the validity and semantics of ports that have no read / write latency specified, or two incompatible values specified.
  3. Make ports with different capabilities have different signal names.

@sequencer
Copy link
Member

We may need an additional custom port for mbist/dft. This port will be ignored in the behavioral simulation. But exist in the post synthesis flow.

@CircuitCoder CircuitCoder marked this pull request as ready for review April 21, 2023 07:31
@CircuitCoder
Copy link
Author

@sequencer Is it sufficient to have a extra-type in port instantiations, which will result in a corresponding signal generated in the port?

mem m1:
  port r1:
    extra-type => {meow:SInt<16>, meowmeow:SInt<16>}
  // ...

m1.r1.extra

@CircuitCoder CircuitCoder changed the title Fine-grind parameters and capabilities for memories [major] Fine-grind parameters and capabilities for memories Apr 23, 2023
spec.md Outdated
read => no
write => with-mask
write-latency => 1
custom-port => {a:UInt<4>, flip b:UInt<2>}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this match syntax more? Something like:
port custom: {a:UInt<4>, flip b:UInt<2>}

Copy link
Author

@CircuitCoder CircuitCoder Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Maybe we should use another keyward (e.g. custom-port) to avoid syntax ambiguity when parsing:

custom-port custom_a => {a:Uint<4>, flip b:Uint<2>}

The motivation behind the custom-port => syntax is that there should only be at most one custom port for a single memory. But this may cause implicit naming collision with other ports. So let's also include the port name here.

Using => is to keep in line with other single-value configuration. See below

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in e6311a7 to use custom-port custom => {a:UInt<4>, flip b:UInt<2>}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should there be at most one custom port?

Copy link
Author

@CircuitCoder CircuitCoder May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the custom-port custom => ... syntax, this limitation is lifted, and there can be multiple custom ports now. However it might still be desirable.

If people requires multiple custom ports, they can always aggregate them into a single one. Limiting the number of custom ports may simplify compilers.

spec.md Outdated
corresponding element within the memory holds the new value.

6. A read-under-write flag indicating the behavior when a memory location is
3. A variable number of named ports, each having following parameters:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"each having the following parameters in this order:"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Author

@CircuitCoder CircuitCoder Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e6311a7.

read-under-write => undefined
port r1:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird to mix => syntax and : syntax.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation of using : is to distinguish a indented block from a single configuration value. Using a different delimitator might make writing parser a little bit easier?

flip rdata: T,
wdata: T,
wmask: M
}
```

where `M`{.firrtl} is the mask type calculated from the element type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the mask actually help anywhere? It seems it is a feature which really only makes sense assuming a certain compilation strategy (lowering aggregate memories to multiple memories, in which the mask becomes the enable flag).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this is a abstraction over different possible hardware memory implementations (some SRAM allows bit or byte masks). Aggregated memory lowering might be part of memory synthesis, which one might want to delay until backend flow.

Also, this might be good for behavioral simulations performance?

port or a readwrite port that is used to initiate a write operation on a given
clock edge, where `en`{.firrtl} is set and, for a readwriter, `wmode`{.firrtl}
is set. An "active read port" is a read port or a readwrite port that is used to
For the purpose of defining such collisions, an "active write port" is a port
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph made more sense for a single clock. When each port can have a different clock, what is "a given clock edge"?

Copy link
Author

@CircuitCoder CircuitCoder Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point.

This description mostly came from the original read-under-write specification. It also has a separate paragraph after this one specific to independently clocked ports, which has been kept.

After a closer look at the original specification, it seems that it has a very imprecise definition of the semantics of a read or write. The first part of the original specification seems to imply that all ports share a single clock (a clock that's globally agreed on). Also writes to memories are atomic w.r.t. a globally-agreed memory content, and reads are also atomic.

There is also the problem of overlapping writes. The original spec completely omitted such possibility, since it modeled that writes happens atomically after write-latency.

  • If we would like to formally define the semantics of ALL possible overlapping transactions, it might be more convenient and precise to use the language of observability, similar to those used in defining memory orderings.
  • If we are happy with atomic writes, then we can just change mentions of "cycle" to "edge w.r.t. the port's clock", and everything will be fine ™️.

If we settled with the language of observability of read / write transactions, the original specification seems to intend to define the following semantics:

  • If the writer and the reader have the same clock, and there's an overlap timewise (w.r.t. wall time) between the write transaction and the read transaction:
    • If the read-under-write flag is old, and the read transaction's start is before the entire write transaction (w.r.t. wall time), the read SHOULD NOT observe the write. 1
    • If the read-under-write flag is new, and the read transaction's end is after the entire write transaction (w.r.t. wall time), the read SHOULD observe the write.
    • Otherwise, the result of the read is entirely undefined. This is stricter than saying that whether the read observe the write is undefined: it may read part of the write, or some other random bits out of nowhere.
  • If the writer and the reader have different clocks, and there's an overlap timewise (w.r.t. wall time) between the write transaction and the read transaction, the result of the read is entirely undefined.

Would this be a more precise definition of the read under write policy? If we decide to use this definition, we may keep a shorter version of the original paragraph as a more intuitive explanation of the aforementioned formal semantics.

Footnotes

  1. This description differs from the original spec in the sense that if a read starts inside a active write, the original spec says the read SHOULD NOT observe the write (the write happens atomically after write-latency). Here we change it to undefined.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not well versed in what real memories prefer to do and how they deal with some of these issues. Things like "what happens when you clock gate in the middle of a write window?" , "is there an internal clock all transactions are referenced too (e.g. 'wall-clock' time above)?". (Hopefully) Obviously firrtl semantics will pick a reasonable subset of all possible behaviors with an eye towards 90% user-case behavior.

Incidentally, talking to some FPGA folks, they classically build their own bypass logic around simpler memories to get the read-under-write behavior they want.

Copy link
Author

@CircuitCoder CircuitCoder May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sequencer What's your thoughts on these?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically there is already an annotation designed for the custom port which makes MBIST work. However the annotation makes it directly export to Top level, this may satisfy an IP vendor, but for those who use chisel to design the full chip, this introduces additional problems.
We all agree that MBIST is necessary for ASIC designers. However it's always provided by vendors, thus a custom port connecting from each memory, and wire to the BIST controller is also necessary.
The main issue is we cannot model it in firrtl since for each memory IP vendor, their MBIST design might be different. Thus I think it's necessary to provide a custom port attribute in FIRRTL, but forbid it to generate a behavior model.
Sorry for the late reply...

@sequencer
Copy link
Member

sequencer commented Apr 25, 2023

another thing to mention is: it's not possible to emulate the behavior for different DFT/MBIST implementation. I think it's necessary to forbid compiler to lower behavior memory with the custom port being configured.

@CircuitCoder
Copy link
Author

@sequencer Then it's more convenient to define the behavior of custom ports during behavioral simulations to be completely undefined.

Updated in e6311a7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants