Replies: 4 comments 13 replies
-
I believe the min clock period is based on internal reg-to-reg paths and not reg-IO paths as shown above. Using a memory generator is a better idea as this will only get worse the bigger you make the rams. |
Beta Was this translation helpful? Give feedback.
-
Expecting to build to build 2.5M instance blocks quickly will be very difficult in any tool. |
Beta Was this translation helpful? Give feedback.
-
Trying a different approach... Since all I care about is to mock area and timing(creating realistic SRAMs is a separate concern that I'm not tackling now), it doens't matter what the Verilog actually does. Below I have reduced the size of the Memory while I'm using all the address bits.
After a few minutes, I have some area and timing for a mock SRAM that should allow me to see what else is going on in this design... If I want more aggressive timing, I can adjust the Verilog to be even simpler. Area can be scaled up and down, using the mock_area feature. After CTS: |
Beta Was this translation helpful? Give feedback.
-
So with the above approach, I can easily mock large SRAMs and I also have a way to mock the area. This allows me to set aside the SRAM concerns to investigate what else is going on in the design and learn something interesting. Setting aside the SRAM concern, I can see a lot of logic and high fanout in the timing path for the FpPipeline in the megaboom design. What timing closure does this have in commercial tools at a small node? It seems like a lot to ask that PDK and improved post synthesis stages are going to get this to GHz frequencies. Could it be a synthesis problem? Perhaps this version of MegaBoom is missing a pipeline stage or two here? Could there be some structure that has to be specialized in floating point units just like SRAM must be? From
|
Beta Was this translation helpful? Give feedback.
-
I'm trying to create a mock SRAM for the L2 in https://github.com/The-OpenROAD-Project/megaboom
Here is the behavioral model:
8192 rows x 64 bits = 5*10^5 bits
. This yields2.7*10^6
instances. At CORE_UTILIZATION=40%, this yields the following floorplan 640um * 1400um.The build times for this module become prohibitive. Synthesis is ca. 6000s, 30000s for global placement, and at least as long for CTS. So I really don't want to run through more than the floorplan and at that point create an abstract.
I have some work in progress where I can mock a smaller area: The-OpenROAD-Project/megaboom#9
I don't have a way to mock a realistic clock period for the .lib file that comes out of the floorplan, nor do I know exactly what is realistic for such an SRAM on ASAP7.
90ps minimum clock period for a 8192x64 SRAM seems pretty good...
I can't reconcile the min period of 90ps with what is observed in Timing Report:
I wonder if the .lib file that comes out of the floorplan can be useful in architectural exploration.
This raises a rather open ended question...
Q: How does the .lib file that comes out of the floorplan compare to a realistic .lib file?
Is it a "simple matter of scaling" the .lib file, like I can scale area to get something that is something that is in the range of what I want in my architectural exploration?
Beta Was this translation helpful? Give feedback.
All reactions