Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparing with OpenROAD-flow-scripts #239

Open
mtdudek opened this issue Dec 8, 2023 · 23 comments
Open

Comparing with OpenROAD-flow-scripts #239

mtdudek opened this issue Dec 8, 2023 · 23 comments
Assignees

Comments

@mtdudek
Copy link
Collaborator

mtdudek commented Dec 8, 2023

Recently I've been comparing Verilog to GDS flow between bazel_rules_hdl and OpenROAD-flow-scripts.

I've been testing this flows using common verilog design and
ASAP7 7.5T rev 28 RVT cells library.
The tested design is generated from DSLX language using XLS.

I've tried using bazel_rules_hdl to get GDS for that design, with clock period of 2.5ns.
Design is failing on the global routing stage with error message
[ERROR GRT-0118] Routing congestion too high. Check the congestion heatmap in the GUI.
P&R parameters:

clock_period = "2500.0",
core_padding_microns = 0,
die_width_microns = 200,
die_height_microns = 200,
placement_density = "0.16",

Then I tried running the same source verilog using ORFS.
With the same clock constraints ORFS easily achieved 2.5 ns,
it was able to reach timings < 1ns using RVT cells.
ORFS parameters:

set clk_name  core_clock
set clk_port_name clk
set clk_period 2500
set clk_io_pct 0.01

set clk_port [get_ports $clk_port_name]

create_clock -name $clk_name -period $clk_period $clk_port

set non_clock_inputs [lsearch -inline -all -not -exact [all_inputs] $clk_port]

set_input_delay  [expr $clk_period * $clk_io_pct] -clock $clk_name $non_clock_inputs
set_output_delay [expr $clk_period * $clk_io_pct] -clock $clk_name [all_outputs]

I later tried running bazel_rules_hdl synthesized Verilog through
ORFS P&R stages. It failed, as ORFS expects tie cells to be
already placed in the synthesized Verilog.
I've modified synthesis script, to use tie cells from openroad_pdk provider
and that was enough to get synthesized Verilog through ORFS,
it also achieved clock period < 1ns using RVT cells.

I've also tested starting ORFS from floorplan stage using flororplan generated
by bazel_rules_hdl. This run gave following error message:
[ERROR GRT-0119] Routing congestion too high. Check the congestion heatmap in the GUI
It is exactly the same error message. 119 means that extra information was being logged to
a file, while 118 means that there is no extra file.

I checked which version of ASAP7 ORFS use. It's 7.5t rev 28 with some extra cells defined.
Extra cells can be removed from the picture. Bazel_rules_hdl synthesized Verilog
only uses standard ASAP7 RVT cells and running ORFS P&R on it gave good timing results.

@mithro
Copy link
Member

mithro commented Dec 10, 2023

Hi @mtdudek,

I have been looking at this recently too - I started mainly looking at the simple examples under https://github.com/hdl/bazel_rules_hdl/blob/main/synthesis/tests/BUILD#L94-L117

Can you share the BUILD files for your example using bazel_rules_hdl?

Sounds like you have found the issue to be something to do with the floorplan?

Tim '@mithro' Ansell

@mtdudek
Copy link
Collaborator Author

mtdudek commented Dec 11, 2023

Hi @mithro,

Here is the link to the BUILD file that uses bazel_rules_hdl rules:
https://github.com/antmicro/xls/blob/pnr_testing/xls/modules/zstd/BUILD#L345-L371

I found that adding -random flag to the place_pins command was enough to get
both bazel_rules_hdl rule and ORFS working.

ORFS still reports better timing values.

Edit:
I've tried and set min_pin_distance to value 1.
After the change, both P&Rs were successful.
In default mode bazel_rules_hdl placed pins 96nm apart,
and that may be to dense for wide buses.

@mtdudek
Copy link
Collaborator Author

mtdudek commented Dec 14, 2023

With all the changes in the #243 PR I still had some timing differences when compared to ORFS.
I've found that ORFS by default use FF corner , while bazel_rules_hdl use SS corner.
Switching ORFS to SS, or bazel_rules_hdl to FF, gave around 3% difference in delay values.

@mithro
Copy link
Member

mithro commented Dec 14, 2023

The OpenROAD team has been working on updating their Yosys version used in OpenROAD-flow-scripts. Hopefully if we can get on identical Yosys versions between bazel_rules_hdl and OpenROAD-flow-scripts we can end up with 0% difference.

@mithro
Copy link
Member

mithro commented Dec 14, 2023

FYI as of today;

  • bazel_rules_hdl is on Yosys commit - 35a05686c4e9987441ac298f5d631f1785e272fd
  • OpenROAD-flow-scripts is on Yosys commit - 2584903a0605cc7ffdfc1996254ccc1f548403f2

The "delta" in Yosys between bazel_rules_hdl and OpenROAD-flow-scripts can be found at The-OpenROAD-Project/yosys@2584903...35a0568

I think we are all going to try and get onto Yosys 0.36 (which is commit 8f07a0d8404f63349d8d3111217b73c9eafbd667) ASAP. The delta between Yosys 0.36 and

@mithro
Copy link
Member

mithro commented Dec 14, 2023

Just FYI - @vvbandeira and @maliberty

@mithro
Copy link
Member

mithro commented Dec 14, 2023

@mithro
Copy link
Member

mithro commented Dec 14, 2023

@mithro
Copy link
Member

mithro commented Dec 15, 2023

FYI - It seems that internal Google3 is on Yosys commit 11ffd7df40260f782c82b43be190a48ec88214a1 (YosysHQ/yosys@11ffd7d).

@oharboe
Copy link

oharboe commented Dec 15, 2023

@mtdudek You may find this interesting, it is a wafer think Bazel layer on top of ORFS that we are using when studying ORFS on designs as large as MegaBoom: https://github.com/The-OpenROAD-Project/megaboom

@mithro
Copy link
Member

mithro commented Dec 19, 2023

The first 7 changes have been landed into both bazel_rules_hdl and the internal Google3 codebase.

I'm going to let them bake for a few days, before I land #253 and #254 which we believe might be more risky.

@mithro
Copy link
Member

mithro commented Dec 19, 2023

On a related note;

  • Upgrade to Yosys version 0.36.0 #246 was landed moving bazel_rules_hdl to Yosys 0.36 (rev 8f07a0d8404f63349d8d3111217b73c9eafbd667).
  • Yosys 0.36 (rev 8f07a0d8404f63349d8d3111217b73c9eafbd667) was imported into the Google3 code base.

This leaves OpenROAD-flow-scripts as the last one to get onto Yosys 0.36

@mithro
Copy link
Member

mithro commented Dec 21, 2023

@mtdudek - It would be good to see how the numbers compare now that your changes have landed. The first point of comparison would be the number and type of standard cells Yosys is producing out of synthesis.

@mtdudek
Copy link
Collaborator Author

mtdudek commented Dec 21, 2023

I've compared synthesis results from the newest main and ORFS@82b4fce7fc8ddedb6ea689894db44822e31ae575
Cell count for bazel_rules_hdl is 6883 and for ORFS is 7122. Both created the same number of DFFHQNs and FAx1s.
hld_rules created much more HAx1s 95 compared to ORFS 68. So there are already some differences after synthesis.

ORFS yosys report
=== __dec_demux__DecoderDemux_0_next ===

   Number of wires:               6830
   Number of wire bits:           7346
   Number of public wires:          94
   Number of public wire bits:     610
   Number of memories:               0
   Number of memory bits:            0
   Number of processes:              0
   Number of cells:               7122
     AND2x2_ASAP7_75t_R            107
     AND3x1_ASAP7_75t_R             79
     AND4x1_ASAP7_75t_R             18
     AND5x1_ASAP7_75t_R              3
     AO211x2_ASAP7_75t_R             4
     AO21x1_ASAP7_75t_R             45
     AO21x2_ASAP7_75t_R              1
     AO221x1_ASAP7_75t_R             7
     AO222x2_ASAP7_75t_R            13
     AO22x1_ASAP7_75t_R             22
     AO31x2_ASAP7_75t_R              3
     AO32x1_ASAP7_75t_R            134
     AOI211x1_ASAP7_75t_R           12
     AOI21x1_ASAP7_75t_R            34
     AOI22x1_ASAP7_75t_R             8
     BUFx2_ASAP7_75t_R            1621
     BUFx3_ASAP7_75t_R              18
     BUFx4f_ASAP7_75t_R              3
     BUFx6f_ASAP7_75t_R              4
     DFFHQNx1_ASAP7_75t_R         1148
     FAx1_ASAP7_75t_R               20
     HAxp5_ASAP7_75t_R              68
     INVx1_ASAP7_75t_R             686
     INVx2_ASAP7_75t_R               1
     NAND2x1_ASAP7_75t_R          1385
     NAND2x2_ASAP7_75t_R             2
     NAND3x1_ASAP7_75t_R             4
     NOR2x1_ASAP7_75t_R             30
     NOR3x1_ASAP7_75t_R              7
     OA211x2_ASAP7_75t_R          1036
     OA21x2_ASAP7_75t_R            417
     OA222x2_ASAP7_75t_R             4
     OA31x2_ASAP7_75t_R              2
     OAI21x1_ASAP7_75t_R            29
     OAI22x1_ASAP7_75t_R            17
     OR2x2_ASAP7_75t_R              13
     OR3x1_ASAP7_75t_R              51
     OR4x1_ASAP7_75t_R              11
     OR5x1_ASAP7_75t_R               6
     TIELOx1_ASAP7_75t_R             1
     XNOR2x1_ASAP7_75t_R             1
     XNOR2x2_ASAP7_75t_R            38
     XOR2x2_ASAP7_75t_R              9
Bazel_rules_hdl yosys report
=== __dec_demux__DecoderDemux_0_next ===

   Number of wires:               6618
   Number of wire bits:           7134
   Number of public wires:        6618
   Number of public wire bits:    7134
   Number of memories:               0
   Number of memory bits:            0
   Number of processes:              0
   Number of cells:               6883
     AND2x2_ASAP7_75t_R            179
     AND3x2_ASAP7_75t_R             86
     AND4x2_ASAP7_75t_R             12
     AO211x2_ASAP7_75t_R             5
     AO21x2_ASAP7_75t_R             83
     AO221x2_ASAP7_75t_R            18
     AO222x2_ASAP7_75t_R             3
     AO22x2_ASAP7_75t_R             49
     AO32x2_ASAP7_75t_R             66
     AO33x2_ASAP7_75t_R              2
     BUFx2_ASAP7_75t_R             461
     DFFHQNx1_ASAP7_75t_R         1148
     FAx1_ASAP7_75t_R               20
     HAxp5_ASAP7_75t_R              95
     INVx2_ASAP7_75t_R            1290
     NAND2x2_ASAP7_75t_R            27
     NAND3x2_ASAP7_75t_R             1
     NOR2x2_ASAP7_75t_R             15
     OA211x2_ASAP7_75t_R          1105
     OA21x2_ASAP7_75t_R            502
     OA222x2_ASAP7_75t_R             3
     OA22x2_ASAP7_75t_R             13
     OA31x2_ASAP7_75t_R              1
     OA33x2_ASAP7_75t_R              2
     OR2x2_ASAP7_75t_R            1525
     OR3x2_ASAP7_75t_R              44
     OR3x4_ASAP7_75t_R               4
     OR4x2_ASAP7_75t_R               7
     OR5x2_ASAP7_75t_R               8
     TIELOx1_ASAP7_75t_R            67
     XNOR2x2_ASAP7_75t_R            29
     XOR2x2_ASAP7_75t_R             13

@mithro
Copy link
Member

mithro commented Dec 22, 2023

This is likely caused by the different Yosys versions?

I sorted and split out the cells list and got the following;

-   Number of cells:               7122
+   Number of cells:               6883
 
-     AND2x2_ASAP7_75t_R            107
+     AND2x2_ASAP7_75t_R            179
-     AND3x1_ASAP7_75t_R             79
+     AND3x2_ASAP7_75t_R             86
-     AND4x1_ASAP7_75t_R             18
+     AND4x2_ASAP7_75t_R             12
-     AND5x1_ASAP7_75t_R              3
 
-     AO211x2_ASAP7_75t_R             4
+     AO211x2_ASAP7_75t_R             5
-     AO21x1_ASAP7_75t_R             45
-     AO21x2_ASAP7_75t_R              1
+     AO21x2_ASAP7_75t_R             83
-     AO221x1_ASAP7_75t_R             7
+     AO221x2_ASAP7_75t_R            18
-     AO222x2_ASAP7_75t_R            13
+     AO222x2_ASAP7_75t_R             3
-     AO22x1_ASAP7_75t_R             22
+     AO22x2_ASAP7_75t_R             49
-     AO31x2_ASAP7_75t_R              3
-     AO32x1_ASAP7_75t_R            134
+     AO32x2_ASAP7_75t_R             66
+     AO33x2_ASAP7_75t_R              2
-     AOI211x1_ASAP7_75t_R           12
-     AOI21x1_ASAP7_75t_R            34
-     AOI22x1_ASAP7_75t_R             8
 
-     BUFx2_ASAP7_75t_R            1621
+     BUFx2_ASAP7_75t_R             461
-     BUFx3_ASAP7_75t_R              18
-     BUFx4f_ASAP7_75t_R              3
-     BUFx6f_ASAP7_75t_R              4
 
      DFFHQNx1_ASAP7_75t_R         1148
 
      FAx1_ASAP7_75t_R               20
-     HAxp5_ASAP7_75t_R              68
+     HAxp5_ASAP7_75t_R              95
 
-     INVx1_ASAP7_75t_R             686
-     INVx2_ASAP7_75t_R               1
+     INVx2_ASAP7_75t_R            1290
 
-     NAND2x1_ASAP7_75t_R          1385
-     NAND2x2_ASAP7_75t_R             2
+     NAND2x2_ASAP7_75t_R            27
-     NAND3x1_ASAP7_75t_R             4
+     NAND3x2_ASAP7_75t_R             1
 
-     NOR2x1_ASAP7_75t_R             30
+     NOR2x2_ASAP7_75t_R             15
-     NOR3x1_ASAP7_75t_R              7
 
-     OA211x2_ASAP7_75t_R          1036
+     OA211x2_ASAP7_75t_R          1105
-     OA21x2_ASAP7_75t_R            417
+     OA21x2_ASAP7_75t_R            502
+     OA22x2_ASAP7_75t_R             13
-     OA222x2_ASAP7_75t_R             4
+     OA222x2_ASAP7_75t_R             3
-     OA31x2_ASAP7_75t_R              2
+     OA31x2_ASAP7_75t_R              1
+     OA33x2_ASAP7_75t_R              2
-     OAI21x1_ASAP7_75t_R            29
-     OAI22x1_ASAP7_75t_R            17
 
-     OR2x2_ASAP7_75t_R              13
+     OR2x2_ASAP7_75t_R            1525
-     OR3x1_ASAP7_75t_R              51
+     OR3x2_ASAP7_75t_R              44
+     OR3x4_ASAP7_75t_R               4
-     OR4x1_ASAP7_75t_R              11
+     OR4x2_ASAP7_75t_R               7
-     OR5x1_ASAP7_75t_R               6
+     OR5x2_ASAP7_75t_R               8
 
-     TIELOx1_ASAP7_75t_R             1
+     TIELOx1_ASAP7_75t_R            67
 
-     XNOR2x1_ASAP7_75t_R             1
-     XNOR2x2_ASAP7_75t_R            38
+     XNOR2x2_ASAP7_75t_R            29
 
-     XOR2x2_ASAP7_75t_R              9
+     XOR2x2_ASAP7_75t_R             13

@mithro
Copy link
Member

mithro commented Dec 22, 2023

I pull the data into a spreadsheet at https://docs.google.com/spreadsheets/d/1SdI8NbOnJRO_rwhzCbMOW5EQ0-Rh-aIzkAYsk30GOD8/edit#gid=0 - preview image shown below;
Screenshot from 2023-12-22 14-19-13

@mithro
Copy link
Member

mithro commented Dec 22, 2023

One thing that is clear, bazel_rules_hdl is not using any of the drive-strength x1 cells. I'm guessing that the exclude list configurations might be quite different?

@mithro
Copy link
Member

mithro commented Dec 22, 2023

BTW What did you run to get this data? Is it something that I could run as well?

@maliberty
Copy link

Sometime odd about:

TIELO | 1.0 | 1 |   | TIELO | 1.0 | 67

Is this apples to apples?

@mithro
Copy link
Member

mithro commented Dec 22, 2023

I'm guessing there is a splitnets like command being call. We do need to compare Yosys scripts.

@mtdudek
Copy link
Collaborator Author

mtdudek commented Dec 22, 2023

I see what the difference is, in the hilomap pass ORFS use flag singleton, while in bazel_rules_hdl this flag is not provided. For me it made more sense to create multiple tie cells, and reduce number of wires that must travel from singular tie cell.

@maliberty
Copy link

In ORFS we do repair_tie_fanout later which has a -separation flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants