-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential optimization in the RISC-V example #16
Comments
Nice sleuthing! Indeed, this won't speed up actual circuits, because constructing a structure in Verilog doesn't cost anything (it just means remembering that a bunch of wires are bundled together). On the simulation side it's a nice speedup, but we don't really want to "cheat" by rewriting code to make the simulator happier (especially because these sort of code changes then make it harder to say anything sinceere about Cuttlesim vs. other simulators, since we would have modified the code to fit Cuttlesim better). Instead, we should improve the simulator so that it handles this pattern better (It's a common pattern in the bluespec world). I thought I did some experiments with this in the past, but I can't find my notes about them. What happens to performance if you force inlining of that function call at the C++ level (IIRC If that doesn't work, or if it's unreliable, then this would be a really interesting optimization pass to implement in Cuttlesim! We would check whether a function is pure and returns a struct, and if it does we would inline just the relevant computations. Sounds fun, actually :) |
I just took a glance at how the C++ code for |
Great catch, thanks! Fortunately, there's at least an inkling of the issue thanks to the compiler printing a warning ( I pushed a fix just now. |
An occurrence of |
Ugh, thanks a lot! |
I noticed that the
getFields
function of the RISC-V example often gets called just to get the value of a single field (with calls along the line oflet funct3 := get(getFields(inst), funct3)
). Since fetching the values of all the fields just to extract the value of one of them looked a bit overkill, I tried replacinggetFields
with individualgetXxx
functions (e.g.getFunct3
). This provides a decent boost to simulation performance:This results in a speedup of roughly 1.43 for Cuttlesim and of 1.23 for Verilator compared to master. I did not yet check the effects on synthesis but I might do it soon. I assume that this kind of optimization is applied automatically when synthesizing (isn't it?), so I don't expect changes on this side (alas). See commit 04cfdf8 for a demonstration.
The text was updated successfully, but these errors were encountered: