-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IDIV_C and ISDIV_C instructions #26
Comments
It's not so clear to me. The 64-bit constant needs to be calculated using division and modulo operations. CPUs have to compile to machine code, so this calculation is done as part of the compilation step. Having those instructions forces the ASIC to have a hardware divider while avoiding the high cost of division for CPUs in the main loop. The ; IDIV_C r5, 624165039
mov rax, 15866829597104432181
mul r13
shr rdx, 29
add r13, rdx |
@tevador I'm talking about all these excess instructions that come after multiplication. We can still leave requirement to calculate 64-bit reciprocal, but leave only multiplication (see my updated comment above). P.S. And we'll be able to use |
That's not a bad idea. So it would become something like: mov rax, 29554273182 ; = 0xffffffffffffffff / 624165039
imul r13, rax The only potential issue here is that the constant would be 33-34 bits in most cases, which could be optimized by an ASIC. CPUs always do 64x64 multiplication. |
It's easy to fix:
Where you select |
@SChernykh Agreed. So this ; IDIV_C r5, 624165039
mov rax, 15866829597104432181
mul r13
shr rdx, 29
add r13, rdx will become ; IDIV_C r5, 624165039
mov rax, 15866829597104432181
imul r13, rax and the signed variant can be removed. |
Perfect! But the instruction should renamed to RCP_MUL_C or something like this. |
What's the point in having them? ISDIV_C compiles to a lot of code:
And can be clearly made faster on ASIC. And since it's essentially multiplication by constant, maybe replace these two instructions with explicit multiplication by 64-bit constant? Then we won't need to handle all edge cases and the code would become something like this:
P.S. Since constants are only 32-bit, we can define this instruction as multiplication by reciprocal without further edge case handling:
The text was updated successfully, but these errors were encountered: