The multipllier $10^{\kappa}$ does not need to be a power of 10 #63

jk-jeon · 2024-05-10T23:22:44Z

The only place where it is a power of 10 is used is that powers of 10 are stored in the precomputed cache. By allowing more general integers other than $10^{\kappa}$, we may further reduce the chance of having to investigate the fractional part, thus boosting performance. It may also opens up a possibility of eliminating the needs for treating the shorter interval case specially.

jk-jeon · 2024-05-11T10:03:34Z

An alternative would be to use $2^{q-p-2}$ as the multiplier, i.e., $128$ for binary32 and $1024$ for binary64. This has an advantage of simplifying the division and divisibility check by the multiplier, but it reduces the error margin for the approximate multiplication by $10^{k}$ into very small value, which has a serious impact on compressed cache handling.

Currently, clang translates this division + divisibility check into imul + movzx + shr + cmp. If we use $2^{q-p-2}$, then this would become test + sete + shr.

Another alternative would be to consider a mixed power of $2$ and $5$, but it doesn't sound that worth the trouble of rewriting a lot of the paper/implementation.

On the other hand possible elimination of the shorter interval branch is worth investigating, but it seems like a completely independent issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The multipllier $10^{\kappa}$ does not need to be a power of 10 #63

The multipllier $10^{\kappa}$ does not need to be a power of 10 #63

jk-jeon commented May 10, 2024

jk-jeon commented May 11, 2024

The multipllier $10^{\kappa}$ does not need to be a power of 10 #63

The multipllier $10^{\kappa}$ does not need to be a power of 10 #63

Comments

jk-jeon commented May 10, 2024

jk-jeon commented May 11, 2024