Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The multipllier $10^{\kappa}$ does not need to be a power of 10 #63

Open
jk-jeon opened this issue May 10, 2024 · 1 comment
Open

The multipllier $10^{\kappa}$ does not need to be a power of 10 #63

jk-jeon opened this issue May 10, 2024 · 1 comment

Comments

@jk-jeon
Copy link
Owner

jk-jeon commented May 10, 2024

The only place where it is a power of 10 is used is that powers of 10 are stored in the precomputed cache. By allowing more general integers other than $10^{\kappa}$, we may further reduce the chance of having to investigate the fractional part, thus boosting performance. It may also opens up a possibility of eliminating the needs for treating the shorter interval case specially.

@jk-jeon
Copy link
Owner Author

jk-jeon commented May 11, 2024

An alternative would be to use $2^{q-p-2}$ as the multiplier, i.e., $128$ for binary32 and $1024$ for binary64. This has an advantage of simplifying the division and divisibility check by the multiplier, but it reduces the error margin for the approximate multiplication by $10^{k}$ into very small value, which has a serious impact on compressed cache handling.

Currently, clang translates this division + divisibility check into imul + movzx + shr + cmp. If we use $2^{q-p-2}$, then this would become test + sete + shr.

Another alternative would be to consider a mixed power of $2$ and $5$, but it doesn't sound that worth the trouble of rewriting a lot of the paper/implementation.

On the other hand possible elimination of the shorter interval branch is worth investigating, but it seems like a completely independent issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant