-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved integer square root. #4403
Conversation
|
Thanks for this proposal. Unfortunately our gas benchmarks are only able to measure non-view functions, so we can't see any improvement in the gas report for this PR. Can you share how much of an improvement you are observing? Please share a few inputs that would allow us to confirm that too. |
The previously linked article discusses integer square root computation within Solidity (in addition to giving proofs). Section 5 discusses gas costs; OZ had a mean gas cost of 948, while the new version had a mean gas cost of 757. I am working on having a public repo where the gas analysis was performed |
The main change of this proposition is that it inlines initial estimation, instead of calling |
Yes, the proposed change inlines a better initialization which requires only 6 Newton iterations rather than 7; the algorithm is provably correct. The current version has no proof of correctness. The gas is reduced from 950 to 749 (mean gas cost from 2048 test values). These gas estimates and more will be in a public repo shortly for verification.
|
The initial estimation looks very similar to what we had in the original sqrt PR The consequence is that, appart from the readability, your ifs behave similarly to the log2 option. The "only" math difference that you PR makes is the Where does this line come from ? When you say:
is that was you are talking about ? If there is a proof of correctness, we should add it to the documentation. Saying that "it exist" is not enough IMO. Also, if adding Edit: I think Edit2: I also measured ~200gas savings. Its distribued betwen:
|
contracts/utils/math/Math.sol
Outdated
result += (result >> 1); | ||
result = (result + a / result) >> 1; | ||
result = (result + a / result) >> 1; | ||
result = (result + a / result) >> 1; | ||
result = (result + a / result) >> 1; | ||
result = (result + a / result) >> 1; | ||
result = (result + a / result) >> 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only count 7 iterations, yet the PDF you linked shows that in some cases, 8 are required. I'm worried about that part. Do you have an example of value that needs 8 iterations that we could run both version of the code with ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, Newton iteration refers to result = (result + a/result) >> 1
. This is the Newton iteration step when computing square roots via Newton's method; see here.
So, result += (result >> 1);
is not a Newton iteration but rather part of the initialization. Yes, you could replace it another Newton iteration, but that is more expensive.
As to your comments about "8 cases are required", you are misunderstanding the comment and figure. If only Newton iterations are performed and the same initialization is used that is currently used by OZ, then it sometimes takes 8 Newton iterations to reach the correct value. Note that this does not apply either in the current OZ code nor the proposed change. That figure was primarily meant to show why a good initialization value is required for efficient computation.
"The "only" math difference that you PR makes is the "Where does this line come from?" (referring to "I'm not 100% but I was under the impression that the math.min includes an interation. I'd love to check the effect of it": please provide a proof that "math.min includes an iteration". " I think result += (result >> 1); is supposed to be an iteration. I'm not sure if that is correct of not. For the sake of readability I'd keep the same syntax as the other ones.": As mentioned above, it is not a Newton iteration, but rather part of initialization. |
The iteration is doign an arithmetic average between |
@PaulRBerg You are the "math done in solidity" expert. WDYT of that initialization change ? |
contracts/utils/math/Math.sol
Outdated
// Take care of easy edge cases | ||
if (a <= 1) { return a; } | ||
// This check ensures no overflow | ||
if (a >= ((1 << 128) - 1)**2) { return (1 << 128) - 1; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a risk of overflow ? Can you document which part is succeptible ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using this specific method and return logic, it is possible to overflow when computing result**2
during the check result**2 <= a
, because it may be that result == 2**128
; thus, result**2 == 0
because this is unchecked
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interresting. I guess this is not necessary if you do min(result, a/result)
so the current version is good in that regard.
I'd rewrite that as:
if (a >= ((1 << 128) - 1)**2) { return (1 << 128) - 1; } | |
if (a >= uint256(type(uint128).max)**2) { return type(uint128).max; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the cost of that test outweight the cost of the min
at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interresting. I guess this is not necessary if you do
min(result, a/result)
so the current version is good in that regard.
The current version has no proof of correctness. This proposed change does. See Appendix B.4.3. Have you looked at that report?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code was updated with suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, the overflow check (if (a >= uint256(type(uint128).max)**2) { return type(uint128).max; }
) costs 23 gas.
I really think it is better to think of the iteration not as the average of
Yes, it is true at some point that This specific return logic (using |
A repo with additional analysis may be found here. |
We received a lot a gas optimization PRs that are great, but that we don't have time to process right now. |
Yes, I understand needing to take the time for gas optimization and not having time now. With that said, I would like to add that this is not just gas optimization but also replacing an algorithm (which has not been proven correct) with a provably-correct algorithm. If there are questions about the initialization choice or the math, please reach out. |
I made a couple of small improvements to the algorithm. Additionally, please see the extended analysis here. |
For the record, here has been my design philosophy:
Some Background: I first started investigating Integer Square Root algorithms because it was needed for an algebraic sigmoid bonding curve. I needed an integer square root function, so I looked at algorithms online. The current OZ alg was one I found online, and there were others which were very similar. I attempted to prove the algorithm's correctness, yet I ran into trouble with the return logic: the return logic ( Saying
is both not a proof and a weak argument. The Integer Square Root algorithm is a fundamental operation, and the proof of validity should be clear. Yes, the proof may require some knowledge of mathematics, but it's not particularly advanced. As noted above, I asked for a proof because if there was one, I wanted to see it because I could not construct one (as noted above).
Yes, I understand. I wanted to be verbose and then wanted to iterate the discussion to make it concise yet understandable. As to
Great. For the record, I iterated a lot on this algorithm as well, so if you are going to make changes to the code, would you please include Max, Mean, Median, Standard Deviation over 2048 samples? This allows for a systematic investigation of updates to the code and allows for an easy comparison. You could, for example, use the 2048 samples from this repo. The entire repo is devoted to one topic: provably correct efficient integer square root algorithms in Solidity. Please also ensure that all changes don't negate the proof of convergence. Iterating comments are completely different. Yes, I understand not everyone "speaks math", but the comments should paint a clear picture of the algorithm and different parts. Algorithm Overview: The algorithm has 3 parts:
|
We'd appreciate short and straight to the point discussions.
I think we've asked you to help with that and yet you kept expanding the explanation.
We'll keep iterating this. I'm sorry but this is an open source project and not an extension of your research. We'll dig into potential optimizations but I don't think we have time to give you a report with the samples. |
I tested the original version by @chgorman vs the version @Amxx is proposing:
Both @chgorman and @Amxx versions seem like an improvement over the current implementation. However, we discussed how much the following block makes readability more difficult: if (a >= uint256(type(uint128).max) ** 2) {
return type(uint128).max;
} And we concluded that we won't merge something we're not comfortable with even if it's an improvement in gas. In this case, the gas improvement is around ~245 gas units using the original proposal (~200 vs @Amxx's). I still think the PR is good, but I'll leave the decision of whether to keep the original proposal or use the current version to @Amxx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM again. Feel free to iterate in comments.
This estimation is with fuzzing numbers in which range? If the inputs are uint128 (don't go all the way to uint256) I don't get the same differences, because the number I put in are never taking the "shortcut" in @chgorman version. I think that realistic usage will probably not be using that shortcut either, as its very rare to see such large number manipulated for math (for cryptography we don't use this function anyway), and I think we should optimize "smaller" values |
I will. I honestly think the current comments are very difficult to read. For those that read french, all the proofs, including convergence, fit into <1 page. We should aim for that level of clarity and simplicity. |
contracts/utils/math/Math.sol
Outdated
result = (3 * result) >> 1; | ||
|
||
// We define the error as ε = result - sqrt(a). Then we know that | ||
// result = 2**e−1 + 2**e−2, and therefore ε0 = 2**e−1 + 2**e−2 - sqrt(n), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a difference of notation between this and the PDF.
Here we do result = 2**e <= sqrt(a) < 2**(e+1))
The PDF does 2**(e−1) ≤ √n < 2**e
(B.2)
This causes much confusion.
contracts/utils/math/Math.sol
Outdated
result = (result + a / result) >> 1; // ε_5 := result - sqrt(a) <= 2**(e-72) | ||
result = (result + a / result) >> 1; // ε_6 := result - sqrt(a) <= 2**(e-144) | ||
|
||
// After 6 iterations, the precision of e is already above 128 (i.e. 144). Meaning that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
precision of e
We computed e at step 0 (using all the ifs). Here we are dealing with the precision of x_n
I've updated the comments in #4902. They are longer, but IMO explain everything clearly without eluding any point or refering to any external ressource. It includes minor variable renaming (for clarity) but no logic change. I'd be willing to merge that version. If @ernestognw and @chgorman can review it, we can then forward this PR to include the commits in 4902, and merge here. |
Actually, that is my goal when iterating. IMO no commit in the history of this PR includes a "clear picture" |
I'm not going to be able to devote time to look at this for at least 36 hours. |
Just reviewed and left my comments in #4902, I think we can move those comments here and merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially felt the comments were too much, and that we shouldn't expect every developer to read the math. However, we agreed the proof is still not immensely large and we will relax the comment length when adding proofs to the Math library, because that also helps auditability.
In regards to @Amxx's comments in regards of gas: Avoiding the case extra if (a >= uint256(type(uint128).max) ** 2)
by bounding the input still gives me more gas consumption in average.
It seems expected because type(uint128).max) ** 2
is leaving a whole range of numbers that are increasing the average. I previoiusly thought that we might not want to optimize for a uniform distribution if the values are frequently in the low uint256's (for example), and the most popular cases I know (eg. Uniswap V2) are in the low uint256 order.
We appreciate the work @chgorman, in fact, challenging the code to be provable has gotten us to this conclusion, and it's a valuable contribution.
Looking good, merging.
A more efficient implementation of integer square root within Solidity; improves upon #3242. Solves Issue #4402.
Proof of validity discussed here.