-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved integer square root: Rewrite comments from #4403 #4902
Conversation
…nd. This optimized gas cost for 'small' values that are not catched by the initial check.
|
contracts/utils/math/Math.sol
Outdated
// In this function, we use Newton's method to get a root of `f(x) := x² - a`. This is also known as | ||
// Heron's method. This involves building a sequence x_n that converges toward sqrt(a). For each | ||
// iteration x_n, we define `ε_n = | x_n - sqrt(a) |`. This represents the error between the current value | ||
// and the result. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo, the mention of the Heron's method here should be at the top, and short. We don't need to prove the method but reference it. Something like:
This method is based on Newton's method for computing square roots using Heron's method; the algorithm is restricted to only using integer operations.
Then I'd remove this block.
contracts/utils/math/Math.sol
Outdated
@@ -339,38 +339,113 @@ library Math { | |||
* @dev Returns the square root of a number. If the number is not a perfect square, the value is rounded | |||
* towards zero. | |||
* | |||
* Inspired by Henry S. Warren, Jr.'s "Hacker's Delight" (Chapter 11). | |||
* This method is based on Newton's method for computing square roots; the algorithm is restricted to only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this is more based on Heron's method than Newton's. I honestly don't see a reason to reference Newton's method if Heron's is more specific.
Note I'm assuming Heron's method is an instance of Newton's method for finding square roots. I don't see how mentioning Newton's improves the explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f I understand correctly, this is more based on Heron's method than Newton's. I honestly don't see a reason to reference Newton's method if Heron's is more specific.
Newton's method is a mathematical approach that is very well known and that applies to many function. Heron's method is the name given to Newton's method in the case of one specific function. Heron's method was known long before Newton's method generalized te approach. Most people don't know "Heron's" method by name.
If you think mentionning the two is confusing, than I'd just keep Newton's because that is the more generic and widelly documented of the two. It sound more familiar, and yeld way more search result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we'd be interested in getting the most specific version. But you're right on the search result.
This would justify mentioning both.
contracts/utils/math/Math.sol
Outdated
if (aa >= (1 << 128)) { | ||
aa >>= 128; | ||
xn <<= 64; | ||
} | ||
if (aa >= (1 << 64)) { | ||
aa >>= 64; | ||
xn <<= 32; | ||
} | ||
if (aa >= (1 << 32)) { | ||
aa >>= 32; | ||
xn <<= 16; | ||
} | ||
if (aa >= (1 << 16)) { | ||
aa >>= 16; | ||
xn <<= 8; | ||
} | ||
if (aa >= (1 << 8)) { | ||
aa >>= 8; | ||
xn <<= 4; | ||
} | ||
if (aa >= (1 << 4)) { | ||
aa >>= 4; | ||
xn <<= 2; | ||
} | ||
if (aa >= (1 << 2)) { | ||
xn <<= 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also know that the comments here (e.g. // sqrt(a) >= 2**(e/2)
) are not necessary, but again, the point of the comments is not to write a full proof that "the reader should be able to follow", but rather to write something developers will follow.
If the comments were removed because they were unnecessary, we may perfectly remove all the comments because they're "unnecesary" either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not remove them because I think they are unnecessary. I removed them because they are innacurate and confusing.
e
is a constant. using it to mean a different value at each if
is more confusing than eanything. Also, the line compares aa
(or aAux
, which is a relicate of a
) to some constant values. It does not compare sqrt(a)
contracts/utils/math/Math.sol
Outdated
// We now have x_n such that `x_n = 2**(e-1) ≤ sqrt(a) < 2**e = 2 * x_n`. This implies ε_n ≤ 2**(e-1). | ||
// | ||
// We can refine our estimation by noticing that the the middle of that interval minimizes the error. | ||
// If we move x_n to equal 2**(e-1) + 2**(e-2), then we reduce the error to ε_n ≤ 2**(e-2). | ||
// This is going to be our x_0 (and ε_0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should define the error here instead. It's been ~50 lines since its definition and is not used anywhere else before here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its clearer to define x_n and ε_n together, because they are two vision of the same object, but we could mode that definition here.
contracts/utils/math/Math.sol
Outdated
// From here, we iterate using Heron's method | ||
// x_{n+1} = (x_n + a / x_n) / 2 | ||
// | ||
// One should note that: | ||
// x_{n+1}² - a = ((x_n + a / x_n) / 2)² - a | ||
// = ((x_n² + a) / (2 * x_n))² - a | ||
// = (x_n⁴ + 2 * a * x_n² + a²) / (4 * x_n²) - a | ||
// = (x_n⁴ + 2 * a * x_n² + a² - 4 * a * x_n²) / (4 * x_n²) | ||
// = (x_n⁴ - 2 * a * x_n² + a²) / (4 * x_n²) | ||
// = (x_n² - a)² / (2 * x_n)² | ||
// = ((x_n² - a) / (2 * x_n))² | ||
// ≥ 0 | ||
// Which proves that for all n ≥ 1, sqrt(a) ≤ x_n | ||
// | ||
// This gives us the proof of quadratic convergence of the sequence: | ||
// ε_{n+1} = | x_{n+1} - sqrt(a) | | ||
// = | (x_n + a / x_n) / 2 - sqrt(a) | | ||
// = | (x_n² + a - 2*x_n*sqrt(a)) / (2 * x_n) | | ||
// = | (x_n - sqrt(a))² / (2 * x_n) | | ||
// = | ε_n² / (2 * x_n) | | ||
// = ε_n² / | (2 * x_n) | | ||
// | ||
// For the first iteration, we have a special case where x_0 is known: | ||
// ε_1 = ε_0² / | (2 * x_0) | | ||
// ≤ (2**(e-2))² / (2 * (2**(e-1) + 2**(e-2))) | ||
// ≤ 2**(2*e-4) / (3 * 2**(e-1)) | ||
// ≤ 2**(e-3) / 3 | ||
// ≤ 2**(e-3-log2(3)) | ||
// ≤ 2**(e-4.5) | ||
// | ||
// For the following iterations, we use the fact that, 2**(e-1) ≤ sqrt(a) ≤ x_n | ||
// ε_{n+1} = ε_n² / | (2 * x_n) | | ||
// ≤ (2**(e-k))² / (2 * 2**(e-1)) | ||
// ≤ 2**(2*e-2*k) / 2**e | ||
// ≤ 2**(e-2*k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the kind of thing that's in Wikipedia. We may as well link an article. I didn't originally because this is the core algorithm, I think the comments should only help follow the algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// For the first iteration, we have a special case where x_0 is known: // ε_1 = ε_0² / | (2 * x_0) | // ≤ (2**(e-2))² / (2 * (2**(e-1) + 2**(e-2))) // ≤ 2**(2*e-4) / (3 * 2**(e-1)) // ≤ 2**(e-3) / 3 // ≤ 2**(e-3-log2(3)) // ≤ 2**(e-4.5)
This part (the thing about log2(3)
). Is not in wikipedia, and its not in the PDF either. I found it nowhere, and yet its the core justification that we can remove an iteration and go from 7 to 6. This is the part that proves that 2**(e-1) + 2**(e-2)
is a good x_0. If we were to keep only one thing in the entire proof, it would probably be that part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let's merge this changes to #4403
The more I think about this while reviewing, I'm getting convinced that if we're setting the industry standards in security, it's better for everyone to include a proof if we consider it's valuable and required.
Also, it's easier to keep the proof with the code so they're updated together (if ever), and the only downside is that we're bloating the library with comments, but I think we can relax that requirement for Math algorithms where it also improves auditability.
Closing since it was merged along with #4403 |
Given the amount of changes I think we should do one the comment, and given that I don't want to push all that to #4403 without a discussion, I'm opening this PR to discuss my propositions.
For the comment part I think we should achieve the following goal:
result
for intermediary computation, as it can lead to confusion between the intermediary value and the actual (final) resultI'm happy with what is in this PR right now,
with one exception. There are claims (from #4403) thatε_0 ≤ 2**(e-2)
ε_1 ≤ 2**(e-4.5)
There is not rational logic for this transition. This needs to be resolved.Latest commit explains properly the convergence speed.
Extends: #4403
PR Checklist
npx changeset add
)