Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved integer square root: Rewrite comments from #4403 #4902

Closed
wants to merge 27 commits into from

Conversation

Amxx
Copy link
Collaborator

@Amxx Amxx commented Feb 15, 2024

Given the amount of changes I think we should do one the comment, and given that I don't want to push all that to #4403 without a discussion, I'm opening this PR to discuss my propositions.

For the comment part I think we should achieve the following goal:

  • clarity of notation:
    • do not use result for intermediary computation, as it can lead to confusion between the intermediary value and the actual (final) result
    • all name should only represent a single value, that does not change through computation, unless explicitelly describes as a sequence, in which case the name representes the latest element of the sequence (so far)
  • locality of the explanation:
    • We should avoid refering to external documents if the math can be written down in the comments. I'd rather have 10 lines of clean math here than having to go look somewhere else.
    • If the explananation is too complexe to be inlined, then any external reference should be available online, and we should point to it.
  • no tricks, no shortcut

I'm happy with what is in this PR right now, with one exception. There are claims (from #4403) that

  • ε_0 ≤ 2**(e-2)
  • ε_1 ≤ 2**(e-4.5)
    There is not rational logic for this transition. This needs to be resolved.

Latest commit explains properly the convergence speed.

Extends: #4403

PR Checklist

  • Tests
  • Documentation
  • Changeset entry (run npx changeset add)

Copy link

changeset-bot bot commented Feb 15, 2024

⚠️ No Changeset found

Latest commit: b20585a

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Comment on lines 352 to 355
// In this function, we use Newton's method to get a root of `f(x) := x² - a`. This is also known as
// Heron's method. This involves building a sequence x_n that converges toward sqrt(a). For each
// iteration x_n, we define `ε_n = | x_n - sqrt(a) |`. This represents the error between the current value
// and the result.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo, the mention of the Heron's method here should be at the top, and short. We don't need to prove the method but reference it. Something like:

This method is based on Newton's method for computing square roots using Heron's method; the algorithm is restricted to only using integer operations.

Then I'd remove this block.

contracts/utils/math/Math.sol Outdated Show resolved Hide resolved
@@ -339,38 +339,113 @@ library Math {
* @dev Returns the square root of a number. If the number is not a perfect square, the value is rounded
* towards zero.
*
* Inspired by Henry S. Warren, Jr.'s "Hacker's Delight" (Chapter 11).
* This method is based on Newton's method for computing square roots; the algorithm is restricted to only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this is more based on Heron's method than Newton's. I honestly don't see a reason to reference Newton's method if Heron's is more specific.

Note I'm assuming Heron's method is an instance of Newton's method for finding square roots. I don't see how mentioning Newton's improves the explanation.

Copy link
Collaborator Author

@Amxx Amxx Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f I understand correctly, this is more based on Heron's method than Newton's. I honestly don't see a reason to reference Newton's method if Heron's is more specific.

Newton's method is a mathematical approach that is very well known and that applies to many function. Heron's method is the name given to Newton's method in the case of one specific function. Heron's method was known long before Newton's method generalized te approach. Most people don't know "Heron's" method by name.

If you think mentionning the two is confusing, than I'd just keep Newton's because that is the more generic and widelly documented of the two. It sound more familiar, and yeld way more search result.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we'd be interested in getting the most specific version. But you're right on the search result.

This would justify mentioning both.

contracts/utils/math/Math.sol Outdated Show resolved Hide resolved
contracts/utils/math/Math.sol Outdated Show resolved Hide resolved
Comment on lines 368 to 394
if (aa >= (1 << 128)) {
aa >>= 128;
xn <<= 64;
}
if (aa >= (1 << 64)) {
aa >>= 64;
xn <<= 32;
}
if (aa >= (1 << 32)) {
aa >>= 32;
xn <<= 16;
}
if (aa >= (1 << 16)) {
aa >>= 16;
xn <<= 8;
}
if (aa >= (1 << 8)) {
aa >>= 8;
xn <<= 4;
}
if (aa >= (1 << 4)) {
aa >>= 4;
xn <<= 2;
}
if (aa >= (1 << 2)) {
xn <<= 1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also know that the comments here (e.g. // sqrt(a) >= 2**(e/2)) are not necessary, but again, the point of the comments is not to write a full proof that "the reader should be able to follow", but rather to write something developers will follow.

If the comments were removed because they were unnecessary, we may perfectly remove all the comments because they're "unnecesary" either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not remove them because I think they are unnecessary. I removed them because they are innacurate and confusing.

e is a constant. using it to mean a different value at each if is more confusing than eanything. Also, the line compares aa (or aAux, which is a relicate of a) to some constant values. It does not compare sqrt(a)

Comment on lines 396 to 400
// We now have x_n such that `x_n = 2**(e-1) ≤ sqrt(a) < 2**e = 2 * x_n`. This implies ε_n ≤ 2**(e-1).
//
// We can refine our estimation by noticing that the the middle of that interval minimizes the error.
// If we move x_n to equal 2**(e-1) + 2**(e-2), then we reduce the error to ε_n ≤ 2**(e-2).
// This is going to be our x_0 (and ε_0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should define the error here instead. It's been ~50 lines since its definition and is not used anywhere else before here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its clearer to define x_n and ε_n together, because they are two vision of the same object, but we could mode that definition here.

Comment on lines 403 to 437
// From here, we iterate using Heron's method
// x_{n+1} = (x_n + a / x_n) / 2
//
// One should note that:
// x_{n+1}² - a = ((x_n + a / x_n) / 2)² - a
// = ((x_n² + a) / (2 * x_n))² - a
// = (x_n⁴ + 2 * a * x_n² + a²) / (4 * x_n²) - a
// = (x_n⁴ + 2 * a * x_n² + a² - 4 * a * x_n²) / (4 * x_n²)
// = (x_n⁴ - 2 * a * x_n² + a²) / (4 * x_n²)
// = (x_n² - a)² / (2 * x_n)²
// = ((x_n² - a) / (2 * x_n))²
// ≥ 0
// Which proves that for all n ≥ 1, sqrt(a) ≤ x_n
//
// This gives us the proof of quadratic convergence of the sequence:
// ε_{n+1} = | x_{n+1} - sqrt(a) |
// = | (x_n + a / x_n) / 2 - sqrt(a) |
// = | (x_n² + a - 2*x_n*sqrt(a)) / (2 * x_n) |
// = | (x_n - sqrt(a))² / (2 * x_n) |
// = | ε_n² / (2 * x_n) |
// = ε_n² / | (2 * x_n) |
//
// For the first iteration, we have a special case where x_0 is known:
// ε_1 = ε_0² / | (2 * x_0) |
// ≤ (2**(e-2))² / (2 * (2**(e-1) + 2**(e-2)))
// ≤ 2**(2*e-4) / (3 * 2**(e-1))
// ≤ 2**(e-3) / 3
// ≤ 2**(e-3-log2(3))
// ≤ 2**(e-4.5)
//
// For the following iterations, we use the fact that, 2**(e-1) ≤ sqrt(a) ≤ x_n
// ε_{n+1} = ε_n² / | (2 * x_n) |
// ≤ (2**(e-k))² / (2 * 2**(e-1))
// ≤ 2**(2*e-2*k) / 2**e
// ≤ 2**(e-2*k)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the kind of thing that's in Wikipedia. We may as well link an article. I didn't originally because this is the core algorithm, I think the comments should only help follow the algorithm.

Copy link
Collaborator Author

@Amxx Amxx Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

       // For the first iteration, we have a special case where x_0 is known:
       // ε_1 = ε_0² / | (2 * x_0) |
       //     ≤ (2**(e-2))² / (2 * (2**(e-1) + 2**(e-2)))
       //     ≤ 2**(2*e-4) / (3 * 2**(e-1))
       //     ≤ 2**(e-3) / 3
       //     ≤ 2**(e-3-log2(3))
       //     ≤ 2**(e-4.5)

This part (the thing about log2(3)). Is not in wikipedia, and its not in the PDF either. I found it nowhere, and yet its the core justification that we can remove an iteration and go from 7 to 6. This is the part that proves that 2**(e-1) + 2**(e-2) is a good x_0. If we were to keep only one thing in the entire proof, it would probably be that part.

ernestognw
ernestognw previously approved these changes Feb 16, 2024
Copy link
Member

@ernestognw ernestognw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's merge this changes to #4403

The more I think about this while reviewing, I'm getting convinced that if we're setting the industry standards in security, it's better for everyone to include a proof if we consider it's valuable and required.

Also, it's easier to keep the proof with the code so they're updated together (if ever), and the only downside is that we're bloating the library with comments, but I think we can relax that requirement for Math algorithms where it also improves auditability.

@ernestognw
Copy link
Member

Closing since it was merged along with #4403

@ernestognw ernestognw closed this Feb 16, 2024
@Amxx Amxx deleted the math/sqrt-root-comments branch February 20, 2024 09:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants