Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve primitive biginteger operations #204

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ The main features of this release are:
- Adding new traits for basic curve cycles and pairing based curve cycles

### Breaking changes
- #20 (ark-poly) Move univariate DensePolynomial and SparsePolynomial into a
- #20 (ark-poly) Move univariate DensePolynomial and SparsePolynomial into a
univariate sub-crate. Make this change by:
find w/ regex `ark_poly::(Dense|Sparse)Polynomial`, and replace with `ark_poly::univariate::$1Polynomial`.
- #36 (ark-ec) In Short-Weierstrass curves, include an infinity bit in `ToConstraintField`.
Expand All @@ -31,12 +31,12 @@ The main features of this release are:
- #129 (ark-ff) Move `ark_ff::{UniformRand, test_rng}` to `ark_std::{UniformRand, test_rng}`.
Importing these from `ark-ff` is still possible, but is deprecated and will be removed in the following release.
- #144 (ark-poly) Add `CanonicalSerialize` and `CanonicalDeserialize` trait bounds for `Polynomial`.
- #160 (ark-serialize, ark-ff, ark-ec)
- #160 (ark-serialize, ark-ff, ark-ec)
- Remove `ConstantSerializedSize`; users should use `serialized_size*` (see next).
- Add `serialized_size_with_flags` method to `CanonicalSerializeWithFlags`.
- Add `serialized_size_with_flags` method to `CanonicalSerializeWithFlags`.
- Change `from_random_bytes_with_flags` to output `ark_serialize::Flags`.
- Change signatures of `Flags::from_u8*` to output `Option`.
- Change `Flags::from_u8*` to be more strict about the inputs they accept:
- Change `Flags::from_u8*` to be more strict about the inputs they accept:
if the top bits of the `u8` value do *not* correspond to one of the possible outputs of `Flags::u8_bitmask`, then these methods output `None`, whereas before they output
a default value.
Downstream users other than `ark-curves` should not see breakage unless they rely on these methods/traits explicitly.
Expand Down Expand Up @@ -80,9 +80,10 @@ The main features of this release are:
- #166 (ark-ff) Add a `to_bytes_be()` and `to_bytes_le` methods to `BigInt`.
- #169 (ark-poly) Improve radix-2 FFTs by moving to a faster algorithm by Riad S. Wahby.
- #171, #173, #176 (ark-poly) Apply significant further speedups to the new radix-2 FFT.
- #188 (ark-ec) Make Short Weierstrass random sampling result in an element with unknown discrete log
- #188 (ark-ec) Make Short Weierstrass random sampling result in an element with unknown discrete log.
- #190 (ark-ec) Add curve cycle trait and extended pairing cycle trait for all types of ec cycles.
- #201 (ark-ec, ark-ff, ark-test-curves, ark-test-templates) Remove the dependency on `rand_xorshift`
- #201 (ark-ec, ark-ff, ark-test-curves, ark-test-templates) Remove the dependency on `rand_xorshift`.
- #204 (ark-ff) Improve biginteger arithmetic with unrolling and intrinsics.

### Bug fixes
- #36 (ark-ec) In Short-Weierstrass curves, include an infinity bit in `ToConstraintField`.
Expand Down
4 changes: 2 additions & 2 deletions ff-asm/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ pub fn x86_64_asm_mul(input: TokenStream) -> TokenStream {
let inner_ts: Expr = syn::parse_str(&impl_block).unwrap();
let ts = quote::quote! {
let a = &mut #a;
let b = #b;
let b = &#b;
#inner_ts
};
ts.into()
Expand Down Expand Up @@ -290,7 +290,7 @@ fn generate_impl(num_limbs: usize, is_mul: bool) -> String {
let mut ctx = Context::new();
ctx.add_declaration("a", "r", "a");
if is_mul {
ctx.add_declaration("b", "r", "&b");
ctx.add_declaration("b", "r", "b");
}
ctx.add_declaration("modulus", "r", "&P::MODULUS.0");
ctx.add_declaration("0", "i", "0u64");
Expand Down
82 changes: 68 additions & 14 deletions ff/src/biginteger/macros.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,39 +13,84 @@ macro_rules! bigint_impl {
const NUM_LIMBS: usize = $num_limbs;

#[inline]
#[ark_ff_asm::unroll_for_loops]
fn add_nocarry(&mut self, other: &Self) -> bool {
let mut carry = 0;

for (a, b) in self.0.iter_mut().zip(other.0.iter()) {
*a = adc!(*a, *b, &mut carry);
for i in 0..$num_limbs {
#[cfg(not(use_asm))]
{
self.0[i] = adc!(self.0[i], other.0[i], &mut carry);
}

#[cfg(use_asm)]
#[allow(unsafe_code)]
unsafe {
carry = core::arch::x86_64::_addcarryx_u64(
carry,
self.0[i],
other.0[i],
&mut self.0[i],
);
}
}

carry != 0
}

#[inline]
#[ark_ff_asm::unroll_for_loops]
fn sub_noborrow(&mut self, other: &Self) -> bool {
let mut borrow = 0;

for (a, b) in self.0.iter_mut().zip(other.0.iter()) {
*a = sbb!(*a, *b, &mut borrow);
for i in 0..$num_limbs {
#[cfg(not(use_asm))]
{
self.0[i] = sbb!(self.0[i], other.0[i], &mut borrow);
}

#[cfg(use_asm)]
#[allow(unsafe_code)]
unsafe {
borrow = core::arch::x86_64::_subborrow_u64(
borrow,
self.0[i],
other.0[i],
&mut self.0[i],
);
}
}

borrow != 0
}

#[inline]
#[ark_ff_asm::unroll_for_loops]
fn mul2(&mut self) {
let mut last = 0;
for i in &mut self.0 {
let tmp = *i >> 63;
*i <<= 1;
*i |= last;
last = tmp;
let mut carry = 0;

for i in 0..$num_limbs {
#[cfg(not(use_asm))]
#[allow(unused_assignments)]
{
self.0[i] = adc!(self.0[i], self.0[i], &mut carry);
}

#[cfg(use_asm)]
#[allow(unsafe_code, unused_assignments)]
unsafe {
carry = core::arch::x86_64::_addcarryx_u64(
carry,
self.0[i],
self.0[i],
&mut self.0[i],
);
}
}
}

#[inline]
#[ark_ff_asm::unroll_for_loops]
fn muln(&mut self, mut n: u32) {
if n >= 64 * $num_limbs {
*self = Self::from(0);
Expand All @@ -54,8 +99,8 @@ macro_rules! bigint_impl {

while n >= 64 {
let mut t = 0;
for i in &mut self.0 {
core::mem::swap(&mut t, i);
for i in 0..$num_limbs {
core::mem::swap(&mut t, &mut self.0[i]);
}
n -= 64;
}
Expand Down Expand Up @@ -119,8 +164,14 @@ macro_rules! bigint_impl {
}

#[inline]
#[ark_ff_asm::unroll_for_loops]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to benchmark this change on larger fields as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are probably right. This will currently involve changing a lot of Cargo.toml files to point to my local repo. I'll try to fix some of the issues I find in this process.

fn is_zero(&self) -> bool {
self.0.iter().all(|&e| e == 0)
for i in 0..$num_limbs {
if self.0[i] != 0 {
return false;
}
}
true
}

#[inline]
Expand Down Expand Up @@ -270,8 +321,11 @@ macro_rules! bigint_impl {

impl Ord for $name {
#[inline]
#[ark_ff_asm::unroll_for_loops]
fn cmp(&self, other: &Self) -> ::core::cmp::Ordering {
for (a, b) in self.0.iter().rev().zip(other.0.iter().rev()) {
for i in 0..$num_limbs {
let a = &self.0[$num_limbs - i - 1];
let b = &other.0[$num_limbs - i - 1];
if a < b {
return core::cmp::Ordering::Less;
} else if a > b {
Expand Down
2 changes: 0 additions & 2 deletions ff/src/fields/arithmetic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ macro_rules! impl_field_mul_assign {
{
// Tentatively avoid using assembly for `$limbs == 1`.
if $limbs <= 6 && $limbs > 1 {
assert!($limbs <= 6);
ValarDragon marked this conversation as resolved.
Show resolved Hide resolved
ark_ff_asm::x86_64_asm_mul!($limbs, (self.0).0, (other.0).0);
self.reduce();
return;
Expand Down Expand Up @@ -104,7 +103,6 @@ macro_rules! impl_field_square_in_place {
let _no_carry: bool = !(first_bit_set || all_bits_set);

if $limbs <= 6 && _no_carry {
assert!($limbs <= 6);
ark_ff_asm::x86_64_asm_square!($limbs, (self.0).0);
self.reduce();
return self;
Expand Down
4 changes: 2 additions & 2 deletions ff/src/fields/macros.rs
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ macro_rules! impl_Fp {
}

impl<P: $FpParameters> $Fp<P> {
#[inline]
#[inline(always)]
pub(crate) fn is_valid(&self) -> bool {
self.0 < P::MODULUS
}
Expand Down Expand Up @@ -605,7 +605,7 @@ macro_rules! impl_Fp {
#[must_use]
fn neg(self) -> Self {
if !self.is_zero() {
let mut tmp = P::MODULUS.clone();
let mut tmp = P::MODULUS;
tmp.sub_noborrow(&self.0);
$Fp::<P>(tmp, PhantomData)
} else {
Expand Down
1 change: 0 additions & 1 deletion ff/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
#![allow(clippy::op_ref, clippy::suspicious_op_assign_impl)]
#![cfg_attr(not(use_asm), forbid(unsafe_code))]
#![cfg_attr(use_asm, feature(llvm_asm))]
#![cfg_attr(use_asm, deny(unsafe_code))]

#[macro_use]
extern crate ark_std;
Expand Down