-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Add BigNumber Type and syscalls for BPF programs #17082
Conversation
@jackcmay Second commit should address the sanity failure |
@joncinque @jstarry I think you guys have been using big numbers recently? If so, do you see a syscall for these types of things as a good solution? |
Yes, definitely. Had to make some compromises on the amount of precision used in token-lending in order to reduce compute cost of exponentiation, multiplication, and division. This is super useful if the compute cost is reasonable. |
@jackcmay From a PR completeness standpoint, is there anything else I should be adding to this? |
This looks like a great set of syscalls to me. I'll echo @jstarry 's caveat, that as long as the compute usage is reasonable, these will definitely be useful. How were the current numbers decided? |
programs/bpf_loader/Cargo.toml
Outdated
@@ -21,6 +21,9 @@ solana-runtime = { path = "../../runtime", version = "=1.7.0" } | |||
solana-sdk = { path = "../../sdk", version = "=1.7.0" } | |||
solana_rbpf = "=0.2.8" | |||
thiserror = "1.0" | |||
hkdf = "0.11.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: nice to keep these in alphabetical order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 03af751
programs/bpf_loader/src/syscalls.rs
Outdated
@@ -1,5 +1,8 @@ | |||
use crate::{alloc, BpfError}; | |||
use alloc::Alloc; | |||
use blake3::traits::digest::Digest; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using cargo fmt
on these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 03af751
sdk/program/src/bignum.rs
Outdated
BorshSerialize, | ||
BorshDeserialize, | ||
BorshSchema, | ||
// Clone, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what rustfmt does, although I did find one other change with rustfmt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is clone commented out? Do you not want to support the default implementation of clone
? If not, remove this line.
sdk/program/src/bignum.rs
Outdated
|
||
// Syscall interfaces | ||
#[cfg(target_arch = "bpf")] | ||
extern "C" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stick these in their representative functions below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 03af751
{ | ||
use openssl::bn::BigNum; | ||
let braw: &BigNum = unsafe { &*(self.0 as *mut BigNum) }; | ||
write!(f, "{}", braw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does bignum not already implement default?
sdk/program/src/bignum.rs
Outdated
} | ||
|
||
impl AsRef<u64> for BigNumber { | ||
fn as_ref(&self) -> &u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackcmay No, it will always return the u64 of the BigNumber structure.0
sdk/program/src/bignum.rs
Outdated
} | ||
|
||
/// Drop - removes the underlying BigNum | ||
impl Drop for BigNumber { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that if someone hacks thier program they could force a leaked bignum in the runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need clarity here...
} | ||
|
||
#[cfg(test)] | ||
mod tests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to see more comprehensive testing. Also would need testing from within a bpf program, check out the sha program in this or for an example: #16498
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added BPF program testing in 2b1f3e1
@@ -934,901 +1150,1892 @@ impl<'a> SyscallObject<BpfError> for SyscallSha256<'a> { | |||
} | |||
} | |||
|
|||
fn get_sysvar<T: std::fmt::Debug + Sysvar + SysvarId>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why github is breaking the diff up like this but it's making it very hard to read
let rwptr = Box::into_raw(bbox); | ||
let bignum_ptr = rwptr as u64; | ||
*big_number = bignum_ptr; | ||
question_mark!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should consume first so that the call fails before doing any work if the caller doesn't have enough units left
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved consumes to early as possible depending on when size is available to calculate byte costs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 03af751
programs/bpf_loader/src/syscalls.rs
Outdated
); | ||
let bytes: f64 = byte_slice.len() as f64; | ||
let bbox = Box::new(BigNum::from_slice(byte_slice).unwrap()); | ||
let rwptr = Box::into_raw(bbox); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This screems memory leak, is it possible to create a container in program space (above the syscall) and just have the syscall fill in the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe pass back up a byte array and then call from_slice()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May need clarity here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The model here looks like it is allocating a bignum type in the syscall and then passing back a raw pointer, which in program space is converted into a bignum type. This means the allocation is occurring in the runtime and relying on the program to trigger the drop. A malicious program could skip the drop resulting in a memory leak in the runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use from_bytes
and to_bytes
to pass a byte array between the two. That does lead to local allocations on both ends which isn't ideal but better then a possible memory leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackcmay So, remember I am a Rust n00b:
- Instead of boxing and raw-pointers, send and receive byte arrays to/from syscalls and in the syscalls instantiate the BigNum's?
- Is there a way to pass in a Vec and
translate
it to type in a syscall? I've had trouble and can't seem to make it work. - The reason for the Vec is that BigNum are ultimately variable length and can get quite large so knowing fixed buffers a priori would be daunting.
sdk/src/process_instruction.rs
Outdated
@@ -188,6 +226,25 @@ impl BpfComputeBudget { | |||
max_cpi_instruction_size: 1280, // IPv6 Min MTU size | |||
cpi_bytes_per_unit: 250, // ~50MB at 200,000 units | |||
sysvar_base_cost: 100, | |||
bignum_new_base_cost: 100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the sense that these costs are too small. By chance have you done any perf measurements of the big-num calls on native?
The compute unit calculations are different for each of the following classes of operation.
Big number operations. The base cost of 100 was chosen as that is what it cost to log and we really do not know how else to select or derive this number. Big number arithmetic operations. The compute unit calculations are based on this (Ethereum precompile for big arithmetic issue] (ethereum/EIPs#101). The formula is GADDSUBBASE + GARITHWORD * ceil( / 32). The value for GARITHWORD is 6. The base cost of an operation GADDSUBBASE we took by logging the compute units required to perform the same operation in a Solana program using u64 numbers which is 15 or 30 units. Big number modular arithmetic operations. The unit calculation uses the same formula from the Ethereum precompile for big arithmetic but the base cost is the cost of two operations in a Solana program on u64 numbers. For example, mod_mul base cost is the cost of a multiply operation plus the cost of a mod operation. We did not use the [Ethereum Big integer modular exponentiation precompile](EIP-198: Big integer modular exponentiation (ethereum.org)) formula for the mod_exp calculation but this is something we are looking for feedback on. Hash operations. For hash operations we used the same formula from the Ethereum precompile for big arithmetic but the base costs vary based on the number of sub-operations. The blake3 digest base cost is the same as the SHA256 digest base cost. The hash generator has more sub-operatinos then a digest operation. The hash to prime operation has a variable number of loops which is a concern as I do not know the maximum number of loops required to find the next prime number if the digest is not a prime number. This is the most expensive hash operation and we wanted feedback on how to calculate compute units. It would not be acceptable for a hash to prime operation to fail for a 32 byte number. When we were developing our program we moved all the operations in our RSA accumulator membership and non-membership proof verification to sys calls if an instruction resulted in us exceeding the 200_000 compute unit limit. When our program was able to run within the 200_000 compute units we tweaked the cost calculations so that we could still execute our proof verifications without exceeding the unit limit. It may be the case that the current Solana compute unit limits and the cost of executing the operations required for RSA accumulator proof verification make it impossible to execute these types of programs on Solana but I’m hoping that this is not the case. |
The solang compiler https://github.com/hyperledger-labs/solang could possibly use bignum operations if they are efficient enough for math of integer types > 64. However there are some issues with that.
|
@jackcmay - Checked in that addresses review comments I marked with 👍 , commented where I was not sure or did not address yet |
@seanyoung these big number operations are useful for cryptographic protocols such as the RSA accumulator we are using. There is very little conversion between native numbers and big numbers in this protocol. Some RSA accumulator implementations use fixed width integers, u256 and u512, usually for hashing operations but we did not take that approach in this iteration. |
@FrankC01 Does this test pass for you locally:
|
Updating from solana master
@jackcmay When I run either (in solana/programs/bpf) : or
I do not see that error locally. And here is what I captured from running:
|
Hmm, I also don't see that error when running the exact test that CI is running :-( I kicked CI, will try and resolve tomorrow |
@seanyoung is the overflow checks something that can be added later or must it be part of these initial syscalls to meet Solang's requirements? |
@jackcmay It seems like a different error now (in coverage)... other PRs seem to have issues as well. Are any actually getting through? |
@jackcmay The syscalls in this pull request do not provide fixed width integer arithmetic. For this you would need to use something like unit provided by parity. https://crates.io/crates/uint. These syscalls rely on OpenSSL to catch underflow and overflow errors. |
@FrankC01 Are you rebased up to the latest master? |
Before each checkin i do |
The current implementations use only the id and disregard other fields, in particular wallclock. This can lead to bugs where an outdated contact-info shadows or overrides a current one because they compare equal.
* improve insert into map initially * rework towards single code path * rename * update test
* don't log shrink metrics on first call * simplify logic
- upgrade rustc to 1.52.1 and clang to 12.0
@jackcmay @arthurgreef @seanyoung - Folks, I'm closing this PR as I had completely messed up the works. I will open a new Draft PR and reference this for the background Apologies |
@jackcmay - As per your comments... for review...
Problem
BigNum operations are not available in vanilla Solana for BPF programs. For big and prime numbers it is desirable to expose these capabilities to BPF programs.
In addition, hashing to a prime number is valuable as well.
Summary of Changes
Fixes #