Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error handling for threaded verification errors #1748

Merged
merged 7 commits into from
Apr 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,18 @@ The main benchmarking tool is called `benchy`. `benchy` has several subcommands
> cargo run --release --bin benchy -- window-post --size 2KiB
> cargo run --release --bin benchy -- window-post-fake --size 2KiB --fake
> cargo run --release --bin benchy -- prodbench
# After a preserved cache is generated, this command tests *only* synthetic proof generation
> cargo run --release --bin benchy -- porep --size 2KiB --synthetic --cache /d1/nemo/tmp/cache-2k --skip-precommit-phase1 --skip-precommit-phase2 --skip-commit-phase1 --skip-commit-phase2
#
# Synthetic PoRep examples (for both 2KiB test and 32GiB production sector sizes)
#
# Preserve a cache using synthetic-porep (2KiB sectors)
> cargo run --release --bin benchy -- porep --size 2KiB --cache /tmp/cache-2k-synthetic --preserve-cache --api-features synthetic-porep
# Preserve a cache using synthetic-porep (32GiB sectors)
> cargo run --release --bin benchy -- porep --size 32GiB --cache /tmp/cache-32g-synthetic --preserve-cache --api-features synthetic-porep
#
# After a preserved cache is generated, this command tests *only* synthetic proof generation (2KiB sectors)
> cargo run --release --bin benchy -- porep --size 2KiB --cache /tmp/cache-2k-synthetic --api-features synthetic-porep --skip-precommit-phase1 --skip-precommit-phase2 --skip-commit-phase1 --skip-commit-phase2
# After a preserved cache is generated, this command tests *only* synthetic proof generation (32GiB sectors)
> cargo run --release --bin benchy -- porep --size 32GiB --cache /tmp/cache-32g-synthetic --api-features synthetic-porep --skip-precommit-phase1 --skip-precommit-phase2 --skip-commit-phase1 --skip-commit-phase2
```

### Window PoSt Bench usages
Expand Down
156 changes: 122 additions & 34 deletions storage-proofs-porep/src/stacked/vanilla/proof.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ use std::io::{BufReader, BufWriter};
use std::marker::PhantomData;
use std::panic::panic_any;
use std::path::{Path, PathBuf};
use std::sync::Mutex;
use std::sync::{Arc, Mutex};

use anyhow::{ensure, Context};
use bincode::deserialize;
Expand All @@ -20,6 +20,8 @@ use merkletree::{
use rayon::prelude::{
IndexedParallelIterator, IntoParallelIterator, ParallelIterator, ParallelSliceMut,
};
#[cfg(any(feature = "cuda", feature = "multicore-sdr", feature = "opencl"))]
use storage_proofs_core::settings::SETTINGS;
use storage_proofs_core::{
cache_key::CacheKey,
data::Data,
Expand All @@ -31,7 +33,6 @@ use storage_proofs_core::{
split_config_and_replica, BinaryMerkleTree, DiskTree, LCTree, MerkleProofTrait,
MerkleTreeTrait,
},
settings::SETTINGS,
util::{default_rows_to_discard, NODE_SIZE},
};
use yastl::Pool;
Expand All @@ -55,6 +56,17 @@ use crate::{

pub const TOTAL_PARENTS: usize = 37;

struct InvalidEncodingProofCoordinate {
cryptonemo marked this conversation as resolved.
Show resolved Hide resolved
failure_detected: bool,
layer: usize,
challenge_index: usize,
}

struct InvalidChallengeCoordinate {
failure_detected: bool,
challenge_index: usize,
}

lazy_static! {
/// Ensure that only one `TreeBuilder` or `ColumnTreeBuilder` uses the GPU at a time.
/// Curently, this is accomplished by only instantiating at most one at a time.
Expand Down Expand Up @@ -140,8 +152,7 @@ impl<'a, Tree: 'static + MerkleTreeTrait, G: 'static + Hasher> StackedDrg<'a, Tr
Challenges::Synth(synth_challenges) => {
// If there are no synthetic vanilla proofs stored on disk yet, generate them.
if pub_inputs.seed.is_none() {
info!("generating synthetic vanilla proofs in a single partition");
assert_eq!(partition_count, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I'm not seeing why this one assert was removed, but I won't let it delay approval.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm, I'm looking at #1750 now.

Copy link
Collaborator Author

@cryptonemo cryptonemo Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I'm not seeing why this one assert was removed, but I won't let it delay approval.

Adding it was an error. The partition count is only 1 for test sectors in porep, and the logging comment implied the incorrect thing (i.e that the single pass of synth porep proof generation implies that the partition count must be 1, which is wrong; it just meant we were not generating synth porep proofs with regard to challenges being partition specific like in other places).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^Fortunately, it was never released with that assert. It came up quickly in overall testing while looking into this issue though.

info!("generating all required synthetic vanilla proofs");

let comm_r = pub_inputs.tau.as_ref().expect("tau is set").comm_r;
// Derive the set of challenges we are proving over.
Expand Down Expand Up @@ -264,6 +275,21 @@ impl<'a, Tree: 'static + MerkleTreeTrait, G: 'static + Hasher> StackedDrg<'a, Tr
.collect()
};

// Error propagation mechanism for scoped parallel verification.
let invalid_encoding_proof = Arc::new(Mutex::new(InvalidEncodingProofCoordinate {
failure_detected: false,
layer: 0,
challenge_index: 0,
}));
let invalid_comm_d = Arc::new(Mutex::new(InvalidChallengeCoordinate {
failure_detected: false,
challenge_index: 0,
}));
let invalid_comm_r = Arc::new(Mutex::new(InvalidChallengeCoordinate {
failure_detected: false,
challenge_index: 0,
}));

THREAD_POOL.scoped(|scope| {
// Stacked commitment specifics
challenges
Expand All @@ -280,10 +306,18 @@ impl<'a, Tree: 'static + MerkleTreeTrait, G: 'static + Hasher> StackedDrg<'a, Tr
.expect("failed to get tree_d")
.gen_proof(challenge)?;

let comm_d_proof_inner = comm_d_proof.clone();
let challenge_inner = challenge;
let comm_d_proof_inner = comm_d_proof.clone();
let invalid_comm_d_inner = Arc::clone(&invalid_comm_d);
scope.execute(move || {
assert!(comm_d_proof_inner.validate(challenge_inner));
if !comm_d_proof_inner.validate(challenge_inner) {
let mut invalid = invalid_comm_d_inner.lock().expect("failed to get lock on invalid_comm_d_inner");
*invalid = InvalidChallengeCoordinate {
failure_detected: true,
challenge_index,
};
error!("Invalid comm_d detected at challenge index {}", challenge_index);
}
});

// Stacked replica column openings
Expand Down Expand Up @@ -329,8 +363,16 @@ impl<'a, Tree: 'static + MerkleTreeTrait, G: 'static + Hasher> StackedDrg<'a, Tr
)?;

let comm_r_last_proof_inner = comm_r_last_proof.clone();
let invalid_comm_r_inner = Arc::clone(&invalid_comm_r);
scope.execute(move || {
debug_assert!(comm_r_last_proof_inner.validate(challenge));
if !comm_r_last_proof_inner.validate(challenge) {
let mut invalid = invalid_comm_r_inner.lock().expect("failed to get lock on invalid_comm_r_inner");
*invalid = InvalidChallengeCoordinate {
failure_detected: true,
challenge_index: challenge,
};
error!("Invalid comm_r detected at challenge index {}", challenge);
}
});

// Labeling Proofs Layer 1..l
Expand Down Expand Up @@ -383,13 +425,19 @@ impl<'a, Tree: 'static + MerkleTreeTrait, G: 'static + Hasher> StackedDrg<'a, Tr
let labeled_node = *rcp.c_x.get_node_at_layer(layer)?;
let replica_id = &pub_inputs.replica_id;
let proof_inner = proof.clone();
let invalid_encoding_proof_inner = Arc::clone(&invalid_encoding_proof);
scope.execute(move || {
assert!(
proof_inner.verify(replica_id, &labeled_node),
"Invalid encoding proof generated at layer {}",
layer,
);
trace!("Valid encoding proof generated at layer {}", layer);
if !proof_inner.verify(replica_id, &labeled_node) {
cryptonemo marked this conversation as resolved.
Show resolved Hide resolved
let mut invalid = invalid_encoding_proof_inner.lock().expect("failed to get lock on invalid_encoding_proof_inner");
*invalid = InvalidEncodingProofCoordinate {
failure_detected: true,
layer,
challenge_index,
};
error!("Invalid encoding proof generated at layer {}, challenge index {}", layer, challenge_index);
} else {
trace!("Valid encoding proof generated at layer {}", layer);
}
});
}

Expand All @@ -402,6 +450,17 @@ impl<'a, Tree: 'static + MerkleTreeTrait, G: 'static + Hasher> StackedDrg<'a, Tr
parents_data_full,
));
}

// Check if a proof was detected as invalid
let invalid_comm_d_coordinate = invalid_comm_d.lock().expect("failed to get lock on invalid_comm_d");
ensure!(!invalid_comm_d_coordinate.failure_detected, "Invalid comm_d detected at challenge_index {}",
invalid_comm_d_coordinate.challenge_index);
let invalid_comm_r_coordinate = invalid_comm_r.lock().expect("failed to get lock on invalid_comm_r");
ensure!(!invalid_comm_r_coordinate.failure_detected, "Invalid comm_r detected at challenge_index {}",
invalid_comm_r_coordinate.challenge_index);
let invalid_encoding_proof_coordinate = invalid_encoding_proof.lock().expect("failed to get lock on invalid_encoding_proof");
ensure!(!invalid_encoding_proof_coordinate.failure_detected, "Invalid encoding proof generated at layer {}, challenge_index {}",
invalid_encoding_proof_coordinate.layer, invalid_encoding_proof_coordinate.challenge_index);
}

Ok(Proof {
Expand Down Expand Up @@ -431,40 +490,69 @@ impl<'a, Tree: 'static + MerkleTreeTrait, G: 'static + Hasher> StackedDrg<'a, Tr
"comm_r must be set prior to generating synthetic challenges",
);

let invalid_synth_porep_proof = Arc::new(Mutex::new(InvalidChallengeCoordinate {
failure_detected: false,
challenge_index: 0,
}));

// Verify synth proofs prior to writing because `ProofScheme`'s verification API is not
// amenable to prover-only verification (i.e. the API uses public values, whereas synthetic
// proofs are known only to the prover).
let pub_params = PublicParams::<Tree>::new(
graph.clone(),
Challenges::Synth(challenges.clone()),
num_layers,
);
let replica_id: Fr = pub_inputs.replica_id.into();
let comm_r: Fr = pub_inputs
.tau
.as_ref()
.map(|tau| tau.comm_r.into())
.expect("unwrapping should not fail");
let synth_challenges = SynthChallengeGenerator::default(graph.size(), &replica_id, &comm_r);
ensure!(
synth_proofs.len() == synth_challenges.num_synth_challenges,
"Mismatched synth porep proofs for the required challenge set"
);

THREAD_POOL.scoped(|scope| {
// Verify synth proofs prior to writing because `ProofScheme`'s verification API is not
// amenable to prover-only verification (i.e. the API uses public values, whereas synthetic
// proofs are known only to the prover).
let pub_params = PublicParams::<Tree>::new(
graph.clone(),
Challenges::Synth(challenges.clone()),
num_layers,
);
let replica_id: Fr = pub_inputs.replica_id.into();
let comm_r: Fr = pub_inputs
.tau
.as_ref()
.map(|tau| tau.comm_r.into())
.expect("unwrapping should not fail");
let synth_challenges =
SynthChallengeGenerator::default(graph.size(), &replica_id, &comm_r);
assert_eq!(synth_proofs.len(), synth_challenges.num_synth_challenges);
for (challenge, proof) in synth_challenges.zip(synth_proofs) {
let proof_inner = proof.clone();
let challenge_inner = challenge;
let pub_params_inner = pub_params.clone();
let pub_inputs_inner = pub_inputs.clone();
let invalid_synth_porep_proof_inner = Arc::clone(&invalid_synth_porep_proof);
scope.execute(move || {
assert!(proof_inner.verify(
if !proof_inner.verify(
&pub_params_inner,
&pub_inputs_inner,
challenge_inner,
graph
));
graph,
) {
let mut invalid = invalid_synth_porep_proof_inner
.lock()
.expect("failed to get lock on invalid_synth_porep_proof_inner");
*invalid = InvalidChallengeCoordinate {
failure_detected: true,
challenge_index: challenge_inner,
};
error!(
"Invalid synth porep proof generated at challenge index {}",
challenge_inner
);
}
});
}
});

let invalid_synth_porep_proof_coordinate = invalid_synth_porep_proof
.lock()
.expect("failed to get lock on invalid_synth_porep_proof");
ensure!(
!invalid_synth_porep_proof_coordinate.failure_detected,
"Invalid synth_porep proof generated at challenge_index {}",
invalid_synth_porep_proof_coordinate.challenge_index
);
info!("writing synth-porep vanilla proofs to file: {:?}", path);
let file = File::create(&path).map(BufWriter::new).with_context(|| {
format!(
Expand Down Expand Up @@ -744,7 +832,7 @@ impl<'a, Tree: 'static + MerkleTreeTrait, G: 'static + Hasher> StackedDrg<'a, Tr
TreeArity: PoseidonArity,
{
use std::cmp::min;
use std::sync::{mpsc::sync_channel as channel, Arc, RwLock};
use std::sync::{mpsc::sync_channel as channel, RwLock};

use fr32::fr_into_bytes;
use generic_array::GenericArray;
Expand Down
2 changes: 1 addition & 1 deletion storage-proofs-update/src/vanilla.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ use storage_proofs_core::{
},
parameter_cache::ParameterSetMetadata,
proof::ProofScheme,
settings::SETTINGS,
};
use storage_proofs_porep::stacked::{StackedDrg, TreeRElementData};

Expand Down Expand Up @@ -356,6 +355,7 @@ pub fn prepare_tree_r_data<Tree: 'static + MerkleTreeTrait>(
start: usize,
end: usize,
) -> Result<TreeRElementData<Tree>> {
use storage_proofs_core::settings::SETTINGS;
let tree_data = source
.read_range(start..end)
.expect("failed to read from source");
Expand Down