A verifiable nearest-neighbor search program, powered by SP1.
This project was built for Aligned Builders Hackathon:
The requested information as per the judging criteria has been given within the project link above.
You need the following:
- Rust for everything
- SP1 for zkVM
- Ollama for embeddings (optional)
- Aligned SDK for proof verification
You can create a new wallet & store it directly in a random file with:
PRIV_KEY=$(cast wallet new | grep "Private key" | cut -d ' ' -f 3)
# will prompt for a password, can be left empty
cast wallet import --private-key $PRIV_KEY -k ./secrets/ $RANDOM.json
It will prompt for a password, and then output the keystore under secrets
folder.
You can view the respective address of a keystore file with:
# will prompt for the password
cast wallet address --keystore ./secrets/wallet.json
First, deposit some funds to Aligned layer (skip this if you already have done so):
./aligned deposit ./path/to/keystore.json
This will print a transaction hash, which can be viewed at https://holesky.etherscan.io/tx/<hash-here>
.
You can check your balance with:
./aligned balance ./path/to/keystore.json
Warning
You need to have Ollama running on localhost:11434
(default) to run the commands in this section.
The repository comes with existing embeddings under the data
folder, within the files with .index.json
extension. For this project, each data has the following type:
{
name: string;
description: string;
}
You can create your own embeddings as follows:
cargo run --bin vnns-embedder index -p ./path/to/data.json
# will output ./path/to/data.index.json
To generate a query vector to be used within a proof, use the following command:
cargo run --bin vnns-embedder query -p ./path/to/data.json -t "your text to be converted here"
# will output ./path/to/data.query.json
This saves the vector itself within the JSON file, which the prover reads from disk.
To build the VNNS program, run the following command:
cd program
cargo prove build --elf-name riscv32im-succinct-vnns-elf
To build the aggregator program:
cd aggregator
cargo prove build --elf-name riscv32im-succinct-aggregator-elf
To run the program without generating a proof:
RUST_LOG=info cargo run --bin vnns-script --release -- --execute --path ./data/foods-small.json
This will execute the program and display the output.
To generate a core proof for your program:
RUST_LOG=info cargo run --bin vnns-script --release -- --prove --path ./data/foods-small.json
This will generate many proofs (based on file size & batch size) and store them under the same directory as given in path
. To see which text the result belongs to, copy the Output Commitment
on the console, and look-up the item within the vector index that has the same hash with that commitment.
Tip
You can configure the batch size with --batch-size <number>
argument, default is 4.
The batch size should be small especially if the vector is large (1000s of elements) because they are all of type f32
and will consume a lot of resources within the zkVM.
Tip
If --aggregate
option is passed, it will aggregate and store the final proof as well with the extension .agg.proof
and .agg.pub
.
Consider proofs generated for some data ./data.json
. You can submit all batches of proofs to Aligned Layer with:
./aligned.sh submit ./path/to/keystore.json ./data.json
To send the aggregated proof only, you can use:
./aligned.sh submit-agg ./path/to/keystore.json ./data.json
Note
For each proof, it will ask for your keystore password.
The project is MIT licensed as per SP1 project template.