Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: local bus #1293

Merged
merged 102 commits into from
Sep 17, 2024
Merged

perf: local bus #1293

merged 102 commits into from
Sep 17, 2024

Conversation

kevjue
Copy link
Contributor

@kevjue kevjue commented Aug 9, 2024

Overview

This PR contains changes to create a global and local interaction bus. The global bus will be used for all sent/received messages that needs to be communicated across shards (e.g. memory slots that are accessed in multiple shards and syscall made).

The main purpose of this change is to reduce the amount of data needed to commit in our prover phase 1. Currently, it commits to all the chips' main trace. But with the global bus, it will only need to commit to the cross shard information, which is a small fraction of all the main traces.

Summary of changes

Stark changes

Core prover (prover_with_context and prove.open)

In phase 1, for each shard it will do the following:

  1. Only call generate dependencies for the chips specified in the RiscVAir struct.
  2. It will generate a batch trace for the global commitment and then commit to that batch trace.
  3. It will observe all the global commitments to the challenger.

In phase 2, for each shard it will do the following:

  1. Call generate dependencies for all the chips.
  2. It will generate a two batch traces (for the global commitment and local commitment) and then commit to that batch trace.
  3. It will then call commit to those two separate batch traces.
  4. Prover.open changes
    a. Will accept two different sets of prover data (traces, commitment, main_data, chip ordering) for the local and global chips.
    b. It will merge the trace, chip ordering, and main_data for the quotient computation and permutation trace.
    c. It will call pcs.open on the two separate commitment and traces, and then merge those openings into one list.

Proof struct

Changes to the proof struct include:

  1. Each shard proof now contains both a global and local commitment and flags to specific which commitment each opening value pertains to.
  2. Each opening value now contains both the global and local cumulative sum

Verifier

Changes to the verify include:

  1. Calculation of the max number of byte lookups for each shard and verify that it’s within babybear field.
  2. Split up the opening values to the global and local ones, and then do the pcs verify on the global and local commitment.
  3. For the core machine, it will observe all the shard’s proof global commitments and will sample for the global challenges (will set to zero for recursion machine).
  4. Will verify that the local cumulative sum sums to 0 for each shard and global cumulative sum sums to 0 among all shards.

Permutation

Changes to the permutation file include:

  1. In trace gen, it will fill in two sets of permutation columns.
  2. In eval, it will verify two different sets of permutation columns using two different sets of challenges and two different cumulative sums.

Folder

Will contain two cumulative sums and two sets of permutation challenges (one for global, one for local). For the recursion folders, the global versions will just be set to zero.

Builder

All send/receive interactions must specify which bus it’s using.

Runtime/Executor

For every memory access in the shard, it will need to save the first and last access for each memory slot accessed (in the local_memory_access field). So in the memory reads/write function (mr/mw), that structure will be updated. Note that each precompile invocation will also need to save similar information since the precompile memory accesses can happen on a different shard.

For every syscall innovation, it will emit a syscall event with the opcode and arguments to populate the syscall table.

The record will have two additional fields (MemoryLocalEvent and SyscallEvent) to populate the local memory tables and syscall tables respectively.

RiscvAir/MachineAir changes:

MachineAir trait

Added commit_trace function that is used to specify whether the chip should be committed to the global or local commitment. By default, it will be to the local commitment (the recursion proofs will not be having a notion of a global bus). The only chips that are added to the global commitment are (Global init/finalize memory, program memory, local init/finalize memory, syscall tables).

New chips:

  1. Memory local init and global tables - It will send to both the local and global bus the first and last memory access for each shard.
  2. Syscall - It will contain all the syscall innovations in the program execution run

Generate dependencies

It now labels the set of chips to run generate dependencies on during phase 1.

Core Precompiles:

All the core precompiles contains the following changes:

  1. In its execution functions, it will need to gather all the local memory events in its record.
  2. In it’s generate_dependencies function, it will copy it’s local memory events to the record’s local memory events.

Miscellaneous changes

  1. Removed the shard column from the byte table and all references to it in the code.
  2. Added a ZeroCommitment trait that is implemented for the PCS structs to return a “dummy” commitment. This is used to set the recursion proof’s global commitment to some value.

@kevjue kevjue changed the title perf: local memory chip perf: local bus Aug 9, 2024
@kevjue kevjue marked this pull request as ready for review September 9, 2024 16:27
@tamirhemo tamirhemo merged commit 1e3ef0e into tamir/v1.3.0-rc2 Sep 17, 2024
8 of 9 checks passed
@tamirhemo tamirhemo deleted the kevjue/phase_1_improvements branch September 17, 2024 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants