Skip to content

Commit 855115d

Browse files
committed
Adding Type-1 prover docs
1 parent 387586c commit 855115d

File tree

10 files changed

+688
-0
lines changed

10 files changed

+688
-0
lines changed
227 KB
Loading
532 KB
Loading
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
This document provides guidelines on how to run Polygon Zero's Type-1 prover, specifically for proving transactions, but with the option to test full blocks of less than 4M gas.
2+
So, it is similar to [`eth-proof`](https://github.com/wborgeaud/eth-proof) but for transaction proofs.
3+
4+
## Quick Start
5+
6+
There two ways to run this prover. The simplest way to get started is
7+
to use the `in-memory` runtime of
8+
[Paladin](https://github.com/0xPolygonZero/paladin). This requires
9+
very little setup, but it's not really suitable for a large scale
10+
test. The other method for testing the prover is to leverage an
11+
[AMQP](https://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol)
12+
like [RabbitMQ](https://en.wikipedia.org/wiki/RabbitMQ) to distribute
13+
workload over many workers.
14+
15+
!!!info
16+
17+
It's worth noting that you'll need at least 40GB of physical memory to run the prover.
18+
19+
20+
### Setup
21+
22+
Before running the prover, you'll need to compile the
23+
application. This command should do the trick:
24+
25+
```bash
26+
env RUSTFLAGS='-C target-cpu=native' cargo build --release
27+
```
28+
29+
You should end up with two binaries in your `target/release`
30+
folder. One is called `worker` and the other is `leader`. Typically,
31+
we'll install these somewhere in our `$PATH` for convenience.
32+
33+
Once you have application available, you'll need to create a block
34+
[witness](https://nmohnblatt.github.io/zk-jargon-decoder/definitions/witness.html)
35+
which essentially serves as the input for the prover.
36+
37+
Assuming you've deployed the `leader` binary, you should be able to generate a witness
38+
like this:
39+
40+
```bash
41+
paladin-leader rpc -u $RPC_URL -t 0x2f0faea6778845b02f9faf84e7e911ef12c287ce7deb924c5925f3626c77906e > 0x2f0faea6778845b02f9faf84e7e911ef12c287ce7deb924c5925f3626c77906e.json
42+
```
43+
44+
You'll need access to an Ethereum RPC in order to run this
45+
command. The input argument is a transaction hash and in particular it
46+
is the _last_ transaction hash in the block.
47+
48+
Once you've successfully generated a witness, you're ready to start
49+
proving either with the `in-memory` run time or the `amqp` runtime.
50+
51+
### In Memory Proving
52+
53+
Running the prover with the `in-memory` setup requires no setup. You
54+
can attempt to generate a proof with a command like this.
55+
56+
```bash
57+
env RUST_MIN_STACK=33554432 \
58+
ARITHMETIC_CIRCUIT_SIZE="15..28" \
59+
BYTE_PACKING_CIRCUIT_SIZE="9..28" \
60+
CPU_CIRCUIT_SIZE="12..28" \
61+
KECCAK_CIRCUIT_SIZE="14..28" \
62+
KECCAK_SPONGE_CIRCUIT_SIZE="9..28" \
63+
LOGIC_CIRCUIT_SIZE="12..28" \
64+
MEMORY_CIRCUIT_SIZE="17..30" \
65+
paladin-leader prove \
66+
--runtime in-memory \
67+
--num-workers 1 \
68+
--input-witness 0x2f0faea6778845b02f9faf84e7e911ef12c287ce7deb924c5925f3626c77906e.json
69+
```
70+
71+
The circuit parameters here are meant to be compatible with virtually
72+
all Ethereum blocks. This will create a block proof from an input
73+
state root of the preceding block. You can adjust the `--num-workers`
74+
flag based on the number of available compute resources. As a rule of
75+
thumb, you'd probably want at least 8 cores per worker.
76+
77+
### AMQP Proving
78+
79+
Proving in a distributed compute environment depends on an AMQP
80+
server. We're not going to cover the setup of RabbitMQ, but assuming
81+
you have something like that available you can run a "leader" which
82+
distribute proving tasks to a collection of "workers" which actually
83+
do the proving work.
84+
85+
In order to run the workers, you'll use a command like:
86+
87+
```bash
88+
env RUST_MIN_STACK=33554432 \
89+
ARITHMETIC_CIRCUIT_SIZE="15..28" \
90+
BYTE_PACKING_CIRCUIT_SIZE="9..28" \
91+
CPU_CIRCUIT_SIZE="12..28" \
92+
KECCAK_CIRCUIT_SIZE="14..28" \
93+
KECCAK_SPONGE_CIRCUIT_SIZE="9..28" \
94+
LOGIC_CIRCUIT_SIZE="12..28" \
95+
MEMORY_CIRCUIT_SIZE="17..30" \
96+
paladin-worker --runtime amqp --amqp-uri=amqp://localhost:5672
97+
```
98+
99+
This will start the worker and have it await tasks. Depending on the
100+
size of your machine, you may be able to run several workers on the
101+
same operating system. An example [systemd
102+
service](./deploy/paladin-worker@.service) is included. Once that
103+
service is installed, you could enable and start 16 workers on the
104+
same VM like this:
105+
106+
```bash
107+
seq 0 15 | xargs -I xxx systemctl enable paladin-worker@xxx
108+
seq 0 15 | xargs -I xxx systemctl start paladin-worker@xxx
109+
```
110+
111+
Now that you have your pool of paladin workers, you can start proving
112+
with a command like this:
113+
114+
```bash
115+
paladin-leader prove \
116+
--runtime amqp \
117+
--amqp-uri=amqp://localhost:5672 \
118+
--input-witness 0x2f0faea6778845b02f9faf84e7e911ef12c287ce7deb924c5925f3626c77906e.json
119+
```
120+
121+
This command will run the same way as the `in-memory` mode except that
122+
the leader itself isn't doing the work. The separate worker processes
123+
are doing the heavy lifting.
124+
125+
126+
## Proving costs
127+
128+
It makes sense to look at the proving costs, as opposed to TPS.
129+
130+
131+
Based on a GCP's 3-year commitment price on a t2d-standard-60 machine, where each vCPU has 4GB memory:
132+
133+
- $0.012376$ USD / vCPU hour
134+
- $0.001659$ USD / GB hour
135+
136+
We obtain, $(60 \times 0.012376) + (240 \times 0.001659) = 1.14072$ USD.
137+
138+
| Block Number | Transactions | Gas | Proof Time (minutes) | Total Cost | Cost Per Tx |
139+
| ----------------------------------------------- | ------------ | ---------- | -------------------- | ---------- | ----------- |
140+
| [17106222](https://etherscan.io/block/17106222) | 105 | 10,781,405 | 44.17 | $0.235 | $0.0022 |
141+
| [17095624](https://etherscan.io/block/17095624) | 163 | 12,684,901 | 78.12 | $0.415 | $0.0025 |
142+
| [17735424](https://etherscan.io/block/17735424) | 182 | 16,580,448 | 100.52 | $0.534 | $0.0029 |
143+
144+
145+
146+
147+
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
This document provides details of the Polygon Zero's Type-1 zkEVM prover, which is a proving scheme deployed for the Type-1 zkEVM developed in collaboration with the Toposware team.
2+
3+
As per definition of zkEVM types, the Polygon Zero's zkEVM is a type-1 as it aims at proving and enabling verification of the EVM's computational integrity.
4+
5+
Since the design of a prover used in any zkEVM is closely related to the type of the zkEVM, this document starts with a brief discussion on the different types of zkEVMs.
6+
7+
8+
## Types of zkEVMs
9+
10+
The emergence of various zkEVMs ignited the debate of how 'equivalent' is a given zkEVM to the Ethereum virtual machine (EVM).
11+
12+
Vitalik Buterin has since introduced some calibration to EVM-equivalence in his article, "[The different types of zkEVMs](https://vitalik.eth.limo/general/2022/08/04/zkevm.html)". He made a distinction among five types of zkEVMs, which boils down to the inevitable trade-off between Ethereum equivalence and the efficacy of the zero-knowledge proving scheme involved. For brevity, we refer to this proving scheme as the zk-prover or simply, prover.
13+
14+
The types of zkEVMs, as outlined by Vitalik, are as follows;
15+
16+
- **Type-1** zkEVMs strive for full Ethereum-equivalence. These types of zkEVMs do not change anything in the Ethereum stack except adding a zk-prover. They can therefore verify Ethereum and environments that are exactly like Ethereum.
17+
- **Type-2** zkEVMs aim at full EVM-equivalence instead of Ethereum-equivalence. These zkEVMs make some minor changes to the Ethereum stack with the exception of the Application layer. As a result, they are fully compatible with almost all Ethereum apps, and thus offer the same UX as with Ethereum.
18+
- **Type-2.5** zkEVMs endeavor for EVM-equivalence but make changes to gas costs. These zkEVMs achieve fast generation of proofs but introduces a few incompatibles.
19+
- **Type-3** zkEVMs seek to be EVM-equivalent but make a few minor changes to the Application layer. These type of zkEVMs achieve faster generation of proofs, and are not compatible with most Ethereum apps.
20+
- **Type-4** zkEVMs are high-level-language equivalent zkEVMs. These type of zkEVMs take smart contract code written in Solidity, Vyper or other high-level languages and compile it to a specialized virtual machine and prove it. Type-4 zkEVMs attain the fastest proof generation time.
21+
22+
The below figure gives a visual summary of the zkEVM types, contrasting compatibility with performance.
23+
24+
![Figure: zkEVM types](../../img/learn/zkevm-types-vitalik.png)
25+
26+
Ultimately, choosing which type of a zkEVM to develop involves a trade-off between EVM-equivalence and performance.
27+
28+
The challenge to developers who favor exact Ethereum-equivalence is to devise ingenious designs and clever techniques to implement faster zk-provers. Vitalik mentions one mitigation strategy to improving proof generation times: Cleverly engineered and massively parallelized provers.
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
Polygon Zero's Type-1 zkEVM is designed for efficient implementation of the STARK proving and verification of Ethereum transactions. It achieves efficiency by restricting the Algebraic Intermediate Representation (AIR) to constraints of degree 3.
2+
3+
The execution trace needed to generate a STARK proof can be assimilated to a large matrix, where columns are registers and each row represents a view of the registers at a given time.
4+
5+
From the initial register values on the first row to the final one, validity of each internal state transition is enforced through a set of dedicated constraints. Generating the execution trace for a given transaction unfortunately yields a considerable overhead for the prover.
6+
7+
A naïve design strategy would be to utilize a single table, which is solely dedicated to the entire EVM execution. Such a table would have thousands of columns, and although it would be a highly sparse matrix, the prover would treat it as fully dense.
8+
9+
10+
### Modular design strategy
11+
12+
Since most of the operations involved in the EVM can be independently executed, the execution trace is split into separate STARK modules, where each is responsible for ensuring integrity of its own computations.
13+
14+
These STARK modules are;
15+
16+
- **Arithmetic module** handles binary operations including ordinary addition, multiplication, subtraction and division, comparison operations such as 'Less than' and 'Greater than', as well as ternary operations like modular operations.
17+
- **Keccak module** is responsible for computing a Keccak permutation.
18+
- **KeccakSponge module** is dedicated to the sponge construction's 'absorbing' and 'squeezing' functions.
19+
- **Logic module** specializes in performing bitwise logic operations such as AND, OR, or XOR.
20+
- **Memory module** is responsible for memory operations like reads and writes.
21+
- **BytePacking module** is used for reading and writing non-empty byte sequences of length at most 32 to memory.
22+
23+
Although these smaller STARK modules are different and each has its own set of constraints, they mostly operate on common input values.
24+
25+
In addition to the constraints of each module, this design requires an additional set of constraints in order to enforce that these common input values are not tampered with when shared amongst the various STARK modules.
26+
27+
For this reason, this design utilizes _Cross-table lookups_ (CTLs), based on a [logUp argument](https://eprint.iacr.org/2022/1530.pdf) designed by Ulrich Haböck, to cheaply add copy-constraints in the overall system.
28+
29+
Polygon Zero's Type-1 zkEVM uses a central component dubbed the **CPU** to orchestrate the entire flow of data that occurs among the STARK modules during execution of EVM transactions. The CPU dispatches instructions and inputs to specific STARK modules, as well as fetches their corresponding outputs.
30+
31+
Note here that “dispatching” and “fetching” means that initial values and final values resulting from a given operation are being copied with the CTLs to and from the targeted STARK module.
32+
33+
34+
35+
### Prover primitives
36+
37+
This document discusses the cryptographic primitives used to engineer the Polygon Zero's Type-1 zkEVM, which is a custom-built zkEVM capable of tracing, proving and verifying the execution of the EVM through all state changes.
38+
39+
The proving and verification process is made possible by the zero-knowledge (ZK) technology. In particular, a combination of STARK[^1] and SNARK[^2], proving and verification schemes, respectively.
40+
41+
#### STARK for proving
42+
43+
Polygon Zero's Type-1 zkEVM prover implements a STARK proving scheme, a robust cryptographic technique with fast proving time.
44+
45+
Such a scheme has a proving component, called the STARK prover, and a verifying component called the STARK verifier. A proof produced by the STARK prover is referred to as a STARK proof.
46+
47+
The process begins with constructing a detailed record of all the operations performed when transactions are executed. The record, called the `execution trace`, is then passed to a STARK prover, which in turn generates a STARK proof attesting to correct computation of transactions.
48+
49+
Although STARK proofs are relatively big in size, they are put through a series of recursive SNARK proving, where each SNARK proof is more compact than the previous one. This way the final transaction proof becomes significantly more succinct than the initial one, and hence the verification process is highly accelerated.
50+
51+
Ultimately, this SNARK proof can stand alone or be combined with preceding blocks of proofs, resulting in a single zkEVM validity proof that validates the entire blockchain back from genesis.
52+
53+
#### Plonky2 SNARK for verification
54+
55+
The Polygon Zero's Type-1 prover implements a SNARK called [Plonky2](https://github.com/0xPolygonZero/plonky2), which is a SNARK designed for fast recursive proofs composition. Although its arithmetization is based on [TurboPLONK](https://docs.zkproof.org/pages/standards/accepted-workshop3/proposal-turbo_plonk.pdf), it replaces the polynomial commitment scheme of [PLONK](https://eprint.iacr.org/2019/953) with a scheme based on [FRI](https://drops.dagstuhl.de/storage/00lipics/lipics-vol107-icalp2018/LIPIcs.ICALP.2018.14/LIPIcs.ICALP.2018.14.pdf). This allows encoding the witness in 64-bit words, represented as field elements of a low-characteristic field.
56+
57+
The field used, denoted by $\mathbb{F}_p$ , is called Goldilocks. It is a prime field where the prime $p$ is of the form $p = 2^{64} - 2^{32} + 1$.
58+
59+
Since SNARKs are succinct, a Plonky2 proof is published as the validity proof that attests to the integrity of a number of aggregated STARK proofs. This results in reduced verification costs.
60+
61+
This innovative approach holds the promise of a succinct, verifiable chain state, marking a significant milestone in the quest for blockchain verifiability, scalability, and integrity. It is the very innovation that plays a central role in the Polygon Zero's Type-1 zkEVM.
62+
63+
64+
65+
### Documentation remarks
66+
67+
The documentation of the Polygon Zero's Type-1 zkEVM is still WIP, some of the documents are in the Github repo.
68+
69+
The STARK modules, which are also referred to as **STARK tables**, have been documented in the Github repo [here](https://github.com/0xPolygonZero/plonky2/tree/main/evm/spec/tables). The **CPU component** is documented below, while the **CPU logic** is in the [repo](https://github.com/0xPolygonZero/plonky2/blob/main/evm/spec/cpulogic.tex).
70+
71+
In order to complete the STARK framework, the cross-table lookups (CTLs) and the **CTL protocol** can be found in this document, while **range-checks** are also discussed below.
72+
73+
Details on **Merkle Patricia tries** and how they are used in the Polygon Zero's Type-1 zkEVM, can be found [here](https://github.com/0xPolygonZero/plonky2/blob/main/evm/spec/mpts.tex). Included in there are outlines on the prover's internal memory, data encoding and hashing, and prover input format.
74+
75+
76+
77+
[^1]: STARK is short for Scalable Transparent Argument of Knowledge
78+
[^2]: SNARK is short for Succinct Non-interactive Argument of Knowledge.

0 commit comments

Comments
 (0)