Skip to content

Commit 43c6b37

Browse files
commit new page
1 parent 8ebe6c1 commit 43c6b37

File tree

8 files changed

+232
-105
lines changed

8 files changed

+232
-105
lines changed

docs/cdk/architecture/type-1-prover/intro-t1-prover.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,4 @@ The figure below gives a visual summary of the types, contrasting compatibility
2727

2828
Ultimately, choosing which type of ZK-EVM to develop involves a trade-off between EVM-equivalence and performance.
2929

30-
The challenge this poses for developers who favor exact Ethereum-equivalence is to devise ingenious designs and clever techniques to implement faster zk-provers. Vitalik mentions one mitigation strategy to improve proof generation times: Cleverly engineered, and massively parallelized provers.
30+
The challenge this poses for developers who favor exact Ethereum-equivalence is to devise ingenious designs and clever techniques to implement faster zk-provers. Vitalik mentions one mitigation strategy to improve proof generation times: cleverly engineered, and massively parallelized provers.

docs/cdk/architecture/type-1-prover/t1-cpu-component.md

Lines changed: 11 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,16 @@
1-
The CPU is the central component of Polygon Zero zkEVM. Like any central processing unit, it reads instructions, executes them, and modifies the state (registers and the memory) accordingly.
1+
The CPU is the central component of the Polygon CDK type-1 prover. Like any central processing unit, it reads instructions, executes them, and modifies the state (registers and the memory) accordingly.
22

33
Other complex instructions, such as Keccak hashing, are delegated to specialized STARK tables.
44

55
This section briefly presents the CPU and its columns. However, details on the CPU logic can be found [here](https://github.com/0xPolygonZero/plonky2/blob/main/evm/spec/cpulogic.tex).
66

7-
8-
### CPU flow
7+
## CPU flow
98

109
CPU execution can be decomposed into two distinct phases; CPU cycles, and padding.
1110

1211
This first phase of the CPU execution is a lot bulkier than the second, more so that padding comes only at the end of the execution.
1312

14-
#### CPU cycles
13+
### CPU cycles
1514

1615
In each row, the CPU reads code at a given program counter (PC) address, executes it, and writes outputs to memory. The code could be kernel code or any context-based code.
1716

@@ -28,23 +27,21 @@ Subsequent contexts are created when executing user code.
2827

2928
Syscalls, which are specific instructions written in the kernel, may be executed in a non-zero user context. They don't change the context but the code context, which is where the instructions are read from.
3029

31-
#### Padding
30+
### Padding
3231

3332
At the end of any execution, the length of the CPU trace is padded to the next power of two.
3433

3534
When the program counter reaches the special halting label in the kernel, execution halts. And that's when padding should follow.
3635

3736
There are special constraints responsible for ensuring that every row subsequent to execution halting is a padded row, and that execution does not automatically resume. That is, execution cannot resume without further instructions.
3837

38+
## CPU columns
3939

40-
41-
### CPU columns
42-
43-
This document discusses CPU columns as they relate to all relevant operations being executed, as well as how some of the constraints are checked.
40+
We now have a look at CPU columns as they relate to all relevant operations being executed, as well as how some of the constraints are checked.
4441

4542
These are the register columns, operation flags, memory columns, and general columns.
4643

47-
#### Registers
44+
### Registers
4845

4946
- $\texttt{context}$: Indicates the current context at any given time. So, $\texttt{context}\ 0$ is for the kernel, while any context specified with a positive integer indicates a user context. A user context is incremented by $1$ at every call.
5047
- $\texttt{code_context}$: Indicates the context in which the executed code resides.
@@ -55,7 +52,7 @@ These are the register columns, operation flags, memory columns, and general col
5552
- $\texttt{clock}$: Monotonic counter which starts at 0 and is incremented by 1 at each row. It is used to enforce correct ordering of memory accesses.
5653
- $\texttt{opcode_bits}$ These are 8 boolean columns, indicating the bit decomposition of the opcode being read at the current PC.
5754

58-
#### Operation flags
55+
### Operation flags
5956

6057
Operation flags are boolean flags indicating whether an operation is executed or not.
6158

@@ -85,8 +82,7 @@ $$
8582
\texttt{eq_iszero * opcode_bits[0]}
8683
$$
8784

88-
89-
#### Memory columns
85+
### Memory columns
9086

9187
The CPU interacts with the EVM memory via its memory channels.
9288

@@ -101,7 +97,7 @@ A full memory channel is composed of the following:
10197

10298
The last memory channel is a partial channel. It doesn't have its own $\texttt{value}$ columns but shares them with the first full memory channel. This allows saving eight columns.
10399

104-
#### General columns
100+
### General columns
105101

106102
There are eight ($8$) shared general columns. Depending on the instruction, they are used differently:
107103

@@ -126,5 +122,4 @@ The `popping-only` instruction uses the $\text{Stack}$ columns to check if the S
126122
While the `pushing-only` instruction uses the $\text{Stack}$ columns to check if the Stack is empty before the instruction.
127123

128124
$\texttt{stack_len_bounds_aux}$ is used to check that the Stack doesn't overflow in user mode. The last four columns are used to prevent conflicts with other general columns.
129-
See the $\text{Stack Handling}$ subsection of this [document](https://github.com/0xPolygonZero/plonky2/blob/main/evm/spec/cpulogic.tex) for more details.
130-
125+
See the $\text{Stack Handling}$ subsection of this [document](https://github.com/0xPolygonZero/plonky2/blob/main/evm/spec/cpulogic.tex) for more details.

docs/cdk/architecture/type-1-prover/t1-ctl-protocol.md

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,11 @@ For each STARK table, ordinary STARK proof and verification are used to check if
44

55
However, there are input and output values shared among the tables. These values need to be checked for possible alterations while being shared among tables. For this purpose, _cross-table lookups_ (CTLs) are used in verifying that the shared values are not tampered with.
66

7-
### How CTLs work
7+
## How CTLs work
88

9-
The CLT protocol is based on the [logUP argment](https://eprint.iacr.org/2022/1530.pdf), which works similar to how range-checks work. Range-checks are discussed in a subsequent document to this one.
9+
The CTL protocol is based on the [logUP argment](https://eprint.iacr.org/2022/1530.pdf), which works similar to how range-checks work. Range-checks are discussed in a subsequent document to this one.
1010

11-
#### Example (CTL)
11+
### Example (CTL)
1212

1313
Consider the following scenario as an example. Suppose STARK $S_2$ requires an operation -- say $Op$ -- that is carried out by another STARK $S_1$.
1414

@@ -26,8 +26,7 @@ And thus, this check is tantamount to ensuring that the rows of the $S_1$ table
2626

2727
![Figure: CTL permutation check](../../../img/cdk/t1-prover-ctl-perm-check.png)
2828

29-
30-
### How the CLT proof works
29+
## How the CTL proof works
3130

3231
As outlined in the above example, verifying that shared values among STARK tables are not tampered with amounts to proving that rows of reduced STARK tables are permutations of each other.
3332

@@ -40,7 +39,7 @@ The proof therefore is achieved in three steps;
4039
- Checking correct construction and equality of 'running sums'.
4140

4241

43-
#### Table filtering
42+
### Table filtering
4443

4544
Define filters $f^1$ and $f^2$ for STARK tables $S_1$ and $S_2$ , respectively, such that
4645

@@ -65,8 +64,7 @@ Next create subtables $S_1'$ and $S_2'$ of STARK tables $S_1$ and $S_2$ , respec
6564

6665
Filters are limited to (at most) degree 2 combinations of columns.
6766

68-
69-
#### Computing running sums
67+
### Computing running sums
7068

7169
For each $i \in \{0,1\}$, let $\{ c^{i,j} \}$ denote the columns of $S_i'$.
7270

@@ -86,7 +84,7 @@ for $0 < l < n-1$.
8684

8785
Note that $Z_l^{S_i}$ is computed backwards. i.e., It starts with $Z_{n-1}^{S_i}$ and goes down to $Z_0^{S_i}$ as the final sum.
8886

89-
#### Checking running sums
87+
### Checking running sums
9088

9189
After computing running sums, check equality of the final sums $Z_0^{S_1} =?\ Z_0^{S_2}$ and whether the running sums were correctly constructed.
9290

@@ -97,8 +95,7 @@ The above three steps turn the CTL argument into a [LogUp lookup argument](https
9795

9896
which checks for equality between $S_1'$ and $S_2'$.
9997

100-
101-
### CTL protocol summary
98+
## CTL protocol summary
10299

103100
The cross-table protocol can be summarized as follows.
104101

@@ -113,5 +110,4 @@ On the other side, and for the same STARK table $S$, the verifier:
113110

114111
- Computes the sum $Z = \sum_j Z_{j}^l$.
115112
- Checks equality, $Z =?\ Z_0^S$.
116-
- Checks whether each of the running sums $Z_{j}^l$ and $Z^S$ were correctly constructed.
117-
113+
- Checks whether each of the running sums $Z_{j}^l$ and $Z^S$ were correctly constructed.

docs/cdk/architecture/type-1-prover/t1-design-challenge.md

Lines changed: 20 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,27 +2,35 @@ The EVM wasn't designed with zero-knowledge proving and verification in mind, an
22

33
Some of the challenges stem from the way the EVM is implemented. Here are some of the discrepancies that occur when deploying the most common zero-knowledge primitives to the EVM.
44

5-
1. **Word size**: The native EVM word size is 256 bits long, whereas the chosen SNARK Plonky2, operates internally over 64-bit field elements.
5+
## Word size
66

7-
Matching these word sizes requires a work-around where word operations are performed in multiples of smaller limbs for proper handling internally.
7+
The native EVM word size is 256 bits long, whereas the chosen SNARK Plonky2, operates internally over 64-bit field elements.
88

9-
This unfortunately incurs overheads, even for simple operations like the ADD opcode.
9+
Matching these word sizes requires a work-around where word operations are performed in multiples of smaller limbs for proper handling internally.
10+
11+
This unfortunately incurs overheads, even for simple operations like the ADD opcode.
1012

11-
2. **Supported fields:** Selecting a field for the most efficient proving scheme can become complicated.
13+
## Supported fields
14+
15+
Selecting a field for the most efficient proving scheme can become complicated.
1216

13-
Ethereum transactions are signed over the [secp256k1 curve](https://secg.org/sec2-v2.pdf), which involves a specific prime field $\mathbb{F}_p$, where $p = 2^{256} - 2^{32} - 2^9 - 2^8 -2^7 - 2^6 - 2^4 - 1$. The EVM also supports precompiles for [BN254 curve](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-197.md) operations, where the computations are carried out in an entirely different field arithmetic.
17+
Ethereum transactions are signed over the [secp256k1 curve](https://secg.org/sec2-v2.pdf), which involves a specific prime field $\mathbb{F}_p$, where $p = 2^{256} - 2^{32} - 2^9 - 2^8 -2^7 - 2^6 - 2^4 - 1$. The EVM also supports precompiles for [BN254 curve](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-197.md) operations, where the computations are carried out in an entirely different field arithmetic.
1418

15-
This adds a major overhead when it comes to proving modular arithmetic, as there is a need to deal with modular reductions in the field of the proving system.
19+
This adds a major overhead when it comes to proving modular arithmetic, as there is a need to deal with modular reductions in the field of the proving system.
1620

17-
Such incongruous modular arithmetic is not uncommon. Recursive proving schemes like [Halo](https://electriccoin.co/wp-content/uploads/2019/09/Halo.pdf) resorted to utilising two pairing-friendly elliptic curves where proving and verification are instantiated in two different field arithmetics.
21+
Such incongruous modular arithmetic is not uncommon. Recursive proving schemes like [Halo](https://electriccoin.co/wp-content/uploads/2019/09/Halo.pdf) resorted to utilising two pairing-friendly elliptic curves where proving and verification are instantiated in two different field arithmetics.
1822

19-
Other curves, such as the pairing-friendly [BLS12-381](https://eips.ethereum.org/EIPS/eip-2537) popularly used in recursive proving systems, are yet to be EVM-supported in the form of precompiled contracts.
23+
Other curves, such as the pairing-friendly [BLS12-381](https://eips.ethereum.org/EIPS/eip-2537) popularly used in recursive proving systems, are yet to be EVM-supported in the form of precompiled contracts.
2024

21-
3. **Hash functions**: The EVM uses [Keccak](https://keccak.team/keccak_specs_summary.html) as its native hash function both for state representation and arbitrary hashing requests, through the `Keccak256` opcode.
25+
## Hash functions
2226

23-
While Keccak is fairly efficient on a CPU, since Plonky2 implements polynomials of degree 3, Keccak operations would need to be expressed as constraints of degree 3. This results in an extremely heavy Algebraic Intermediate Representation (AIR) compared to some of the most recent [STARK-friendly](https://eprint.iacr.org/2020/948.pdf) hash functions, tailored specifically for zero-knowledge proving systems.
27+
The EVM uses [Keccak](https://keccak.team/keccak_specs_summary.html) as its native hash function both for state representation and arbitrary hashing requests, through the `Keccak256` opcode.
2428

25-
Although the EVM supports precompiles of hash functions such as SHA2-256, RIPEMD-160, and Blake2f, they are all quite heavy for a ZK proving system.
29+
While Keccak is fairly efficient on a CPU, since Plonky2 implements polynomials of degree 3, Keccak operations would need to be expressed as constraints of degree 3. This results in an extremely heavy Algebraic Intermediate Representation (AIR) compared to some of the most recent [STARK-friendly](https://eprint.iacr.org/2020/948.pdf) hash functions, tailored specifically for zero-knowledge proving systems.
30+
31+
Although the EVM supports precompiles of hash functions such as SHA2-256, RIPEMD-160, and Blake2f, they are all quite heavy for a ZK proving system.
2632

27-
4. **State representation**: Ethereum uses Merkle Patricia Tries with RLP encoding. Both of these are not zero-knowledge-friendly primitives, and incur huge overheads on transaction processing within a ZK-EVM context.
33+
## State representation
34+
35+
Ethereum uses Merkle Patricia Tries with RLP encoding. Both of these are not zero-knowledge-friendly primitives, and incur huge overheads on transaction processing within a ZK-EVM context.
2836

docs/cdk/architecture/type-1-prover/testing-and-proving-costs.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
1-
### Testing Polygon type-1 zkEVM
1+
### Testing the prover
22

3-
Find a parser and test runner for testing compatible and common Ethereum full node tests against Polygon type-1 zkEVM [here](https://github.com/0xPolygonZero/evm-tests).
3+
Find a parser and test runner for testing compatible and common Ethereum full node tests against the Polygon CDK type-1 prover [here](https://github.com/0xPolygonZero/evm-tests).
44

5-
Polygon type-1 zkEVM passes all relevant and official [Ethereum tests](https://github.com/ethereum/tests/).
5+
The prover passes all relevant and official [Ethereum tests](https://github.com/ethereum/tests/).
66

77
### Proving costs
88

9-
Instead of presenting gas costs, we focus on the cost of proving EVM transactions with Polygon type-1 prover.
9+
Instead of presenting gas costs, we focus on the cost of proving EVM transactions with the Polygon CDK type-1 prover.
1010

11-
Since Polygon type-1 zkEVM is more like a 'CPU' for the EVM, it makes sense to look at proving costs per VM instance used, as opposed to TPS or other benchmarks.
11+
Since the prover is more like a 'CPU' for the EVM, it makes sense to look at proving costs per VM instance used, as opposed to TPS or other benchmarks.
1212

1313
Consider the table below for prices of GCP's specific instances, taken from [here](https://cloud.google.com/compute/all-pricing), and webpage accessed on the 29th January, 2024.
1414

@@ -19,8 +19,7 @@ Take the example of a t2d-standard-60 GCP instance, where each vCPU has 4GB memo
1919
- 0.00346 USD / vCPU hour
2020
- 0.000463 USD / GB hour
2121

22-
We obtain the following hourly cost, $(60 \times 0.00346) + (240 \times 0.000463) = 0.31872$ USD per hour.
23-
The total cost per block is given by: $\texttt{(Proving time per hr)} \times 0.31872$ USD.
22+
We obtain the following hourly cost, $(60 \times 0.00346) + (240 \times 0.000463) = 0.31872$ USD per hour. The total cost per block is given by: $\texttt{(Proving time per hr)} \times 0.31872$ USD.
2423

2524
The table below displays proving costs per transaction per hour.
2625

0 commit comments

Comments
 (0)