Skip to content

Commit 65bcd2e

Browse files
first chunk of vm docs
1 parent 0778063 commit 65bcd2e

28 files changed

+1412
-14
lines changed
61.9 KB
Loading

docs/miden/vm/intro/index.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Introduction
2+
Miden VM is a zero-knowledge virtual machine written in Rust. For any program executed on Miden VM, a STARK-based proof of execution is automatically generated. This proof can then be used by anyone to verify that the program was executed correctly without the need for re-executing the program or even knowing the contents of the program.
3+
4+
## Status and features
5+
Miden VM is currently on release v0.7. In this release, most of the core features of the VM have been stabilized, and most of the STARK proof generation has been implemented. While we expect to keep making changes to the VM internals, the external interfaces should remain relatively stable, and we will do our best to minimize the amount of breaking changes going forward.
6+
7+
At this point, Miden VM is good enough for experimentation, and even for real-world applications, but it is not yet ready for production use. The codebase has not been audited and contains known and unknown bugs and security flaws.
8+
9+
### Feature highlights
10+
Miden VM is a fully-featured virtual machine. Despite being optimized for zero-knowledge proof generation, it provides all the features one would expect from a regular VM. To highlight a few:
11+
12+
* **Flow control.** Miden VM is Turing-complete and supports familiar flow control structures such as conditional statements and counter/condition-controlled loops. There are no restrictions on the maximum number of loop iterations or the depth of control flow logic.
13+
* **Procedures.** Miden assembly programs can be broken into subroutines called *procedures*. This improves code modularity and helps reduce the size of Miden VM programs.
14+
* **Execution contexts.** Miden VM program execution can span multiple isolated contexts, each with its own dedicated memory space. The contexts are separated into the *root context* and *user contexts*. The root context can be accessed from user contexts via customizable kernel calls.
15+
* **Memory.** Miden VM supports read-write random-access memory. Procedures can reserve portions of global memory for easier management of local variables.
16+
* **u32 operations.** Miden VM supports native operations with 32-bit unsigned integers. This includes basic arithmetic, comparison, and bitwise operations.
17+
* **Cryptographic operations.** Miden assembly provides built-in instructions for computing hashes and verifying Merkle paths. These instructions use Rescue Prime Optimized hash function (which is the native hash function of the VM).
18+
* **External libraries.** Miden VM supports compiling programs against pre-defined libraries. The VM ships with one such library: Miden `stdlib` which adds support for such things as 64-bit unsigned integers. Developers can build other similar libraries to extend the VM's functionality in ways which fit their use cases.
19+
* **Nondeterminism**. Unlike traditional virtual machines, Miden VM supports nondeterministic programming. This means a prover may do additional work outside of the VM and then provide execution *hints* to the VM. These hints can be used to dramatically speed up certain types of computations, as well as to supply secret inputs to the VM.
20+
* **Customizable hosts.** Miden VM can be instantiated with user-defined hosts. These hosts are used to supply external data to the VM during execution/proof generation (via nondeterministic inputs) and can connect the VM to arbitrary data sources (e.g., a database or RPC calls).
21+
22+
### Planned features
23+
In the coming months we plan to finalize the design of the VM and implement support for the following features:
24+
25+
* **Recursive proofs.** Miden VM will soon be able to verify a proof of its own execution. This will enable infinitely recursive proofs, an extremely useful tool for real-world applications.
26+
* **Better debugging.** Miden VM will provide a better debugging experience including the ability to place breakpoints, better source mapping, and more complete program analysis info.
27+
* **Faulty execution.** Miden VM will support generating proofs for programs with faulty execution (a notoriously complex task in ZK context). That is, it will be possible to prove that execution of some program resulted in an error.
28+
29+
## Structure of this document
30+
This document is meant to provide an in-depth description of Miden VM. It is organized as follows:
31+
32+
* In the introduction, we provide a high-level overview of Miden VM and describe how to run simple programs.
33+
* In the user documentation section, we provide developer-focused documentation useful to those who want to develop on Miden VM or build compilers from higher-level languages to Miden assembly (the native language of Miden VM).
34+
* In the design section, we provide in-depth descriptions of the VM's internals, including all AIR constraints for the proving system. We also provide the rationale for settling on specific design choices.
35+
* Finally, in the background material section, we provide references to materials which could be useful for learning more about STARKs - the proving system behind Miden VM.
36+
37+
## License
38+
Licensed under the [MIT license](http://opensource.org/licenses/MIT).

docs/miden/vm/intro/overview.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
## Miden VM overview
2+
Miden VM is a stack machine. The base data type of the MV is a field element in a 64-bit [prime field](https://en.wikipedia.org/wiki/Finite_field) defined by modulus $p = 2^{64} - 2^{32} + 1$. This means that all values that the VM operates with are field elements in this field (i.e., values between $0$ and $2^{64} - 2^{32}$, both inclusive).
3+
4+
Miden VM consists of four high-level components as illustrated below.
5+
6+
![](../../../img/miden/vm/intro/vm_components.png)
7+
8+
These components are:
9+
* **Stack** which is a push-down stack where each item is a field element. Most assembly instructions operate with values located on the stack. The stack can grow up to $2^{32}$ items deep, however, only the top 16 items are directly accessible.
10+
* **Memory** which is a linear random-access read-write memory. The memory is word-addressable, meaning, four elements are located at each address, and we can read and write elements to/from memory in batches of four. Memory addresses can be in the range $[0, 2^{32})$.
11+
* **Chiplets** which are specialized circuits for accelerating certain types of computations. These include Rescue Prime Optimized (RPO) hash function, 32-bit binary operations, and 16-bit range checks.
12+
* **Host** which is a way for the prover to communicate with the VM during runtime. This includes responding to the VM's requests for non-deterministic inputs and handling messages sent by the VM (e.g., for debugging purposes). The requests for non-deterministic inputs are handled by the host's *advice provider*.
13+
14+
Miden VM comes with a default implementation of the host interface (with an in-memory advice provider). However, the users are able to provide their own implementations which can connect the VM to arbitrary data sources (e.g., a database or RPC calls) and define custom logic for handling events emitted by the VM.
15+
16+
## Writing programs
17+
Our goal is to make Miden VM an easy compilation target for high-level languages such as Rust, Move, Sway, and others. We believe it is important to let people write programs in the languages of their choice. However, compilers to help with this have not been developed yet. Thus, for now, the primary way to write programs for Miden VM is to use [Miden assembly](../user_docs/assembly/main.md).
18+
19+
While writing programs in assembly is far from ideal, Miden assembly does make this task a little bit easier by supporting high-level flow control structures and named procedures.
20+
21+
## Inputs and outputs
22+
External inputs can be provided to Miden VM in two ways:
23+
24+
1. Public inputs can be supplied to the VM by initializing the stack with desired values before a program starts executing. Any number of stack items can be initialized in this way, but providing a large number of public inputs will increase the cost for the verifier.
25+
2. Secret (or nondeterministic) inputs can be supplied to the VM via the [*advice provider*](#nondeterministic-inputs). There is no limit on how much data the advice provider can hold.
26+
27+
After a program finishes executing, the elements remaining on the stack become the outputs of the program. Since these outputs will be public inputs for the verifier, having a large stack at the end of execution will increase cost to the verifier. Therefore, it's best to drop unneeded output values. We've provided the [`truncate_stack`](../user_docs/stdlib/sys.md) utility function in the standard library for this purpose.
28+
29+
The number of public inputs and outputs of a program can be reduced by making use of the advice stack and Merkle trees. Just 4 elements are sufficient to represent a root of a Merkle tree, which can be expanded into an arbitrary number of values.
30+
31+
For example, if we wanted to provide a thousand public input values to the VM, we could put these values into a Merkle tree, initialize the stack with the root of this tree, initialize the advice provider with the tree itself, and then retrieve values from the tree during program execution using `mtree_get` instruction (described [here](../user_docs/assembly/cryptographic_operations.md#hashing-and-merkle-trees)).
32+
33+
### Stack depth restrictions
34+
For reasons explained [here](../design/stack/main.md), the VM imposes the restriction that the stack depth cannot be smaller than $16$. This has the following effects:
35+
36+
- When initializing a program with fewer than $16$ inputs, the VM will pad the stack with zeros to ensure the depth is $16$ at the beginning of execution.
37+
- If an operation would result in the stack depth dropping below $16$, the VM will insert a zero at the deep end of the stack to make sure the depth stays at $16$.
38+
39+
### Nondeterministic inputs
40+
The *advice provider* component is responsible for supplying nondeterministic inputs to the VM. These inputs only need to be known to the prover (i.e., they do not need to be shared with the verifier).
41+
42+
The advice provider consists of three components:
43+
* **Advice stack** which is a one-dimensional array of field elements. Being a stack, the VM can either push new elements onto the advice stack, or pop the elements from its top.
44+
* **Advice map** which is a key-value map where keys are words and values are vectors of field elements. The VM can copy values from the advice map onto the advice stack as well as insert new values into the advice map (e.g., from a region of memory).
45+
* **Merkle store** which contain structured data reducible to Merkle paths. Some examples of such structures are: Merkle tree, Sparse Merkle Tree, and a collection of Merkle paths. The VM can request Merkle paths from the Merkle store, as well as mutate it by updating or merging nodes contained in the store.
46+
47+
The prover initializes the advice provider prior to executing a program, and from that point on the advice provider is manipulated solely by executing operations on the VM.

docs/miden/vm/intro/performance.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Performance
2+
The benchmarks below should be viewed only as a rough guide for expected future performance. The reasons for this are twofold:
3+
1. Not all constraints have been implemented yet, and we expect that there will be some slowdown once constraint evaluation is completed.
4+
2. Many optimizations have not been applied yet, and we expect that there will be some speedup once we dedicate some time to performance optimizations.
5+
6+
Overall, we don't expect the benchmarks to change significantly, but there will definitely be some deviation from the below numbers in the future.
7+
8+
A few general notes on performance:
9+
10+
* Execution time is dominated by proof generation time. In fact, the time needed to run the program is usually under 1% of the time needed to generate the proof.
11+
* Proof verification time is really fast. In most cases it is under 1 ms, but sometimes gets as high as 2 ms or 3 ms.
12+
* Proof generation process is dynamically adjustable. In general, there is a trade-off between execution time, proof size, and security level (i.e. for a given security level, we can reduce proof size by increasing execution time, up to a point).
13+
* Both proof generation and proof verification times are greatly influenced by the hash function used in the STARK protocol. In the benchmarks below, we use BLAKE3, which is a really fast hash function.
14+
15+
## Single-core prover performance
16+
When executed on a single CPU core, the current version of Miden VM operates at around 20 - 25 KHz. In the benchmarks below, the VM executes a Fibonacci calculator program on Apple M1 Pro CPU in a single thread. The generated proofs have a target security level of 96 bits.
17+
18+
| VM cycles | Execution time | Proving time | RAM consumed | Proof size |
19+
| :-------------: | :------------: | :----------: | :-----------: | :--------: |
20+
| 2<sup>10</sup> | 1 ms | 60 ms | 20 MB | 46 KB |
21+
| 2<sup>12</sup> | 2 ms | 180 ms | 52 MB | 56 KB |
22+
| 2<sup>14</sup> | 8 ms | 680 ms | 240 MB | 65 KB |
23+
| 2<sup>16</sup> | 28 ms | 2.7 sec | 950 MB | 75 KB |
24+
| 2<sup>18</sup> | 81 ms | 11.4 sec | 3.7 GB | 87 KB |
25+
| 2<sup>20</sup> | 310 ms | 47.5 sec | 14 GB | 100 KB |
26+
27+
As can be seen from the above, proving time roughly doubles with every doubling in the number of cycles, but proof size grows much slower.
28+
29+
We can also generate proofs at a higher security level. The cost of doing so is roughly doubling of proving time and roughly 40% increase in proof size. In the benchmarks below, the same Fibonacci calculator program was executed on Apple M1 Pro CPU at 128-bit target security level:
30+
31+
| VM cycles | Execution time | Proving time | RAM consumed | Proof size |
32+
| :-------------: | :------------: | :----------: | :-----------: | :--------: |
33+
| 2<sup>10</sup> | 1 ms | 120 ms | 30 MB | 61 KB |
34+
| 2<sup>12</sup> | 2 ms | 460 ms | 106 MB | 77 KB |
35+
| 2<sup>14</sup> | 8 ms | 1.4 sec | 500 MB | 90 KB |
36+
| 2<sup>16</sup> | 27 ms | 4.9 sec | 2.0 GB | 103 KB |
37+
| 2<sup>18</sup> | 81 ms | 20.1 sec | 8.0 GB | 121 KB |
38+
| 2<sup>20</sup> | 310 ms | 90.3 sec | 20.0 GB | 138 KB |
39+
40+
## Multi-core prover performance
41+
STARK proof generation is massively parallelizable. Thus, by taking advantage of multiple CPU cores we can dramatically reduce proof generation time. For example, when executed on an 8-core CPU (Apple M1 Pro), the current version of Miden VM operates at around 100 KHz. And when executed on a 64-core CPU (Amazon Graviton 3), the VM operates at around 250 KHz.
42+
43+
In the benchmarks below, the VM executes the same Fibonacci calculator program for 2<sup>20</sup> cycles at 96-bit target security level:
44+
45+
| Machine | Execution time | Proving time | Execution % | Implied Frequency |
46+
| ------------------------------ | :------------: | :----------: | :---------: | :---------------: |
47+
| Apple M1 Pro (16 threads) | 310 ms | 7.0 sec | 4.2% | 140 KHz |
48+
| Apple M2 Max (16 threads) | 280 ms | 5.8 sec | 4.5% | 170 KHz |
49+
| AMD Ryzen 9 5950X (16 threads) | 270 ms | 10.0 sec | 2.6% | 100 KHz |
50+
| Amazon Graviton 3 (64 threads) | 330 ms | 3.6 sec | 8.5% | 265 KHz |
51+
52+
### Recursive proofs
53+
Proofs in the above benchmarks are generated using BLAKE3 hash function. While this hash function is very fast, it is not very efficient to execute in Miden VM. Thus, proofs generated using BLAKE3 are not well-suited for recursive proof verification. To support efficient recursive proofs, we need to use an arithmetization-friendly hash function. Miden VM natively supports Rescue Prime Optimized (RPO), which is one such hash function. One of the downsides of arithmetization-friendly hash functions is that they are considerably slower than regular hash functions.
54+
55+
In the benchmarks below we execute the same Fibonacci calculator program for 2<sup>20</sup> cycles at 96-bit target security level using RPO hash function instead of BLAKE3:
56+
57+
| Machine | Execution time | Proving time | Proving time (HW) |
58+
| ------------------------------ | :------------: | :----------: | :---------------: |
59+
| Apple M1 Pro (16 threads) | 310 ms | 94.3 sec | 42.0 sec |
60+
| Apple M2 Max (16 threads) | 280 ms | 75.1 sec | 20.9 sec |
61+
| AMD Ryzen 9 5950X (16 threads) | 270 ms | 59.3 sec | |
62+
| Amazon Graviton 3 (64 threads) | 330 ms | 21.7 sec | 14.9 sec |
63+
64+
In the above, proof generation on some platforms can be hardware-accelerated. Specifically:
65+
66+
* On Apple M1/M2 platforms the built-in GPU is used for a part of proof generation process.
67+
* On the Graviton platform, SVE vector extension is used to accelerate RPO computations.

0 commit comments

Comments
 (0)