Author: Justin Thaler Source: a16z Translation: Shan Ouba, Golden Finance
Zero-knowledge virtual machines (zkVMs) aim to "democratize SNARKs," allowing even people without SNARK expertise to prove that they ran a program correctly on a specific input (or witness). Its core advantage lies in the developer experience, but current zkVMs still face huge challenges in security and performance. If zkVMs want to deliver on their promise, designers must overcome these obstacles. This article will explore the possible stages of zkVM development, and the entire process may take several years to complete - don't listen to anyone who says this can be achieved quickly.
Challenges
In terms of security, zkVMs are highly complex software projects that are still full of vulnerabilities.
In terms of performance, proving the correct execution of a program can be hundreds of thousands of times slower than running natively, making most applications unfeasible for real-world deployment.
Despite this, many voices in the blockchain industry still promote zkVMs as being ready for immediate deployment, and some projects are even already paying high computational costs to generate zero-knowledge proofs of on-chain activity. However, due to the many vulnerabilities that zkVMs still have, this approach is actually just an expensive disguise to make the system look like it is protected by SNARKs, when in fact it either relies on permission control or, worse, is exposed to attack risks.
The reality is that we are still years away from building a truly secure and efficient zkVM. This article proposes a series of concrete and phased goals to help us track the real progress of zkVM, weaken the hype, and guide the community's attention to real technical breakthroughs.
Security Development Stages
Background
SNARK-based zkVMs typically consist of two core components:
1. Polynomial Interactive Oracle Proof (PIOP): An interactive proof framework for proving polynomials (or constraints derived from polynomials).
2. Polynomial Commitment Scheme (PCS): Ensures that the prover cannot forge polynomial evaluation results without being detected.
zkVM ensures the correct use of the virtual machine's registers and memory by encoding valid execution traces into a constraint system, and then using SNARK to prove the satisfaction of these constraints.
In such a complex system, the only way to ensure that zkVM is vulnerability-free isformal verification. The following are the different stages of zkVM security, where the first stage focuses on protocol correctness, and the second and third stages focus on implementation correctness.
Security Phase 1: Correct Protocol
Formally verified proof of the soundness of PIOP;
Formally verified proof that PCS is binding under certain cryptographic assumptions or ideal models;
Formally verified proof that the succinct argument obtained by combining PIOP and PCS is secure in the random oracle model if Fiat-Shamir is used (augmented with other cryptographic assumptions as needed);
Formally verified proof that the system of constraints applied by PIOP is equivalent to the semantics of the VM;
A comprehensive "gluing" of all of the above pieces into a single, formally verified secure SNARK proof that it can be used to run any program specified by the VM bytecode. If the protocol intends to implement zero-knowledge, this property must also be formally verified to ensure that sensitive information about witnesses is not leaked.
If zkVM uses recursion, then the PIOPs, commitment schemes, and constraint systems involved in the recursion must all be verified, otherwise this subphase cannot be considered complete.
Security Phase 2: Correct Validator Implementation
This phase requires formal verification of the actual implementation of the zkVM validator (such as Rust, Solidity, etc.) to ensure that it conforms to the protocol that has been verified in the first phase. Completing this phase means that the zkVM implementation is consistent with the theoretical design, rather than just a secure protocol on paper or an inefficient specification written in a language like Lean.
There are two main reasons why we only focus on the verifier and not the prover: First, ensuring that the verifier is correct guarantees the completeness of the zkVM proof system (i.e., ensuring that the verifier cannot be deceived into accepting a false proof). Second, the zkVM verifier implementation is more than an order of magnitude simpler than the prover implementation, and the correctness of the verifier is easier to ensure in the short term.
Security Phase 3: Correct Prover Implementation
This phase requires formal verification of the actual implementation of the zkVM prover to ensure that it can correctly generate proofs of the proof systems verified in phases 1 and 2. The goal of this phase is completeness, that is, any system using zkVM will not be stuck because it cannot prove a legal statement. If zkVM needs to have zero-knowledge properties, formal verification must be provided to ensure that the proof does not leak any information about the witness.
Expected Timeline
Phase 1 Progress: We can expect some progress next year (e.g., ZKLibis one such effort). But no zkVM will fully meet the requirements of Phase 1 for at least two years.
Phase 2 and 3: These phases can advance in parallel with aspects of Phase 1. For example, some teams have demonstrated that the Plonk validator implementation matches the protocol in the paper (although the paper’s protocol itself may not be fully verified). Still, I don’t expect any zkVM to reach stage 3 in less than four years — and possibly longer.
Key Notes: Fiat-Shamir Security vs. Verified Bytecode
A major complexity issue is that the security of the Fiat-Shamir transform remains an unsolved research problem. All three security stages treat Fiat-Shamir and random oracles as absolutely secure, but in reality the whole paradigm can have vulnerabilities. This is due to differences between the idealized model of random oracles and the actual hash functions used.
In the worst case, a system that has reached security stage 2 could be found to be completely insecure due to Fiat-Shamir related issues. This deserves our high attention and continued research. We may need to modify the Fiat-Shamir transform itself to better protect against such vulnerabilities.
Systems that do not use recursion are theoretically more secure, because some known attacks involve circuits similar to those used in recursive proofs. But this risk remains a fundamental unsolved problem.
Another caveat is that even if zkVM proves that a computational program (specified by bytecode) is correctly executed, this proof is of limited value if the bytecode itself is flawed. Therefore, the practicality of zkVM depends heavily on how to generate formally verified bytecode, a challenge that is extremely large and beyond the scope of this article.
About Quantum-Resistant Security
Quantum computers will not pose a serious threat for at least 5 years (and possibly longer), while software vulnerabilities are a matter of life and death.Thus, the priority should be to achieve the security and performance goals proposed in this article. If non-quantum-resistant SNARKs can meet these goals more quickly, we should use them first. Wait until quantum-resistant SNARKs catch up or there are signs that quantum computers with realistic threats are imminent before considering switching.
Specific Security Levels
100-bit classical security is the minimum standard for any SNARK to protect valuable assets (but there are still some systems that do not meet this low standard). Even so, this should not be accepted, as standard cryptographic practice often requires 128-bit security and above. If SNARKs are truly performant, we should not compromise security for performance.
Performance Phase
Current Situation
Currently, the computational overhead of the zkVM prover is about 1 million times that of native execution. In other words, if a native execution of a program takes X CPU cycles, then generating a proof of correct execution takes about X × 1,000,000 CPU cycles. This was true a year ago, and it is still true today (despite some misunderstandings).
Some popular claims in the industry today can be misleading, such as:
1. “The cost of generating proofs for the entire Ethereum mainnet is less than $1 million per year.”
2. “We have achieved almost real-time proof generation for Ethereum blocks, with only a few dozen GPUs.”
3. “Our latest zkVM is 1000x faster than its predecessor.”
However, these claims can be misleading without context:
• 1000x faster than old zkVM can still be very slow, which says more about how bad it was in the past than how good it is now.
• The amount of computation on Ethereum mainnet may increase 10 times in the future, which will make the current zkVM performance far behind the demand.
• The so-called "almost real-time" proof generation is still too slow for many blockchain applications (for example, Optimism's block time is 2 seconds, which is much faster than Ethereum's 12 seconds).
• "Dozens of GPUs running 24/7 for a long time" does not provide sufficient liveness guarantees.
• These proof generation times are usually for proof sizes over 1MB, which is too large for many applications.
• “less than $1 million per year” is simply because an Ethereum full node performs only about $25 worth of computation per year.
For use cases outside of blockchain, this computational overhead is clearly too high. No amount of parallel computing or engineering optimization can make up for such a huge computational overhead.
The basic goal we should set is: performance overhead of no more than 100,000 times that of native execution. But even so, this is still just a first step. To achieve truly large-scale mainstream applications, we may need to reduce the overhead to 10,000 times or less of native execution.
Performance Measurement
SNARK performance has three main components:
1. Intrinsic efficiency of the underlying proof system.
2. Application-specific optimizations (e.g., precompilation).
3. Engineering and hardware acceleration (e.g., GPU, FPGA, or multi-core CPU).
While (2) and (3) are critical for real-world deployments, they apply to any proof system and thus may not necessarily reflect improvements in basic overhead. For example, adding GPU acceleration and precompilation to the zkEVM can easily speed up by 50x over relying solely on CPUs - potentially making an inherently less efficient system appear superior to another that has not been similarly optimized.
Therefore, this paper focuses on measuring the basic performance of SNARKs in the absence of specialized hardware and precompilation. This differs from current benchmarking approaches, which often combine all three factors into a single “overall number”. This is like judging a diamond by how long it has been polished, rather than by its inherent clarity.
Our goal is to isolate the inherent overhead of general-purpose proof systems, lower the barrier to entry for under-explored techniques, and help the community cut through the noise so it can focus on real progress in proof system design.
Performance Phases
Here are the five performance phase milestones I propose. First, we need to significantly reduce prover overhead on CPUs before we can further reduce overhead on hardware. Memory usage must also improve.
In all phases, developers should not have to tweak their code for zkVM performance. Developer experience is a core benefit of zkVM. Sacrificing DevEx to meet performance benchmarks defeats the purpose of benchmarking and the original purpose of zkVM.
These metrics focus primarily on prover cost. However, if the verifier cost is allowed to grow without bound (i.e., there is no upper bound on proof size or verification time), then any of the prover metrics can be easily met. Therefore, to meet the requirements of the following phases, both maximum proof size and maximum verification time must be bounded.
Phase 1 Requirement: “Reasonable Non-Trivial Verification Cost”
• Proof Size: Must be smaller than the witness size.
• Verification Time: Verifying a proof must be no slower than native execution of the program (i.e., no slower than performing the computation directly).
These are minimum simplicity requirements, ensuring that proof size and verification time are no worse than sending the witness directly to the validator and having it check it directly.
Phase 2 and above
• Maximum proof size: 256 KB.
• Maximum verification time: 16 milliseconds.
These caps are intentionally loose to accommodate novel fast proof techniques, even if they may incur higher verification costs. At the same time, these caps exclude proofs that are so expensive that few projects would want to use them on the blockchain.
Speed Phase 1
Single-threaded proofs must be no more than 100,000x slower than native execution (applicable for many applications, not just Ethereum block proofs), and must not rely on precompilation.
To put it concretely, assuming a RISC-V processor on a modern laptop running at ~3 billion cycles/second, reaching Phase 1 means that the laptop can generate proofs at 30,000 RISC-V cycles/second (single-threaded).
Verifier cost must meet the "reasonable non-trivial verification cost" criterion defined previously.
Speed Phase 2
Single-threaded proofs must be no more than 10,000 times slower than native execution.
Alternatively, since some promising SNARK approaches (particularly binary-domain SNARKs) are limited by current CPUs and GPUs, this phase could be met with FPGAs (or even ASICs):
1. Count the number of RISC-V cores that an FPGA can emulate at native speed.
2. Count the number of FPGAs needed to emulate and prove RISC-V execution (in near real time).
3. If the number of (2) is no more than 10,000 times that of (1), then stage 2 is satisfied.
• Proof size: 256 KB maximum.
• Verification time: 16 ms maximum on a standard CPU.
Speed stage 3
Based on achieving speed stage 2, achieve proof overhead less than 1000× (suitable for a variety of applications) and must use automatic synthesis and formal verification precompilation. Essentially, the instruction set of each program is dynamically customized to speed up proof generation, but with the promise of ease of use and formal verification. (See the next section for why precompilation is a double-edged sword, and why “hand-written” precompilation is not a sustainable approach.) Memory Phase 1 Reach speed phase 1 in less than 2 GB of memory while meeting zero-knowledge requirements. This phase is critical for mobile devices or browsers, and opens the door to a wide range of client-side zkVM use cases. For example, smartphones for location privacy, identity credentials, etc. Most mobile devices will not be able to run if proof generation requires more than 1-2 GB of memory.
Two important notes:
1. Even for large-scale computations (native execution requiring trillions of CPU cycles), the proof system must maintain a 2 GB memory limit, otherwise its applicability will be limited.
2. If the proof is extremely slow, it is easy to maintain a 2 GB memory limit. Therefore, in order for Memory Phase 1 to make sense, Speed Phase 1 must be achieved within the 2 GB memory limit.
Memory Phase 2
Speed Phase 1 is achieved with less than 200 MB of memory (a 10x improvement over Memory Phase 1).
Why go down to 200 MB? Consider a non-blockchain scenario: When you visit an HTTPS website, authentication and encryption certificates are downloaded. If the website instead sends zk proofs of these certificates, large websites may need to generate millions of proofs per second. If each proof requires 2 GB of memory, the computing resource requirements will reach PB level, which is obviously not feasible. Therefore, further reducing memory usage is crucial for non-blockchain applications.
Precompile: The last mile, or a crutch?
Precompile refers to a SNARK constraint system that is optimized for specific functions (such as hashing, elliptic curve signatures). In Ethereum, precompile can reduce the overhead of Merkle hashing and signature verification, but over-reliance on precompile does not really improve the core efficiency of SNARK.
Precompilation Issues
1. Still too slow: Even with hash and signature precompilation, zkVM still has inefficiencies in the core proof system inside and outside the blockchain.
2. Security vulnerabilities: Handwritten precompilations that are not formally verified are almost certain to have vulnerabilities, which may lead to catastrophic security failures.
3. Poor developer experience: Currently, many zkVMs require developers to handwrite constraint systems, which is similar to the programming method in the 1960s, which seriously affects the development experience.
4. Benchmark misleading: If the benchmark relies on optimizing a specific precompilation, it may mislead people to focus on optimizing the hand-crafted constraint system rather than improving the SNARK design itself.
5. I/O Overhead and No RAM AccessWhile precompiles can improve performance for crypto-heavy tasks, they may not provide meaningful speedups for more diverse workloads because they incur significant overhead in passing input/output and they cannot use RAM.
Even in the blockchain context, as soon as you go beyond a single L1 like Ethereum (say, you want to build a series of cross-chain bridges), you are faced with different hash functions and signature schemes. Constantly precompiling to solve this problem is both unscalable and poses a huge security risk.
I do believe that precompiles will remain critical in the long run, but only after they are automatically synthesized and formally verified. In this way, we can maintain the developer experience advantages of zkVM while avoiding catastrophic security risks. This view is reflected in Phase 3.
Expected Timeline
I expect a handful of zkVMs to reach speed stage 1 and memory stage 1 later this year. I think we’ll also reach speed stage 2 within the next two years, but it’s not clear if we can reach that without new research ideas.
I expect the remaining stages (speed stage 3 and memory stage 2) to take several years to reach.
While this article lists the security and performance stages of zkVM separately, the two are not entirely independent. As vulnerabilities in zkVM continue to be discovered, I expect that the fixes of some of these vulnerabilities will inevitably result in significant performance degradation. Therefore, until zkVM reaches Security Stage 2, performance test results should be considered tentative.
zkVM has great potential to make zero-knowledge proofs truly ubiquitous, but it is still in its early stages - fraught with security challenges and with severe performance bottlenecks. The hype and marketing hype make it difficult to measure real progress. By clearly defining security and performance milestones, I hope to provide a roadmap that can clear the fog. We will get there eventually, but it will take time and continued research and engineering efforts.