diff --git a/docs/vm/README.md b/docs/vm/README.md index 2201a92f71..97c0db4e74 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -156,11 +156,147 @@ TODO: Explain primitive operations and how they affect the registers ## Hints -TODO: +So far we have been thinking about the VM mostly abstracted from the prover and verifier it's meant to feed its results to. The last main feature we need to talk about, however, requires keeping this proving/verifying logic in mind. +As a reminder, the whole point of the Cairo VM is to output a trace/memory file so that a `prover` can then create a cryptographic proof that the execution of the program was done correctly. A `verifier` can then take that proof and verify it in much less time than it would have taken to re-execute the entire program. + +In this model, the one actually using the VM to run a cairo program is *always the prover*. The verifier does not use the VM in any way, as that would defeat the entire purpose of validity proofs; they just get the program being run and the proof generated by the prover and run some cryptographic algorithm to check it. + +While the verifier does not execute the code, they do *check it*. As an example, if a cairo program computes a fibonacci number like this: + +``` +func main() { + // Call fib(1, 1, 10). + let result: felt = fib(1, 1, 10); +} +``` + +the verifier won't *run* this, but they will reject any incorrect execution of the call to `fib`. The correct value for `result` in this case is `144` (it's the 10th fibonacci number); any attempt by the prover to convince the verifier that `result` is not `144` will fail, because the call to the `fib` function is *being proven* and thus *seen* by the verifier. + +A `Hint` is a piece of code that is not proven, and therefore not seen by the verifier. If `fib` above were a hint, then the prover could convince the verifier that `result` is $144$, $0$, $1000$ or any other number. + +In cairo 0, hints are code written in `Python` and are surrounded by curly brackets. Here's an example from the `alloc` function, provided by the Cairo common library + +``` +func alloc() -> (ptr: felt*) { + %{ memory[ap] = segments.add() %} + ap += 1; + return (ptr=cast([ap - 1], felt*)); +} +``` + +The first line of the function, + +``` +%{ memory[ap] = segments.add() %} +``` + +is a hint called `ADD_SEGMENT`. All it does is create a new memory segment, then write its base to the current value of `ap`. This is python code that is being run in the context of the VM's execution; thus `memory` refers to the VM's current memory and `segments.add()` is just a function provided by the VM to allocate a new segment. + +At this point you might be wondering: why run code that's not being proven? Isn't the whole point of Cairo to prove correct execution? There are (at least) two reasons for hints to exist. + +### Nothing to prove + +For some operations there's simply nothing to prove, as they are just convenient things one wants to do during execution. The `ADD_SEGMENT` hint shown above is a good example of that. When proving execution, the program's memory is presented as one relocated continuous segment, it does not matter at all which segment a cell was in, or when that segment was added. The verifier doesn't care. + +Because of this, there's no reason to make `ADD_SEGMENT` a part of the cairo language and have an instruction for it. + +### Optimization + +Certain operations can be very expensive, in the sense that they might involve a huge amount of instructions or memory usage, and therefore contribute heavily to the proving time. For certain calculations, there are two ways to convince the verifier that it was done correctly: + +- Write the entire calculation in Cairo/Cairo Assembly. This makes it show up in the trace and therefore get proven. +- *Present the result of the calculation to the verifier through a hint*, then show said result indeed satisfies the relevant condition that makes it the actual result. + +To make this less abstract, let's show two examples. + +#### Square root + +Let's say the calculation in question is to compute the square root of a number `x`. The two ways to do it then become: + +- Write the usual square root algorithm in Cairo to compute `sqrt(x)`. +- Write a hint that computes `sqrt(x)`, then immediately after calling the hint show __in Cairo__ that `(sqrt(x))^2 = x`. + +The second approach is exactly what the `sqrt` function in the Cairo common library does: + +``` +// Returns the floor value of the square root of the given value. +// Assumptions: 0 <= value < 2**250. +func sqrt{range_check_ptr}(value) -> felt { + alloc_locals; + local root: felt; + + %{ + from starkware.python.math_utils import isqrt + value = ids.value % PRIME + assert value < 2 ** 250, f"value={value} is outside of the range [0, 2**250)." + assert 2 ** 250 < PRIME + ids.root = isqrt(value) + %} + + assert_nn_le(root, 2 ** 125 - 1); + tempvar root_plus_one = root + 1; + assert_in_range(value, root * root, root_plus_one * root_plus_one); + + return root; +} +``` + +If you read it carefully, you'll see that the hint in this function computes the square root in python, then this line + +``` +assert_in_range(value, root * root, root_plus_one * root_plus_one); +``` + +asserts __in Cairo__ that `(sqrt(x))^2 = x`. + +This is done this way because it is much cheaper, in terms of the generated trace (and thus proving time), to square a number than compute its square root. + +Notice that the last assert is absolutely mandatory to make this safe. If you forget to write it, the square root calculation does not get proven, and anyone could convince the verifier that the result of `sqrt(x)` is any number they like. + +#### Linear search turned into an O(1) lookup + +This example is taken from the [Cairo documentation](https://docs.cairo-lang.org/0.12.0/hello_cairo/program_input.html). + +Given a list of `(key, value)` pairs, if we want to write a `get_value_by_key` function that returns the value associated to a given key, there are two ways to do it: + +- Write a linear search in Cairo, iterating over each key until you find the requested one. +- Do that exact same linear search *inside a hint*, find the result, then show that the result's key is the one requested. + +Again, the second approach makes the resulting trace and proving much faster, because it's just a lookup; there's no linear search. Notice this only applies to proving, the VM has to execute the hint, so there's still a linear search when executing to generate the trace. In fact, the second approach is more expensive for the VM than the first one. It has to do both a linear search and a lookup. This is a tradeoff in favor of proving time. + +Also note that, as in the square root example, when writing this logic you need to remember to show the hint's result is the correct one in Cairo. If you don't, your code is not being proven. + +### Non-determinism + +The Cairo paper and documentation refers to this second approach to calculating things through hints as *non-determinism*. The reason for this is that sometimes there is more than one result that satisfies a certain condition. This means that cairo execution becomes non deterministic; a hint could output multiple values, and in principle there is no way to know which one it's going to be. Running the same code multiple times could give different results. + +The square root is an easy example of this. The condition `(sqrt(x))^2 = x` is not unique, there are two solutions to it. Without the hint, this is non-deterministic, `x` could have multiple values; the hint resolves that by choosing a specific value when being run. + +### Common Library and Hints + +As explained above, using hints in your code is highly unsafe. Forgetting to add a check after calling them can make your code vulnerable to any sorts of attacks, as your program will not prove what you think it proves. + +Because of this, most hints in Cairo 0 are wrapped around or used by functions in the Cairo common library that do the checks for you, thus making them safe to use. Ideally, Cairo developers should not be using hints on their own; only transparently through Cairo library functions they call. + +### Whitelisted Hints + +In Cairo, a hint could be any Python code you like. In the context of it as just another language someone might want to use, this is fine. In the context of Cairo as a programming language used to write smart contracts deployed on a blockchain, it's not. Users could deploy contracts with hints that simply do + +```python +while true: + pass +``` + +and grind the network down to a halt, as nodes get stuck executing an infinite loop when calling the contract. + +To address this, the starknet network maintains a list of *whitelisted* hints, which are the only ones that can be used in starknet contracts. These are the ones implemented in this VM. ## Builtins -TODO: +A builtin is a low level optimization integrated into the core loop of the VM that allows otherwise expensive computation to be performed more efficiently. Builtins have two ways to operate: via validation rules and via auto-deduction rules. + +- Validation rules are applied to every element that is inserted into a builtin's segment. For example, if I want to verify an ecdsa signature, I can insert it into the ecdsa builtin's segment and let a validation rule take care of verifying the signature. +- Auto-deduction rules take over during instruction execution, when we can't compute the value of an operand who's address belongs to a builtin segment, we can use that builtin's auto-deduction rule to calculate the value of the operand. For example, If I want to calculate the pedersen hash of two values, I can write the values into the pedersen builtin's segment and then ask for the next memory cell, without builtins, this instruction would have failed, as there is no value stored in that cell, but now we can use auto-deduction rules to calculate the hash and fill in that memory cell. ## Program parsing