Skip to content

Commit

Permalink
proofread graph compiler tutorial (#364)
Browse files Browse the repository at this point in the history
  • Loading branch information
joelberkeley authored May 27, 2023
1 parent 9059d83 commit 5d31e9c
Showing 1 changed file with 15 additions and 12 deletions.
27 changes: 15 additions & 12 deletions tutorials/GraphCompiler.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@ limitations under the License.
-->
# The Graph Compiler

_Note: We're not compiler experts, so this tutorial is more about spidr itself than a guide to compiler design._
_Note: We're not compiler experts; this tutorial is more about spidr itself than a guide to compiler design._

## Efficient reuse of tensors, and working with `Ref`
## Efficiently reusing tensors, and working with `Ref`

spidr explicitly caches tensors so they can be efficiently be reused. Our understanding is that the technique we have used to achieve this is called observable sharing. In this section we'll discuss what our implementation means for spidr's tensor API.
spidr explicitly caches tensors so they can be efficiently be reused. We achieved this with _observable sharing_. In this section we discuss what our implementation means for spidr's tensor API.

Caching ensures that the computation you write will be the computation sent to the graph compiler. Unfortunately this comes with downsides. Firstly, there is extra boilerplate. Most tensor operations accept `Tensor shape dtype` and output `Ref (Tensor shape dtype)`, so computations must handle the `Ref` effect. For example, what might be `abs (max x)` in another library is, for example, `abs !(max x)` in spidr. One notable exception to this is for infix operators where, to avoid unreadable algebra, we have defined infix operators to accept `Ref (Tensor shape dtype)` values. This means you will need to wrap a bare `Tensor shape dtype` in `pure` to pass it to an infix operator. For example,
Caching ensures that the computation you write will be the computation sent to the graph compiler. Unfortunately this comes with downsides. First, there is extra boilerplate. Most tensor operations accept `Tensor shape dtype` and output `Ref (Tensor shape dtype)`, so when you compose operations, you'll need to handle the `Ref` effect. For example, what might be `abs (max x y)` in another library can be `abs !(max !x !y)` in spidr. One notable exception to this is infix operators, which accept `Ref (Tensor shape dtype)` values. This is to avoid unreadable algebra: you won't need to write `!(!x * !y) + !z`. However, it does mean you will need to wrap any `Tensor shape dtype` values in `pure` to pass it to an infix operator. Let's see an example:
<!-- idris
import Literal
import Tensor
Expand All @@ -30,23 +30,26 @@ import Tensor
f : Tensor shape F64 -> Tensor shape F64 -> Ref $ Tensor shape F64
f x y = (abs x + pure y) * pure x
```
In this example, `pure` produces a `Ref a` from an `a`, as does `abs` (the elementwise absolute value function). Addition `(+)` and multiplication `(*)` produce _and accept_ `Ref` so there is no need to wrap the output of `abs x + pure y` in `pure` before passing it to `(*)`. A rule of thumb is that you only need `pure` if both of these are true
* you're passing a value to an infix operator
* the value is either a function argument or is on the left hand side of `x <- expression` Secondly, care is needed when reusing expressions to make sure you're not recomputation sections of the graph. For example, in
Here, `pure` produces a `Ref (Tensor shape F64)` from a `Tensor shape F64`, as does `abs` (the element-wise absolute value function). Addition `(+)` and multiplication `(*)` produce _and accept_ `Ref (Tensor shape F64)` so there is no need to wrap the output of `abs x + pure y` in `pure` before passing it to `(*)`. A rule of thumb is that you only need `pure` if both of these are true

* you're passing a tensor to an infix operator
* the tensor is either a function argument or is on the left hand side of a monadic bind `x <- expression`

Second, care is needed when reusing expressions to make sure you don't recompute sections of the graph. For example, in
```idris
whoops : Ref $ Tensor [3] S32
whoops = let y = tensor [1, 2, 3]
z = y + y
in z * z
```
`z` will be calculated twice, and `y` allocated four times (unless the graph compiler chooses to optimize that out). Instead, we can reuse `z` and `y` with
`z` will be calculated twice, and `y` allocated four times (unless the graph compiler chooses to optimize that out). Instead, we can reuse `y` and `z` with
```idris
ok : Ref $ Tensor [3] S32
ok = do y <- tensor [1, 2, 3]
z <- (pure y) + (pure y)
(pure z) * (pure z)
z <- pure y + pure y
pure z * pure z
```
Here, `y` and `z` will only be calculated once. This can happen more subtley when reusing values from another scope. For example, in
Here, `y` and `z` will only be calculated once. This problem can occur more subtley when reusing values from another scope. For example, in
```idris
expensive : Ref $ Tensor [] F64
expensive = reduce @{Sum} [0] !(fill {shape = [100000]} 1.0)
Expand Down Expand Up @@ -74,4 +77,4 @@ okf e = max !(xf e) !(yf e)
res : Ref $ Tensor [] F64
res = okf !expensive
```
Note we must pass the `Tensor [] F64`, rather than a `Ref (Tensor [] F64)`, if the tensor is to be reused.
Note we must pass the `Tensor [] F64` to `xf`, `yf` and `okf`, rather than a `Ref (Tensor [] F64)`, if the tensor is to be reused.

0 comments on commit 5d31e9c

Please sign in to comment.