Skip to content

Commit

Permalink
Add RzIL chapter
Browse files Browse the repository at this point in the history
  • Loading branch information
wargio committed Dec 4, 2024
1 parent e1e226b commit 0ac0873
Show file tree
Hide file tree
Showing 3 changed files with 153 additions and 1 deletion.
1 change: 1 addition & 0 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ book:
chapters:
- src/disassembling/intro.md
- src/disassembling/adding_metadata.md
- src/disassembling/rzil.md
- src/disassembling/esil.md

- part: "Analysis"
Expand Down
2 changes: 1 addition & 1 deletion src/analysis/emulation.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ to solve even in the most basic way without at least a partial emulation.
Thus, many professional reverse engineering tools use code emulation while
performing an analysis of binary code, and Rizin is no different here.

For partial emulation (or imprecise full emulation) Rizin uses its own RzIL intermediate language, designed
For partial emulation (or imprecise full emulation) Rizin uses its own [RzIL intermediate language](../disassembling/rzil.md), designed
to replace current [ESIL](../disassembling/esil.md).

Rizin supports this kind of partial emulation for all platforms that
Expand Down
151 changes: 151 additions & 0 deletions src/disassembling/rzil.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# RzIL

RzIL is the new intermediate language in Rizin, primarily intended for representing the semantics of machine code. It is designed as a clone of BAP's [Core Theory](http://binaryanalysisplatform.github.io/bap/api/master/bap-core-theory/Bap_core_theory/), with minor deviations where necessary.

More details related the implementation can be found [here](https://github.com/rizinorg/rizin/blob/dev/doc/rzil.md).

## IL statements

IL statements for each [opcode](https://github.com/rizinorg/rizin/blob/dev/librz/include/rz_il/rz_il_opcodes.h) can be represented as a string, composed of s-expressions via command `aoi` (JSON output is available via `aoj`).

In the below shows an example using `aoip` (the *prettified* output) of 2 PPC instructions (`stwu` and `mflr`).

```bash
[0x10000488]> pd 2
| 0x10000488 stwu r1, -0x10(r1)
| 0x1000048c mflr r0
[0x10000488]> aoi?
Usage: aoi[p] # Print the RzIL of next N instructions
| aoi [<n_instructions>] # Print the RzIL of next N instructions
| aoip [<n_instructions>] # Pretty print the RzIL of next N instructions
[0x10000488]> aoip 2
0x10000488
(seq
(storew 0
(+
(var r1)
(let v
(bv 16 0xfff0)
(ite
(msb
(var v))
(cast 32
(msb
(var v))
(var v))
(cast 32
false
(var v)))))
(cast 32
false
(var r1)))
(set r1
(+
(var r1)
(let v
(bv 16 0xfff0)
(ite
(msb
(var v))
(cast 32
(msb
(var v))
(var v))
(cast 32
false
(var v)))))))
0x1000048c
(set r0
(cast 32
false
(var lr)))
```

The same output of `aoi` can be obtained via `rz-asm` as in the following example.

```bash
$ rz-asm -de -a ppc 7c0802a6
mflr r0
$ rz-asm -Ie -a ppc 7c0802a6
(set r0 (cast 32 false (var lr)))
```

## Emulation

Rizin enables instruction emulation by leveraging RzIL. This process can record interactions between instructions and VM components like registers and memory. The emulation is controlled via the `aez` commands.

```bash
[0x00000000]> aez?
Usage: aez<isv?> # RzIL Emulation
| aezi # Initialize the RzIL Virtual Machine at the current offset
| aezs [<n_times>] # Step N instructions within the RzIL Virtual Machine
| aezse[j] [<n_times>] # Step N instructions within the RzIL VM and output VM changes (read &
write)
| aezsu <address> # Step until PC equals given address
| aezsue <address> # Step until PC equals given address and output VM changes (read & write)
| aezv[jqt] [<var_name> [<number>]] # Print or modify the current status of the RzIL Virtual Machine
```
Supported architectures can be inspected via `La` command; If the architecture has an `I` as in the example below, then it supports RzIL.
```bash
_dAeI 32 64 ppc BSD Capstone PowerPC disassembler
```
Example of emulation of some instructions in a powerpc binary.
In the example, `r9` contains the base address which is used to calculate the pointer to the string (stored in `r3`) used by 'reloc.printf'.
```bash
[0x1000049c]> pd 3
| 0x1000049c lis r9, 0x1000
| 0x100004a0 addi r3, r9, 0x640
| 0x100004a4 bl reloc.printf
```
First we need to initialize the RzIL Virtual Machine at the current offset using `aezi`
```bash
[0x1000049c]> aezi?
Usage: aezi # Initialize the RzIL Virtual Machine at the current offset
[0x1000049c]> aezi
```
Then we execute 2 instructions via `aezs` (quiet) or use `aezse` to see the actual changes within the RzIL VM.
```bash
[0x1000049c]> aezse?
Usage: aezse[j] [<n_times>] # Step N instructions within the RzIL VM and output VM changes (read & write)
[0x1000049c]> aezse 2 # execute 2 instructions
pc_write(old: 0x1000049c, new: 0x100004a0)
var_write(name: r9, old: 0x0, new: 0x10000000)
pc_write(old: 0x100004a0, new: 0x100004a4)
var_write(name: r3, old: 0x0, new: 0x10000640)
```
It's possible to see (or modify) the values of the registers in the RzIL VM via `aezv`.
```bash
[0x1000049c]> # We can also print the content of the RzIL VM via 'aezv'
[0x1000049c]> aezv?
Usage: aezv[jqt] [<var_name> [<number>]] # Print or modify the current status of the RzIL Virtual Machine
[0x1000049c]> aezv r3
r3: 0x10000640
[0x1000049c]> aezv r9
r9: 0x10000000
```
Now that we know that the string is situated at `0x10000640`, we can print it.
```bash
[0x1000049c]> # hexdump the content of address 0x10000640 with a buffer size of 0x20 bytes.
[0x1000049c]> px @ 0x10000640 @! 0x20
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x10000640 5369 6d70 6c65 2050 5043 2070 726f 6772 Simple PPC progr
0x10000650 616d 2e00 0000 0000 ffff ffff ffff ffff am..............
[0x1000049c]>
[0x1000049c]> # Decode and print the utf-8 string at address 0x10000640
[0x1000049c]> ps @ 0x10000640
Simple PPC program.
[0x1000049c]>
```

0 comments on commit 0ac0873

Please sign in to comment.