Skip to content

Commit

Permalink
docs(DAT): introduce basic DAT file documentation
Browse files Browse the repository at this point in the history
* docs(DAT): Introduce basic DAT file documentation
* docs(DAT) Small correction to code span definition
* docs(DAT) Add ZenGin Scripts subsection and rename DAT page
* docs(DAT) Use collapsible blocks and mark unused opcodes
* docs(DAT) Use content tabs for InstructionData
* docs(DAT) Link issue related to improper termination of strings
* docs(DAT) Use content tabs for SymbolData as well
* docs(DAT) Use content tabs for SymbolName
  • Loading branch information
PolyMeilex authored Apr 12, 2024
1 parent 08ce19c commit 16267f5
Show file tree
Hide file tree
Showing 3 changed files with 310 additions and 0 deletions.
3 changes: 3 additions & 0 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@
* [Datatypes](engine/datatypes.md)
* Formats
* [ZenGin Animations](engine/formats/animation.md)
* ZenGin Scripts
* [Script Binaries](engine/formats/script_binaries.md)
* [Daedalus Bytecode](engine/formats/bytecode.md)
* [ZenGin Font](engine/formats/font.md)
* [ZenGin Texture](engine/formats/texture.md)
* [ZenGin Virtual File System](engine/formats/vdf.md)
Expand Down
137 changes: 137 additions & 0 deletions docs/engine/formats/bytecode.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# ZenGin Bytecode

Bytecode is read from [`DatFile.bytecode`](./script_binaries.md#format-description) field

```c title="Bytecode Buffer"
struct Bytecode {
Instruction instructions[/* Read until there is no bytes left in the buffer */]
};
```

## Instruction

```c title=""
struct Instruction {
Opcode opcode,
InstructionData data,
};
```

### Opcode

??? "enum Opcode: uint { ... }"

Unused variants are commented out
```c title=""
enum Opcode: uint {
zPAR_OP_PLUS = 0,
zPAR_OP_MINUS = 1,
zPAR_OP_MUL = 2,
zPAR_OP_DIV = 3,
zPAR_OP_MOD = 4,
zPAR_OP_OR = 5,
zPAR_OP_AND = 6,
zPAR_OP_LOWER = 7,
zPAR_OP_HIGHER = 8,
zPAR_OP_IS = 9,
zPAR_OP_LOG_OR = 11,
zPAR_OP_LOG_AND = 12,
zPAR_OP_SHIFTL = 13,
zPAR_OP_SHIFTR = 14,
zPAR_OP_LOWER_EQ = 15,
zPAR_OP_EQUAL = 16,
zPAR_OP_NOTEQUAL = 17,
zPAR_OP_HIGHER_EQ = 18,
zPAR_OP_ISPLUS = 19,
zPAR_OP_ISMINUS = 20,
zPAR_OP_ISMUL = 21,
zPAR_OP_ISDIV = 22,
// zPAR_OP_UNARY = 30,
zPAR_OP_UN_PLUS = 30,
zPAR_OP_UN_MINUS = 31,
zPAR_OP_UN_NOT = 32,
zPAR_OP_UN_NEG = 33,
// zPAR_OP_MAX = 33,
// zPAR_TOK_BRACKETON = 40,
// zPAR_TOK_BRACKETOFF = 41,
// zPAR_TOK_SEMIKOLON = 42,
// zPAR_TOK_KOMMA = 43,
// zPAR_TOK_SCHWEIF = 44,
zPAR_TOK_NONE = 45,
// zPAR_TOK_FLOAT = 51,
// zPAR_TOK_VAR = 52,
// zPAR_TOK_OPERATOR = 53,
zPAR_TOK_RET = 60,
zPAR_TOK_CALL = 61,
zPAR_TOK_CALLEXTERN = 62,
// zPAR_TOK_POPINT = 63,
zPAR_TOK_PUSHINT = 64,
zPAR_TOK_PUSHVAR = 65,
// zPAR_TOK_PUSHSTR = 66,
zPAR_TOK_PUSHINST = 67,
// zPAR_TOK_PUSHINDEX = 68,
// zPAR_TOK_POPVAR = 69,
zPAR_TOK_ASSIGNSTR = 70,
zPAR_TOK_ASSIGNSTRP = 71,
zPAR_TOK_ASSIGNFUNC = 72,
zPAR_TOK_ASSIGNFLOAT = 73,
zPAR_TOK_ASSIGNINST = 74,
zPAR_TOK_JUMP = 75,
zPAR_TOK_JUMPF = 76,
zPAR_TOK_SETINSTANCE = 80,
// zPAR_TOK_SKIP = 90,
// zPAR_TOK_LABEL = 91,
// zPAR_TOK_FUNC = 92,
// zPAR_TOK_FUNCEND = 93,
// zPAR_TOK_CLASS = 94,
// zPAR_TOK_CLASSEND = 95,
// zPAR_TOK_INSTANCE = 96,
// zPAR_TOK_INSTANCEEND = 97,
// zPAR_TOK_NEWSTRING = 98,
zPAR_TOK_FLAGARRAY = zPAR_TOK_VAR + 128
};
```

### Instruction Data

Depending on `opcode` the data may be different or in most cases 0 sized.

=== "CALL, JUMPF, JUMP"

```c title=""
struct InstructionData {
uint address;
};
```

=== "PUSHINT"

```c title=""
struct InstructionData {
int immediate;
};
```

=== "CALLEXTERN, PUSHVAR, PUSHINST, SETINSTANCE"

```c title=""
struct InstructionData {
uint symbol;
};
```

=== "PUSHVAR + FLAGARRAY"

```c title=""
struct InstructionData {
uint symbol;
byte index;
};
```

=== "any other"

```c title=""
struct InstructionData {};
```

170 changes: 170 additions & 0 deletions docs/engine/formats/script_binaries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# ZenGin Script Binaries

!!! abstract inline end "Quick Infos"
**Type:** Script Format<br/>
**Format Name:** DAT<br/>
**File Extension:** `.DAT`<br/>
**Class Name:** `zCParser`<br/>
**Encoding:** [Binary](../encodings/binary.md)<br/>

*ZenGin* DAT files contain symbols and [bytecode](./bytecode.md) used by the VM of the engine.
DAT files are the result of Daedalus script compilation process.

## Format Description

Compiled scripts are stored in a [binary](../encodings/binary.md) file which contains the following data. Also refer to the
[Datatype Reference](../datatypes.md) for general information about often used datatypes.

```c title="DAT Structure"
struct DatFile {
byte version;

uint symbolCount;
uint symbolIds[/* symbolCount */]; // (1)
zCPar_Symbol symbolTable[/* symbolCount */];

uint bytecodeSize;
byte bytecode[/* bytecodeSize */]; // (2)
};
```

1. Symbol IDs sorted lexicographically
2. Read [bytecode](./bytecode.md) page for more info about this data.

### Symbol

```c title=""
struct zCPar_Symbol {
byte isNamed; // (1)
SymbolName name;

SymbolParameters params;
SymbolCodeSpan span;
SymbolData data;
int parent; // (2)
};
```

1. Does this symbol have a name
2. Parent type, `-1` means `None`

#### Symbol Name

=== "If isNamed == 1"
```c title=""
struct SymbolName {
string name;
};
```
=== "If isNamed == 0"
```c title=""
struct SymbolName {};
```

#### Symbol Parameters

```c title=""
struct SymbolParameters {
int offset; // (1)
uint count: 12; // (2)
zPAR_TYPE type: 4;
zPAR_FLAG flags: 6;
uint space: 1;
uint reserved: 9;
};

enum zPAR_TYPE: uint {
VOID = 0,
FLOAT = 1,
INT = 2,
STRING = 3,
CLASS = 4,
FUNC = 5,
PROTOTYPE = 6,
INSTANCE = 7,
};

// Parameter bitflags
enum zPAR_FLAG: uint {
CONST = 1,
RETURN = 2,
CLASSVAR = 4,
EXTERNAL = 8,
MERGED = 16
};
```

1. Depending on `type` either Offset (ClassVar), Size (Class) or ReturnType (Func)
2. How many sub items does this symbol have

#### Symbol Code Span
This is span debug information, pointing at the source daedalus script

```c title=""
struct SymbolCodeSpan {
uint fileIndex: 19;
uint reserved: 13;
uint lineStart: 19;
uint reserved: 13;
uint lineCount: 19;
uint reserved: 13;
uint charStart: 24; // (1)
uint reserved: 8;
uint charCount: 24; // (2)
uint reserved: 8;
};
```

1. Points to a byte of a source code file at which the span starts
2. Determines how many bytes does the span starting at `charStart` has

#### Symbol Data

This depends on `params.flags`, if `CLASSVAR` flag is set this data will always be 0 sized.
=== "If `CLASSVAR` flag is not set"
It also depends on `params.type` of the symbol.

=== "FLOAT"
```c title=""
struct SymbolData {
float value[/* params.count */];
};
```
=== "INT"
```c title=""
struct SymbolData {
int value[/* params.count */];
};
```
=== "STRING"
```c title=""
struct SymbolData {
string value[/* params.count */];
};
```

!!! warning
Some mods don't terminate those strings correctly, look at this [code](https://github.com/GothicKit/ZenKit/commit/0e7e507de92e8da4ec28513e6be56e4043329990)
and the [issue related to it](https://github.com/Try/OpenGothic/issues/483).
=== "CLASS"
```c title=""
struct SymbolData {
int classOffset;
};
```
=== "INSTANCE, FUNCTION, PROTOTYPE"
```c title=""
struct SymbolData {
int address;
};
```
=== "VOID"
```c title=""
struct SymbolData {}
```

=== "If `CLASSVAR` flag is set"
```c title=""
struct SymbolData {}
```

0 comments on commit 16267f5

Please sign in to comment.