A simple compiled language written for the Diana-II 6-bit computer. This language was written to aid development by providing all basic instructions not natively supported by the architecture.
The following documentation is intended for programmers who are already familiar with other assembly like languages.
Acknowledgments: The following documentation is strongly inspired by the Solaris x86 assembly language reference manual
From crates.io: (Recommended)
cargo install dianac
From source:
git clone https://github.com/5-pebbles/dianac.git
cd dianac
cargo install --path .
Basic help:
~ ❯ dianac --help
An emulator, compiler, and interpreter for the Diana Compiled Language
Usage: dianac <COMMAND>
Commands:
repl Start the interactive emulation REPL
compile Compile a static binary (6-bit bytes are padded with zeros)
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
If there is anything that can be improved, please let me know:
Issue:
GitHub.Email:
[email protected].
The Diana II is 6-bit minimal instruction set computer designed around using NOR
as a universal logic gate.
-
byte size: 6-bits.
-
endianness: little-endian.
-
address size: 12-bits (two 6-bit operands, first is higher order).
-
unique instructions: 6.
Binary | Instruction | Description |
---|---|---|
00 | NOR [val] [val] |
Performs a negated OR on the first operand. |
01 | PC [val] [val] |
Sets the program counter to the address [val, val] . |
10 | LOAD [val] [val] |
Loads data from the address [val, val] into C . |
11 | STORE [val] [val] |
Stores the value in C at the address [val, val] . |
Layout:
Each instruction is 6 bits in the format [XX][YY][ZZ]
:
- X: 2-bit instruction identifier.
- Y: 2-bit first operand identifier.
- Z: 2-bit second operand identifier.
The first operand of NOR can't be immediate, so that allows another four instructions:
Binary | Instruction | Description |
---|---|---|
001100 | NOP |
No operation; used for padding. |
001101 | --- |
Reserved for future use. |
001110 | --- |
Reserved for future use. |
001111 | HLT |
Halts the CPU until the next interrupt. |
Note
Instructions and operands are uppercase because my 6-bit character encoding does not support lowercase...
Binary | Name | Description |
---|---|---|
00 | A | General purpose register. |
01 | B | General purpose register. |
10 | C | General purpose register. |
11 | - | Read next instruction as a value. |
There are a total of 4096 unique address each containing 6 bits.
Address | Description |
---|---|
0x000..=0xEFF |
General purpose RAM. |
0xEFF..=0xF3D |
Reserved for future use. |
0xF3E..=0xF3F |
Program Counter(PC) (ROM). |
0xF80..=0xFBF |
Left rotate lookup table (ROM). |
0xFC0..=0xFFF |
Right rotate lookup table (ROM). |
A program consists of one or more files containing statements. A statement consists of tokens separated by whitespace and terminated by a newline character.
A comment can reside on its own line or be appended to a statement. The comment consists of an octothorp (#) followed by the text of the comment and a terminating newline character.
A label can be placed before the beginning of a statement. During compilation the label is assigned the address of the following statement and can be used as a keyword operand.
A label consists of the LAB
keyword followed by an identifier labels are global in scope and appear in the files symbol table.
There are 6 classes of tokens:
- Identifiers
- Keywords
- Registers
- Numerical constants
- Character constants
- Operators
An identifier is an arbitrarily-long sequence of letters, underscores, and digits. The first character must be letter or underscore. Uppercase and lowercase characters are equivalent.
Keywords such as instruction mnemonics and directives are reserved and cannot be used as identifiers. For a list of keywords see the Keyword Tables.
The Diana-II architecture provides three registers [A, B, C] these are reserved and can not be used as identifiers. Uppercase and lowercase characters are equivalent.
Numbers in the Diana-II architecture are unsigned 6-bit integers. These can be expressed in several bases:
- Decimal. Decimal integers consist of one or more decimal digits (0–9).
- Binary. Binary integers begin with “0b” or “0B” followed by zero or more binary digits (0, 1).
- Hexadecimal. Hexadecimal integers begin with “0x” or “0X” followed by one or more hexadecimal digits (0–9, A–F). Hexadecimal digits can be either uppercase or lowercase.
A character constant consists of a supported character enclosed in single quotes ('). A character will be converted to its numeric representation based on the table of supported characters bellow:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | = | - | + | * | / | ^ |
1x | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P |
2x | Q | R | S | T | U | V | W | X | Y | Z | SPACE | . | , | ' | " | ` |
3x | # | ! | & | ? | ; | : | $ | % | | | > | < | [ | ] | ( | ) | \ |
If a lowercase character is used, it will be converted to its uppercase representation.
The compiler supports the following operators for use in expressions. Operators have no assigned precedence. Expressions can be grouped in parentheses () to establish precedence.
! | Logical NOT |
& | Logical AND |
| | Logical OR |
+ | Addition |
- | Subtraction |
* | Multiplication |
/ | Division |
>> | Rotate right |
<< | Rotate left |
All operators except Logical NOT require two values and parentheses ():
(5 + 9 + 3)
= 17!0b111110
= 0b000001(2 + (2 * 5))
= 12(2 + 2 * 5)
= 20
Keywords represent an instruction, set of instructions, or a directive. Operands are entities operated upon by the keyword. Addresses are the locations in memory of specified data.
A keyword can have zero to three operands separated by whitespace characters. For instructions with a source and destination this language uses Intel's notation destination(lefthand) then source(righthand).
There are 4 types of operands:
- Immediate. A 6-bit constant expression that evaluate to an inline value.
- Register. One of the three 6-bit general-purpose registers provided by the Diana-II architecture.
- Either. An immediate or a register operand.
- Address. A single 12-bit identifier or two a pair of whitespace separated 6-bit either operands.
- Conditional. A pair of square brackets [ ] containing a pair of 6-bit operands separated by whitespace and one of the following comparison operators:
== Equal != Not equal > Greater >= Greater or equal < Less <= Less or equal
The Diana-II architecture uses 12-bit addressing. Labels can be split into two 6-bit immediate values by appending a colon followed by a 1 or 0. If a keyword requires an address it can be provided as two 6-bit values or a single 12-bit identifier:
LOD MAIN
=LOD MAIN:0 MAIN:1
.
Any side effects will be listed in the notes of a keyword read each carefully. If a keyword clobbers an unrelated register, it will select the first available in reverse alphabetical order, e.g.
XNOR C 0x27
will clobber BXNOR A 0x27
will clobber C
Operands will be displayed in square brackets [ ] using the following shorthand:
[reg]
= register[imm]
= immediate[eth]
= either[add]
= address[con]
= conditional
Keyword | Description | Notes |
---|---|---|
NOT [reg] |
bitwise logical NOT | - |
AND [reg] [eth] |
bitwise logical AND | The second register is flipped; its value can be restored with a NOT operation. If an immediate value is used, it is flipped at compile time. |
NAND [reg] [eth] |
bitwise logical NAND | The second register is flipped; its value can be restored with a NOT operation. If an immediate value is used, it is flipped at compile time. |
OR [reg] [eth] |
bitwise logical OR | - |
NOR [reg] [eth] |
bitwise logical NOR | - |
XOR [reg] [eth] |
bitwise logical XOR | An extra register will be clobbered; this is true even if an immediate value is used. |
NXOR [reg] [eth] |
bitwise logical NXOR | An extra register will be clobbered; this is true even if an immediate value is used. |
These keywords simply load the corresponding address from the right and left rotate lookup tables.
Keyword | Description | Notes |
---|---|---|
ROL [eth] |
rotate left storing the value in C | - |
ROR [eth] |
rotate right storing the value in C | - |
SHL [eth] |
shift left storing the value in C | - |
SHR [eth] |
shift right storing the value in C | - |
Keyword | Description | Notes |
---|---|---|
ADD [reg] [eth] |
add | All registers will be clobbered; this is true even if an immediate value is used. |
SUB [reg] [eth] |
subtract | All registers will be clobbered; this is true even if an immediate value is used. |
Keyword | Description | Notes |
---|---|---|
SET [imm] |
compiles to raw value [imm] |
- |
MOV [reg] [eth] |
copy from second operand to first | - |
LOD [add] |
load data from [add] into C |
- |
STO [add] |
stores data in C at [add] |
- |
Keyword | Description | Notes |
---|---|---|
PC [add] |
set program counter to [add] |
- |
LAB [idn] |
define a label pointing to the next statement | - |
LIH [con] [add] |
conditional jump if true | All registers will be clobbered, and LIH stands for logic is hard. |
Keyword | Description | Notes |
---|---|---|
NOP |
No operation; used for padding | - |
HLT |
halts the CPU until the next interrupt | - |