-
Notifications
You must be signed in to change notification settings - Fork 1
/
aasm_manual
393 lines (318 loc) · 12.8 KB
/
aasm_manual
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
AASM Manual
===========
(This is more of a set of notes than a true manual.)
Beta- release!
Required: {binary file, token specifier "mnemonics", source file}
All the source is in a single file. To compile, simply use your
favourite C compiler.
e.g. "gcc -O2 -o aasm aasm.c"
Command line format
~~~~~~~~~~~~~~~~~~~
aasm [-s[d,v][{l, p}], <file>, -l <file>, -h <file>, -e <file>] <sourcefile>
-s dumps the symbol table to the specified file.
the default is to dump in alphabetical order
the -sd option dumps symbols in order of Definition
the -sv option sorts symbols into ascending Value
adding an 'l' will dump the Local labels too
adding a 'p' will dump the literal Pools too
-l dumps the list output to the specified file.
the -ls option will list the symbol table too
the -lk option will list the symbol table and insert a "KMD" identifier
-h dumps ASCII hexadecimal to the specified file.
-e dumps ELF to the specified file.
omitting the filename (or substituting '-') directs to stdout.
(Further options will be added later.)
Output information
~~~~~~~~~~~~~~~~~~
Each pass attempts to define and refine label values. On each pass
information is echoed indicating "Label changes":
"defined" is the number of labels which were defined for the
first time on that pass.
"value changed" is the number of labels whose definition has
been altered on that pass; this is normally due to the object
code changing size as offsets are adjusted.
"read while undefined" indicates references to labels which
have not yet had any definition.
Iteration continues until all these values are zero - at which time a
final pass generates code - or the assembler becomes fed up, which
implies that the source code is not sensible. In this case labels
still undefined are reported.
Errors are reported as they are determined and, if a pass contains
errors, assembly will be aborted at the end of the pass.
Source file input format
~~~~~~~~~~~~~~~~~~~~~~~~
Standard assembly language: optional label followed by a mnemonic -
with addressing mode(s) as required - followed by an optional comment.
Any field may be omitted where sensible.
No restrictions on mnemonic/label formatting (1st column, etc.)
Mnemonics (including directives) are case insensitive.
Labels are case sensitive.
Lines are terminated by <LF> or <EOF>.
A line is truncated to a limit of 255 characters. (Alterable in source file.)
Directives supported are:
INCLUDE <filename> Include file as part of source
GET <filename> Synonym for "INCLUDE"
IMPORT <filename> Include binary file in output
EQU <expr> Equate
EXPORT <label>, <label>, ... Marks labels for export (for expansion)
If <label> = "all", all labels are exported
ARM Assemble as 32-bit ARM code (default)
(Automatically aligns to word boundary)
THUMB Assemble as 16-bit Thumb code
ARCHITECTURE Set instruction set architecture (default "ALL")
{v3, v3M, v4, v4xM, v4T, v4TxM, v5, v5xM, v5T, v5TxM,
v5TE, v5TExP, all, any} (last 2 are synonyms)
ARCH Synonym for ARCHITECTURE
ORIGIN <addr> Origin (defaults to 00000000 at start)
ORG Synonym for ORIGIN
ENTRY Mark code entry point
ALIGN [<expr>[, <fill>]] Increment origin to next multiple of <expr>
Will default to 4-byte alignment
Note: any label will be at the first filled address
if filling the gap, but the aligned address if not.
DEFS <size>[, <fill>] Define space - space uninitialised if <fill>
byte omitted
DEFB <expr>, <expr>, ... Define byte
DEFB "string", ... may mix with above - possible delimiters " ' ` /
Accepts C-style escape codes
{ \0 \" \' \? \\ \a \b \f \n \r \t \v }
DEFH <expr>, <expr>, ... Define 16-bit word (also strings as DEFB)
DEFW <expr>, <expr>, ... Define 32-bit word (also strings as DEFB)
DCB Synonym for DEFB
DCW Synonym for DEFH
DCD Synonym for DEFW
LITERAL Dump any outstanding literal constants
Owing to the automatic alignment and
user invisibility of literals it is
inadvisable to label a LITERAL directive.
LITERALS Synonym for LITERAL
POOL Synonym for LITERAL
LTORG Synonym for LITERAL
RN <reg> Alias a register name.
Note: must precede alias use in assembly
CN <creg> Alias a coprocessor register name.
Note: must precede alias use in assembly
RECORD [<expr>] Start a data record [See section below]
STRUCTURE [<expr>] Synonym for RECORD
STRUCT [<expr>] Synonym for RECORD
ALIAS Identify a zero size data element
BYTE [<expr>] Identify one byte data element(s)
HALFWORD [<expr>] Identify two byte data element(s)
HALF [<expr>] Identify two byte data element(s)
WORD [<expr>] Identify four byte data element(s)
DOUBLE [<expr>] Identify eight byte data element(s)
DOUBLEWORD [<expr>] Identify eight byte data element(s)
REC_ALIGN [<expr>] Align data field modulo <expr> (default 4)
STRUCT_ALIGN [<expr>] Synonym for REC_ALIGN
Note: a value used in (e.g.) DEFB is truncated to the appropriate
least significant bits -without- a warning message being generated.
Label rules
~~~~~~~~~~~
Labels are strings of alphanumerics, beginning with an alphabetic
character. '_' is regarded as alphabetic. They are case sensitive.
A label can be any length, but only the first 31 characters are
stored. (This limit can be changed in the source file.) Longer
labels are still likely to be distinguished by their hashes.
Local labels may be defined as decimal numbers. The same label may be
defined repeatedly, if desired. A local label is referenced using a
'%' prefix, thus:
999 beq %b111
The reference may be specified as "%b##" (back) which searches for the
nearest matching preceding definition in the assembly process, "%f##"
(forward) which searches for the nearest matching succeeding
definition or "%##" which searches backwards then forwards if not
found. Both directions of search begin with the label on the current
line, if any.
Expression rules
~~~~~~~~~~~~~~~~
An operand may be one of:
. current location
<label> label
%<number> local label
<number> decimal number
:<number> binary number ????
0b<number> binary number
@<number> octal number
$<number> hexadecimal number
&<number> hexadecimal number
0x<number> hexadecimal number
Values are 32 bit, two's complement.
Any number may contain an arbitrary number of '_' characters. These
are intended to be available as field separators (e.g. in long binary
numbers) and are ignored by the assembler.
A monadic operator may be one of:
Symbols Operation
+ Plus
- Negative (2's complement)
~ NOT (1's complement)
| log (i.e. the bit number of the MS '1' or -1 for 0)
Only one monadic operator is allowed per element.
A diadic operator may be one of:
Symbols Operation Precedence
<< LSL SHL Left shift 7 (High)
>> LSR SHR Right shift 7
AND Logical AND 6
| OR Logical OR 5
^ EOR XOR Logical XOR 5
* Multiply 4
/ DIV DIV 4
\ MOD MOD 4
+ Add 3
- Subtract 3
= EQ is equal to 2
<> NE != is not equal to 2
> HI is higher than 2
>= HS is higher or same 2
< LO is lower than 2
<= LS is lower or same 2
GT is greater than 2
GE is greater or equal 2
LT is less than 2
LE is less or equal 2 (Low)
Parentheses may be used to force priority.
Note the symbolic comparison operators (">", etc.) treat operands as unsigned.
Comparison operators return FFFFFFFF (-1) for true and 0 for false.
Conditional assembly
~~~~~~~~~~~~~~~~~~~~
Condition directives are as follows:
IF <expr> Begin conditional assembly; following code
produced if <expr> evaluates to non-zero.
ELSE Toggle effect of preceding IF clause.
(In practice `ELSE' may be used repeatedly;
this is not recommended.)
ENDIF End conditional assembly following IF/ELSE.
FI Synonym for ENDIF.
The expression used for the IF must be evaluated correctly on the
first pass, thus any variables used must be defined before it is
encountered.
Conditions may be nested up to a maximun depth (currently 10 levels).
** BUG at present **
These directives are not case sensitive.
Mnemonics
~~~~~~~~~
Standard ARM mnemonics and syntax are supported, with the following additions:
NOP ARM no-operation (MOV R0, R0)
UNDEF Undefined instruction (?C000010)
UNDEFINED as above
ADR Rd, addr Produce an ADD/SUB Rd, PC, ## so that Rd becomes addr
Exactly one instruction is generated.
ADRL Rd, addr Produce a set of operations which sets up Rd
by adding or subtracting from the PC. From
one to four instructions may be generated.
ADRn Rd, addr Produce a set of operations which sets Rd to
addr, using exactly n instructions
LDR/STR Rd, addr PC-relative addressing mode
LDR Rd, =## MOV/MVN Rd, ## if possible, otherwise plant a
PC-relative load to a literal constant. The
literals will be dumped at the end of the file,
or earlier with a LITERAL directive.
LDRH will generate half word literals.
Literals will be aliased to earlier constants
of the same value, where possible.
Literal pools are aligned automatically, as
deemed necessary by the assembler, and
terminate on word alignment.
ADDX<cc><s> Rd, Rn, #nn A number of `extended' immediate operations
which are synthesized from real operations but
allow arbitrary sized immediate fields. From
one to four instructions may be generated;
some substitution (e.g. SUB for ADD) will be
attempted if it shortens the sequence.
Operations supported are {MOVX, ADDX, ADCX,
SUBX, SBCX, RSBX, RSCX, ANDX, BICX, ORRX,
EORX}.
Conditional operation and flag setting is
optional; NOTE that an attempt to set the C or
V FLAG will result in an UNDEFINED state as
predicting where an overflow may occur is not
possible and the flags can only be set on the
last instruction in the sequence.
These sequences may provide a convenient means
of performing some operations but should be
used with caution.
An advantage of this mechanism over "LDR =" is
that the data is fetched from the code stream
rather than as data; thus the constants are
more easily integrated with instruction
caches, etc.
Notes for Thumb:
NOP is MOV R8, R8 because MOV R0, R0 would affect the flags.
Only ADR (and ADR1) are currently supported.
Data records
~~~~~~~~~~~~
These are primarily intended as an easy way to set up data offsets
without planting code. The syntax is as follows:
RECORD
element1 WORD
element2 WORD
element3 BYTE 20
element4 HALFWORD
This will set the following labels (decimal):
element1 0
element2 4
element3 8
element4 28
There is no limit to the number of records. `RECORD' may be followed
by an expression which is used as the base or, as here, sets the origin
to zero by default. Any element may be followed by a multiplier (as
`BYTE' is, above) or default to a single element.
Data sizes of 0, 1, 2, 4 and 8 bytes are supported. "ALIAS" is a
means of defining a zero length element. Note this needs to *precede*
the declaration which reserves the space.
`REC_ALIGN' allows realignment onto word (or other) boundaries as
desired.
The free-format input style allows definitions to be indented
differently, for clarity, if desired.
Register names
~~~~~~~~~~~~~~
Register names are case insensitive. The following register names are
predefined:
Name Name Name Value
r0 a1 0
r1 a2 1
r2 a3 2
r3 a4 3
r4 v1 4
r5 v2 5
r6 v3 6
r7 v4 7
r8 v5 8
r9 v6 sb 9
r10 v7 sl 10
r11 fp 11
r12 ip 12
r13 sp 13
r14 lr 14
r15 pc 15
Other aliases (also case insensitive) may be created with the "RN"
directive, e.g.
acc rn r0
Coprocessor names are treated similarly. Predefined aliases are "C?"
and "CR?" where '?' is a decimal number 0-15. The relevant directive
is "CN".
Status register names are not aliasable.
Coprocessor names
~~~~~~~~~~~~~~~~~
Predefined coprocessor names are "P?" or "CP?" where '?' is a decimal
number 0-15. They are case insensitive. Aliases may be defined with
the "CP" directive.
Non-literal translation
~~~~~~~~~~~~~~~~~~~~~~~
Some instructions will be modified, if possible, if they cannot be
assembled as specified. For example: "ADD R1, R2, #-1" becomes
"SUB R1, R2, #1". There are two such sets of pairs:
ADD/SUB and CMP/CMN Operand negated
AND/BIC, ADC/SBC and MOV/MVN Operand inverted
As mentioned above, instructions of the form "LDR Rd, =###" will be
translated to MOV or MVN if the immediate field will fit in the
instruction.
Caveats
~~~~~~~
If using literal pools (i.e. the LDR R0, =&12345678 syntax) it is
advisable to finish the code file with a "LITERAL" directive.
Although this will be placed automatically this will ensure that any
label(s) -apparently- marking the end of the code block do not exclude
the remaining literal pool.
If setting the flags with the extended immediate operation sequences
(e.g. "ADDXS") note that the carry and (for arithmetic operations)
overflow flags are -undefined-.