-
Notifications
You must be signed in to change notification settings - Fork 32
Booting
While not an absolutely necessity, a bootloader such as UBoot is nice to have, mostly so that we can load binaries to test over serial or Ethernet instead of having to remove the SD card, write files, and re-insert the card every time we update the code. It may also be possible to add one of these transfers as a target in our Makefile for added convenience.
FreeBSD developer Oleksandr Tymoshenko (blog) has been working on porting UBoot to the Pi. The current binary in use is here. The instructions to install it are as follows:
- Download Raspian
dd if=<raspian-image> of=<sd-card-dev> bs=2M
- Extract the UBoot
.tar.gz
into the boot partition - Connect serial device/monitor and boot the Pi
To load Raspian:
-
fatload mmc 0:1 0x00200000 kernel.img
(loads the Raspian kernel) -
bootz
(boots zImage binaries)
To [possibly] load Xinu:
fatload mmc 0:1 0x00000000 xinu.bin
go 0
Currently, the xinu.bin
executable will boot directly from the SD card. Unfortunately, something goes wrong before the serial port is initialized so nothing is printed so far. The problem could be as simple as changing the base addresses of various hardware registers in the source code or configuration files.
Fellow GitHub-er David Welch has been developing on the Pi "bare metal"-style, his notes are available on his project page.
Brain Dump, beware of possible inaccuracies and sudden changes
The ARM effectively boots from 0x8000, kernel.img is a raw binary that's loaded here.
XINU currently (1/22/2013) wants to run starting at 0x10000. From there, it branches to the address at 0x10020 which is 0x0x10044, the Reset_Handler routine. (this is confirmed via arm-...-objdump -D
on xinu.elf) UBoot also booted by jumping to this area (Starting application at 0x00010044 ...
) so I think the Pi does, in fact, use this address as the reset vector.
Sadly, we can't just .org 0x8000
then b 0x10020
in start.S
because we're already at 0x10000 by the time we get to assembling it. Changing compile/platforms/arm-qemu/ld.script
allows us to set the initial address to 0x8000. The behavior of .org
is a bit odd in that it doesn't seem to set or advance the current code address, but rather it seems to increment it. Using this, we can insert a small bit of code to keep the start.S code where it normally lives at 0x10000 and still boot at 0x8000:
/* the GPU starts the CPU at 0x8000 */
b Reset_Handler
.org 0x8000
Currently the Pi does boot our Xinu code from the SD card, but it is not able to get far enough to send anything over the UART to tell us that it booted. We know it does at least start executing because the following ARM code will light up the "ACT" ("OK" on some cases) LED:
//based on code by David Welch ([email protected]), original license:
// Copyright (c) 2012 David Welch [email protected]
//
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
/* turn on an LED */
ldr r0, GPFSEL1 //allocate registers poorly
ldr r1, GPSET0
ldr r2, GPCLR0
ldr r3, MASK0
ldr r4, MASK1
ldr r5, MASK2
ldr r7, LOOPCT
ldr r6, [r0, #0] //GPFSEL1 &= ~(7<<18)
and r6, r6, r3
str r6, [r0, #0]
ldr r6, [r0, #0] //GPFSEL1 |= (1<<18)
orr r6, r6, r4
str r6, [r0, #0]
loop:
ldr r6, [r2, #0] //GPCLR0 |= (1<<16)
orr r6, r6, r5
str r6, [r2, #0]
ldr r6, [r1, #0] //GPSET0 |= (1<<16)
orr r6, r6, r5
str r6, [r1, #0]
subs r7, r7, #1 //loop until r7 is zero
bne loop
b done
.align 2
GPFSEL1:
.word 0x20200004
GPSET0:
.word 0x2020001C
GPCLR0:
.word 0x20200028
MASK0:
.word 0xFFE3FFFF //~(7<<18)
MASK1:
.word 0x00040000 //(1<<18)
MASK2:
.word 0x00010000 //(1<<16)
LOOPCT:
.word 0x00FFFFFF //big
done:
Important register base locations with (bus addresses) and datasheet page numbers:
Peripheral address space: 0x20000000 - 0x20FFFFFF (0x7E000000 - 0x7EFFFFFF) (pg 6)
DMA engines use bus addresses (pg 6)
GPIO base 0x20200000 (0x7E200000) see pg 90
Interrupt 0x2000B000 (0x7E00B000) see pg 112
0x2000B200 IRQ basic pending
0x2000B204 IRQ pending 1
0x2000B208 IRQ pending 2
0x2000B20C FIQ control
0x2000B210 Enable IRQs 1
0x2000B214 Enable IRQs 2
0x2000B218 Enable Basic IRQs
0x2000B21C Disable IRQs 1
0x2000B220 Disable IRQs 2
0x2000B224 Disable Basic IRQs
PL011 (16650 variant) UART 0x20201000 (0x7E201000) see pg 175
0x20201000 DR - Data Register
0x20201004 RSRECR
0x20201018 FR - Flag register
0x20201020 ILPR - not in use
0x20201024 IBRD - Integer Baud rate divisor
0x20201028 FBRD - Fractional Baud rate divisor
0x2020102C LCRH - Line Control register
0x20201030 CR - Control register
0x20201034 IFLS - Interupt FIFO Level Select Register
0x20201038 IMSC - Interupt Mask Set Clear Register
0x2020103C RIS - Raw Interupt Status Register
0x20201040 MIS - Masked Interupt Status Register
0x20201044 ICR - Interupt Clear Register
0x20201048 DMACR DMA - Control Register
0x20201080 ITCR - Test Control register
0x20201084 ITIP - Integration test input reg
0x20201088 ITOP - Integration test output reg
0x2020108C TDR - Test Data reg
SP804 variant Timer 0x2000B000 (0x7E00B000) see pg 196
0x2000B400 Load
0x2000B404 Value (Read Only)
0x2000B408 Control
0x2000B40C IRQ Clear/Ack (Write Only)
0x2000B410 RAW IRQ (Read Only)
0x2000B414 Masked IRQ (Read Only)
0x2000B418 Reload
0x2000B41C Pre-Divider (Not in real 804!)
0x2000B420 Free running counter (Not in real 804!)
Also take a look at the rather long errata page about the datasheet.
This is an example UART initialization routine and simple test (send a '?') for the mini UART (stripped down pseudo-clone of a 16550):
//based on code by David Welch ([email protected]), original license:
// Copyright (c) 2012 David Welch [email protected]
//
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
/* initialize uart (poorly) at 115200 baud */
//AUX_ENABLES = 1
ldr r0, AUX_ENABLES
ldr r1, ONE
str r1, [r0, #0]
//AUX_MU_IER_REG = 0
ldr r0, AUX_MU_IER_REG
ldr r1, ZERO
str r1, [r0, #0]
//AUX_MU_CNTL_REG = 0
ldr r0, AUX_MU_CNTL_REG
ldr r1, ZERO
str r1, [r0, #0]
//AUX_MU_LCR_REG = 3
ldr r0, AUX_MU_LCR_REG
ldr r1, THREE
str r1, [r0, #0]
//AUX_MU_MCR_REG = 0
ldr r0, AUX_MU_MCR_REG
ldr r1, ZERO
str r1, [r0, #0]
//AUX_MU_IER_REG = 0
ldr r0, AUX_MU_IER_REG
ldr r1, ZERO
str r1, [r0, #0]
//AUX_MU_IIR_REG = 0xC6
ldr r0, AUX_MU_IIR_REG
ldr r1, CSIX
str r1, [r0, #0]
//AUX_MU_BAUD_REG = 270
ldr r0, AUX_MU_BAUD_REG
ldr r1, TWOSEVENTY
str r1, [r0, #0]
//GPFSEL1 &= ~(7<<12)
ldr r0, GPFSEL1
ldr r1, [r0, #0]
ldr r2, GPFSEL1_MASK1
and r1, r1, r2
str r1, [r0, #0]
//GPFSEL1 |= (2<<12)
ldr r0, GPFSEL1
ldr r1, [r0, #0]
ldr r2, GPFSEL1_MASK2
orr r1, r1, r2
str r1, [r0, #0]
//GPPUD = 0
ldr r0, GPPUD
ldr r1, ZERO
str r1, [r0, #0]
//for(ra=0;ra<150;ra++) dummy(ra);
ldr r0, ONEFIFTY
wait1: bl dummy
subs r0, r0, #1
bne wait1
//GPPUDCLK0 = (1<<14)
ldr r0, GPPUDCLK0
ldr r1, GPPUDCLK0_MASK1
str r1, [r0, #0]
//for(ra=0;ra<150;ra++) dummy(ra);
ldr r0, ONEFIFTY
wait2: bl dummy
subs r0, r0, #1
bne wait2
//GPPUDCLK0 = 0
ldr r0, GPPUDCLK0
ldr r1, ZERO
str r1, [r0, #0]
//AUX_MU_CNTL_REG = 2
ldr r0, AUX_MU_CNTL_REG
ldr r1, TWO
str r1, [r0, #0]
//AUX_MU_IO_REG = '?'
ldr r0, AUX_MU_IO_REG
ldr r1, TESTCHAR
str r1, [r0, #0]
b done
dummy:
bx lr
.align 2
GPFSEL1:
.word 0x20200004
GPSET0:
.word 0x2020001C
GPCLR0:
.word 0x20200028
GPPUD:
.word 0x20200094
ONEFIFTY:
.word 0x00000096 //150
GPPUDCLK0:
.word 0x20200098
AUX_ENABLES:
.word 0x20215004
AUX_MU_IO_REG:
.word 0x20215040
AUX_MU_IER_REG:
.word 0x20215044
AUX_MU_IIR_REG:
.word 0x20215048
AUX_MU_LCR_REG:
.word 0x2021504C
AUX_MU_MCR_REG:
.word 0x20215050
AUX_MU_LSR_REG:
.word 0x20215054
AUX_MU_MSR_REG:
.word 0x20215058
AUX_MU_SCRATCH:
.word 0x2021505C
AUX_MU_CNTL_REG:
.word 0x20215060
AUX_MU_STAT_REG:
.word 0x20215064
AUX_MU_BAUD_REG:
.word 0x20215068
ONE:
.word 0x00000001
ZERO:
.word 0x00000000
THREE:
.word 0x00000003
CSIX:
.word 0x000000C6
TWOSEVENTY:
.word 0x0000010E //270
GPFSEL1_MASK1:
.word 0xFFFF8FFF //~(7<<12)
GPFSEL1_MASK2:
.word 0x00002000 //(2<<12)
GPPUDCLK0_MASK1:
.word 0x00004000 //(1<<14)
TWO:
.word 0x00000002
AUXMULSR_MASK1:
.word 0x00000020
TESTCHAR:
.word 0x0000003F //'?'
done:
The next step is to make sure that Xinu is Doing the Right Thing and initializing the UART and GPIO lines properly. It's worth noting that David Welch says in a README that accompanies the UART initialization code:
The documentation for the chip has some glaring errors. The IER and IIR register descriptions are screwy. The one that killed me was the word length. The document says that the single bit controls 7 bits per word vs 8 bits per word. And that bit 1 and some above do not do anything, they might on real 16550's but not here. Well that is wrong. If bits 1:0 are 00 you get 7 bits if bits 1:0 are 01 you get 7 bits. You need bit 1 set to get 8 bits.
Boot process
- GPU starts up, mounts FAT32 partition on SD card
- GPU reads, runs
bootcode.bin
andstart.elf
from the SD card - GPU reads
kernel.img
from the SD card, writes it to0x00008000
in RAM - GPU starts the CPU executing at
0x00008000
-
loader/platforms/arm-qemu/start.S
: sets up the system a bit and does ab _startup
to run the C code -
system/platforms/arm-qemu/stubs.c
:_startup()
callssetup_pins()
andnulluser()
-
system/initialize.c
containsnulluser()
Using the Mini-UART, it's possible to hack the kputc and kgetc routines to allow for simple serial debugging. The code for this is:
//based on code by David Welch ([email protected]), original license:
// Copyright (c) 2012 David Welch [email protected]
//
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#define uint32_t unsigned int
#define int32_t int
//for I/O devices when using the physical address space (0x20000000 - 0x20FFFFFF)
#define IO_PHY_ADDR(x) (((x)&0x00FFFFFF) | 0x20000000)
//for I/O devices when using the bus address space (0x7E000000 - 0x7EFFFFFF)
#define IO_BUS_ADDR(x) (((x)&0x00FFFFFF) | 0x7E000000)
//for using I/O devices just like a global variable
#define __IO(x) (*((uint32_t*)(IO_PHY_ADDR(x))))
//mini uart registers
#define AUX_ENABLES __IO(0x20215004)
#define AUX_MU_IO_REG __IO(0x20215040)
#define AUX_MU_IER_REG __IO(0x20215044)
#define AUX_MU_IIR_REG __IO(0x20215048)
#define AUX_MU_LCR_REG __IO(0x2021504C)
#define AUX_MU_MCR_REG __IO(0x20215050)
#define AUX_MU_LSR_REG __IO(0x20215054)
#define AUX_MU_MSR_REG __IO(0x20215058)
#define AUX_MU_SCRATCH __IO(0x2021505C)
#define AUX_MU_CNTL_REG __IO(0x20215060)
#define AUX_MU_STAT_REG __IO(0x20215064)
#define AUX_MU_BAUD_REG __IO(0x20215068)
//GPIO registers
#define GPFSEL1 __IO(0x20200004)
#define GPSET0 __IO(0x2020001C)
#define GPCLR0 __IO(0x20200028)
//peripheral clock registers
#define GPPUD __IO(0x20200094)
#define GPPUDCLK0 __IO(0x20200098)
//initialize the mini UART to 115200 baud
void muart_init(void){
int i;
//initialize UART
AUX_ENABLES = 1;
AUX_MU_IER_REG = 0; //No interrupts (ERBFI, ETBEI, ELSI, EDSSI)
AUX_MU_CNTL_REG = 0; //pg16
AUX_MU_LCR_REG = 3; //not using the divisor, no break, no stick parity, even parity, no parity bit, 8 bits, 1 stop bit
AUX_MU_MCR_REG = 0; //diagnostics off, OUT2 off, OUT1 off, RTS off, DTR off
AUX_MU_IER_REG = 0; //disable interrupts again?
AUX_MU_IIR_REG = 0xC6;
AUX_MU_BAUD_REG = 270; //115200 baud
//initialize GPIO to let the UART TX through
GPFSEL1 &= ~(7<<12);
GPFSEL1 |= (2<<12);
//initialize GPIO to let the UART RX through
GPFSEL1 &= ~(7<<15);
GPFSEL1 |= (2<<15);
//initialize the peripheral clock
GPPUD = 0;
for(i=0; i<150; i++){}
GPPUDCLK0 = (1<<14);
for(i=0; i<150; i++){}
GPPUDCLK0 = 0;
//final UART initialization
AUX_MU_CNTL_REG = 3; //3 for TX/RX, 2 for just TX
}
void muart_putc(char c){
//wait for transmission to finish
while((AUX_MU_LSR_REG & 0x20) == 0){}
//send char
AUX_MU_IO_REG = c;
}
char muart_getc(void){
char result;
//wait for a character to come in
while((AUX_MU_LSR_REG & 0x01) == 0){}
//get char
result = (AUX_MU_IO_REG & 0xFF);
return result;
}
By replacing the guts of kputc and kgetc with calls to these functions, and converting all getc and putc calls to kgetc and kputc calls, the test routines can be run. This is the status of the test routines:
- c - clockTest - fail (says
clkticks: 0
andclktime: 0
then dies) - a - ksimpleterminal - success (call to test function is commented out, but we make it to
All user processes have completed.
just fine) - b - semtest - success?
- g - timesliceThreadTest - success?
- i - irTest (LED test?) - fail (nothing happens, will need to rewrite LED test)
- j - uartTest - fail (nothing happens, will need to get the real UART working)
- k - interruptTest - fail (says
Interrupt mask: 10010
andDisabled. Interrupt mask saved: 10010, actual: 10
andlpc_vic->vect_ctrls[irq]: 0x2c
andGPIO0 interrupts: 0
then dies repeating the last line every few seconds) - m - findMemory - success?
- l - turn on LED - not implemented
- o - turn off LED - not implemented
- L - turn on back LED - not applicable
- O - turn off back LED - not applicable
- w - kprintf - success (will need to regression test this when real serial is implemented)
- t - timerTest - fail (will likely need to change timer register addresses, possibly code as well)