Skip to content
rememberthe8bit edited this page Feb 2, 2013 · 22 revisions

While not an absolutely necessity, a bootloader such as UBoot is nice to have, mostly so that we can load binaries to test over serial or Ethernet instead of having to remove the SD card, write files, and re-insert the card every time we update the code. It may also be possible to add one of these transfers as a target in our Makefile for added convenience.

FreeBSD developer Oleksandr Tymoshenko (blog) has been working on porting UBoot to the Pi. The current binary in use is here. The instructions to install it are as follows:

  • Download Raspian
  • dd if=<raspian-image> of=<sd-card-dev> bs=2M
  • Extract the UBoot .tar.gz into the boot partition
  • Connect serial device/monitor and boot the Pi

To load Raspian:

  • fatload mmc 0:1 0x00200000 kernel.img (loads the Raspian kernel)
  • bootz (boots zImage binaries)

To [possibly] load Xinu:

  • fatload mmc 0:1 0x00000000 xinu.bin
  • go 0

Currently, the xinu.bin executable will boot directly from the SD card. Unfortunately, something goes wrong before the serial port is initialized so nothing is printed so far. The problem could be as simple as changing the base addresses of various hardware registers in the source code or configuration files.

Fellow GitHub-er David Welch has been developing on the Pi "bare metal"-style, his notes are available on his project page.


Brain Dump, beware of possible inaccuracies and sudden changes

The ARM effectively boots from 0x8000, kernel.img is a raw binary that's loaded here.

XINU currently (1/22/2013) wants to run starting at 0x10000. From there, it branches to the address at 0x10020 which is 0x0x10044, the Reset_Handler routine. (this is confirmed via arm-...-objdump -D on xinu.elf) UBoot also booted by jumping to this area (Starting application at 0x00010044 ...) so I think the Pi does, in fact, use this address as the reset vector.

Sadly, we can't just .org 0x8000 then b 0x10020 in start.S because we're already at 0x10000 by the time we get to assembling it. Changing compile/platforms/arm-qemu/ld.script allows us to set the initial address to 0x8000. The behavior of .org is a bit odd in that it doesn't seem to set or advance the current code address, but rather it seems to increment it. Using this, we can insert a small bit of code to keep the start.S code where it normally lives at 0x10000 and still boot at 0x8000:

/* the GPU starts the CPU at 0x8000 */
                b  Reset_Handler
.org 0x8000

Currently the Pi does boot our Xinu code from the SD card, but it is not able to get far enough to send anything over the UART to tell us that it booted. We know it does at least start executing because the following ARM code will light up the "ACT" ("OK" on some cases) LED:

//based on code by David Welch ([email protected]), original license:
// Copyright (c) 2012 David Welch [email protected]
//
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

                /* turn on an LED */
                ldr   r0, GPFSEL1         //allocate registers poorly
                ldr   r1, GPSET0
                ldr   r2, GPCLR0
                ldr   r3, MASK0
                ldr   r4, MASK1
                ldr   r5, MASK2
                ldr   r7, LOOPCT

                ldr   r6, [r0, #0]        //GPFSEL1 &= ~(7<<18)                                                         
                and   r6, r6, r3
                str   r6, [r0, #0]

                ldr   r6, [r0, #0]        //GPFSEL1 |= (1<<18)                                                          
                orr   r6, r6, r4
                str   r6, [r0, #0]

loop:   
                ldr   r6, [r2, #0]        //GPCLR0 |= (1<<16)                                                           
                orr   r6, r6, r5
                str   r6, [r2, #0]

                ldr   r6, [r1, #0]        //GPSET0 |= (1<<16)
                orr   r6, r6, r5
                str   r6, [r1, #0]

                subs  r7, r7, #1          //loop until r7 is zero                                                       
                bne loop
                b done

.align  2                                                                                                               
GPFSEL1:   
        .word   0x20200004   
GPSET0:   
        .word   0x2020001C   
GPCLR0:   
        .word   0x20200028   
MASK0:   
        .word   0xFFE3FFFF //~(7<<18)                                                                                   
MASK1:   
        .word   0x00040000 //(1<<18)                                                                                    
MASK2:   
        .word   0x00010000 //(1<<16)                                                                                    
LOOPCT:   
        .word   0x00FFFFFF //big                                                                                        
done:

Important register base locations with (bus addresses) and datasheet page numbers:

Peripheral address space: 0x20000000 - 0x20FFFFFF (0x7E000000 - 0x7EFFFFFF) (pg 6)
DMA engines use bus addresses (pg 6)
GPIO base 0x20200000 (0x7E200000) see pg 90
Interrupt 0x2000B000 (0x7E00B000) see pg 112
        0x2000B200 IRQ basic pending
        0x2000B204 IRQ pending 1
        0x2000B208 IRQ pending 2
        0x2000B20C FIQ control
        0x2000B210 Enable IRQs 1
        0x2000B214 Enable IRQs 2
        0x2000B218 Enable Basic IRQs
        0x2000B21C Disable IRQs 1
        0x2000B220 Disable IRQs 2
        0x2000B224 Disable Basic IRQs
PL011 (16650 variant) UART 0x20201000 (0x7E201000) see pg 175
        0x20201000 DR - Data Register
        0x20201004 RSRECR
        0x20201018 FR - Flag register
        0x20201020 ILPR - not in use
        0x20201024 IBRD - Integer Baud rate divisor
        0x20201028 FBRD - Fractional Baud rate divisor
        0x2020102C LCRH - Line Control register
        0x20201030 CR - Control register
        0x20201034 IFLS - Interupt FIFO Level Select Register
        0x20201038 IMSC - Interupt Mask Set Clear Register
        0x2020103C RIS - Raw Interupt Status Register
        0x20201040 MIS - Masked Interupt Status Register
        0x20201044 ICR - Interupt Clear Register
        0x20201048 DMACR DMA - Control Register
        0x20201080 ITCR - Test Control register
        0x20201084 ITIP - Integration test input reg
        0x20201088 ITOP - Integration test output reg
        0x2020108C TDR - Test Data reg
SP804 variant Timer 0x2000B000 (0x7E00B000) see pg 196
        0x2000B400 Load
        0x2000B404 Value (Read Only)
        0x2000B408 Control
        0x2000B40C IRQ Clear/Ack (Write Only)
        0x2000B410 RAW IRQ (Read Only)
        0x2000B414 Masked IRQ (Read Only)
        0x2000B418 Reload
        0x2000B41C Pre-Divider (Not in real 804!)
        0x2000B420 Free running counter (Not in real 804!)

Also take a look at the rather long errata page about the datasheet.

This is an example UART initialization routine and simple test (send a '?') for the mini UART (stripped down pseudo-clone of a 16550):

//based on code by David Welch ([email protected]), original license:
// Copyright (c) 2012 David Welch [email protected]
//
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
		/* initialize uart (poorly) at 115200 baud */
		//AUX_ENABLES = 1
		ldr   r0, AUX_ENABLES
		ldr   r1, ONE
		str   r1, [r0, #0]
		//AUX_MU_IER_REG = 0
		ldr   r0, AUX_MU_IER_REG
		ldr   r1, ZERO
		str   r1, [r0, #0]
		//AUX_MU_CNTL_REG = 0
		ldr   r0, AUX_MU_CNTL_REG
		ldr   r1, ZERO
		str   r1, [r0, #0]
		//AUX_MU_LCR_REG = 3
		ldr   r0, AUX_MU_LCR_REG
		ldr   r1, THREE
		str   r1, [r0, #0]
		//AUX_MU_MCR_REG = 0
		ldr   r0, AUX_MU_MCR_REG
		ldr   r1, ZERO
		str   r1, [r0, #0]
		//AUX_MU_IER_REG = 0
		ldr   r0, AUX_MU_IER_REG
		ldr   r1, ZERO
		str   r1, [r0, #0]
		//AUX_MU_IIR_REG = 0xC6
		ldr   r0, AUX_MU_IIR_REG
		ldr   r1, CSIX
		str   r1, [r0, #0]
		//AUX_MU_BAUD_REG = 270
		ldr   r0, AUX_MU_BAUD_REG
		ldr   r1, TWOSEVENTY
		str   r1, [r0, #0]

		//GPFSEL1 &= ~(7<<12)
		ldr   r0, GPFSEL1
		ldr   r1, [r0, #0]
		ldr   r2, GPFSEL1_MASK1
		and   r1, r1, r2
		str   r1, [r0, #0]
		//GPFSEL1 |= (2<<12)
		ldr   r0, GPFSEL1
		ldr   r1, [r0, #0]
		ldr   r2, GPFSEL1_MASK2
		orr   r1, r1, r2
		str   r1, [r0, #0]

		//GPPUD = 0
		ldr   r0, GPPUD
		ldr   r1, ZERO
		str   r1, [r0, #0]
		//for(ra=0;ra<150;ra++) dummy(ra);
		ldr   r0, ONEFIFTY
wait1:		bl    dummy
		subs  r0, r0, #1
		bne   wait1
		//GPPUDCLK0 = (1<<14)
		ldr   r0, GPPUDCLK0
		ldr   r1, GPPUDCLK0_MASK1
		str   r1, [r0, #0]
		//for(ra=0;ra<150;ra++) dummy(ra);
		ldr   r0, ONEFIFTY
wait2:		bl    dummy
		subs  r0, r0, #1
		bne   wait2
		//GPPUDCLK0 = 0
		ldr   r0, GPPUDCLK0
		ldr   r1, ZERO
		str   r1, [r0, #0]

		//AUX_MU_CNTL_REG = 2
		ldr   r0, AUX_MU_CNTL_REG
		ldr   r1, TWO
		str   r1, [r0, #0]

		//AUX_MU_IO_REG = '?'
		ldr   r0, AUX_MU_IO_REG
		ldr   r1, TESTCHAR
		str   r1, [r0, #0]

		b done

dummy:
		bx lr

.align  2
GPFSEL1:
        .word   0x20200004
GPSET0:
        .word   0x2020001C
GPCLR0:
        .word   0x20200028
GPPUD:
        .word   0x20200094
ONEFIFTY:
	.word	0x00000096 //150
GPPUDCLK0:
        .word   0x20200098

AUX_ENABLES:
        .word   0x20215004
AUX_MU_IO_REG:
        .word   0x20215040
AUX_MU_IER_REG:
        .word   0x20215044
AUX_MU_IIR_REG:
        .word   0x20215048
AUX_MU_LCR_REG:
        .word   0x2021504C
AUX_MU_MCR_REG:
        .word   0x20215050
AUX_MU_LSR_REG:
        .word   0x20215054
AUX_MU_MSR_REG:
        .word   0x20215058
AUX_MU_SCRATCH:
        .word   0x2021505C
AUX_MU_CNTL_REG:
        .word   0x20215060
AUX_MU_STAT_REG:
        .word   0x20215064
AUX_MU_BAUD_REG:
        .word   0x20215068

ONE:
	.word	0x00000001
ZERO:
	.word	0x00000000
THREE:
	.word	0x00000003
CSIX:
	.word	0x000000C6
TWOSEVENTY:
	.word	0x0000010E //270
GPFSEL1_MASK1:
	.word	0xFFFF8FFF //~(7<<12)
GPFSEL1_MASK2:
	.word	0x00002000 //(2<<12)
GPPUDCLK0_MASK1:
	.word	0x00004000 //(1<<14)
TWO:
	.word	0x00000002
AUXMULSR_MASK1:
	.word	0x00000020
TESTCHAR:
	.word	0x0000003F //'?'

done:

The next step is to make sure that Xinu is Doing the Right Thing and initializing the UART and GPIO lines properly. It's worth noting that David Welch says in a README that accompanies the UART initialization code:

The documentation for the chip has some glaring errors. The IER and IIR register descriptions are screwy. The one that killed me was the word length. The document says that the single bit controls 7 bits per word vs 8 bits per word. And that bit 1 and some above do not do anything, they might on real 16550's but not here. Well that is wrong. If bits 1:0 are 00 you get 7 bits if bits 1:0 are 01 you get 7 bits. You need bit 1 set to get 8 bits.

Boot process

  1. GPU starts up, mounts FAT32 partition on SD card
  2. GPU reads, runs bootcode.bin and start.elf from the SD card
  3. GPU reads kernel.img from the SD card, writes it to 0x00008000 in RAM
  4. GPU starts the CPU executing at 0x00008000
  5. loader/platforms/arm-qemu/start.S: sets up the system a bit and does a b _startup to run the C code
  6. system/platforms/arm-qemu/stubs.c: _startup() calls setup_pins() and nulluser()
  7. system/initialize.c contains nulluser()

Using the Mini-UART, it's possible to hack the kputc and kgetc routines to allow for simple serial debugging. The code for this is:

//based on code by David Welch ([email protected]), original license:
// Copyright (c) 2012 David Welch [email protected]
//
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

#define uint32_t unsigned int
#define int32_t int

//for I/O devices when using the physical address space (0x20000000 - 0x20FFFFFF)
#define IO_PHY_ADDR(x) (((x)&0x00FFFFFF) | 0x20000000)
//for I/O devices when using the bus address space (0x7E000000 - 0x7EFFFFFF)
#define IO_BUS_ADDR(x) (((x)&0x00FFFFFF) | 0x7E000000)

//for using I/O devices just like a global variable
#define __IO(x) (*((uint32_t*)(IO_PHY_ADDR(x))))

//mini uart registers
#define AUX_ENABLES     __IO(0x20215004)
#define AUX_MU_IO_REG   __IO(0x20215040)
#define AUX_MU_IER_REG  __IO(0x20215044)
#define AUX_MU_IIR_REG  __IO(0x20215048)
#define AUX_MU_LCR_REG  __IO(0x2021504C)
#define AUX_MU_MCR_REG  __IO(0x20215050)
#define AUX_MU_LSR_REG  __IO(0x20215054)
#define AUX_MU_MSR_REG  __IO(0x20215058)
#define AUX_MU_SCRATCH  __IO(0x2021505C)
#define AUX_MU_CNTL_REG __IO(0x20215060)
#define AUX_MU_STAT_REG __IO(0x20215064)
#define AUX_MU_BAUD_REG __IO(0x20215068)

//GPIO registers
#define GPFSEL1 __IO(0x20200004)
#define GPSET0  __IO(0x2020001C)
#define GPCLR0  __IO(0x20200028)

//peripheral clock registers
#define GPPUD     __IO(0x20200094)
#define GPPUDCLK0 __IO(0x20200098)

//initialize the mini UART to 115200 baud
void muart_init(void){
  int i;

  //initialize UART
  AUX_ENABLES = 1;
  AUX_MU_IER_REG = 0;    //No interrupts (ERBFI, ETBEI, ELSI, EDSSI)
  AUX_MU_CNTL_REG = 0;   //pg16
  AUX_MU_LCR_REG = 3;    //not using the divisor, no break, no stick parity, even parity, no parity bit, 8 bits, 1 stop bit
  AUX_MU_MCR_REG = 0;    //diagnostics off, OUT2 off, OUT1 off, RTS off, DTR off
  AUX_MU_IER_REG = 0;    //disable interrupts again?
  AUX_MU_IIR_REG = 0xC6; 
  AUX_MU_BAUD_REG = 270; //115200 baud

  //initialize GPIO to let the UART TX through
  GPFSEL1 &= ~(7<<12);
  GPFSEL1 |= (2<<12);

  //initialize GPIO to let the UART RX through
  GPFSEL1 &= ~(7<<15);
  GPFSEL1 |= (2<<15);

  //initialize the peripheral clock
  GPPUD = 0;
  for(i=0; i<150; i++){}
  GPPUDCLK0 = (1<<14);
  for(i=0; i<150; i++){}
  GPPUDCLK0 = 0;

  //final UART initialization
  AUX_MU_CNTL_REG = 3; //3 for TX/RX, 2 for just TX
}

void muart_putc(char c){
  //wait for transmission to finish
  while((AUX_MU_LSR_REG & 0x20) == 0){}
  //send char
  AUX_MU_IO_REG = c;
}

char muart_getc(void){
  char result;

  //wait for a character to come in
  while((AUX_MU_LSR_REG & 0x01) == 0){}
  //get char
  result = (AUX_MU_IO_REG & 0xFF);

  return result;
}

By replacing the guts of kputc and kgetc with calls to these functions, and converting all getc and putc calls to kgetc and kputc calls, the test routines can be run. This is the status of the test routines:

  • c - clockTest - fail (says clkticks: 0 and clktime: 0 then dies)
  • a - ksimpleterminal - success (call to test function is commented out, but we make it to All user processes have completed. just fine)
  • b - semtest - success?
  • g - timesliceThreadTest - success?
  • i - irTest (LED test?) - fail (nothing happens, will need to rewrite LED test)
  • j - uartTest - fail (nothing happens, will need to get the real UART working)
  • k - interruptTest - fail (says Interrupt mask: 10010 and Disabled. Interrupt mask saved: 10010, actual: 10 and lpc_vic->vect_ctrls[irq]: 0x2c and GPIO0 interrupts: 0 then dies repeating the last line every few seconds)
  • m - findMemory - success?
  • l - turn on LED - not implemented
  • o - turn off LED - not implemented
  • L - turn on back LED - not applicable
  • O - turn off back LED - not applicable
  • w - kprintf - success (will need to regression test this when real serial is implemented)
  • t - timerTest - fail (will likely need to change timer register addresses, possibly code as well)
Clone this wiki locally