Skip to content

Latest commit

 

History

History
447 lines (389 loc) · 16 KB

06-user-memory.org

File metadata and controls

447 lines (389 loc) · 16 KB

User space memory management

In this section we’ll work on user-space memory management, and add an allocator so user-space programs can use heap memory.

Once we can allocate objects on the heap in user programs, we can make a nicer interface to the threading syscalls by using a closure and Box.

User process memory management

http://gee.cs.oswego.edu/dl/html/malloc.html

Linux has two ways to allocate memory to processes: the brk() and mmap() syscalls. Both of these provide “heap” space that persist across function calls, and from the user code perspective are typically allocated using calls to malloc() or new.

The brk syscall moves the “breakpoint” (hence the slightly odd name) between the heap and the stack: The heap starts at an address above the program code and grows upwards; the stack starts at a high address and grows downwards. The breakpoint between them is an unmapped guard page. Moving that guard page up therefore made more space for the heap, at the expense of shrinking space for the stack.

The mmap() syscall maps a region of memory (page aligned), which can be released (unmapped) back to the operating system.

For now we’ll reuse the linked list allocator which is used in the kernel, and give it a fixed size heap when the user program starts. We can use a similar trick that we used in the stack, to create a large heap without actually using much memory: Map one frame, and mark the rest as read-only. Our page fault handler will then allocate frames as they are used, and when the program exits the frames will be returned.

To find a range of memory to use, we can use the page_table_address python code defined in the previous section on Memory management. The stack pages are currently (5,0,0,0,0) to (5,0,1,0,0) i.e. addresses 0x28000000000 to 0x28000200000, 2Mb in total. We can choose a different range e.g (5,0,3,0,0) to (5,0,23,0,0), addresses 0x28000600000 to 0x28002e00000, a total of 0x2800000 bytes or 40Mb.

Allocating pages

pub fn create_user_ondemand_pages(
    level_4_physaddr: u64,
    start_addr: VirtAddr,
    size: u64)
    -> Result<(), MapToError<Size4KiB>> {

    let memory_info = unsafe {MEMORY_INFO.as_mut().unwrap()};
    let frame_allocator = &mut memory_info.frame_allocator;

    let l4_table: &mut PageTable = unsafe {
                &mut *(memory_info.physical_memory_offset
                       + level_4_physaddr).as_mut_ptr()};

    let mut mapper = unsafe {
        OffsetPageTable::new(l4_table,
                             memory_info.physical_memory_offset)};

    let page_range = {
        let end_addr = start_addr + size - 1u64;
        let start_page = Page::containing_address(start_addr);
        let end_page = Page::containing_address(end_addr);
        Page::range_inclusive(start_page, end_page)
    };

    // Only allocating one frame
    let frame = frame_allocator
        .allocate_frame()
        .ok_or(MapToError::FrameAllocationFailed)?;

    for page in page_range {
        unsafe {
            mapper.map_to_with_table_flags(page,
                                           frame,
                                           // Page not writable
                                           PageTableFlags::PRESENT |
                                           PageTableFlags::USER_ACCESSIBLE,
                                           // Parent table flags include writable
                                           PageTableFlags::PRESENT |
                                           PageTableFlags::WRITABLE |
                                           PageTableFlags::USER_ACCESSIBLE,
                                           frame_allocator)?.flush()
        };
    }

    // Make one page writable, so this 'owns' the frame
    unsafe {
        mapper.update_flags(page_range.start,
                            PageTableFlags::PRESENT |
                            PageTableFlags::WRITABLE |
                            PageTableFlags::USER_ACCESSIBLE);
    }

    Ok(())
}

Project structure

We now need to have a global allocator for the user program. We could use the kernel library because its currently a bin which is linked against the user program. A better solution is probably to rearrange the Cargo project so that user programs aren’t linked against the kernel library code. User programs can then have their own global allocator.

The way this is done with Cargo is to create a Workspace. We want this to contain:

  • The kernel
  • The user-space standard library (split from hello.rs)
  • A user program (hello.rs)

This is probably also a good time to give this thing a name to replace blog_os. I’ve settled on EuraliOS from “Euralia” in A.A.Milne’s Once On A Time, so clearly the kernel is king “Merriwig”.

Using different configurations for packages within a Cargo workspace is a bit complicated. Cargo uses Cargo.toml, and if run in the root workspace directory reads Cargo.toml files in each package. There is also the .cargo/config.toml but only the file in the directory that Cargo is run from will be used. The Cargo per-package-target feature allows packages in a workspace to have different targets, and there is some discussion here of that method, but I couldn’t get that to work for this use case. All we really need is to be able to set the code & data address in the ELF file, but the --image-base argument to the rust-lld linker doesn’t seem to do that.

The simplest (and only) way to configure the kernel and user programs that I have found so far is to use a custom build script. Starting in the root (workspace) directory, set the linker in the target json file (now renamed x86_64-euralios.json):

  {
    "llvm-target": "x86_64-unknown-none",
    "data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
    "arch": "x86_64",
    "target-endian": "little",
    "target-pointer-width": "64",
    "target-c-int-width": "32",
    "os": "none",
    "executables": true,
    "linker-flavor": "ld",  // new!
    "linker": "ld",         // new!
    "panic-strategy": "abort",
    "disable-redzone": true,
    "features": "-mmx,-sse,+soft-float"
}

Note that the linker-flavor and linker settings are now “ld”. The root Cargo.toml file is simple, just listing the two packages (the kernel and user program “hello”):

[workspace]

members = [
    "kernel",
    "hello"
]

The .cargo/config.toml file sets the target as before (note renamed file), but now also sets a flag setting relocation-model to static, which we previously set in a command-line argument to rustc in the makefile.

[unstable]
build-std-features = ["compiler-builtins-mem"]
build-std = ["core", "compiler_builtins", "alloc"]

[build]
target = "x86_64-euralios.json"  # changed file

[target.'cfg(target_os = "none")']
runner = "bootimage runner"
rustflags = ["-C", "relocation-model=static"] # new

The makefile can be simplified to:

user/% : FORCE
	cargo build --release --bin $*
	mkdir -p user
	cp target/x86_64-euralios/release/$* user/

FORCE:

.PHONY: run
run : user/hello
	cargo run --bin kernel

which always runs cargo to rebuild user programs, so cargo looks after dependencies.

In the hello subdirectory we have a user program. The Cargo.toml file is quite standard:

[package]
name = "hello"
version = "0.1.0"
edition = "2021"

Now to pass flags to the linker we can use a Cargo build script. Cargo runs this to allow people to compile C codes, perform code generation etc., and parses the outputs for linker settings. We just want to add a couple of linker flags so can put in build.rs:

fn main() {
    println!("cargo:rustc-link-arg=-Ttext-segment=5000000");
    println!("cargo:rustc-link-arg=-Trodata-segment=5100000");
}

That script passes some arguments to the (ld) linker, setting the code and data segments.

The structure of the workspace is now:

- Cargo.toml
- makefile
- x86_64-euralios.json  <- modified from x86_64-blog_os.json
- .cargo/
    - config.toml
- hello/
    - Cargo.toml
    - src/
        - main.rs     <- Was hello.rs
- kernel/
    - Cargo.toml
    - src/
        - allocator.rs
        - gdt.rs
        - interrupts.rs
        - lib.rs
        - main.rs
        - memory.rs
        - process.rs
        - serial.rs
        - syscalls.rs
        - vga_buffer.rs

User program allocator

In the user program hello we can now add a linked list allocator to manage the memory heap. It won’t be able to add memory beyond the original range given to it, or give frames back to the kernel, but at least frames will only be used as they are needed: If a user program doesn’t use much memory then it won’t use many frames.

In hello/Cargo.toml add the dependency:

[dependencies]
linked_list_allocator = "0.9.0"

then in hello/src/main.rs create the static allocator:

use linked_list_allocator::LockedHeap;

#[global_allocator]
static ALLOCATOR: LockedHeap = LockedHeap::empty();

where the global_allocator attribute registers the allocator to be used by default by containers like Box and Vec in this program.

We need to initialise the allocator before using it, in the same way as the kernel heap allocator. To do that we need to know the heap location and size. We could hard-wire these values (USER_HEAP_START and USER_HEAP_SIZE in process.rs) in the user code, but it might be more fun to pass this information from the kernel, and this way we don’t need to remember to change it in two (or more) places.

At the start of the _start() function we can read some registers, which we can choose fairly arbitrarily to be RAX and RCX (RBX is reserved by LLVM for its own use):

pub unsafe extern "sysv64" fn _start() -> ! {
  let heap_start: usize;
  let heap_size: usize;
  asm!("",
       lateout("rax") heap_start,
       lateout("rcx") heap_size,
       options(pure, nomem, nostack)
  );
  println!("Heap start {:#016X}, size: {} bytes ({} Mb)",
           heap_start, heap_size, heap_size / (1024 * 1024));
  ...
}

To pass the information to the user code we can just modify the thread Context, because those values will be popped off the kernel stack and into the registers when the thread runs. In process.rs the new_user_thread() function already sets the segment selectors cs and ss, instruction and stack pointers rip and rsp. We can just add the heap start and size:

...
context.rsp = new_thread.user_stack_end as usize;
context.rax = USER_HEAP_START as usize; // new
context.rcx = USER_HEAP_SIZE as usize; // new
...

As an experiment we can try out the heap allocation and the page fault handling. In the page_fault_handler() in interrupts.rs add a message in the section which handles writes to read-only pages:

if error_code == (PageFaultErrorCode::PROTECTION_VIOLATION |
                  PageFaultErrorCode::CAUSED_BY_WRITE |
                  PageFaultErrorCode::USER_MODE) {
    println!("READ-ONLY ACCESS!"); // New (temporary!)
    if let Err(msg) = memory::allocate_missing_ondemand_frame(accessed_virtaddr) {
        println!("Page fault error: {}", msg);
        hlt_loop();
    }
}

Thread closures

The Rust standard library includes a thread::spawn function which works like this:

use std::thread;
use std::time::Duration;

fn main() {
    thread::spawn(|| {
        for i in 1..10 {
            println!("hi number {} from the spawned thread!", i);
            thread::sleep(Duration::from_millis(1));
        }
    });

    for i in 1..5 {
        println!("hi number {} from the main thread!", i);
        thread::sleep(Duration::from_millis(1));
    }
}

The spawn function in the standard library has the signature

fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T,
    F: Send + 'static,
    T: Send + 'static,

It takes ownership of a closure, puts it in a Box, and passes it to Thread::new(). We can use the same method to make a nice wrapper around the syscalls::thread_spawn() function made in section 4.

In a new file euralios_std/src/thread.rs we’ll write launch() which will take a boxed function Box<dyn FnOnce()> and pass it to thread_spawn(). We’ll return an error code if something goes wrong:

fn launch(p: Box<dyn FnOnce()>) -> Result<(), u64>
{
    // Call thread_spawn
    extern "C" fn thread_start(main: usize) {
        // This function called by thread_spawn
    }
}

The thread_spawn function has the signature

pub fn thread_spawn(func: extern "C" fn(usize) -> (), param: usize)
                    -> Result<u64, u64> {

so we can pass it thread_start as the func argument, and use the parameter to pass the address of the boxed function.

To convert Box<dyn FnOnce()> to a “thin” pointer address, we have to wrap the boxed function in another box, and get the raw pointer:

let p = Box::into_raw(Box::new(p));

The reason is that Box<dyn FnOnce()> is a fat pointer, consisting of a pointer to allocated memory to store closure data (if any), the function, and also a function to drop the contents. A thin pointer can’t store all this information, so here Box::new(p) moves it into heap-allocated memory, which can be pointed to by a thin pointer.

The call to thread_spawn can now be:

syscalls::thread_spawn(thread_start,
                       p as *mut () as usize)

which converts the box raw pointer to a thin pointer and then to an address which can be passed as the argument to the thread_start function.

Inside thread_start we can convert the thin pointer back into a Box and call the function:

unsafe {Box::from_raw(main as *mut Box<dyn FnOnce()>)()};

The spawn function is then quite simple: It takes a closure, puts it into a Box, and passes it to launch:

pub fn spawn<F>(f: F) -> Result<(), SyscallError>
where
    F: FnOnce() -> (),
    F: Send + 'static,
{
    launch(Box::new(f))
}

The constraints on the function mean that any variables captured by the closure must either be moved or have static lifetime. That is because the launched thread may outlive the caller.

We can now run threads using syntax like the std library:

use euralios_std::thread;

...
let value: usize = 42;

thread::spawn(move || {
    println!("Hello from thread: {}", value);
});

Taking out the move results in a compile-time error like:

  | thread::spawn(|| {
  |               ^^ may outlive borrowed value `value`
  |       println!("Hello from thread: {}", value);
  |                                         ----- `value` is borrowed here
note: function requires argument type to outlive `'static`

along with a helpful note to use the move keyword so that the closure takes ownership of any referenced variables. We now have reasonably nice and safe interface to launching threads using closures.

In the next section we’ll start working on inter-process/thread communication, allowing threads to send messages to each other.