A Rusty Stack Jump
26 February 2025
In my quest to learn to build an async runtime in Rust, I have to learn about CPU context switching. In order to switch from one async task to another, our async runtime has to perform a context switch. This means saving the current CPU registers marked as callee saved
by the System V ABI manual and loading the CPU registers with our new async stack.
In this article, I will show you what I have learned about jumping onto a new stack in a x86_64 CPU.
Contents
Setting the stage
Why do we need to swap the stack of async tasks in a runtime with stackful coroutines ?
Async tasks, by nature, are paused and resumed. Everytime a task is paused to move into a new task, we would have to save the current context of the task that is running and load the context of the upcoming task.
Jumping into the new stack
Here is the code in its entirely, I'd recommend you run this on the Rust Playground. I have left comments through out the code so you can get the general idea but I will go over each line in details in the next section.
Note that you have to manually stop the process
use core::arch::asm;
// stack size of 48 bytes so its easy to print the stack before we switch contexts
const SSIZE: isize = 48;
// a struct that represents our CPU state
//
// This struct will stores the stack pointer
#[derive(Debug, Default)]
#[repr(C)]
struct ThreadContext {
rsp: u64,
}
// Returning ! means
// it will panic OR runs forever
fn hello() -> ! {
println!("I LOVE WAKING UP ON A NEW STACK!");
loop {}
}
// new is a pointer to a ThreadContext
unsafe fn gt_switch(new: *const ThreadContext) {
// inline assembly
asm!(
"mov rsp, [{0} + 0x00]", // move the content of where the new pointer is pointing to, into the rsp register
"ret", // ret pops the return address from our custom stackāin our example, the address of hello.
in(reg) new,
);
}
fn main() {
// initialize
let mut ctx = ThreadContext::default();
// stack initialize
// ie. 0x10
let mut stack = vec![0_u8; SSIZE as usize];
unsafe {
// we get the bottom of the stack
// remember that the stack grows downward from high memory address to low memory address
// i.e 0x40 -> because 0x30 = 0x40 - 0x10 and 0x30 = SSIZE in decimal
// NOTE: offset() is applied in units of the size of the type that the pointer points to
// in our case, stack is a pointer to u8 (a byte) so offset(SSIZE) == offset(48 bytes) == offset(0x30)
let stack_bottom = stack.as_mut_ptr().offset(SSIZE);
// we align the bottom of the stack to be 16-byte-aligned
// this is for performance reasons as some CPU instructions (SSE and SIMD)
// The technicality: 15 is b1111 so if we do (stack_bottom AND !15) we will zero out the bottom 4 bits
//
// we also want the bottom of the stack pointer to point to a byte (8bit or u8)
let sb_aligned = (stack_bottom as usize & !15) as *mut u8;
// Here, we write the address of the hello function as 64 bits(8 bytes)
// Remember that 16 bytes = 0x10 in hex
// So we go DOWN 10 memory addresses, i.e from 0x40 to 0x30
// NOTE: 16 bytes down (0x10) even though, the hello function pointer is ONLY 8 bytes
// This is because the System V ABI requires the stack pointer to be always be 16-byte aligned
std::ptr::write(sb_aligned.offset(-16) as *mut u64, hello as u64);
// we write the stack pointer into the rsp inside context
ctx.rsp = sb_aligned.offset(-16) as u64;
for i in 0..SSIZE {
println!("mem: {}, val: {}",
sb_aligned.offset(-i as isize) as usize,
*sb_aligned.offset(-i as isize))
};
// we go into the function
// we will write our stack pointer to the cpu stack pointer
// and `ret` will pop that stack pointer
gt_switch(&mut ctx);
}
}