Skip to content

Commit 6411a74

Browse files
committed
Implement kernel stack isolation for U-mode tasks
User mode tasks require kernel stack isolation to prevent malicious or corrupted user stack pointers from compromising kernel memory during interrupt handling. Without this protection, a user task could set its stack pointer to an invalid or controlled address, causing the ISR to write trap frames to arbitrary memory locations. This commit implements stack isolation using the mscratch register as a discriminator between machine mode and user mode execution contexts. The ISR entry performs a blind swap with mscratch: for machine mode tasks (mscratch=0), the swap is immediately undone to restore the kernel stack pointer. For user mode tasks (mscratch=kernel_stack), the swap provides the kernel stack while preserving the user stack pointer in mscratch. Each user mode task is allocated a dedicated 512-byte kernel stack to ensure complete isolation between tasks and prevent stack overflow attacks. The task control block is extended to track per-task kernel stack allocations. A global pointer references the current task's kernel stack and is updated during each context switch. The ISR loads this pointer to access the appropriate per-task kernel stack through mscratch, replacing the previous approach of using a single global kernel stack shared by all user mode tasks. The interrupt frame structure is extended to include dedicated storage for the stack pointer. Task initialization zeroes the entire frame and correctly sets the initial stack pointer to support the new restoration path. For user mode tasks, the initial ISR frame is constructed on the kernel stack rather than the user stack, ensuring the frame is protected from user manipulation. Enumeration constants replace magic number usage for improved code clarity and consistency. The ISR implementation now includes separate entry and restoration paths for each privilege mode. The M-mode path maintains mscratch=0 throughout execution. The U-mode path saves the user stack pointer from mscratch immediately after frame allocation and restores mscratch to the current task's kernel stack address before returning to user mode, enabling the next trap to use the correct per-task kernel stack. Task initialization was updated to configure mscratch appropriately during the first dispatch. The dispatcher checks the current privilege level and sets mscratch to zero for machine mode tasks or to the per-task kernel stack base for user mode tasks. The main scheduler initialization ensures the first task's kernel stack pointer is set before entering the scheduling loop. The user mode output system call was modified to bypass the asynchronous logger queue and implement task-level synchronization. Direct output ensures strict FIFO ordering for test output clarity, while preventing task preemption during character transmission avoids interleaving when multiple user tasks print concurrently. This ensures each string is output atomically with respect to other tasks. A test helper function was added to support stack pointer manipulation during validation. Following the Linux kernel's context switching pattern, this provides precise control over stack operations without compiler interference. The validation harness uses this to verify syscall stability under corrupted stack pointer conditions. Documentation updates include the calling convention guide's stack layout section, which now distinguishes between machine mode and user mode task stack organization with detailed diagrams of the dual-stack design. The context switching guide's task initialization section reflects the updated function signature for building initial interrupt frames with per-task kernel stack parameters. Testing validates that system calls succeed even when invoked with a malicious stack pointer (0xDEADBEEF), confirming the ISR correctly uses the per-task kernel stack from mscratch rather than the user-controlled stack pointer.
1 parent 8c60804 commit 6411a74

File tree

11 files changed

+489
-108
lines changed

11 files changed

+489
-108
lines changed

Documentation/hal-calling-convention.md

Lines changed: 48 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -109,14 +109,14 @@ void hal_context_restore(jmp_buf env, int32_t val); /* Restore context + process
109109
The ISR in `boot.c` performs a complete context save of all registers:
110110

111111
```
112-
Stack Frame Layout (144 bytes, 33 words × 4 bytes, offsets from sp):
112+
Stack Frame Layout (144 bytes, 36 words × 4 bytes, offsets from sp):
113113
0: ra, 4: gp, 8: tp, 12: t0, 16: t1, 20: t2
114114
24: s0, 28: s1, 32: a0, 36: a1, 40: a2, 44: a3
115115
48: a4, 52: a5, 56: a6, 60: a7, 64: s2, 68: s3
116116
72: s4, 76: s5, 80: s6, 84: s7, 88: s8, 92: s9
117117
96: s10, 100:s11, 104:t3, 108: t4, 112: t5, 116: t6
118-
120: mcause, 124: mepc, 128: mstatus
119-
132-143: padding (12 bytes for 16-byte alignment)
118+
120: mcause, 124: mepc, 128: mstatus, 132: sp (for restore)
119+
136-143: padding (8 bytes for 16-byte alignment)
120120
```
121121

122122
Why full context save in ISR?
@@ -127,12 +127,14 @@ Why full context save in ISR?
127127

128128
### ISR Stack Requirements
129129

130-
Each task stack must reserve space for the ISR frame:
130+
Each task requires space for the ISR frame:
131131
```c
132-
#define ISR_STACK_FRAME_SIZE 144 /* 33 words × 4 bytes, 16-byte aligned */
132+
#define ISR_STACK_FRAME_SIZE 144 /* 36 words × 4 bytes, 16-byte aligned */
133133
```
134134
135-
This "red zone" is reserved at the top of every task stack to guarantee ISR safety.
135+
**M-mode tasks**: This "red zone" is reserved at the top of the task stack to guarantee ISR safety.
136+
137+
**U-mode tasks**: The ISR frame is allocated on the per-task kernel stack (512 bytes), not on the user stack. This provides stack isolation and prevents user tasks from corrupting kernel trap handling state.
136138
137139
## Function Calling in Linmo
138140
@@ -181,7 +183,9 @@ void task_function(void) {
181183
182184
### Stack Layout
183185
184-
Each task has its own stack with this layout:
186+
#### Machine Mode Tasks
187+
188+
Each M-mode task has its own stack with this layout:
185189
186190
```
187191
High Address
@@ -197,6 +201,43 @@ High Address
197201
Low Address
198202
```
199203
204+
#### User Mode Tasks (Per-Task Kernel Stack)
205+
206+
U-mode tasks maintain separate user and kernel stacks for isolation:
207+
208+
**User Stack** (application execution):
209+
```
210+
High Address
211+
+------------------+ <- user_stack_base + user_stack_size
212+
| |
213+
| User Stack | <- Grows downward
214+
| (Dynamic) | <- Task executes here in U-mode
215+
| |
216+
+------------------+ <- user_stack_base
217+
Low Address
218+
```
219+
220+
**Kernel Stack** (trap handling):
221+
```
222+
High Address
223+
+------------------+ <- kernel_stack_base + kernel_stack_size (512 bytes)
224+
| ISR Frame | <- 144 bytes for trap context
225+
| (144 bytes) | <- Traps switch to this stack
226+
+------------------+
227+
| Trap Handler | <- Kernel code execution during traps
228+
| Stack Space |
229+
+------------------+ <- kernel_stack_base
230+
Low Address
231+
```
232+
233+
When a U-mode task enters a trap (syscall, interrupt, exception):
234+
1. Hardware swaps SP with `mscratch` (containing kernel stack top)
235+
2. ISR saves full context to kernel stack
236+
3. Trap handler executes on kernel stack
237+
4. Return path restores user SP and switches back
238+
239+
This dual-stack design prevents user tasks from corrupting kernel state and provides strong isolation between privilege levels.
240+
200241
### Stack Alignment
201242
- 16-byte alignment: Required by RISC-V ABI for stack pointer
202243
- 4-byte alignment: Minimum for all memory accesses on RV32I

Documentation/hal-riscv-context-switch.md

Lines changed: 29 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -123,14 +123,26 @@ a complete interrupt service routine frame:
123123
```c
124124
void *hal_build_initial_frame(void *stack_top,
125125
void (*task_entry)(void),
126-
int user_mode)
126+
int user_mode,
127+
void *kernel_stack,
128+
size_t kernel_stack_size)
127129
{
128-
/* Place frame in stack with initial reserve below for proper startup */
129-
uint32_t *frame = (uint32_t *) ((uint8_t *) stack_top - 256 -
130-
ISR_STACK_FRAME_SIZE);
130+
/* For U-mode tasks, build frame on kernel stack for stack isolation.
131+
* For M-mode tasks, build frame on user stack as before.
132+
*/
133+
uint32_t *frame;
134+
if (user_mode && kernel_stack) {
135+
/* U-mode: Place frame on per-task kernel stack */
136+
void *kstack_top = (uint8_t *) kernel_stack + kernel_stack_size;
137+
frame = (uint32_t *) ((uint8_t *) kstack_top - ISR_STACK_FRAME_SIZE);
138+
} else {
139+
/* M-mode: Place frame on user stack with reserve below */
140+
frame = (uint32_t *) ((uint8_t *) stack_top - 256 -
141+
ISR_STACK_FRAME_SIZE);
142+
}
131143
132144
/* Initialize all general purpose registers to zero */
133-
for (int i = 0; i < 32; i++)
145+
for (int i = 0; i < 36; i++)
134146
frame[i] = 0;
135147
136148
/* Compute thread pointer: aligned to 64 bytes from _end */
@@ -152,6 +164,18 @@ void *hal_build_initial_frame(void *stack_top,
152164
/* Set entry point */
153165
frame[FRAME_EPC] = (uint32_t) task_entry;
154166
167+
/* SP value for when ISR returns (stored in frame[33]).
168+
* For U-mode: Set to user stack top.
169+
* For M-mode: Set to frame + ISR_STACK_FRAME_SIZE.
170+
*/
171+
if (user_mode && kernel_stack) {
172+
/* U-mode: frame[33] should contain user SP */
173+
frame[FRAME_SP] = (uint32_t) ((uint8_t *) stack_top - 256);
174+
} else {
175+
/* M-mode: frame[33] contains kernel SP after frame deallocation */
176+
frame[FRAME_SP] = (uint32_t) ((uint8_t *) frame + ISR_STACK_FRAME_SIZE);
177+
}
178+
155179
return frame; /* Return frame base as initial stack pointer */
156180
}
157181
```

app/umode.c

Lines changed: 41 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,72 @@
11
#include <linmo.h>
22

3-
/* U-mode Validation Task
3+
/* Architecture-specific helper for SP manipulation testing.
4+
* Implemented in arch/riscv/entry.c as a naked function.
5+
*/
6+
extern uint32_t __switch_sp(uint32_t new_sp);
7+
8+
/* U-mode validation: syscall stability and privilege isolation.
49
*
5-
* Integrates two tests into a single task flow to ensure sequential execution:
6-
* 1. Phase 1: Mechanism Check - Verify syscalls work.
7-
* 2. Phase 2: Security Check - Verify privileged instructions trigger a trap.
10+
* Phase 1: Verify syscalls work under various SP conditions (normal,
11+
* malicious). Phase 2: Verify privileged instructions trap.
812
*/
913
void umode_validation_task(void)
1014
{
11-
/* --- Phase 1: Mechanism Check (Syscalls) --- */
12-
umode_printf("[umode] Phase 1: Testing Syscall Mechanism\n");
15+
/* --- Phase 1: Kernel Stack Isolation Test --- */
16+
umode_printf("[umode] Phase 1: Testing Kernel Stack Isolation\n");
17+
umode_printf("\n");
1318

14-
/* Test 1: sys_tid() - Simplest read-only syscall. */
19+
/* Test 1a: Baseline - Syscall with normal SP */
20+
umode_printf("[umode] Test 1a: sys_tid() with normal SP\n");
1521
int my_tid = sys_tid();
1622
if (my_tid > 0) {
1723
umode_printf("[umode] PASS: sys_tid() returned %d\n", my_tid);
1824
} else {
1925
umode_printf("[umode] FAIL: sys_tid() failed (ret=%d)\n", my_tid);
2026
}
27+
umode_printf("\n");
28+
29+
/* Test 1b: Verify ISR uses mscratch, not malicious user SP */
30+
umode_printf("[umode] Test 1b: sys_tid() with malicious SP\n");
31+
32+
uint32_t saved_sp = __switch_sp(0xDEADBEEF);
33+
int my_tid_bad_sp = sys_tid();
34+
__switch_sp(saved_sp);
35+
36+
if (my_tid_bad_sp > 0) {
37+
umode_printf(
38+
"[umode] PASS: sys_tid() succeeded, ISR correctly used kernel "
39+
"stack\n");
40+
} else {
41+
umode_printf(
42+
"[umode] FAIL: Syscall failed with malicious SP (ret=%d)\n",
43+
my_tid_bad_sp);
44+
}
45+
umode_printf("\n");
2146

22-
/* Test 2: sys_uptime() - Verify value transmission is correct. */
47+
/* Test 1c: Verify syscall functionality is still intact */
48+
umode_printf("[umode] Test 1c: sys_uptime() with normal SP\n");
2349
int uptime = sys_uptime();
2450
if (uptime >= 0) {
2551
umode_printf("[umode] PASS: sys_uptime() returned %d\n", uptime);
2652
} else {
2753
umode_printf("[umode] FAIL: sys_uptime() failed (ret=%d)\n", uptime);
2854
}
55+
umode_printf("\n");
2956

30-
/* Note: Skipping sys_tadd for now, as kernel user pointer checks might
31-
* block function pointers in the .text segment, avoiding distraction.
32-
*/
57+
umode_printf(
58+
"[umode] Phase 1 Complete: Kernel stack isolation validated\n");
59+
umode_printf("\n");
3360

3461
/* --- Phase 2: Security Check (Privileged Access) --- */
3562
umode_printf("[umode] ========================================\n");
63+
umode_printf("\n");
3664
umode_printf("[umode] Phase 2: Testing Security Isolation\n");
65+
umode_printf("\n");
3766
umode_printf(
3867
"[umode] Action: Attempting to read 'mstatus' CSR from U-mode.\n");
3968
umode_printf("[umode] Expect: Kernel Panic with 'Illegal instruction'.\n");
40-
umode_printf("[umode] ========================================\n");
69+
umode_printf("\n");
4170

4271
/* CRITICAL: Delay before suicide to ensure logs are flushed from
4372
* buffer to UART.

0 commit comments

Comments
 (0)