|
| 1 | +# PMP: Memory Protection |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +Linmo operates entirely in Machine mode by default, with all tasks sharing the same physical address space. |
| 6 | +A misbehaving task can corrupt kernel data structures or interfere with other tasks, compromising system stability. |
| 7 | + |
| 8 | +Physical Memory Protection provides hardware-enforced access control at the physical address level. |
| 9 | +Unlike an MMU, PMP requires no page tables or TLB management, making it suitable for resource-constrained RISC-V systems. |
| 10 | +PMP enforces read, write, and execute permissions for up to 16 configurable memory regions. |
| 11 | + |
| 12 | +The design draws inspiration from the F9 microkernel, adopting a three-layer abstraction: |
| 13 | +- **Memory Pools** define static physical regions at boot time, derived from linker symbols. |
| 14 | +- **Flexpages** represent dynamically protected memory ranges with associated permissions. |
| 15 | +- **Memory Spaces** group flexpages into per-task protection domains. |
| 16 | + |
| 17 | +## Architecture |
| 18 | + |
| 19 | +### Memory Abstraction Layers |
| 20 | + |
| 21 | +```mermaid |
| 22 | +graph TD |
| 23 | + classDef hw fill:#424242,stroke:#000,color:#fff,stroke-width:2px |
| 24 | + classDef static fill:#e1f5fe,stroke:#01579b,stroke-width:2px |
| 25 | + classDef dynamic fill:#fff3e0,stroke:#e65100,stroke-width:2px |
| 26 | + classDef container fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px |
| 27 | + classDef task fill:#f3e5f5,stroke:#6a1b9a,stroke-width:2px |
| 28 | +
|
| 29 | + subgraph L0 ["Hardware"] |
| 30 | + PMP[PMP Registers]:::hw |
| 31 | + end |
| 32 | +
|
| 33 | + subgraph L1 ["Memory Pools"] |
| 34 | + MP["Static Regions<br/>(.text, .data, .bss)"]:::static |
| 35 | + end |
| 36 | +
|
| 37 | + subgraph L2 ["Flexpages"] |
| 38 | + FP["fpage_t<br/>base / size / rwx"]:::dynamic |
| 39 | + end |
| 40 | +
|
| 41 | + subgraph L3 ["Memory Spaces"] |
| 42 | + AS["memspace_t<br/>per-task domain"]:::container |
| 43 | + end |
| 44 | +
|
| 45 | + subgraph L4 ["Task"] |
| 46 | + TCB[TCB]:::task |
| 47 | + end |
| 48 | +
|
| 49 | + TCB -->|owns| AS |
| 50 | + AS -->|contains| FP |
| 51 | + MP -->|initializes| FP |
| 52 | + AS -->|configures| PMP |
| 53 | +``` |
| 54 | + |
| 55 | +The core structures: |
| 56 | + |
| 57 | +```c |
| 58 | +typedef struct fpage { |
| 59 | + struct fpage *as_next; /* Next in address space list */ |
| 60 | + struct fpage *map_next; /* Next in mapping chain */ |
| 61 | + struct fpage *pmp_next; /* Next in PMP queue */ |
| 62 | + uint32_t base; /* Physical base address */ |
| 63 | + uint32_t size; /* Region size */ |
| 64 | + uint32_t rwx; /* R/W/X permission bits */ |
| 65 | + uint32_t pmp_id; /* PMP region index */ |
| 66 | + uint32_t flags; /* Status flags */ |
| 67 | + uint32_t priority; /* Eviction priority */ |
| 68 | + int used; /* Usage counter */ |
| 69 | +} fpage_t; |
| 70 | +``` |
| 71 | +```c |
| 72 | +typedef struct memspace { |
| 73 | + uint32_t as_id; /* Memory space identifier */ |
| 74 | + struct fpage *first; /* Head of flexpage list */ |
| 75 | + struct fpage *pmp_first; /* Head of PMP-loaded list */ |
| 76 | + struct fpage *pmp_stack; /* Stack regions */ |
| 77 | + uint32_t shared; /* Shared flag */ |
| 78 | +} memspace_t; |
| 79 | +``` |
| 80 | + |
| 81 | +### TOR Mode and Paired Entries |
| 82 | + |
| 83 | +TOR (Top Of Range) mode defines region *i* as `[pmpaddr[i-1], pmpaddr[i])`. |
| 84 | +This works well for contiguous kernel regions where boundaries naturally chain together. |
| 85 | + |
| 86 | +For dynamically allocated user regions at arbitrary addresses, Linmo uses paired entries: |
| 87 | + |
| 88 | +``` |
| 89 | +┌─────────────────────────────────────────┐ |
| 90 | +│ Entry N: base_addr (disabled) │ |
| 91 | +│ Entry N+1: top_addr (TOR, R|W) │ |
| 92 | +│ │ |
| 93 | +│ Region N+1 = [base_addr, top_addr) │ |
| 94 | +└─────────────────────────────────────────┘ |
| 95 | +``` |
| 96 | + |
| 97 | +The first entry sets the lower bound with permissions disabled. |
| 98 | +The second entry defines the upper bound with TOR mode and the desired permissions. |
| 99 | +This consumes two hardware slots per user region but allows non-contiguous regions at arbitrary addresses. |
| 100 | + |
| 101 | +### Kernel and User Regions |
| 102 | + |
| 103 | +Kernel regions protect `.text`, `.data`, and `.bss` sections: |
| 104 | + |
| 105 | +```c |
| 106 | +static const mempool_t kernel_mempools[] = { |
| 107 | + DECLARE_MEMPOOL("kernel_text", |
| 108 | + &_stext, &_etext, |
| 109 | + PMPCFG_PERM_RX, |
| 110 | + PMP_PRIORITY_KERNEL), |
| 111 | + DECLARE_MEMPOOL("kernel_data", |
| 112 | + &_sdata, &_edata, |
| 113 | + PMPCFG_PERM_RW, |
| 114 | + PMP_PRIORITY_KERNEL), |
| 115 | + DECLARE_MEMPOOL("kernel_bss", |
| 116 | + &_sbss, &_ebss, |
| 117 | + PMPCFG_PERM_RW, |
| 118 | + PMP_PRIORITY_KERNEL), |
| 119 | +}; |
| 120 | +``` |
| 121 | + |
| 122 | +Kernel heap and stack are intentionally excluded—PMP is ineffective for M-mode, and kernel heap/stack is only used in M-mode. |
| 123 | +This keeps Regions 0-2 for kernel, leaving Region 3+ available for user dynamic regions with correct TOR address ordering. |
| 124 | + |
| 125 | +Kernel regions use a hybrid lock strategy: |
| 126 | + |
| 127 | +| Lock Type | Location | Effect | |
| 128 | +|-----------|---------------------------|-------------------------| |
| 129 | +| Software | `regions[i].locked = 1` | Allocator skips slot | |
| 130 | +| Hardware | `PMPCFG_L` NOT set | M-mode access preserved | |
| 131 | + |
| 132 | +Setting the hardware lock bit would deny M-mode access. |
| 133 | + |
| 134 | +User regions protect task stacks and are dynamically loaded during context switches. |
| 135 | +When PMP slots are exhausted, user regions can be evicted and reloaded on demand. |
| 136 | + |
| 137 | +## Memory Isolation |
| 138 | + |
| 139 | +### Context Switching |
| 140 | + |
| 141 | +Context switching reconfigures PMP in two phases: |
| 142 | + |
| 143 | +```mermaid |
| 144 | +flowchart LR |
| 145 | + subgraph Eviction |
| 146 | + E1[Iterate pmp_first] --> E2[Disable region in hardware] |
| 147 | + E2 --> E3["Set pmp_id = INVALID"] |
| 148 | + end |
| 149 | + subgraph Loading |
| 150 | + L1[Reset pmp_first = NULL] --> L2{Already loaded?} |
| 151 | + L2 -->|Yes| L3[Add to tracking list] |
| 152 | + L2 -->|No| L4[Find free slot] |
| 153 | + L4 --> L5[Load to hardware] |
| 154 | + L5 --> L3 |
| 155 | + end |
| 156 | + Eviction --> Loading |
| 157 | +``` |
| 158 | + |
| 159 | +**Eviction phase** iterates the outgoing task's `pmp_first` linked list. |
| 160 | +Each flexpage is disabled in hardware, and `pmp_id` is set to `PMP_INVALID_REGION (0xFF)` to mark it as unloaded. |
| 161 | + |
| 162 | +**Loading phase** rebuilds `pmp_first` from scratch. |
| 163 | +This prevents circular references—if `pmp_first` is not cleared, reloading a flexpage could create a self-loop in the linked list. |
| 164 | +For each flexpage in the incoming task's memory space: |
| 165 | +- **Already loaded** (shared regions): Add directly to tracking list |
| 166 | +- **Not loaded**: Find a free slot via `find_free_region_slot()` and load |
| 167 | + |
| 168 | +If all slots are occupied, remaining regions load on-demand through the fault handler (lazy loading). |
| 169 | + |
| 170 | +### Per-Task Kernel Stack |
| 171 | + |
| 172 | +U-mode trap handling requires a kernel stack to save context. |
| 173 | +If multiple U-mode tasks share a single kernel stack, Task A's context frame is overwritten when Task B traps—the ISR writes to the same position on the shared stack. |
| 174 | + |
| 175 | +Linmo allocates a dedicated 512-byte kernel stack for each U-mode task: |
| 176 | + |
| 177 | +```c |
| 178 | +typedef struct tcb { |
| 179 | + /* ... */ |
| 180 | + void *kernel_stack; /* Base address of kernel stack (NULL for M-mode) */ |
| 181 | + size_t kernel_stack_size; /* Size of kernel stack in bytes (0 for M-mode) */ |
| 182 | +} tcb_t; |
| 183 | +``` |
| 184 | + |
| 185 | +M-mode tasks do not require a separate kernel stack—they use the task stack directly without privilege transition. |
| 186 | + |
| 187 | +During context switch, the scheduler saves the incoming task's kernel stack top to a global variable. |
| 188 | +The ISR restore path loads this value into `mscratch`, enabling the next U-mode trap to use the correct per-task kernel stack. |
| 189 | + |
| 190 | +### Fault Handling and Task Termination |
| 191 | + |
| 192 | +PMP access faults occur when a U-mode task attempts to access memory outside its loaded regions. |
| 193 | +The trap handler routes these faults to the PMP fault handler, which attempts recovery or terminates the task. |
| 194 | + |
| 195 | +The fault handler first searches the task's memory space for a flexpage containing the faulting address. |
| 196 | +If found and the flexpage is not currently loaded in hardware, it loads the region and returns to the faulting instruction. |
| 197 | +This enables lazy loading—regions not loaded during context switch are loaded on first access. |
| 198 | + |
| 199 | +If no matching flexpage exists, the access is unauthorized (e.g., kernel memory or another task's stack). |
| 200 | +If the flexpage is already loaded but still faulted, recovery is impossible. |
| 201 | +In either case, the handler marks the task as `TASK_ZOMBIE` and returns a termination code. |
| 202 | + |
| 203 | +```mermaid |
| 204 | +flowchart TD |
| 205 | + A[Find flexpage for fault_addr] --> B{Flexpage found?} |
| 206 | + B -->|No| F[Unauthorized access] |
| 207 | + B -->|Yes| C{Already loaded in hardware?} |
| 208 | + C -->|No| D[Load to hardware] |
| 209 | + D --> E[Return RECOVERED] |
| 210 | + C -->|Yes| F |
| 211 | + F --> G[Mark TASK_ZOMBIE] |
| 212 | + G --> H[Return TERMINATE] |
| 213 | +``` |
| 214 | + |
| 215 | +The trap handler interprets the return value: |
| 216 | + |
| 217 | +| Return Code | Action | |
| 218 | +|-------------------------|-----------------------------------------------| |
| 219 | +| `PMP_FAULT_RECOVERED` | Resume execution at faulting instruction | |
| 220 | +| `PMP_FAULT_TERMINATE` | Print diagnostic, invoke dispatcher | |
| 221 | +| `PMP_FAULT_UNHANDLED` | Fall through to default exception handler | |
| 222 | + |
| 223 | +Terminated tasks are not immediately destroyed. |
| 224 | +The dispatcher calls a cleanup routine before selecting the next runnable task. |
| 225 | +This routine iterates zombie tasks, evicts their PMP regions, frees their memory spaces and stacks, and removes them from the task list. |
| 226 | +Deferring cleanup to the dispatcher avoids modifying task structures from within interrupt context. |
| 227 | + |
| 228 | +## Best Practices |
| 229 | + |
| 230 | +### Hardware Limitations |
| 231 | + |
| 232 | +PMP provides 16 hardware slots shared between kernel and user regions. |
| 233 | +Kernel regions occupy slots 0-2 and cannot be evicted. |
| 234 | +Each user region requires two slots (paired entries for TOR mode). |
| 235 | + |
| 236 | +| Resource | Limit | |
| 237 | +|-----------------------------|----------------------------| |
| 238 | +| Total PMP slots | 16 | |
| 239 | +| Kernel slots | 3 (fixed at boot) | |
| 240 | +| Slots per user region | 2 (paired entries) | |
| 241 | +| Max concurrent user regions | ~6 | |
| 242 | + |
| 243 | +Systems with many U-mode tasks should minimize per-task region requirements. |
| 244 | +Tasks exceeding available slots rely on lazy loading, which incurs fault handler overhead on first access. |
| 245 | + |
| 246 | +### Task Creation Guidelines |
| 247 | + |
| 248 | +U-mode tasks receive automatic PMP protection. |
| 249 | +The kernel allocates a memory space and registers the task stack as a protected flexpage: |
| 250 | + |
| 251 | +```c |
| 252 | +/* M-mode task: no isolation, full memory access */ |
| 253 | +mo_task_spawn(task_func, stack_size); |
| 254 | + |
| 255 | +/* U-mode task: PMP protected, memory space auto-created */ |
| 256 | +mo_task_spawn_user(task_func, stack_size); |
| 257 | +``` |
| 258 | +
|
| 259 | +Choose the appropriate privilege level: |
| 260 | +- **M-mode**: Trusted kernel tasks, drivers requiring full memory access |
| 261 | +- **U-mode**: Application tasks, untrusted or potentially buggy code |
| 262 | +
|
| 263 | +### Common Pitfalls |
| 264 | +
|
| 265 | +1. Assuming PMP protects the kernel |
| 266 | +
|
| 267 | + PMP only restricts Supervisor and User modes. |
| 268 | + Machine mode has unrestricted access regardless of PMP configuration. |
| 269 | + This is intentional—the kernel must access all memory to manage protection. |
| 270 | +
|
| 271 | + ```c |
| 272 | + /* This code in M-mode bypasses PMP entirely */ |
| 273 | + void kernel_func(void) { |
| 274 | + volatile uint32_t *user_stack = (uint32_t *)0x80007000; |
| 275 | + *user_stack = 0; /* No fault—M-mode ignores PMP */ |
| 276 | + } |
| 277 | + ``` |
| 278 | +
|
| 279 | + PMP protects user tasks from each other but does not protect the kernel from itself. |
| 280 | +
|
| 281 | +2. Exhausting PMP slots |
| 282 | +
|
| 283 | + With only ~6 user regions available, spawning many U-mode tasks causes PMP slot exhaustion. |
| 284 | + Subsequent tasks rely entirely on lazy loading, degrading performance. |
| 285 | +
|
| 286 | +3. Mixing M-mode and U-mode incorrectly |
| 287 | +
|
| 288 | + M-mode tasks spawned with `mo_task_spawn()` do not receive memory spaces. |
| 289 | + PMP-related functions check for NULL memory spaces and return early, so calling them on M-mode tasks has no effect. |
| 290 | +
|
| 291 | +## References |
| 292 | +
|
| 293 | +- [RISC-V Privileged Architecture](https://riscv.github.io/riscv-isa-manual/snapshot/privileged/) |
| 294 | +- [Memory Protection for Embedded RISC-V Systems](https://nva.sikt.no/registration/0198eb345173-b2a7ef5c-8e7e-4b98-bd3e-ff9c469ce36d) |
0 commit comments