From 40ee3fbd7ab93a4a7103295629ade998af53458e Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Tue, 12 Aug 2025 11:29:14 -0700 Subject: [PATCH 01/15] add proposal doc --- docs/ExtensibleMaps.md | 77 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) create mode 100644 docs/ExtensibleMaps.md diff --git a/docs/ExtensibleMaps.md b/docs/ExtensibleMaps.md new file mode 100644 index 0000000000..0c9b79f7e0 --- /dev/null +++ b/docs/ExtensibleMaps.md @@ -0,0 +1,77 @@ +Extensible maps are program type specific maps that will be implemented by the extension that is implementing the program type (program info provider). This document contains the proposal for implementing support for extensible / program type specific maps in eBPF-for-Windows. The below sections describe all the scenarios / areas that will need to be updated or tested for this new map type. + +### Map Id partitioning +Global maps get an ID for their map types from a global namespace. There are two possible options for how we can allocate IDs for map types for extensible maps. + +**Option 1: Global Map IDs** +- The map type IDs are allocated from a global namespace. This will be disjoint from the namespace for global maps. Global maps will use IDs from 1 to 4095. Extensible maps will use IDs 4096 onwards. +- Each program info provider that implements a extensible map will need to register / reserve the MAP ID / enum in the eBPF repo by creating a PR. + +**Option 2: Per-program map ID space** +- Just like program type specific helper functions, each program type can define map type IDs, and these can be overlapping. +- This has a problem though -- Existing user mode APIs to create a map can then not be used, as program type for the map cannot be disambiguated by just the map type. This will require a new eBPF map create API that will also take the corresponding program type (GUID or ID) + +**Proposal** +Use option 1 as it allows keeping the user mode API for map creation same as on Linux, only adding a one-time step for extension developers to reserve the map ID in the global namespace (by creating a PR in eBPF repo). + +### NMR interface for extensions +The NMR interface for program info provider will extended (non-breaking) and extensions will provide below information: + - List of extensible map types it is supporting + - APIs for + - Map creation / map deletion + - Map lookup, update, delete. + +### eBPF Store +- Program info providers will now include the map types they are going to support when updating eBPF store. This should include the map type string, and the map type ID. +- eBPF store APIs will be updated to populate this information also in the registry. +- ebpfapi when loading will read the extensible map type information and create a in-memory map for `map-type : program type`. +- This will be used when explicitly creating map from user mode. + +### Verfication +- For offline verification, eBPF store will be used to provide map information to the verfier. +- For online verification (JIT / interpret), map type to program type conversion will be done using eBPF store data. Then ebpfsvc will query ebpfcore to get program information which will also provide map information. + + +#### Map lifecycle +Even though the extensible map will be created by and reside in the extension, ebpfcore will also create a corresponding map entry, as it does for the global maps. The difference being, in case of extensible maps, the map CRUD APIs will be supplied by the extension, and map entry in ebpfcore will contain these function pointers provided by the extension. + +Map lifetime will also be maintained by eBPFcore, and it will invoke extension's map delete API when the map needs to be finally deleted. +Similarly, map pinning will also be handled by eBPFcore as that impacts map lifetime. + +Another thing to note is that once an extensible map is created, the corresponding extension cannot be allowed to unload / restart, as that will delete the map and its entries. This will be a limitation / restriction for the extension that is implementing extensible maps, and may impact their servicing flow. + +### Map creation +Assuming option 1 for `Map ID partitioning`, below is the expected flow for map creation. + +#### Explicit map creation +- App uses the existing map create APIs, and internally ebpfapi tries to find the corresponding program type from the eBPF store. +- Once it finds the program info provider, it makes a (new) ioctl call to create the extensible map, and also pass the program type. +- eBPFcore will first attach (NMR) to this provider, and check if the actual provider supports this map type. If yes, proceed to create map in the extension. + +Implicit map creation flow will also be similar. ebpf runtime will have similar flow for map creation, automatic map pinning, and map reuse. + +### Usermode CRUD APIs +Assuming option 1 for `Map ID partitioning`, all existing APIs should be applicable for extensible maps too. + +### Map helper functions +The existing map helper functions implemented by ebpfcore will be used by the BPF programs for extensible maps also. For extensible maps, ebpfcore will redirect the calls to the extension. + +### Exposing RCU semantics to extensions +For extensions to implement maps, they will need RCU support, and eBPF needs to expose RCU / epoch logic to extensions. + +There are two options for this: + +**Export RCU as lib** +- This will simplify logic in eBPFCore +- Extensions will have their own RCU "runtime" +- This approach will require recompilation and release from extension if there is a bug in RCU lib. + +**Export RCU APIs via NMR interface** +- Probably adds more complexity to ebpfcore. +- Does not require new release from extensions wheenver there is a bugfix in RCU logic. + +**Proposal** +Proposal here is to export RCU as lib. + +## Perf Consideration +Since map APIs for extensible maps will have logner path length, we should measure perf for extensible map operations. From be36e2b4c8b37f5af14696934de613b0061bd6a1 Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Tue, 12 Aug 2025 11:40:55 -0700 Subject: [PATCH 02/15] update doc --- docs/ExtensibleMaps.md | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/docs/ExtensibleMaps.md b/docs/ExtensibleMaps.md index 0c9b79f7e0..5e237e068e 100644 --- a/docs/ExtensibleMaps.md +++ b/docs/ExtensibleMaps.md @@ -1,6 +1,8 @@ +# Introduction + Extensible maps are program type specific maps that will be implemented by the extension that is implementing the program type (program info provider). This document contains the proposal for implementing support for extensible / program type specific maps in eBPF-for-Windows. The below sections describe all the scenarios / areas that will need to be updated or tested for this new map type. -### Map Id partitioning +## Map Id partitioning Global maps get an ID for their map types from a global namespace. There are two possible options for how we can allocate IDs for map types for extensible maps. **Option 1: Global Map IDs** @@ -14,25 +16,24 @@ Global maps get an ID for their map types from a global namespace. There are two **Proposal** Use option 1 as it allows keeping the user mode API for map creation same as on Linux, only adding a one-time step for extension developers to reserve the map ID in the global namespace (by creating a PR in eBPF repo). -### NMR interface for extensions +## NMR interface for extensions The NMR interface for program info provider will extended (non-breaking) and extensions will provide below information: - List of extensible map types it is supporting - APIs for - Map creation / map deletion - Map lookup, update, delete. -### eBPF Store +## eBPF Store - Program info providers will now include the map types they are going to support when updating eBPF store. This should include the map type string, and the map type ID. - eBPF store APIs will be updated to populate this information also in the registry. - ebpfapi when loading will read the extensible map type information and create a in-memory map for `map-type : program type`. - This will be used when explicitly creating map from user mode. -### Verfication +## Verfication - For offline verification, eBPF store will be used to provide map information to the verfier. - For online verification (JIT / interpret), map type to program type conversion will be done using eBPF store data. Then ebpfsvc will query ebpfcore to get program information which will also provide map information. - -#### Map lifecycle +## Map lifecycle Even though the extensible map will be created by and reside in the extension, ebpfcore will also create a corresponding map entry, as it does for the global maps. The difference being, in case of extensible maps, the map CRUD APIs will be supplied by the extension, and map entry in ebpfcore will contain these function pointers provided by the extension. Map lifetime will also be maintained by eBPFcore, and it will invoke extension's map delete API when the map needs to be finally deleted. @@ -40,23 +41,25 @@ Similarly, map pinning will also be handled by eBPFcore as that impacts map life Another thing to note is that once an extensible map is created, the corresponding extension cannot be allowed to unload / restart, as that will delete the map and its entries. This will be a limitation / restriction for the extension that is implementing extensible maps, and may impact their servicing flow. -### Map creation +## Map creation Assuming option 1 for `Map ID partitioning`, below is the expected flow for map creation. -#### Explicit map creation +### Explicit map creation - App uses the existing map create APIs, and internally ebpfapi tries to find the corresponding program type from the eBPF store. - Once it finds the program info provider, it makes a (new) ioctl call to create the extensible map, and also pass the program type. - eBPFcore will first attach (NMR) to this provider, and check if the actual provider supports this map type. If yes, proceed to create map in the extension. Implicit map creation flow will also be similar. ebpf runtime will have similar flow for map creation, automatic map pinning, and map reuse. +## Map CRUD APIs + ### Usermode CRUD APIs Assuming option 1 for `Map ID partitioning`, all existing APIs should be applicable for extensible maps too. ### Map helper functions The existing map helper functions implemented by ebpfcore will be used by the BPF programs for extensible maps also. For extensible maps, ebpfcore will redirect the calls to the extension. -### Exposing RCU semantics to extensions +## Exposing RCU semantics to extensions For extensions to implement maps, they will need RCU support, and eBPF needs to expose RCU / epoch logic to extensions. There are two options for this: From ef30b58808e198a1c5fa06acbb8a6e66424a11f1 Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Tue, 12 Aug 2025 11:41:37 -0700 Subject: [PATCH 03/15] update doc --- docs/ExtensibleMaps.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ExtensibleMaps.md b/docs/ExtensibleMaps.md index 5e237e068e..383d8fe362 100644 --- a/docs/ExtensibleMaps.md +++ b/docs/ExtensibleMaps.md @@ -39,7 +39,7 @@ Even though the extensible map will be created by and reside in the extension, e Map lifetime will also be maintained by eBPFcore, and it will invoke extension's map delete API when the map needs to be finally deleted. Similarly, map pinning will also be handled by eBPFcore as that impacts map lifetime. -Another thing to note is that once an extensible map is created, the corresponding extension cannot be allowed to unload / restart, as that will delete the map and its entries. This will be a limitation / restriction for the extension that is implementing extensible maps, and may impact their servicing flow. +Another thing to note is that once an extensible map is created, the corresponding extension **cannot be allowed to unload / restart**, as that will delete the map and its entries. This will be a limitation / restriction for the extension that is implementing extensible maps, and may impact their servicing flow. ## Map creation Assuming option 1 for `Map ID partitioning`, below is the expected flow for map creation. From 3653a590199dbb18a862c244129ef8d31ca6ac03 Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Sun, 17 Aug 2025 23:54:24 -0700 Subject: [PATCH 04/15] cr comments --- docs/ExtensibleMaps.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/ExtensibleMaps.md b/docs/ExtensibleMaps.md index 383d8fe362..72435c6a3c 100644 --- a/docs/ExtensibleMaps.md +++ b/docs/ExtensibleMaps.md @@ -30,8 +30,7 @@ The NMR interface for program info provider will extended (non-breaking) and ext - This will be used when explicitly creating map from user mode. ## Verfication -- For offline verification, eBPF store will be used to provide map information to the verfier. -- For online verification (JIT / interpret), map type to program type conversion will be done using eBPF store data. Then ebpfsvc will query ebpfcore to get program information which will also provide map information. +- No impact on verfication (online or offline), as the verifier only cares about the actual map definitions. ## Map lifecycle Even though the extensible map will be created by and reside in the extension, ebpfcore will also create a corresponding map entry, as it does for the global maps. The difference being, in case of extensible maps, the map CRUD APIs will be supplied by the extension, and map entry in ebpfcore will contain these function pointers provided by the extension. From b198dd55665ae9b3bc1f959a9dc5d4ffc0908515 Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Mon, 8 Dec 2025 20:00:06 -0800 Subject: [PATCH 05/15] update design doc --- docs/ExtensibleMaps.md | 218 +++++++++++++++++++++++++++-------------- 1 file changed, 144 insertions(+), 74 deletions(-) diff --git a/docs/ExtensibleMaps.md b/docs/ExtensibleMaps.md index 72435c6a3c..afd40eb336 100644 --- a/docs/ExtensibleMaps.md +++ b/docs/ExtensibleMaps.md @@ -1,79 +1,149 @@ # Introduction -Extensible maps are program type specific maps that will be implemented by the extension that is implementing the program type (program info provider). This document contains the proposal for implementing support for extensible / program type specific maps in eBPF-for-Windows. The below sections describe all the scenarios / areas that will need to be updated or tested for this new map type. - -## Map Id partitioning -Global maps get an ID for their map types from a global namespace. There are two possible options for how we can allocate IDs for map types for extensible maps. - -**Option 1: Global Map IDs** -- The map type IDs are allocated from a global namespace. This will be disjoint from the namespace for global maps. Global maps will use IDs from 1 to 4095. Extensible maps will use IDs 4096 onwards. -- Each program info provider that implements a extensible map will need to register / reserve the MAP ID / enum in the eBPF repo by creating a PR. - -**Option 2: Per-program map ID space** -- Just like program type specific helper functions, each program type can define map type IDs, and these can be overlapping. -- This has a problem though -- Existing user mode APIs to create a map can then not be used, as program type for the map cannot be disambiguated by just the map type. This will require a new eBPF map create API that will also take the corresponding program type (GUID or ID) - -**Proposal** -Use option 1 as it allows keeping the user mode API for map creation same as on Linux, only adding a one-time step for extension developers to reserve the map ID in the global namespace (by creating a PR in eBPF repo). - -## NMR interface for extensions -The NMR interface for program info provider will extended (non-breaking) and extensions will provide below information: - - List of extensible map types it is supporting - - APIs for - - Map creation / map deletion - - Map lookup, update, delete. - -## eBPF Store -- Program info providers will now include the map types they are going to support when updating eBPF store. This should include the map type string, and the map type ID. -- eBPF store APIs will be updated to populate this information also in the registry. -- ebpfapi when loading will read the extensible map type information and create a in-memory map for `map-type : program type`. -- This will be used when explicitly creating map from user mode. - -## Verfication -- No impact on verfication (online or offline), as the verifier only cares about the actual map definitions. - -## Map lifecycle -Even though the extensible map will be created by and reside in the extension, ebpfcore will also create a corresponding map entry, as it does for the global maps. The difference being, in case of extensible maps, the map CRUD APIs will be supplied by the extension, and map entry in ebpfcore will contain these function pointers provided by the extension. - -Map lifetime will also be maintained by eBPFcore, and it will invoke extension's map delete API when the map needs to be finally deleted. -Similarly, map pinning will also be handled by eBPFcore as that impacts map lifetime. - -Another thing to note is that once an extensible map is created, the corresponding extension **cannot be allowed to unload / restart**, as that will delete the map and its entries. This will be a limitation / restriction for the extension that is implementing extensible maps, and may impact their servicing flow. - -## Map creation -Assuming option 1 for `Map ID partitioning`, below is the expected flow for map creation. - -### Explicit map creation -- App uses the existing map create APIs, and internally ebpfapi tries to find the corresponding program type from the eBPF store. -- Once it finds the program info provider, it makes a (new) ioctl call to create the extensible map, and also pass the program type. -- eBPFcore will first attach (NMR) to this provider, and check if the actual provider supports this map type. If yes, proceed to create map in the extension. - -Implicit map creation flow will also be similar. ebpf runtime will have similar flow for map creation, automatic map pinning, and map reuse. +Extensible maps are custom map types that can be implemented by eBPF extensions (e.g. BPF_MAP_TYPE_XSKMAP) through a new +NMR (Network Module Registrar) provider interface. This document contains the design for adding support for extensible +maps in eBPF-for-Windows. Extensible maps enable extensions to register and manage their own map types beyond those +provided by the core eBPF runtime. + +## NMR Interface for Extensions +A new NMR interface is added and extensions register as map info providers using NMR, similar to existing program and +hook providers. + +**New NMR Provider Interface**: `EBPF_MAP_INFO_EXTENSION_IID` + +**Provider Registration Data**: +```c +typedef struct _ebpf_map_provider_data { + ebpf_extension_header_t header; + uint32_t map_type; // Single map type per provider + ebpf_map_provider_dispatch_table_t* dispatch_table; +} ebpf_map_provider_data_t; +``` + +**Provider Dispatch Table**: +```c +typedef struct _ebpf_map_provider_dispatch_table { + ebpf_extension_header_t header; + _Notnull_ ebpf_map_create_t create_map_function; + _Notnull_ ebpf_map_delete_t delete_map_function; + _Notnull_ ebpf_map_associate_program_type_t associate_program_function; + ebpf_map_find_element_t find_element_function; + ebpf_map_update_element_t update_element_function; + ebpf_map_delete_element_t delete_element_function; + ebpf_map_get_next_key_and_value_t get_next_key_and_value_function; +} ebpf_map_provider_dispatch_table_t; +``` + +**Client Services** (provided by eBPF core): +```c +typedef struct _ebpf_map_client_dispatch_table { + ebpf_extension_header_t header; + epoch_allocate_with_tag_t epoch_allocate_with_tag; + epoch_allocate_cache_aligned_with_tag_t epoch_allocate_cache_aligned_with_tag; + epoch_free_t epoch_free; + epoch_free_cache_aligned_t epoch_free_cache_aligned; +} ebpf_map_client_dispatch_table_t; +``` + +## Map Type Enum Partitioning +Currently map type IDs are allocated from a global namespace. With extensible maps, global map type ID space is +partitioned into 2 disjoint sets: for global maps (implemented in eBPFCore) and for the extensible maps. +Global maps will use IDs from 1 to 4095. Extensible maps will use IDs 4096 onwards. + +- **Global Maps (1-4095)**: Reserved for core eBPF runtime map types (hash, array, etc.) +- **Extensible Maps (4096+)**: Available for extension-implemented custom map types + +Extensions *can* reserve unique map type IDs by submitting PRs to update the enum in the eBPF repository. + +## Map Discovery and Creation + +**Dynamic Provider Discovery**: Uses NMR's built-in discovery mechanism +- No central registry required - providers are discovered on-demand during map creation +- When a map is created with an extensible type (>= 4096), eBPF core: + 1. Creates an `ebpf_extensible_map_t` structure with NMR client characteristics + 2. Calls `NmrRegisterClient()` to find a provider for the specific map type + 3. On successful provider attachment, delegates map creation to the provider + 4. Returns map handle to user application + +**Map Creation Flow**: +``` +User calls bpf_create_map(BPF_MAP_TYPE_CUSTOM, ...) + ↓ +ebpfapi validates parameters + ↓ +ebpfcore checks if map_type >= 4096 + ↓ +ebpfcore creates extensible map with NMR client + ↓ +NMR finds and attaches to provider + ↓ +Provider creates actual map instance + ↓ +Map handle returned to user +``` + +## Verification +- No impact on verification (online or offline), as the verifier only cares about the actual map definitions. + +## Map Lifecycle +**Provider Binding**: eBPF core maintains map lifecycle and coordinates with extensions for map creation, deletion, +and othr map operations. +- eBPF core creates a corresponding map entry for each extensible map. +- Map CRUD operations are delegated to the extension via dispatch table function pointers. +- Map lifetime managed by eBPF core, including proper cleanup coordination. +- Map pinning handled by eBPF core as it impacts map lifetime. + +**Note**: Extensions with active maps cannot unload / restart until the map is deleted. ## Map CRUD APIs -### Usermode CRUD APIs -Assuming option 1 for `Map ID partitioning`, all existing APIs should be applicable for extensible maps too. - -### Map helper functions -The existing map helper functions implemented by ebpfcore will be used by the BPF programs for extensible maps also. For extensible maps, ebpfcore will redirect the calls to the extension. - -## Exposing RCU semantics to extensions -For extensions to implement maps, they will need RCU support, and eBPF needs to expose RCU / epoch logic to extensions. - -There are two options for this: - -**Export RCU as lib** -- This will simplify logic in eBPFCore -- Extensions will have their own RCU "runtime" -- This approach will require recompilation and release from extension if there is a bug in RCU lib. - -**Export RCU APIs via NMR interface** -- Probably adds more complexity to ebpfcore. -- Does not require new release from extensions wheenver there is a bugfix in RCU logic. - -**Proposal** -Proposal here is to export RCU as lib. - -## Perf Consideration -Since map APIs for extensible maps will have logner path length, we should measure perf for extensible map operations. +### User-mode APIs +**Transparent Compatibility**: All existing libbpf APIs work unchanged with extensible maps +- `bpf_create_map()` / `bpf_map_create()` - Creates extensible maps when type >= 4096 +- `bpf_map_lookup_elem()` - Lookup operations routed to provider +- `bpf_map_update_elem()` - Update operations routed to provider +- `bpf_map_delete_elem()` - Delete operations routed to provider +- `bpf_map_get_next_key()` - Iteration routed to provider + +### eBPF Helper Functions +**Transparent Helper Routing**: Map helpers automatically work with extensible maps +- `bpf_map_lookup_elem()` helper detects extensible maps and routes to provider +- `bpf_map_update_elem()` helper routes update operations to provider +- `bpf_map_delete_elem()` helper routes delete operations to provider + +### Batch Operations +**Advanced Features** (implemented in test scenarios): +- `update_batch()` - Batch update operations +- `delete_batch()` - Batch delete operations +- `lookup_and_delete_batch()` - Atomic lookup and delete operations + +## Memory Management and RCU Semantics + +Extensions require RCU support to implement a performant map. For this, eBPFCore exports epoch-based memory allocation +APIs to the extensions via NMR client dispatch table. +**Implemented Approach**: Epoch-based APIs exposed via NMR client dispatch table + +**Available Epoch APIs**: +```c +// Memory allocation with epoch tracking +epoch_allocate_with_tag_t epoch_allocate_with_tag; +epoch_allocate_cache_aligned_with_tag_t epoch_allocate_cache_aligned_with_tag; + +// Safe memory deallocation +epoch_free_t epoch_free; +epoch_free_cache_aligned_t epoch_free_cache_aligned; +``` + +**Benefits of this approach**: +- **Centralized RCU Management**: Single epoch system across core and extensions. +- **Automatic Updates**: Extensions benefit from eBPF core RCU improvements without recompilation. + +**Usage Example**: +```c +// In extension map implementation +void* map_entry = client_dispatch->epoch_allocate_with_tag( + sizeof(my_map_entry_t), EBPF_POOL_TAG_EXTENSION); + +// Safe deallocation +client_dispatch->epoch_free(map_entry); +``` From 324487298d649ef54b82fed7d5991db6039fb7c4 Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Thu, 11 Dec 2025 18:31:55 -0800 Subject: [PATCH 06/15] proposal doc --- docs/ExtensibleMaps.md | 112 ++++++++++++++++++----------------------- 1 file changed, 48 insertions(+), 64 deletions(-) diff --git a/docs/ExtensibleMaps.md b/docs/ExtensibleMaps.md index afd40eb336..5e071530af 100644 --- a/docs/ExtensibleMaps.md +++ b/docs/ExtensibleMaps.md @@ -1,13 +1,32 @@ # Introduction -Extensible maps are custom map types that can be implemented by eBPF extensions (e.g. BPF_MAP_TYPE_XSKMAP) through a new -NMR (Network Module Registrar) provider interface. This document contains the design for adding support for extensible -maps in eBPF-for-Windows. Extensible maps enable extensions to register and manage their own map types beyond those -provided by the core eBPF runtime. +This document contains proposal for adding support for extensible / custom maps. Custom maps are map types that are +program type specific, and are implemented / coordinated by the extensions. + +Extensible / custom maps are program type specific map types that can be implemented / coordinated by eBPF extensions +(e.g. BPF_MAP_TYPE_XSKMAP) through a new NMR (Network Module Registrar) provider interface. This document contains +the design for adding support for custom maps in eBPF-for-Windows. Custom maps enable extensions to register and +manage their own map types beyond those provided by the core eBPF runtime. + +Custom maps will be based on one of the underlying / base map types already implemented in eBPFCore. When extensions +declare a new custom map type, they also declare the base map type on which the custom map should be based on. +For example, BPF_MAP_TYPE_XSKMAP can be based on the existing BPF_MAP_TYPE_HASH_MAP. +With this approach, eBPFCore can implement the custom map, and extensions will provide hooks / callbacks for map +creation, and CRUD operations on the map. + +This approach has a benefit that extensions do not need to re-implement a performant, RCU-aware data structure and +can leverage the implementation that is present in eBPFCore. + +Currently, eBPF-for-Windows will only allow base map type BPF_MAP_TYPE_HASH_MAP. This support can be extended to other +base map types in future based on the requirements. + +Note: If there is a need for an extension to implement a map type that cannot be based on any existing map type, we can +extend this interface for extensions to optionally provide their whole implementation, instead of relying on a base map +type in eBPF-for-Windows. ## NMR Interface for Extensions -A new NMR interface is added and extensions register as map info providers using NMR, similar to existing program and -hook providers. +To implement custom maps, a new NMR interface will be added and extensions will register as map info providers using +NMR, similar to existing program and hook providers. **New NMR Provider Interface**: `EBPF_MAP_INFO_EXTENSION_IID` @@ -15,7 +34,8 @@ hook providers. ```c typedef struct _ebpf_map_provider_data { ebpf_extension_header_t header; - uint32_t map_type; // Single map type per provider + uint32_t custom_map_type; // Single map type per provider + uint32_t base_map_type; ebpf_map_provider_dispatch_table_t* dispatch_table; } ebpf_map_provider_data_t; ``` @@ -24,26 +44,30 @@ typedef struct _ebpf_map_provider_data { ```c typedef struct _ebpf_map_provider_dispatch_table { ebpf_extension_header_t header; - _Notnull_ ebpf_map_create_t create_map_function; - _Notnull_ ebpf_map_delete_t delete_map_function; - _Notnull_ ebpf_map_associate_program_type_t associate_program_function; - ebpf_map_find_element_t find_element_function; - ebpf_map_update_element_t update_element_function; - ebpf_map_delete_element_t delete_element_function; - ebpf_map_get_next_key_and_value_t get_next_key_and_value_function; + ebpf_map_create_notify_t create_map_notify; + ebpf_map_delete_notify_t delete_map_notify; + ebpf_map_associate_program_type_t associate_program; + ebpf_map_find_element_notify_t find_element_notify; + ebpf_map_update_element_notify_t update_element_notify; + ebpf_map_delete_element_notify_t delete_element_notify; + ebpf_map_get_next_key_and_value_notify_t get_next_key_and_value_notify; } ebpf_map_provider_dispatch_table_t; ``` +An extension (provider) needs to provide the above dispatch table. eBPF runtime will invoke the above functions in the +following scenarios: +1. **Map Creation** -- eBPF runtime will invoke create_map_notify to validate the key and value sizes, and optionally get actual value size. +2. **Map Deletion** -- eBPF runtime will invoke delete_map_notify to notify the extension that the map is being deleted. +3. **CRUD operation** -- For each CRUD operation, eBPF runtime will invoke the corresponding dispatch function to notify extension. In case of update functions, extension can optionally provide a different value to be stored in the map. + **Client Services** (provided by eBPF core): ```c typedef struct _ebpf_map_client_dispatch_table { ebpf_extension_header_t header; - epoch_allocate_with_tag_t epoch_allocate_with_tag; - epoch_allocate_cache_aligned_with_tag_t epoch_allocate_cache_aligned_with_tag; - epoch_free_t epoch_free; - epoch_free_cache_aligned_t epoch_free_cache_aligned; + ebpf_map_find_element_t find_element_function; } ebpf_map_client_dispatch_table_t; ``` +eBPF runtime will expose a *find_element_function* dispatch function that extension can use to query a map value, given the key. ## Map Type Enum Partitioning Currently map type IDs are allocated from a global namespace. With extensible maps, global map type ID space is @@ -77,7 +101,7 @@ ebpfcore creates extensible map with NMR client ↓ NMR finds and attaches to provider ↓ -Provider creates actual map instance +ebpfcore creates actual map instance ↓ Map handle returned to user ``` @@ -89,7 +113,7 @@ Map handle returned to user **Provider Binding**: eBPF core maintains map lifecycle and coordinates with extensions for map creation, deletion, and othr map operations. - eBPF core creates a corresponding map entry for each extensible map. -- Map CRUD operations are delegated to the extension via dispatch table function pointers. +- For Map CRUD operations, corresponding dispatch functions provided by the extension will be invoked. - Map lifetime managed by eBPF core, including proper cleanup coordination. - Map pinning handled by eBPF core as it impacts map lifetime. @@ -98,52 +122,12 @@ and othr map operations. ## Map CRUD APIs ### User-mode APIs -**Transparent Compatibility**: All existing libbpf APIs work unchanged with extensible maps +All existing libbpf APIs work unchanged with extensible maps - `bpf_create_map()` / `bpf_map_create()` - Creates extensible maps when type >= 4096 -- `bpf_map_lookup_elem()` - Lookup operations routed to provider -- `bpf_map_update_elem()` - Update operations routed to provider -- `bpf_map_delete_elem()` - Delete operations routed to provider -- `bpf_map_get_next_key()` - Iteration routed to provider +- Map operations - Operations are routed to extensible maps, whicn in turn invokes provider callbacks. ### eBPF Helper Functions -**Transparent Helper Routing**: Map helpers automatically work with extensible maps -- `bpf_map_lookup_elem()` helper detects extensible maps and routes to provider -- `bpf_map_update_elem()` helper routes update operations to provider -- `bpf_map_delete_elem()` helper routes delete operations to provider - -### Batch Operations -**Advanced Features** (implemented in test scenarios): -- `update_batch()` - Batch update operations -- `delete_batch()` - Batch delete operations -- `lookup_and_delete_batch()` - Atomic lookup and delete operations +As with user mode APIs, helper functions also automatically work with extensible maps in a similar manner. ## Memory Management and RCU Semantics - -Extensions require RCU support to implement a performant map. For this, eBPFCore exports epoch-based memory allocation -APIs to the extensions via NMR client dispatch table. -**Implemented Approach**: Epoch-based APIs exposed via NMR client dispatch table - -**Available Epoch APIs**: -```c -// Memory allocation with epoch tracking -epoch_allocate_with_tag_t epoch_allocate_with_tag; -epoch_allocate_cache_aligned_with_tag_t epoch_allocate_cache_aligned_with_tag; - -// Safe memory deallocation -epoch_free_t epoch_free; -epoch_free_cache_aligned_t epoch_free_cache_aligned; -``` - -**Benefits of this approach**: -- **Centralized RCU Management**: Single epoch system across core and extensions. -- **Automatic Updates**: Extensions benefit from eBPF core RCU improvements without recompilation. - -**Usage Example**: -```c -// In extension map implementation -void* map_entry = client_dispatch->epoch_allocate_with_tag( - sizeof(my_map_entry_t), EBPF_POOL_TAG_EXTENSION); - -// Safe deallocation -client_dispatch->epoch_free(map_entry); -``` +Since the actual map is implemented in eBPFCore, it will automatically use Epoch-based APIs for memory allocation. From 905838cab809bbd9c364c305cc24d115e76593c4 Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Fri, 12 Dec 2025 12:46:31 -0800 Subject: [PATCH 07/15] CR comments --- docs/ExtensibleMaps.md | 85 +++++++++++++++++++++++------------------- 1 file changed, 46 insertions(+), 39 deletions(-) diff --git a/docs/ExtensibleMaps.md b/docs/ExtensibleMaps.md index 5e071530af..9935630b01 100644 --- a/docs/ExtensibleMaps.md +++ b/docs/ExtensibleMaps.md @@ -1,21 +1,18 @@ # Introduction -This document contains proposal for adding support for extensible / custom maps. Custom maps are map types that are -program type specific, and are implemented / coordinated by the extensions. - -Extensible / custom maps are program type specific map types that can be implemented / coordinated by eBPF extensions -(e.g. BPF_MAP_TYPE_XSKMAP) through a new NMR (Network Module Registrar) provider interface. This document contains -the design for adding support for custom maps in eBPF-for-Windows. Custom maps enable extensions to register and -manage their own map types beyond those provided by the core eBPF runtime. +This document contains proposal for adding support for custom maps. Custom maps are program type specific or global +map types that can be implemented / coordinated by eBPF extensions (e.g. BPF_MAP_TYPE_XSKMAP) through a new NMR +(Network Module Registrar) provider interface. This document contains the design for adding support for custom maps +in eBPF-for-Windows. Custom maps enable extensions to register and manage their own map types beyond those provided +by the core eBPF runtime. Custom maps will be based on one of the underlying / base map types already implemented in eBPFCore. When extensions -declare a new custom map type, they also declare the base map type on which the custom map should be based on. +declare a new custom map type, they also declare the base map type on which the custom map should be based. For example, BPF_MAP_TYPE_XSKMAP can be based on the existing BPF_MAP_TYPE_HASH_MAP. With this approach, eBPFCore can implement the custom map, and extensions will provide hooks / callbacks for map creation, and CRUD operations on the map. -This approach has a benefit that extensions do not need to re-implement a performant, RCU-aware data structure and -can leverage the implementation that is present in eBPFCore. +This approach has a benefit that extensions do not need to re-implement a performant, RCU (Read-Copy-Update) aware data structure and can leverage the implementation that is present in eBPFCore. Currently, eBPF-for-Windows will only allow base map type BPF_MAP_TYPE_HASH_MAP. This support can be extended to other base map types in future based on the requirements. @@ -26,7 +23,8 @@ type in eBPF-for-Windows. ## NMR Interface for Extensions To implement custom maps, a new NMR interface will be added and extensions will register as map info providers using -NMR, similar to existing program and hook providers. +NMR, similar to existing program and hook providers. An extension implementing more than one custom maps needs to +register as a map provider once for each map type it supports. **New NMR Provider Interface**: `EBPF_MAP_INFO_EXTENSION_IID` @@ -34,7 +32,7 @@ NMR, similar to existing program and hook providers. ```c typedef struct _ebpf_map_provider_data { ebpf_extension_header_t header; - uint32_t custom_map_type; // Single map type per provider + uint32_t custom_map_type; // Single map type per provider. uint32_t base_map_type; ebpf_map_provider_dispatch_table_t* dispatch_table; } ebpf_map_provider_data_t; @@ -44,21 +42,27 @@ typedef struct _ebpf_map_provider_data { ```c typedef struct _ebpf_map_provider_dispatch_table { ebpf_extension_header_t header; - ebpf_map_create_notify_t create_map_notify; - ebpf_map_delete_notify_t delete_map_notify; - ebpf_map_associate_program_type_t associate_program; - ebpf_map_find_element_notify_t find_element_notify; - ebpf_map_update_element_notify_t update_element_notify; - ebpf_map_delete_element_notify_t delete_element_notify; - ebpf_map_get_next_key_and_value_notify_t get_next_key_and_value_notify; + _Notnull_ ebpf_process_map_create_t process_map_create; + _Notnull_ ebpf_process_map_delete_t process_map_delete; + _Notnull_ ebpf_process_map_associate_program_type_t process_map_associate_program; + ebpf_process_map_find_element_t process_map_find_element; + ebpf_process_map_update_element_t process_map_update_element; + ebpf_process_map_delete_element_t process_map_delete_element; + ebpf_process_map_get_next_key_and_value_t process_map_get_next_key_and_value; } ebpf_map_provider_dispatch_table_t; ``` -An extension (provider) needs to provide the above dispatch table. eBPF runtime will invoke the above functions in the +An extension (provider) needs to implement the above dispatch table. eBPF runtime will invoke the above functions in the following scenarios: -1. **Map Creation** -- eBPF runtime will invoke create_map_notify to validate the key and value sizes, and optionally get actual value size. -2. **Map Deletion** -- eBPF runtime will invoke delete_map_notify to notify the extension that the map is being deleted. -3. **CRUD operation** -- For each CRUD operation, eBPF runtime will invoke the corresponding dispatch function to notify extension. In case of update functions, extension can optionally provide a different value to be stored in the map. +1. **Map Creation** -- eBPF runtime will invoke `process_map_create` to validate the key and value sizes, and optionally +get actual value size. +2. **Map Deletion** -- eBPF runtime will invoke process_map_delete to notify the extension that the map is being deleted. +3. **CRUD operations** -- For each CRUD operation, eBPF runtime will invoke the corresponding dispatch function to notify extension. In case of update functions, extension can optionally provide a different value to be stored in the map. + +In the above dispatch table, `process_map_create`, `process_map_delete` and `process_map_associate_program` are required +to be non-NULL. If the extension intends to update the actual value size during map creation, other fields in the table +also need to be non-NULL, otherwise eBPFCore will fail the map creation. If extension does not intend to update the +actual value size during map creation, these fields can be optionally NULL. **Client Services** (provided by eBPF core): ```c @@ -67,26 +71,29 @@ typedef struct _ebpf_map_client_dispatch_table { ebpf_map_find_element_t find_element_function; } ebpf_map_client_dispatch_table_t; ``` -eBPF runtime will expose a *find_element_function* dispatch function that extension can use to query a map value, given the key. +eBPF runtime will expose a *find_element_function* dispatch function that extension can use to query a map +value, given the key. ## Map Type Enum Partitioning -Currently map type IDs are allocated from a global namespace. With extensible maps, global map type ID space is -partitioned into 2 disjoint sets: for global maps (implemented in eBPFCore) and for the extensible maps. -Global maps will use IDs from 1 to 4095. Extensible maps will use IDs 4096 onwards. +Currently map type IDs are allocated from a global namespace. With custom maps, global map type ID space is +partitioned into 2 disjoint sets: for global maps (implemented in eBPFCore) and for the custom maps. +Global maps will use IDs from 1 to 4095. Custom maps will use IDs 4096 onwards. - **Global Maps (1-4095)**: Reserved for core eBPF runtime map types (hash, array, etc.) -- **Extensible Maps (4096+)**: Available for extension-implemented custom map types +- **Custom Maps (4096+)**: Available for extension-implemented custom map types -Extensions *can* reserve unique map type IDs by submitting PRs to update the enum in the eBPF repository. +Note: Extensions **should** register the map types for the custom maps by creating a pull request to eBPF-for-Windows +repository and updating `ebpf_map_type_t` enum in ebpf_structs.h. This helps in any map type collision with another +extension. ## Map Discovery and Creation **Dynamic Provider Discovery**: Uses NMR's built-in discovery mechanism - No central registry required - providers are discovered on-demand during map creation -- When a map is created with an extensible type (>= 4096), eBPF core: - 1. Creates an `ebpf_extensible_map_t` structure with NMR client characteristics +- When a map is created with an custom type (>= 4096), eBPF core: + 1. Creates an `ebpf_custom_map_t` structure with NMR client characteristics 2. Calls `NmrRegisterClient()` to find a provider for the specific map type - 3. On successful provider attachment, delegates map creation to the provider + 3. On successful provider attachment, creates the custom map with the provided base map type. 4. Returns map handle to user application **Map Creation Flow**: @@ -97,9 +104,9 @@ ebpfapi validates parameters ↓ ebpfcore checks if map_type >= 4096 ↓ -ebpfcore creates extensible map with NMR client +ebpfcore registers new map NMR client for the map instance ↓ -NMR finds and attaches to provider +NMR finds and attaches to provider implementing the custom map type ↓ ebpfcore creates actual map instance ↓ @@ -112,7 +119,7 @@ Map handle returned to user ## Map Lifecycle **Provider Binding**: eBPF core maintains map lifecycle and coordinates with extensions for map creation, deletion, and othr map operations. -- eBPF core creates a corresponding map entry for each extensible map. +- eBPF core creates a corresponding map entry for each custom map. - For Map CRUD operations, corresponding dispatch functions provided by the extension will be invoked. - Map lifetime managed by eBPF core, including proper cleanup coordination. - Map pinning handled by eBPF core as it impacts map lifetime. @@ -122,12 +129,12 @@ and othr map operations. ## Map CRUD APIs ### User-mode APIs -All existing libbpf APIs work unchanged with extensible maps -- `bpf_create_map()` / `bpf_map_create()` - Creates extensible maps when type >= 4096 -- Map operations - Operations are routed to extensible maps, whicn in turn invokes provider callbacks. +All existing libbpf APIs work unchanged with custom maps +- `bpf_create_map()` / `bpf_map_create()` - Creates custom maps when type >= 4096 +- Map operations - Operations are routed to custom maps, whicn in turn invokes provider callbacks. ### eBPF Helper Functions -As with user mode APIs, helper functions also automatically work with extensible maps in a similar manner. +As with user mode APIs, helper functions also automatically work with custom maps in a similar manner. ## Memory Management and RCU Semantics Since the actual map is implemented in eBPFCore, it will automatically use Epoch-based APIs for memory allocation. From 220ed07e8d3ee97f56865eff333f7a012d70f053 Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Fri, 12 Dec 2025 12:49:27 -0800 Subject: [PATCH 08/15] CR comments --- docs/ExtensibleMaps.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ExtensibleMaps.md b/docs/ExtensibleMaps.md index 9935630b01..a8f315c8c7 100644 --- a/docs/ExtensibleMaps.md +++ b/docs/ExtensibleMaps.md @@ -90,7 +90,7 @@ extension. **Dynamic Provider Discovery**: Uses NMR's built-in discovery mechanism - No central registry required - providers are discovered on-demand during map creation -- When a map is created with an custom type (>= 4096), eBPF core: +- When a map is created with a custom type (>= 4096), eBPF core: 1. Creates an `ebpf_custom_map_t` structure with NMR client characteristics 2. Calls `NmrRegisterClient()` to find a provider for the specific map type 3. On successful provider attachment, creates the custom map with the provided base map type. From ec3688fca5b3db016b6d4b5a9ebcc389b9875072 Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Thu, 8 Jan 2026 13:13:40 -0800 Subject: [PATCH 09/15] update file name --- docs/{ExtensibleMaps.md => CustomMaps.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/{ExtensibleMaps.md => CustomMaps.md} (100%) diff --git a/docs/ExtensibleMaps.md b/docs/CustomMaps.md similarity index 100% rename from docs/ExtensibleMaps.md rename to docs/CustomMaps.md From b18e954a6d28b06feee5e8ed99a70c08b80f50e5 Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Wed, 18 Mar 2026 14:23:03 -0700 Subject: [PATCH 10/15] address cr comments, update doc --- docs/CustomMaps.md | 267 ++++++++++++++++++++++++++++++++++----------- 1 file changed, 201 insertions(+), 66 deletions(-) diff --git a/docs/CustomMaps.md b/docs/CustomMaps.md index a8f315c8c7..3d508d9adc 100644 --- a/docs/CustomMaps.md +++ b/docs/CustomMaps.md @@ -1,114 +1,235 @@ -# Introduction +# Custom Maps Design -This document contains proposal for adding support for custom maps. Custom maps are program type specific or global -map types that can be implemented / coordinated by eBPF extensions (e.g. BPF_MAP_TYPE_XSKMAP) through a new NMR -(Network Module Registrar) provider interface. This document contains the design for adding support for custom maps -in eBPF-for-Windows. Custom maps enable extensions to register and manage their own map types beyond those provided -by the core eBPF runtime. +This document describes the design for custom maps in eBPF-for-Windows. Custom maps are program type specific or global +map types that can be implemented by eBPF extensions (e.g. BPF_MAP_TYPE_XSKMAP) through a new NMR +(Network Module Registrar) provider interface. Custom maps enable extensions to register and manage their own map types +beyond those provided by the core eBPF runtime. -Custom maps will be based on one of the underlying / base map types already implemented in eBPFCore. When extensions +Custom maps are based on one of the underlying / base map types already implemented in eBPFCore. When extensions declare a new custom map type, they also declare the base map type on which the custom map should be based. -For example, BPF_MAP_TYPE_XSKMAP can be based on the existing BPF_MAP_TYPE_HASH_MAP. -With this approach, eBPFCore can implement the custom map, and extensions will provide hooks / callbacks for map -creation, and CRUD operations on the map. +For example, BPF_MAP_TYPE_XSKMAP can be based on the existing BPF_MAP_TYPE_HASH. +With this approach, eBPFCore implements the base data structure for the custom map, while extensions provide +callbacks for map creation, deletion, and CRUD operations. The base map implementation in eBPFCore is used for +storing entries; extensions do not implement their own storage. -This approach has a benefit that extensions do not need to re-implement a performant, RCU (Read-Copy-Update) aware data structure and can leverage the implementation that is present in eBPFCore. +This approach has a benefit that extensions do not need to re-implement a performant, RCU (Read-Copy-Update) aware +data structure and can leverage the implementation that is present in eBPFCore. -Currently, eBPF-for-Windows will only allow base map type BPF_MAP_TYPE_HASH_MAP. This support can be extended to other -base map types in future based on the requirements. +Currently, eBPF-for-Windows only allows base map type BPF_MAP_TYPE_HASH. This support can be extended to other +base map types in the future based on requirements. Note: If there is a need for an extension to implement a map type that cannot be based on any existing map type, we can extend this interface for extensions to optionally provide their whole implementation, instead of relying on a base map -type in eBPF-for-Windows. +type in eBPF-for-Windows. To avoid breaking changes for fully custom maps, the base map type can be set to +unspecified, and additional callback functions can be added since the struct is versioned. ## NMR Interface for Extensions -To implement custom maps, a new NMR interface will be added and extensions will register as map info providers using -NMR, similar to existing program and hook providers. An extension implementing more than one custom maps needs to -register as a map provider once for each map type it supports. + +To implement custom maps, a new NMR interface is used. Extensions register as map info providers using +NMR, similar to existing program and hook providers. An extension implementing more than one custom map type must +register a separate map provider for each map type it supports (i.e., one NMR provider registration per map type). **New NMR Provider Interface**: `EBPF_MAP_INFO_EXTENSION_IID` -**Provider Registration Data**: +### Provider Registration Data + ```c typedef struct _ebpf_map_provider_data { ebpf_extension_header_t header; - uint32_t custom_map_type; // Single map type per provider. - uint32_t base_map_type; - ebpf_map_provider_dispatch_table_t* dispatch_table; + uint32_t map_type; ///< Custom map type implemented by the provider. + uint32_t base_map_type; ///< Base map type used to implement the custom map. + ebpf_base_map_provider_properties_t* base_properties; ///< Base map provider properties. + ebpf_base_map_provider_dispatch_table_t* base_provider_table; ///< Pointer to base map provider dispatch table. } ebpf_map_provider_data_t; ``` -**Provider Dispatch Table**: +### Provider Properties + +```c +typedef struct _ebpf_base_map_provider_properties { + ebpf_extension_header_t header; + bool updates_original_value; // Whether the provider updates the original value during map operations, + // which controls whether BPF programs can perform map CRUD operations. +} ebpf_base_map_provider_properties_t; +``` + +When `updates_original_value` is set to true, the extension transforms values during CRUD operations (e.g., converting +a user-mode handle to a kernel pointer). In this case, BPF programs are **not** allowed to perform map CRUD operations +directly, because a BPF program receives a pointer to the value for in-place reads/writes, whereas the stored value +is the transformed version (e.g., a kernel pointer) that should not be directly modified. + +### Provider Dispatch Table + ```c typedef struct _ebpf_map_provider_dispatch_table { ebpf_extension_header_t header; _Notnull_ ebpf_process_map_create_t process_map_create; _Notnull_ ebpf_process_map_delete_t process_map_delete; - _Notnull_ ebpf_process_map_associate_program_type_t process_map_associate_program; + _Notnull_ ebpf_map_associate_program_type_t associate_program_function; ebpf_process_map_find_element_t process_map_find_element; - ebpf_process_map_update_element_t process_map_update_element; + ebpf_process_map_add_element_t process_map_add_element; ebpf_process_map_delete_element_t process_map_delete_element; - ebpf_process_map_get_next_key_and_value_t process_map_get_next_key_and_value; -} ebpf_map_provider_dispatch_table_t; +} ebpf_base_map_provider_dispatch_table_t; ``` -An extension (provider) needs to implement the above dispatch table. eBPF runtime will invoke the above functions in the +An extension (provider) needs to implement the above dispatch table. eBPF runtime invokes these functions in the following scenarios: -1. **Map Creation** -- eBPF runtime will invoke `process_map_create` to validate the key and value sizes, and optionally -get actual value size. -2. **Map Deletion** -- eBPF runtime will invoke process_map_delete to notify the extension that the map is being deleted. -3. **CRUD operations** -- For each CRUD operation, eBPF runtime will invoke the corresponding dispatch function to notify extension. In case of update functions, extension can optionally provide a different value to be stored in the map. -In the above dispatch table, `process_map_create`, `process_map_delete` and `process_map_associate_program` are required -to be non-NULL. If the extension intends to update the actual value size during map creation, other fields in the table -also need to be non-NULL, otherwise eBPFCore will fail the map creation. If extension does not intend to update the -actual value size during map creation, these fields can be optionally NULL. +1. **Map Creation** -- eBPF runtime invokes `process_map_create` to validate the key and value sizes, allocate a + provider-defined per-map context, and optionally return a different `actual_value_size`. When `process_map_create` + is invoked, the extension allocates a map context and returns a pointer to it (called `map_context`) back to the + eBPF runtime. Subsequent callbacks for this map receive this `map_context` as an input parameter. + +2. **Map Deletion** -- eBPF runtime invokes the `process_map_delete` callback to notify the extension that the map + is being deleted. The extension should free its per-map context. + +3. **Associate Program** -- eBPF runtime invokes `associate_program_function` before a custom map is associated with + a program. The extension can validate whether the map type is compatible with the given program type. + +4. **CRUD operations** -- For each CRUD operation, eBPF runtime invokes the corresponding dispatch function. + Extensions can optionally transform the value being stored or retrieved (see callback invocation semantics below). + +In the above dispatch table, `process_map_create`, `process_map_delete`, and `associate_program_function` are required +to be non-NULL. If the extension sets `updates_original_value` to true, the CRUD callback fields +(`process_map_find_element`, `process_map_add_element`, `process_map_delete_element`) must also be non-NULL, otherwise +eBPFCore will fail the map creation. If `updates_original_value` is false, these CRUD fields can be optionally NULL. + +**Callback Invocation Semantics:** + +- `process_map_find_element`: Called *after* reading from the base map. If the provider sets `updates_original_value` + to true, the extension can transform the retrieved value (e.g., kernel pointer → user-visible value) via `out_value` + before returning to the caller. If `updates_original_value` is false, `out_value` will be NULL. + +- `process_map_add_element`: Called *before* writing to the base map. If the provider sets `updates_original_value` + to true, the extension can transform the user-provided value (e.g., user fd → kernel pointer) via `out_value`, + which eBPFCore then stores in the base map. If `updates_original_value` is false, `out_value` will be NULL. + +- `process_map_delete_element`: Called *before* the entry is deleted from the base map. This allows the extension + to perform cleanup (e.g., releasing kernel resources). The `flags` parameter indicates the context: + `EBPF_MAP_OPERATION_UPDATE` if the delete is part of a replace operation, + `EBPF_MAP_OPERATION_MAP_CLEANUP` if the map itself is being destroyed, and + `EBPF_MAP_OPERATION_HELPER` if invoked from a BPF program. When `EBPF_MAP_OPERATION_UPDATE` or + `EBPF_MAP_OPERATION_MAP_CLEANUP` is set, the provider must not fail the deletion. + +**Example: Object Map insert flow** + +For a custom map that stores kernel objects (similar to how XSKMAP might work), the insert operation works as follows: +1. User calls `bpf_map_update_elem()` with a user-mode handle (e.g., 4-byte fd) as the value. +2. eBPFCore invokes `process_map_add_element` with the handle in `in_value`. +3. The extension validates the handle and converts it to a kernel pointer, writing it into `out_value`. +4. eBPFCore stores the kernel pointer (from `out_value`) in the underlying hash map. +5. On lookup, eBPFCore retrieves the kernel pointer and invokes `process_map_find_element`, which converts it + back to a user-visible value in `out_value`. + +This transformation pattern allows extensions to store kernel objects while exposing user-mode handles to applications. + +If the extension returns an `actual_value_size` different from the user-specified `value_size` during +`process_map_create`, CRUD callbacks are required to translate between the user-facing value format and the internal +storage format. For example: +- User declares a map with `value_size=4` (to store socket fds). +- Extension's `process_map_create` returns `actual_value_size=8` (to store kernel pointers). +- On insert: user passes 4-byte fd → `process_map_add_element` converts to 8-byte pointer → stored in base map. +- On lookup: 8-byte pointer retrieved → `process_map_find_element` converts to 4-byte fd → returned to user. + +Without these callbacks, eBPFCore cannot perform the size/format translation. + +### Client Data (provided by eBPF core) + +When eBPFCore attaches to the NMR provider, it provides the following client data to the extension: + +```c +typedef struct _ebpf_map_client_data { + ebpf_extension_header_t header; + uint64_t map_context_offset; ///< Offset within the map structure where the + ///< provider context data is stored. + ebpf_base_map_client_dispatch_table_t* base_client_table; ///< Pointer to base map client dispatch table. +} ebpf_map_client_data_t; +``` + +`map_context_offset` is provided by eBPFCore to the extension so it can retrieve its extension-specific map context +when a custom map is used in a helper function. This value is constant for all bindings from eBPFCore to the extension +for all custom map types and instances. A `MAP_CONTEXT()` macro is provided in `ebpf_extension.h` for extensions to +conveniently retrieve their map context. Extensions should validate that the returned map context is not NULL. + +### Client Dispatch Table -**Client Services** (provided by eBPF core): ```c typedef struct _ebpf_map_client_dispatch_table { ebpf_extension_header_t header; ebpf_map_find_element_t find_element_function; -} ebpf_map_client_dispatch_table_t; + ebpf_epoch_enter_t epoch_enter; + ebpf_epoch_exit_t epoch_exit; + ebpf_epoch_allocate_with_tag_t epoch_allocate_with_tag; + ebpf_epoch_allocate_cache_aligned_with_tag_t epoch_allocate_cache_aligned_with_tag; + ebpf_epoch_free_t epoch_free; + ebpf_epoch_free_cache_aligned_t epoch_free_cache_aligned; +} ebpf_base_map_client_dispatch_table_t; ``` -eBPF runtime will expose a *find_element_function* dispatch function that extension can use to query a map -value, given the key. -## Map Type Enum Partitioning -Currently map type IDs are allocated from a global namespace. With custom maps, global map type ID space is -partitioned into 2 disjoint sets: for global maps (implemented in eBPFCore) and for the custom maps. -Global maps will use IDs from 1 to 4095. Custom maps will use IDs 4096 onwards. +The client dispatch table provides: +- `find_element_function` -- Used by the extension to query a map value given a key. +- **Epoch-based memory management APIs** -- `epoch_enter`, `epoch_exit`, `epoch_allocate_with_tag`, + `epoch_allocate_cache_aligned_with_tag`, `epoch_free`, and `epoch_free_cache_aligned`. + +Provider dispatch function invocations and BPF helper function callbacks are already epoch-protected, so the epoch +memory APIs can be called directly in those contexts. If the provider uses these APIs outside those contexts, it must +call `epoch_enter` / `epoch_exit` to bracket the calls. Similarly, `find_element_function` must only be invoked +within an epoch-protected region. + +## Map Type Registration -- **Global Maps (1-4095)**: Reserved for core eBPF runtime map types (hash, array, etc.) -- **Custom Maps (4096+)**: Available for extension-implemented custom map types +Custom map types come from the same map type numbering space as the built-in maps. Extensions are **required** to +register the custom map types by creating a pull request to the eBPF-for-Windows repository and updating the +`ebpf_map_type_t` enum in `ebpf_structs.h`. This avoids any map type collision with another extension or eBPF core. -Note: Extensions **should** register the map types for the custom maps by creating a pull request to eBPF-for-Windows -repository and updating `ebpf_map_type_t` enum in ebpf_structs.h. This helps in any map type collision with another -extension. +Map creation will fail if the map type is not registered in the `ebpf_map_type_t` enum. + +**Runtime Behavior:** + +- If no provider is registered for a custom map type, NMR client registration will fail to find a provider and + map creation will return an error. + +- If multiple extensions register as providers for the same custom map type ID, the behavior is undefined — NMR + will offer all matching provider interfaces and eBPFCore will attach to the first one. This is why registering + map types via PR is mandatory: it prevents collisions through code review rather than runtime enforcement. ## Map Discovery and Creation -**Dynamic Provider Discovery**: Uses NMR's built-in discovery mechanism -- No central registry required - providers are discovered on-demand during map creation -- When a map is created with a custom type (>= 4096), eBPF core: - 1. Creates an `ebpf_custom_map_t` structure with NMR client characteristics - 2. Calls `NmrRegisterClient()` to find a provider for the specific map type - 3. On successful provider attachment, creates the custom map with the provided base map type. - 4. Returns map handle to user application +**Dynamic Provider Discovery**: Uses NMR's built-in discovery mechanism. +- Providers are discovered on-demand during map creation. +- When a custom map is created, eBPF core: + 1. Identifies the map type as a custom map type (registered in the `ebpf_map_type_t` enum). + 2. Creates an NMR client registration for the map instance. + 3. Calls `NmrRegisterClient()` to find a provider for the specific map type. + 4. On successful provider attachment, invokes `process_map_create` and creates the custom map with the provided + base map type. + 5. Returns map handle to user application. + +This also applies when maps are implicitly created (i.e., defined within an eBPF program file). The map creation +flow is the same regardless of whether the map is created explicitly via API or implicitly during program load. + +eBPFCore registers a new NMR client for each map instance. This per-instance registration serves two purposes: +1. **Discovery**: NMR locates the provider implementing the requested custom map type. +2. **Lifetime management**: The NMR binding prevents the extension from unloading while the map exists, ensuring + callbacks remain valid for the map's lifetime. + +This pattern matches the existing program-to-provider binding model used elsewhere in eBPF-for-Windows. **Map Creation Flow**: ``` -User calls bpf_create_map(BPF_MAP_TYPE_CUSTOM, ...) +User calls bpf_create_map(BPF_MAP_TYPE_, ...) ↓ ebpfapi validates parameters ↓ -ebpfcore checks if map_type >= 4096 +ebpfcore identifies this as a custom map type ↓ ebpfcore registers new map NMR client for the map instance ↓ NMR finds and attaches to provider implementing the custom map type ↓ -ebpfcore creates actual map instance +ebpfcore invokes process_map_create callback + ↓ +ebpfcore creates actual map instance using base map type ↓ Map handle returned to user ``` @@ -117,10 +238,11 @@ Map handle returned to user - No impact on verification (online or offline), as the verifier only cares about the actual map definitions. ## Map Lifecycle + **Provider Binding**: eBPF core maintains map lifecycle and coordinates with extensions for map creation, deletion, -and othr map operations. -- eBPF core creates a corresponding map entry for each custom map. -- For Map CRUD operations, corresponding dispatch functions provided by the extension will be invoked. +and other map operations. +- eBPF core creates a corresponding map entry for each custom map, using the base map type for storage. +- For map CRUD operations, corresponding dispatch functions provided by the extension will be invoked. - Map lifetime managed by eBPF core, including proper cleanup coordination. - Map pinning handled by eBPF core as it impacts map lifetime. @@ -129,12 +251,25 @@ and othr map operations. ## Map CRUD APIs ### User-mode APIs -All existing libbpf APIs work unchanged with custom maps -- `bpf_create_map()` / `bpf_map_create()` - Creates custom maps when type >= 4096 -- Map operations - Operations are routed to custom maps, whicn in turn invokes provider callbacks. +All existing libbpf APIs work unchanged with custom maps: +- `bpf_create_map()` / `bpf_map_create()` -- Creates custom maps when the type is a custom map type. +- Map operations -- Operations are routed to custom maps, which in turn invoke provider callbacks. + +When `EBPF_MAP_OPERATION_HELPER` is not set in the flags (i.e., the operation is from user mode), the caller context +is the same as the original user-mode process. This allows providers to implicitly use the handle table of the +current process when resolving parameters like file descriptors. ### eBPF Helper Functions -As with user mode APIs, helper functions also automatically work with custom maps in a similar manner. +As with user-mode APIs, eBPF helper functions also work with custom maps automatically. + +If the extension is implementing a helper function that takes a custom map as input, when the helper function is +invoked, it will **not** get the map context that it originally passed to eBPFCore. Instead, it gets a pointer to +a separate map structure that eBPFCore maintains. Using this pointer and the `map_context_offset` provided in the +`ebpf_map_client_data_t`, the extension retrieves its map context via the `MAP_CONTEXT()` macro defined in +`ebpf_extension.h`. Extensions should validate that the map context is not NULL and handle it appropriately. ## Memory Management and RCU Semantics -Since the actual map is implemented in eBPFCore, it will automatically use Epoch-based APIs for memory allocation. +Since the base map is implemented in eBPFCore, it automatically uses epoch-based APIs for memory allocation. +Extensions can also use the epoch-based memory management APIs provided in the client dispatch table for their +own allocations (e.g., allocating per-entry kernel objects). See +[Epoch based memory management](EpochBasedMemoryManagement.md) for more details. From 5148704c6faf85a903ab642a6bb0211427bad8ce Mon Sep 17 00:00:00 2001 From: Anurag Saxena <43585259+saxena-anurag@users.noreply.github.com> Date: Mon, 23 Mar 2026 08:51:19 -0700 Subject: [PATCH 11/15] Apply suggestions from code review Co-authored-by: Dave Thaler --- docs/CustomMaps.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/CustomMaps.md b/docs/CustomMaps.md index 3d508d9adc..75a133033e 100644 --- a/docs/CustomMaps.md +++ b/docs/CustomMaps.md @@ -1,7 +1,7 @@ # Custom Maps Design This document describes the design for custom maps in eBPF-for-Windows. Custom maps are program type specific or global -map types that can be implemented by eBPF extensions (e.g. BPF_MAP_TYPE_XSKMAP) through a new NMR +map types that can be implemented by eBPF extensions (e.g., BPF_MAP_TYPE_XSKMAP) through a new NMR (Network Module Registrar) provider interface. Custom maps enable extensions to register and manage their own map types beyond those provided by the core eBPF runtime. @@ -72,7 +72,7 @@ typedef struct _ebpf_map_provider_dispatch_table { } ebpf_base_map_provider_dispatch_table_t; ``` -An extension (provider) needs to implement the above dispatch table. eBPF runtime invokes these functions in the +An extension (provider) needs to implement the above dispatch table. The eBPF runtime invokes these functions in the following scenarios: 1. **Map Creation** -- eBPF runtime invokes `process_map_create` to validate the key and value sizes, allocate a From 3a1dec4e36e3172bb197e5eabf74a9651cbf1b7a Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Mon, 23 Mar 2026 11:55:05 -0700 Subject: [PATCH 12/15] address cr comments, add more details --- docs/CustomMaps.md | 231 +++++++++++++++++++++++++++++++++++++++------ 1 file changed, 201 insertions(+), 30 deletions(-) diff --git a/docs/CustomMaps.md b/docs/CustomMaps.md index 75a133033e..6bcc2e4494 100644 --- a/docs/CustomMaps.md +++ b/docs/CustomMaps.md @@ -72,46 +72,145 @@ typedef struct _ebpf_map_provider_dispatch_table { } ebpf_base_map_provider_dispatch_table_t; ``` -An extension (provider) needs to implement the above dispatch table. The eBPF runtime invokes these functions in the -following scenarios: +### Provider Dispatch Function Signatures -1. **Map Creation** -- eBPF runtime invokes `process_map_create` to validate the key and value sizes, allocate a - provider-defined per-map context, and optionally return a different `actual_value_size`. When `process_map_create` - is invoked, the extension allocates a map context and returns a pointer to it (called `map_context`) back to the - eBPF runtime. Subsequent callbacks for this map receive this `map_context` as an input parameter. +The function pointer types used in the dispatch table are defined in `ebpf_extension.h`. An extension (provider) +needs to implement the dispatch table. The eBPF runtime invokes these functions as described below. -2. **Map Deletion** -- eBPF runtime invokes the `process_map_delete` callback to notify the extension that the map - is being deleted. The extension should free its per-map context. +`process_map_create`, `process_map_delete`, and `associate_program_function` are required to be non-NULL. If the +extension sets `updates_original_value` to true, the CRUD callback fields (`process_map_find_element`, +`process_map_add_element`, `process_map_delete_element`) must also be non-NULL, otherwise eBPFCore will fail the +map creation. If `updates_original_value` is false, these CRUD fields can be optionally NULL. -3. **Associate Program** -- eBPF runtime invokes `associate_program_function` before a custom map is associated with - a program. The extension can validate whether the map type is compatible with the given program type. +--- -4. **CRUD operations** -- For each CRUD operation, eBPF runtime invokes the corresponding dispatch function. - Extensions can optionally transform the value being stored or retrieved (see callback invocation semantics below). +#### `ebpf_process_map_create_t` — Map Creation (required) -In the above dispatch table, `process_map_create`, `process_map_delete`, and `associate_program_function` are required -to be non-NULL. If the extension sets `updates_original_value` to true, the CRUD callback fields -(`process_map_find_element`, `process_map_add_element`, `process_map_delete_element`) must also be non-NULL, otherwise -eBPFCore will fail the map creation. If `updates_original_value` is false, these CRUD fields can be optionally NULL. +eBPF runtime invokes `process_map_create` to validate the key and value sizes, allocate a provider-defined per-map +context, and optionally return a different `actual_value_size`. The extension allocates a map context and returns a +pointer to it (`map_context`) back to the eBPF runtime. Subsequent callbacks for this map receive this `map_context` +as an input parameter. -**Callback Invocation Semantics:** +Note: When a map lookup happens from user mode, the value is copied into the buffer provided by the user, whereas +when a map lookup happens from a BPF program, a pointer to the value is provided to the program, and the program +can read or modify the value in place. Therefore, for maps where an extension intends to *modify* the actual value +being stored in the map, map CRUD operations from BPF programs are disallowed by the eBPF runtime. -- `process_map_find_element`: Called *after* reading from the base map. If the provider sets `updates_original_value` - to true, the extension can transform the retrieved value (e.g., kernel pointer → user-visible value) via `out_value` - before returning to the caller. If `updates_original_value` is false, `out_value` will be NULL. +```c +typedef ebpf_result_t (*ebpf_process_map_create_t)( + _In_ void* binding_context, + uint32_t map_type, + uint32_t key_size, + uint32_t value_size, + uint32_t max_entries, + _Out_ uint32_t* actual_value_size, + _Outptr_ void** map_context); +``` + +--- -- `process_map_add_element`: Called *before* writing to the base map. If the provider sets `updates_original_value` - to true, the extension can transform the user-provided value (e.g., user fd → kernel pointer) via `out_value`, - which eBPFCore then stores in the base map. If `updates_original_value` is false, `out_value` will be NULL. +#### `ebpf_process_map_delete_t` — Map Deletion (required) -- `process_map_delete_element`: Called *before* the entry is deleted from the base map. This allows the extension - to perform cleanup (e.g., releasing kernel resources). The `flags` parameter indicates the context: - `EBPF_MAP_OPERATION_UPDATE` if the delete is part of a replace operation, - `EBPF_MAP_OPERATION_MAP_CLEANUP` if the map itself is being destroyed, and - `EBPF_MAP_OPERATION_HELPER` if invoked from a BPF program. When `EBPF_MAP_OPERATION_UPDATE` or - `EBPF_MAP_OPERATION_MAP_CLEANUP` is set, the provider must not fail the deletion. +eBPF runtime invokes `process_map_delete` to notify the extension that the map is being deleted. The extension +should free its per-map context. + +```c +typedef void (*ebpf_process_map_delete_t)( + _In_ void* binding_context, + _In_ _Post_invalid_ void* map_context); +``` -**Example: Object Map insert flow** +--- + +#### `ebpf_map_associate_program_type_t` — Associate Program (required) + +eBPF runtime invokes `associate_program_function` before a custom map is associated with a program. The extension +can validate whether the map type is compatible with the given program type. + +```c +typedef ebpf_result_t (*ebpf_map_associate_program_type_t)( + _In_ void* binding_context, + _In_ void* map_context, + _In_ const ebpf_program_type_t* program_type); +``` + +--- + +#### `ebpf_process_map_find_element_t` — Find Element (optional) + +Called *after* reading from the base map. If the provider sets `updates_original_value` to true, the extension can +transform the retrieved value (e.g., kernel pointer → user-visible value) via `out_value` before returning to the +caller. If `updates_original_value` is false, `out_value` will be NULL and `out_value_size` will be 0. + +```c +typedef ebpf_result_t (*ebpf_process_map_find_element_t)( + _In_ void* binding_context, + _In_ void* map_context, + size_t key_size, + _In_reads_opt_(key_size) const uint8_t* key, + size_t in_value_size, + _In_reads_(in_value_size) const uint8_t* in_value, + size_t out_value_size, + _Out_writes_opt_(out_value_size) uint8_t* out_value, + uint32_t flags); +``` + +--- + +#### `ebpf_process_map_add_element_t` — Add/Update Element (optional) + +Called *before* writing to the base map. If the provider sets `updates_original_value` to true, the extension can +transform the user-provided value (e.g., user fd → kernel pointer) via `out_value`, which eBPFCore then stores in +the base map. If `updates_original_value` is false, `out_value` will be NULL and `out_value_size` will be 0. + +```c +typedef ebpf_result_t (*ebpf_process_map_add_element_t)( + _In_ void* binding_context, + _In_ void* map_context, + size_t key_size, + _In_reads_opt_(key_size) const uint8_t* key, + size_t in_value_size, + _In_reads_(in_value_size) const uint8_t* in_value, + size_t out_value_size, + _Out_writes_opt_(out_value_size) uint8_t* out_value, + uint32_t flags); +``` + +--- + +#### `ebpf_process_map_delete_element_t` — Delete Element (optional) + +Called *before* the entry is deleted from the base map. This allows the extension to perform cleanup (e.g., releasing +kernel resources). The `flags` parameter indicates the context: `EBPF_MAP_OPERATION_UPDATE` if the delete is part of +a replace operation, `EBPF_MAP_OPERATION_MAP_CLEANUP` if the map itself is being destroyed, and +`EBPF_MAP_OPERATION_HELPER` if invoked from a BPF program. When `EBPF_MAP_OPERATION_UPDATE` or +`EBPF_MAP_OPERATION_MAP_CLEANUP` is set, the provider must not fail the deletion. + +```c +typedef ebpf_result_t (*ebpf_process_map_delete_element_t)( + _In_ void* binding_context, + _In_ void* map_context, + size_t key_size, + _In_reads_opt_(key_size) const uint8_t* key, + size_t value_size, + _In_reads_(value_size) const uint8_t* value, + uint32_t flags); +``` + +--- + +#### Flags + +The following flags are used with the CRUD dispatch functions: +```c +#define EBPF_MAP_OPERATION_HELPER 0x01 /* Called by a BPF program. When not set, the provider + * function is called in the context of the original + * user mode process. */ +#define EBPF_MAP_OPERATION_UPDATE 0x02 /* Update operation. */ +#define EBPF_MAP_OPERATION_MAP_CLEANUP 0x04 /* Map cleanup operation. */ +``` + +#### Example: Object Map insert flow For a custom map that stores kernel objects (similar to how XSKMAP might work), the insert operation works as follows: 1. User calls `bpf_map_update_elem()` with a user-mode handle (e.g., 4-byte fd) as the value. @@ -268,6 +367,78 @@ a separate map structure that eBPFCore maintains. Using this pointer and the `ma `ebpf_map_client_data_t`, the extension retrieves its map context via the `MAP_CONTEXT()` macro defined in `ebpf_extension.h`. Extensions should validate that the map context is not NULL and handle it appropriately. +## Provider vs. Internal Dispatch Table Signatures + +The eBPF runtime maintains an internal dispatch table (`ebpf_map_metadata_table_properties_t`, defined in +[ebpf_maps.c](https://github.com/microsoft/ebpf-for-windows/blob/main/libs/execution_context/ebpf_maps.c)) +that implements the map CRUD operations for all built-in map types. This table has different function signatures +from the provider dispatch table (`ebpf_base_map_provider_dispatch_table_t`, defined in +[ebpf_extension.h](https://github.com/microsoft/ebpf-for-windows/blob/main/include/ebpf_extension.h)) +that extensions implement. The key differences are explained below. + +### Internal Dispatch Table (`ebpf_map_metadata_table_properties_t`) +```c +typedef struct _ebpf_map_metadata_table_properties { + ebpf_result_t (*create_map)( + _In_ const ebpf_map_definition_in_memory_t* map_definition, + ebpf_handle_t inner_map_handle, + _Outptr_ ebpf_core_map_t** map); + void (*delete_map)(_In_ _Post_invalid_ ebpf_core_map_t* map); + ebpf_result_t (*associate_program)( + _Inout_ ebpf_map_t* map, _In_ const ebpf_program_t* program); + ebpf_result_t (*find_entry)( + _Inout_ ebpf_core_map_t* map, _In_opt_ const uint8_t* key, + uint64_t flags, _Outptr_ uint8_t** data); + ebpf_result_t (*update_entry)( + _Inout_ ebpf_core_map_t* map, _In_opt_ const uint8_t* key, + _In_ const uint8_t* value, ebpf_map_option_t option); + ebpf_result_t (*update_entry_with_handle)( + _Inout_ ebpf_core_map_t* map, _In_ const uint8_t* key, + uintptr_t value_handle, ebpf_map_option_t option); + ebpf_result_t (*update_entry_per_cpu)( + _Inout_ ebpf_core_map_t* map, _In_ const uint8_t* key, + _In_ const uint8_t* value, ebpf_map_option_t option); + ebpf_result_t (*delete_entry)(_Inout_ ebpf_core_map_t* map, _In_ const uint8_t* key); + ebpf_result_t (*next_key_and_value)( + _Inout_ ebpf_core_map_t* map, _In_ const uint8_t* previous_key, + _Out_ uint8_t* next_key, _Inout_opt_ uint8_t** next_value); + // ... plus ring buffer / perf event array specific operations, and flags. +} ebpf_map_metadata_table_properties_t; +``` + +### Why the Signatures Differ + +The internal and provider dispatch tables serve fundamentally different roles, which necessitates different +function signatures: + +1. **Abstraction Boundary**: Internal functions operate directly on eBPF core's internal map structure + (`ebpf_core_map_t*`), giving them full access to the map's data, metadata, and configuration. Provider + functions operate through opaque context pointers (`binding_context` and `map_context`) because extensions + must not depend on or access eBPF core's internal data structures. This maintains a clean ABI boundary + and enables independent versioning of the core runtime and extensions. + +2. **Notification vs. Implementation**: Internal dispatch functions **are** the map CRUD implementation -- + they directly manipulate the underlying data structure (hash table, array, etc.). Provider dispatch + functions are **notifications** -- eBPF core performs the actual CRUD using the base map type, and + invokes the provider callbacks to allow the extension to validate, transform, or track the operation. + For example, during an update, eBPF core first calls the provider's `process_map_add_element`, then + inserts the (potentially transformed) value into the hash table. + +3. **Value Transformation (in/out pattern)**: Provider functions use an explicit `in_value` / `out_value` + pattern with separate size parameters, allowing the extension to receive the existing stored value and + optionally provide a different value to be stored. Internal functions directly read and write values + in-place since they own the memory. + +4. **Flag Semantics**: Provider functions use `EBPF_MAP_OPERATION_*` flags (`EBPF_MAP_OPERATION_HELPER`, + `EBPF_MAP_OPERATION_UPDATE`, `EBPF_MAP_OPERATION_MAP_CLEANUP`) to inform the extension about the + calling context. Internal functions use different mechanisms such as `ebpf_map_option_t` for update + semantics (ANY, NOEXIST, EXIST) and separate function pointers for per-CPU operations. + +5. **Subset of Operations**: The provider dispatch table does not include `next_key_and_value`, + `update_entry_with_handle`, `update_entry_per_cpu`, ring buffer operations, or perf event array + operations. Key enumeration and per-CPU handling are transparently managed by the base map + implementation in eBPF core, and handle-based updates are not applicable to custom map extensions. + ## Memory Management and RCU Semantics Since the base map is implemented in eBPFCore, it automatically uses epoch-based APIs for memory allocation. Extensions can also use the epoch-based memory management APIs provided in the client dispatch table for their From ecf70a99bfd1960d8161ce16b0f970d1ea1b001c Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Mon, 6 Apr 2026 13:01:28 -0700 Subject: [PATCH 13/15] update doc --- docs/CustomMaps.md | 44 +++++++++++++++++++------------------------- 1 file changed, 19 insertions(+), 25 deletions(-) diff --git a/docs/CustomMaps.md b/docs/CustomMaps.md index 6bcc2e4494..a53e275520 100644 --- a/docs/CustomMaps.md +++ b/docs/CustomMaps.md @@ -67,8 +67,8 @@ typedef struct _ebpf_map_provider_dispatch_table { _Notnull_ ebpf_process_map_delete_t process_map_delete; _Notnull_ ebpf_map_associate_program_type_t associate_program_function; ebpf_process_map_find_element_t process_map_find_element; - ebpf_process_map_add_element_t process_map_add_element; - ebpf_process_map_delete_element_t process_map_delete_element; + ebpf_preprocess_map_element_addition_t preprocess_map_element_addition; + ebpf_preprocess_map_element_deletion_t preprocess_map_element_deletion; } ebpf_base_map_provider_dispatch_table_t; ``` @@ -79,7 +79,7 @@ needs to implement the dispatch table. The eBPF runtime invokes these functions `process_map_create`, `process_map_delete`, and `associate_program_function` are required to be non-NULL. If the extension sets `updates_original_value` to true, the CRUD callback fields (`process_map_find_element`, -`process_map_add_element`, `process_map_delete_element`) must also be non-NULL, otherwise eBPFCore will fail the +`preprocess_map_element_addition`, `preprocess_map_element_deletion`) must also be non-NULL, otherwise eBPFCore will fail the map creation. If `updates_original_value` is false, these CRUD fields can be optionally NULL. --- @@ -157,14 +157,14 @@ typedef ebpf_result_t (*ebpf_process_map_find_element_t)( --- -#### `ebpf_process_map_add_element_t` — Add/Update Element (optional) +#### `ebpf_preprocess_map_element_addition_t` — Add/Update Element (optional) Called *before* writing to the base map. If the provider sets `updates_original_value` to true, the extension can transform the user-provided value (e.g., user fd → kernel pointer) via `out_value`, which eBPFCore then stores in the base map. If `updates_original_value` is false, `out_value` will be NULL and `out_value_size` will be 0. ```c -typedef ebpf_result_t (*ebpf_process_map_add_element_t)( +typedef ebpf_result_t (*ebpf_preprocess_map_element_addition_t)( _In_ void* binding_context, _In_ void* map_context, size_t key_size, @@ -178,7 +178,7 @@ typedef ebpf_result_t (*ebpf_process_map_add_element_t)( --- -#### `ebpf_process_map_delete_element_t` — Delete Element (optional) +#### `ebpf_preprocess_map_element_deletion_t` — Delete Element (optional) Called *before* the entry is deleted from the base map. This allows the extension to perform cleanup (e.g., releasing kernel resources). The `flags` parameter indicates the context: `EBPF_MAP_OPERATION_UPDATE` if the delete is part of @@ -187,7 +187,7 @@ a replace operation, `EBPF_MAP_OPERATION_MAP_CLEANUP` if the map itself is being `EBPF_MAP_OPERATION_MAP_CLEANUP` is set, the provider must not fail the deletion. ```c -typedef ebpf_result_t (*ebpf_process_map_delete_element_t)( +typedef ebpf_result_t (*ebpf_preprocess_map_element_deletion_t)( _In_ void* binding_context, _In_ void* map_context, size_t key_size, @@ -214,7 +214,7 @@ The following flags are used with the CRUD dispatch functions: For a custom map that stores kernel objects (similar to how XSKMAP might work), the insert operation works as follows: 1. User calls `bpf_map_update_elem()` with a user-mode handle (e.g., 4-byte fd) as the value. -2. eBPFCore invokes `process_map_add_element` with the handle in `in_value`. +2. eBPFCore invokes `preprocess_map_element_addition` with the handle in `in_value`. 3. The extension validates the handle and converts it to a kernel pointer, writing it into `out_value`. 4. eBPFCore stores the kernel pointer (from `out_value`) in the underlying hash map. 5. On lookup, eBPFCore retrieves the kernel pointer and invokes `process_map_find_element`, which converts it @@ -227,7 +227,7 @@ If the extension returns an `actual_value_size` different from the user-specifie storage format. For example: - User declares a map with `value_size=4` (to store socket fds). - Extension's `process_map_create` returns `actual_value_size=8` (to store kernel pointers). -- On insert: user passes 4-byte fd → `process_map_add_element` converts to 8-byte pointer → stored in base map. +- On insert: user passes 4-byte fd → `preprocess_map_element_addition` converts to 8-byte pointer → stored in base map. - On lookup: 8-byte pointer retrieved → `process_map_find_element` converts to 4-byte fd → returned to user. Without these callbacks, eBPFCore cannot perform the size/format translation. @@ -406,35 +406,29 @@ typedef struct _ebpf_map_metadata_table_properties { } ebpf_map_metadata_table_properties_t; ``` -### Why the Signatures Differ +### Differences in Signatures -The internal and provider dispatch tables serve fundamentally different roles, which necessitates different -function signatures: +This section enumrates the differences in the custom map provider dispatch tables, and the intertnal dispatch table +used by eBPF core for implementing native maps. -1. **Abstraction Boundary**: Internal functions operate directly on eBPF core's internal map structure - (`ebpf_core_map_t*`), giving them full access to the map's data, metadata, and configuration. Provider - functions operate through opaque context pointers (`binding_context` and `map_context`) because extensions - must not depend on or access eBPF core's internal data structures. This maintains a clean ABI boundary - and enables independent versioning of the core runtime and extensions. - -2. **Notification vs. Implementation**: Internal dispatch functions **are** the map CRUD implementation -- +1. **Notification vs. Implementation**: Internal dispatch functions **are** the map CRUD implementation -- they directly manipulate the underlying data structure (hash table, array, etc.). Provider dispatch - functions are **notifications** -- eBPF core performs the actual CRUD using the base map type, and - invokes the provider callbacks to allow the extension to validate, transform, or track the operation. - For example, during an update, eBPF core first calls the provider's `process_map_add_element`, then + functions act like hooks in the actual map CRUD impelemtation -- eBPF core performs the actual CRUD using the base + map type, and invokes the provider callbacks to allow the extension to validate, transform, or track the operation. + For example, during an update, eBPF core first calls the provider's `preprocess_map_element_addition`, then inserts the (potentially transformed) value into the hash table. -3. **Value Transformation (in/out pattern)**: Provider functions use an explicit `in_value` / `out_value` +2. **Value Transformation (in/out pattern)**: Provider functions use an explicit `in_value` / `out_value` pattern with separate size parameters, allowing the extension to receive the existing stored value and optionally provide a different value to be stored. Internal functions directly read and write values in-place since they own the memory. -4. **Flag Semantics**: Provider functions use `EBPF_MAP_OPERATION_*` flags (`EBPF_MAP_OPERATION_HELPER`, +3. **Flag Semantics**: Provider functions use `EBPF_MAP_OPERATION_*` flags (`EBPF_MAP_OPERATION_HELPER`, `EBPF_MAP_OPERATION_UPDATE`, `EBPF_MAP_OPERATION_MAP_CLEANUP`) to inform the extension about the calling context. Internal functions use different mechanisms such as `ebpf_map_option_t` for update semantics (ANY, NOEXIST, EXIST) and separate function pointers for per-CPU operations. -5. **Subset of Operations**: The provider dispatch table does not include `next_key_and_value`, +4. **Subset of Operations**: The provider dispatch table does not include `next_key_and_value`, `update_entry_with_handle`, `update_entry_per_cpu`, ring buffer operations, or perf event array operations. Key enumeration and per-CPU handling are transparently managed by the base map implementation in eBPF core, and handle-based updates are not applicable to custom map extensions. From 08945c69e10b81a6199f0e318fd909a0a84ff60f Mon Sep 17 00:00:00 2001 From: Anurag Saxena Date: Mon, 13 Apr 2026 23:49:33 -0700 Subject: [PATCH 14/15] cr comments --- docs/CustomMaps.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/docs/CustomMaps.md b/docs/CustomMaps.md index a53e275520..a8bd6ce3e5 100644 --- a/docs/CustomMaps.md +++ b/docs/CustomMaps.md @@ -63,10 +63,10 @@ is the transformed version (e.g., a kernel pointer) that should not be directly ```c typedef struct _ebpf_map_provider_dispatch_table { ebpf_extension_header_t header; - _Notnull_ ebpf_process_map_create_t process_map_create; - _Notnull_ ebpf_process_map_delete_t process_map_delete; - _Notnull_ ebpf_map_associate_program_type_t associate_program_function; - ebpf_process_map_find_element_t process_map_find_element; + _Notnull_ ebpf_preprocess_map_create_t preprocess_map_create; + _Notnull_ ebpf_postprocess_map_delete_t postprocess_map_delete; + _Notnull_ ebpf_preprocess_map_associate_program_type_t preprocess_associate_program_type; + ebpf_postprocess_map_find_element_t postprocess_map_find_element; ebpf_preprocess_map_element_addition_t preprocess_map_element_addition; ebpf_preprocess_map_element_deletion_t preprocess_map_element_deletion; } ebpf_base_map_provider_dispatch_table_t; @@ -77,14 +77,14 @@ typedef struct _ebpf_map_provider_dispatch_table { The function pointer types used in the dispatch table are defined in `ebpf_extension.h`. An extension (provider) needs to implement the dispatch table. The eBPF runtime invokes these functions as described below. -`process_map_create`, `process_map_delete`, and `associate_program_function` are required to be non-NULL. If the -extension sets `updates_original_value` to true, the CRUD callback fields (`process_map_find_element`, +`preprocess_map_create`, `postprocess_map_delete`, and `preprocess_associate_program_type` are required to be non-NULL. If the +extension sets `updates_original_value` to true, the CRUD callback fields (`postprocess_map_find_element`, `preprocess_map_element_addition`, `preprocess_map_element_deletion`) must also be non-NULL, otherwise eBPFCore will fail the map creation. If `updates_original_value` is false, these CRUD fields can be optionally NULL. --- -#### `ebpf_process_map_create_t` — Map Creation (required) +#### `ebpf_preprocess_map_create_t` — Map Creation (required) eBPF runtime invokes `process_map_create` to validate the key and value sizes, allocate a provider-defined per-map context, and optionally return a different `actual_value_size`. The extension allocates a map context and returns a @@ -97,7 +97,7 @@ can read or modify the value in place. Therefore, for maps where an extension in being stored in the map, map CRUD operations from BPF programs are disallowed by the eBPF runtime. ```c -typedef ebpf_result_t (*ebpf_process_map_create_t)( +typedef ebpf_result_t (*ebpf_preprocess_map_create_t)( _In_ void* binding_context, uint32_t map_type, uint32_t key_size, @@ -109,26 +109,26 @@ typedef ebpf_result_t (*ebpf_process_map_create_t)( --- -#### `ebpf_process_map_delete_t` — Map Deletion (required) +#### `ebpf_postprocess_map_delete_t` — Map Deletion (required) eBPF runtime invokes `process_map_delete` to notify the extension that the map is being deleted. The extension should free its per-map context. ```c -typedef void (*ebpf_process_map_delete_t)( +typedef void (*ebpf_postprocess_map_delete_t)( _In_ void* binding_context, _In_ _Post_invalid_ void* map_context); ``` --- -#### `ebpf_map_associate_program_type_t` — Associate Program (required) +#### `ebpf_preprocess_map_associate_program_type_t` — Associate Program (required) eBPF runtime invokes `associate_program_function` before a custom map is associated with a program. The extension can validate whether the map type is compatible with the given program type. ```c -typedef ebpf_result_t (*ebpf_map_associate_program_type_t)( +typedef ebpf_result_t (*ebpf_preprocess_map_associate_program_type_t)( _In_ void* binding_context, _In_ void* map_context, _In_ const ebpf_program_type_t* program_type); @@ -136,14 +136,14 @@ typedef ebpf_result_t (*ebpf_map_associate_program_type_t)( --- -#### `ebpf_process_map_find_element_t` — Find Element (optional) +#### `ebpf_postprocess_map_find_element_t` — Find Element (optional) Called *after* reading from the base map. If the provider sets `updates_original_value` to true, the extension can transform the retrieved value (e.g., kernel pointer → user-visible value) via `out_value` before returning to the caller. If `updates_original_value` is false, `out_value` will be NULL and `out_value_size` will be 0. ```c -typedef ebpf_result_t (*ebpf_process_map_find_element_t)( +typedef ebpf_result_t (*ebpf_postprocess_map_find_element_t)( _In_ void* binding_context, _In_ void* map_context, size_t key_size, From 43dc3840d8db327838bab286af40197c46b8d0e5 Mon Sep 17 00:00:00 2001 From: Anurag Saxena <43585259+saxena-anurag@users.noreply.github.com> Date: Tue, 14 Apr 2026 11:20:48 -0700 Subject: [PATCH 15/15] Update docs/CustomMaps.md Co-authored-by: Dave Thaler --- docs/CustomMaps.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/CustomMaps.md b/docs/CustomMaps.md index a8bd6ce3e5..e7f35035d5 100644 --- a/docs/CustomMaps.md +++ b/docs/CustomMaps.md @@ -111,7 +111,7 @@ typedef ebpf_result_t (*ebpf_preprocess_map_create_t)( #### `ebpf_postprocess_map_delete_t` — Map Deletion (required) -eBPF runtime invokes `process_map_delete` to notify the extension that the map is being deleted. The extension +eBPF runtime invokes `process_map_delete` to notify the extension that the map has been deleted. The extension should free its per-map context. ```c