Corentin xcpng 9/xenserver rebased patchqueue by CocoElAbricot · Pull Request #1 · xcp-ng/linux

CocoElAbricot · 2026-03-19T15:47:43Z

First rebase of XS9-preview patchqueue on top of linux 6.12.74

casasnovas · 2026-03-19T16:43:24Z

Hey Corentin,

This is a big chunk of work, thanks for diving into this!

As we discussed on matrix, I think it is worth re-doing the rebase following the practices laid out on:

This will allow you to record information on conflict resolutions, dropped commits, and also double check that your changes build properly, so that the review can be much smoother.

I had started reviewing the branch, so I will paste my unfinished review comments here in case they are helpful when you re-do your work on top of the UEK8 branch:

8c2cd4e net: gro: move L3 flush checks to tcp_gro_receive and udp_gro_receive_segment

Dropped because upstream commit 4b0ebbc ("net: gro: move L3 flush
checks to tcp_gro_receive and udp_gro_receive_segment") is ancestor of your
rebased branch, ack.

f3240e2 x86/xen/time: Reduce Xen timer tick

Dropped because upstream commit ad16248 ("x86/xen/time: Reduce Xen
timer tick") is ancestor of your branch, ack.

97e223a sched/fair: Bump sd->max_newidle_lb_cost when newidle balance fails

Sould be dropped because upstream commit fc42892 ("sched/fair: Bump
sd->max_newidle_lb_cost when newidle balance fails") is an ancestor and the original
commit was reverted in 8134361 ("sched/fair: Revert max_newidle_lb_cost bump")

d175d44 disable-mitigations-by-default

I am assuming XenServer folks disabled those mitigations because in the
context of the dom0 kernel they don't apply and incur an unwanted
performance cost?

What's the rationale for not rebasing it?

2d854a0 expose-xsversion

Dropped, though a comment on why would be good for the record. I agree
this is likely not needed, but maybe worth checking it's not used by XAPI
for example.

d076860 blkback-kthread-pid

Conflicts in your rebased branch caused by 217759b ("xen: port block
device access to file"), all good.

1764637 tg3-alloc-repeat
a81ba45 net: Do not scrub ignore_df within the same name space

Why is this one missing on your rebased branch? It applies cleanly.

427d6df enable-fragmention-gre-packets

--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -525,8 +525,8 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev,

        gre_build_header(skb, tunnel_hlen, flags, proto,
                         tunnel_id_to_key32(tun_info->key.tun_id),
-                        test_bit(IP_TUNNEL_SEQ_BIT, flags) ?
-                        htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0);
+                        test_bit(IP_TUNNEL_SEQ_BIT, flags) ? htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0);
+       skb->ignore_df = 1;

Nit: it would have been nice to not change the coding styles of lines where
you had conflicts for easier review that they haven't changed in, here the
only real change should be the extra line with ignore_df = 1.

a8bbfd1 call-kexec-before-offlining-noncrashing-cpus

--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -361,6 +367,13 @@ void panic(const char *fmt, ...)
         */
        kgdb_panic(buf);

+#ifdef CONFIG_XEN
+       if (xen_initial_domain()) {
+               xen_kexec_exec_t xke = { KEXEC_TYPE_CRASH };
+               if (HYPERVISOR_kexec_op(KEXEC_CMD_kexec, &xke))
+       pr_warn("HYPERVISOR_kexec_op failed\n");
+       }

It looks like you added a change from XenServer with the extra conditional
plus pr_warn(), why?

bb88d6e lpfc-fallback-to-sli-2

The driver changed significantly enough from the base branch that I
understand it's been dropped. On top of that the original author says it
is temporary band-aid to make a specific device work, so probably not
needed.

97379fb fix-hypercall-preemption

Why are you adding back all this code to common.c in your rebased commit
df7c38a ("fix-hypercall-preemption")? The code has been moved into
arch/x86/xen/enlighten_pv.c in d30f467 ("x86/xen: Move Xen upcall
handler").

I'm surprised this would build fine as we have multiple global symbols with
the same name that will get linked into vmlinux. And the one in
enlighten_pv.c does not contain the fix.

Now, if you really needed to add a bunch of code as a pre-requisite to a
commit you are rebasing/cherry-picking, I think it would be worth isolating
in it in their own commit with a good description.

b6653be skip cpuidle driver init if cpuidle function disabled

--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -407,6 +407,18 @@ struct klp_modinfo {
 };
 #endif

+#ifdef CONFIG_KALLSYMS
+/*
+ * /sys/module/foo/notes/.section.name gives contents of SHT_NOTE sections.
+ */
+
+struct module_notes_attrs {
+       struct kobject *dir;
+       unsigned int notes;
+       struct bin_attribute attrs[];
+};
+#endif
+

In the above hunk, please keep the annotation __counted_by(notes) which
was present on the original struct (removed in a subsequent hunk), as it
allows runtime bounds checking on flexible arrays, which is something we
might want to enable if our dom0 kernel if not already enabled.

f0c7aa5 CA-392853: fix kdump kernel cannot find ACPI RSDP
d2100f8 CA-415346: export module symbol offsets
669c430 x86,xen: correct dma_get_required_mask() for Xen PV guests
7f78a04 pci: export pci_probe_reset_function()
e16408f xen-pciback: provide a "reset" sysfs file to try harder at an SBR

The original code patched pcistub_device_release and
pcistub_put_pci_dev when yours is patching another function.

Upon closer look, upstream commit 770352e ("xen-pciback: provide a
"reset" sysfs file to try harder at an SBR") already is an ancestor of your
branch, so you should drop this rebased commit.

toolstack. Relative to the VBD, the xenstore key is: queue-NNN/kthread-pid where NNN is the queue number starting from 0. Signed-off-by: Ross Lagerwall <[email protected]>

interface. These may fail if the interface is reconfigured some time after boot due to external memory fragmentation. These buffers must be contiguous (for DMA) and they cannot be pre-allocated at boot because they are dependent on the number of queues, which is configurable (apart from wasting memory if the interface is not up). Use the __GFP_RETRY_MAYFAIL flag when performing these allocations so that direct reclaim and compaction has a better chance of working, making the allocations less likely to fail. Signed-off-by: Ross Lagerwall <[email protected]>

Commit 239c78d ("net: clear local_df when passing skb between namespaces") cleared local_df/ignore_df for tunnels. This broke, last resort, fragmentation of GRE tunneled packets. Clear ignore_df, only when packet is being injected to another namespace. Signed-off-by: Anoob Soman <[email protected]>

CocoElAbricot · 2026-03-26T14:58:37Z

Het @casasnovas Here is the updated patchqueue on top of UEK8/u1
The original patchqueue is in the branch kernel/xenserver-6.6.98-19/base

Dropped commits:
commit b3c34f5ee600 net: gro: move L3 flush checks to tcp_gro_receive and udp_gro_receive_segment
already upstream in: 4b0ebbc

commit 97e223a sched/fair: Bump sd->max_newidle_lb_cost when newidle balance fails
reverted in upstream by: 8134361 ("sched/fair: Revert max_newidle_lb_cost bump")

commit f3240e2 x86/xen/time: Reduce Xen timer tick
Dropped because upstream commit ad16248 ("x86/xen/time: Reduce Xen timer tick")

commit bb88d6e lpfc-fallback-to-sli-2
The driver changed significantly enough from the base branch that I
understand it's been dropped. On top of that the original author says it
is temporary band-aid to make a specific device work, so probably not
needed.

commits 02f7f1d, b152f10, e0fc31b, de9285f, ff47f40 and ac5a534 are part of GFS2 related patchset, have been dropped because XCP-ng don't support GFS2

casasnovas

Nice work, my early comments inlined.

Can you also explain what sort of testing you did so far (build tested + dom0 boot with this kernel)?

net/ipv4/ip_gre.c

arch/x86/xen/enlighten_pv.c

casasnovas · 2026-03-26T15:36:38Z

arch/x86/xen/enlighten_pv.c

 	bool inhcall = __this_cpu_read(xen_in_preemptible_hcall);

-	__this_cpu_write(xen_in_preemptible_hcall, false);
+	raw_cpu_write(xen_in_preemptible_hcall, false);


Can you please explain why you made this change here? I don't find it in the original XS patch.

Removed This changed to match the original patch

I wasn't necessarily saying it should be reverted, it was more to understand the rationale as maybe it is indeed required, but we need to have an audit trail of why we thought it was.

casasnovas · 2026-03-26T15:38:33Z

kernel/module/sysfs.c

+	 * The caller checked the pos and count against our size.
+	 */
+	memcpy(buf, bin_attr->private + pos, count);
+	return count;


I think you inadvertently added this module_notes_read function from your conflict resolution.

Removed module_notes_read function

Nit: I think there's now one extra new line I think (I am being picky so that we don't get unnecessary changes to decrease likeliness of conflicts with future patches).

now it should be correct

casasnovas · 2026-03-26T15:43:59Z

drivers/xen/xen-pciback/pci_stub.c

+
+	list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
+		if (pdev != dev && (!pdev->driver
+		    || strcmp(pdev->driver->name, "pciback")))


You've altered the indentation from original commit here.

It still isn't right and changed from the original commit AFAICS.

It's now correct

drivers/xen/xen-pciback/pci_stub.c

casasnovas · 2026-03-26T15:53:58Z

drivers/xen/xen-pciback/pci_stub.c

+		if (dev->quarantined)
+			dev->unsafe = 0;
+		if (!xen_pv_domain())
+			xen_reset_device(dev);


In your extra commit description you mention you've inlined these, but can you explain why?

After some digging and AIUI, upstream commit 88801d0 ("xen/pci: Add a function to reset device for xen") slightly reworked how to do a pci_reset and you're trying to mimic it here - why not simply calling pcistub_reset_device_state directly?

Also I think your inlining is not correct, the conditional for !xen_pv_domain() calling xen_reset_device() was called unconditionally, not only when the return of __pci_reset_function_locked is zero (so using the original helper instead of deleting + inlining here should also fix this manual error).

As an extra suggestion, you could also rename pcistub_reset_device_state, to be defensive against future addition of calls to that function, i.e. by renaming it, we'll get build errors allowing us to call our own wrappers or decide to call the original one, but it won't go silently unnoticed.

I have changed __pcistub_reset_function() and add a call to pcistub_reset_device_state() instead of deleting it.

casasnovas · 2026-03-26T16:27:55Z

kernel/module/Kconfig

 menuconfig MODULES
 	bool "Enable loadable module support"
 	modules
-	select EXECMEM


Why do you remove this select?

This was a mistake, reverted the change.

casasnovas · 2026-03-26T16:35:35Z

Het @casasnovas Here is the updated patchqueue on top of UEK8/u1
…
Dropped commits:
...

What about these two?

1e245e7 ("pv-iommu-support.patch")
664ea14 ("Reserve the crashkernel region in the E820. This prevents oustanding DMA")

In:

7f2b042 ("pciback-mask-root-port-comp-timeout.patch")

Can you prefix your own commit descriptions with your name and enclose them in square brackets? Also, please wrap the lines so they are more readable :)

Same comment for:

75761b9 ("xen-pciback: provide a "reset" sysfs file to try harder at an SBR")
9dcf6f3 ("CA-415346: export module symbol offsets")
b6142b9 ("At some point since kernel 4.19 the preemption of hypercalls has fallen into a s…")
6de9272 ("Enable lockdown integrity by default")
9b7239b ("Filter and check hypercalls from userspace")
1ebe51c ("module-hash-revocation.patch") and
157ef6b ("CP-46343: common data structure padding")

where you're using different styles/conventions.

As another general comment when you add extra information to the commit description, explain the context and the why rather than the what (the what is usually already visible from the diff, but your intent and the information you had at the time you made a decision are important, like referencing upstream commits that lead to your conflict).

6de9272 ("Enable lockdown integrity by default")

A reference to the commit changing the call_int_hook would have been handy during review, please add it to your commit description: 260017f ("lsm: use default hook return value in call_int_hook()").

Example extra commit description ticking all boxes for me:

[ Corentin: rebase from XS-9 kernel (v6.6.98 based) to UEK8 kernel (v6.12 based):
- 260017f31a8c `("lsm: use default hook return value in call_int_hook()") simplified the `call_int_hook`
macro, so align our usage in `security_lock_kernel_down()`]

157ef6b ("CP-46343: common data structure padding")

After giving it a bit of thought, we should probably go all-in with the UEK way, so maybe converting the xs_padding pointers to the UEK macros so that we have a consistent way to add padding bytes would be prefered.

I haven't had a chance to check each hunk and missed xs_padding to make sure there is a UEK one, will do when we agree on how to go forward.

fallen into a state of disrepair, partly due to dead code, partly as a result of the dynamic preemption feature whereby kernel preemption mode is selected at kernel boot time. This patch re-enables preemptible hypercalls and will do the following: 1) Update specific feature selection from !PREEMPTION to !PREEMPT. The following table shows the relationship between different preemption features and their indicators/selectors (Y = "=Y", N = "is not set", . = absent): | np-s | np-d | vp-s | vp-d | fp-s | fp-d CONFIG_PREEMPT_DYNAMIC N Y N Y N Y CONFIG_PREEMPTION . Y . Y Y Y CONFIG_PREEMPT N N N N Y Y CONFIG_PREEMPT_VOLUNTARY N N Y Y N N CONFIG_PREEMPT_NONE Y Y N N N N Unless PREEMPT is set, we need to enable the fixes. 2) Update flag access from __this_cpu_XXX() to raw_cpu_XXX(). The long-running hypercalls are flagged by way of a per-cpu variable which is set before and cleared after the relevant calls. This elicits a warning "BUG: using __this_cpu_write() in preemptible [00000000] code", but xen_pv_evtchn_do_upcall() deals specifically with this. 3) Update irqentry_exit_cond_resched() to raw_irqentry_exit_cond_resched(). The code will call irqentry_exit_cond_resched() if the flag (as noted above) is set, but the dynamic preemption feature will livepatch that function to a no-op unless full preemption is selected. The code is therefore updated to call raw_irqentry_exit_cond_resched(). [ Corentin: rebase from XS-9 kernel (v6.6.98 based) to UEK8 kernel (v6.12 based): following commit d30f467 ("x86/xen: Move Xen upcall handler"), this code now lives in arch/x86/xen/enlighten_pv.c. The fix is applied there rather than arch/x86/entry/common.c where it originally resided.]

When kernel runs as Xen PV dom0, it does not have access to the required MSRs and so cpuidle is explicitly disabled in xen_arch_setup(). Skip intel_idle initialization early if cpuidle is disabled to avoid spurious warnings reading MSRs. [ 11.706584] <TASK> [ 11.706629] ? ex_handler_msr+0x128/0x140 [ 11.706716] ? fixup_exception+0x85/0x340 [ 11.706803] ? exc_general_protection+0x136/0x400 [ 11.706907] ? asm_exc_general_protection+0x22/0x30 [ 11.707055] ? xen_do_read_msr+0x81/0xa0 [ 11.707182] xen_read_msr+0x20/0x30 [ 11.707340] intel_idle_init+0xa5c/0xc90 [ 11.707642] ? _raw_spin_unlock_irqrestore+0x18/0x20 [ 11.708159] ? __pfx_intel_idle_init+0x10/0x10 [ 11.708451] do_one_initcall+0x41/0x2e0 [ 11.708580] kernel_init_freeable+0x1c0/0x2c0 [ 11.708748] ? __pfx_kernel_init+0x10/0x10 [ 11.709013] kernel_init+0x16/0x1b0 [ 11.709193] ret_from_fork+0x2d/0x50 [ 11.709389] ? __pfx_kernel_init+0x10/0x10 [ 11.709656] ret_from_fork_asm+0x1b/0x30 [ 11.709844] </TASK> Signed-off-by: Chunjie Zhu <[email protected]>

kexec/kdump must pass acpi_rsdp (physical address of ACPI RSDP table) to the crash kernel, especially in EFI case, otherwise, the crash kernel fails to locate ACPI RSDP table. Consequently, ACPI init runs into error, and APIC mmio read page fault happens, finally crash kernel gets stuck. Signed-off-by: Chunjie Zhu <[email protected]>

CocoElAbricot · 2026-03-27T14:45:48Z

I have updated the commits description to have uniform style.

1e245e7 ("pv-iommu-support.patch")
664ea14 ("Reserve the crashkernel region in the E820. This prevents oustanding DMA")
These 2 commits are part of Xenserver specific PVIOMMU patchset, as discussed with the team earlier they are not needed in XCP-ng

crashdump needs it to navigate around vmcore to dump driver information Signed-off-by: Chunjie Zhu <[email protected]> [ Corentin: rebase from XS-9 kernel (v6.6.98 based) to UEK8 kernel (v6.12 based): re-introduce __counted_by(notes) annotation which was present on the original struct (removed in a subsequent hunk), as it allows runtime bounds checking on flexible arrays]

On systems where DMA addresses and physical addresses are not 1:1 (such as Xen PV guests), the generic dma_get_required_mask() will not return the correct mask (since the xen swiotlb does not provide a get_required_mask callback, it falls back to using 32 bits). Some device drivers (such as mpt3sas) use dma_get_required_mask() to set device DMA masks to allow them to use only 32-bit DMA addresses in hardware structures. This results in unnecessary use of the SWIOTLB if DMA addresses are more than 32-bits, impacting performance significantly. Under Xen, the required DMA mask can then always be set to 64-bits. Signed-off-by: David Vrabel <[email protected]>

The xen-pciback needs to test if a per-function reset is available so it can provide an interface for a safe bus reset instead. Signed-off-by: David Vrabel <[email protected]>

…g at any time. Any access by Dom0, DomU or Xen to the PCI device may cause a Fatal PCI Express error. Root ports can be configured to signal BIOS code that such an error has occured and the BIOS code can injected NMI interrupts after detecting the Fatal PCI Express error. The injected NMI interrupt may lead to a host crash as Xen may not be able to determine the cause of the NMI interrupt. Thus a guest with a PCI device passed through may be able to trigger a host crash. Disabling Fatal Error reporting in the PCI Express root port prevents the error from propogating to system firmware and removes that attack vector for triggering host crashes.

some devices to fallback to an alternate reset method. Signed-off-by: Ross Lagerwall <[email protected]>

"slabs", each of which is physically contiguous, in 32-bit address space, to service DMA for devices that are limited to a 32-bit address space. Ideally we want a total of 128M (a separate patch updates this from the kernel default of 64M), but we are content with less, and will retry three times, halving the allocation each time, before giving up. A unique Xen problem results from memory being exchanged with the hypervisor, whereby the kernel offers an amount of memory to the hypervisor, and the hypervisor returns different memory, in this case contiguous, low-address memory. This enables VMs (including dom0) to remain of the exact same size, which has significant benefits in terms of VM and memory management. If, however, not enough of the desired memory is available, the hypervisor will return what it's got, and we have a partial success. In this case retrying is guaranteed to fail, since the hypervisor has already exchanged everything it can, and we would end up giving up. To avoid this we record how much memory we actually got by way of an extra paramemeter ("contig_pages"), and if this is enough (at time of writing we whink down to 1/8th of the default request is OK) we take it and run.

In kernel version v4.19.19, the default number of RSS queues is determined using num_online_cpus(). However, in kernel v6.6.x, the logic is updated to use CPU topology information to determine the number of physical cores. For Xen dom0 kernels, this change introduces a problem: the dom0 kernel cannot retrieve valid CPU topology data from Xen. As a result, the default number of RSS queues is always set to 1, which degrades networking performance of guests using xen-netback. This patch introduces a fallback mechanism: if CPU topology information is unavailable, it reverts to using num_online_cpus(). To avoid excessive queue allocation, the number of RSS queues is capped at 8. Signed-off-by: Ming Lu <[email protected]>

After some recent upstream changes a lot of what used to be done in build.c is now done in header.S and setup.ld. The existing code lays out the PE header like this CONFIG_EFI_MIXED=n CONFIG_EFI_MIXED=y ``` 0x0000 +--------+ --- 0x0000 +--------+ --- | | | | | | 0x1000 +--------+ | 0x1000 +--------+ | | | | | | | | .setup | | | .setup | | | | | setup_size +--------+ | setup_size | | | | .compat| | 0x4000 +--------+ --- 0x4000 +--------+ --- | .text | | .text | | | | | ... ... | | | | 0xabb000 +--------+ 0xabb000 +--------+ | .data | | .data | +--------+ +--------+ ``` The .text section doesn't move and compat eats backwards into .setup. This looks a little weird to me but ok. The code in build.c typically copies more than 0x3000 bytes of setup data so it sure looks like putting .compat at 0x3000 is causing something to be overwritten. But I have left this behavior alone. The changes in this PR extend setup_size defined in setup.ld by 0x1000 bytes and then puts .sbat in the resulting gap between .setup and .text or .compat and .text. CONFIG_EFI_MIXED=n CONFIG_EFI_MIXED=y ``` 0x0000 +--------+ --- 0x0000 +--------+ --- | | | | | | 0x1000 +--------+ | 0x1000 +--------+ | | | | | | | | .setup | | | .setup | | | | | setup_size +--------+ | setup_size | | | | .compat| | 0x5000 +--------+ | 0x5000 +--------+ | | .sbat | | | .sbat | | 0x6000 +--------+ --- 0x6000 +--------+ --- | .text | | .text | | | | | ... ... | | | | 0xabb000 +--------+ 0xabb000 +--------+ | .data | | .data | +--------+ +--------+ ``` This change is careful to allocate new space for .sbat and to leave .compat to continue doing what it did before. Even though the .sbat data will be much smaller than 0x1000 bytes, the UEFI standard apparently wants each section to be 4kB aligned with no gaps between sections so the size of each section except the last needs to be a multiple of 0x1000. With CONFIG_EFI_MIXED=n (XenServer default) ``` $ objdump -s -j .sbat arch/x86/boot/bzImage 0 .setup 00004000 0000000000001000 0000000000001000 00001000 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 1 .sbat 00001000 0000000000005000 0000000000005000 00005000 2**4 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .text 00ab5000 0000000000006000 0000000000006000 00006000 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 3 .data 00000400 0000000000abb000 0000000000abb000 00abb000 2**4 CONTENTS, ALLOC, LOAD, DATA ``` ``` $ objdump -s -j .sbat arch/x86/boot/bzImage arch/x86/boot/bzImage: file format pei-x86-64 Contents of section .sbat: 5000 73626174 2c312c53 42415420 56657273 sbat,1,SBAT Vers 5010 696f6e2c 73626174 2c312c68 74747073 ion,sbat,1,https 5020 3a2f2f67 69746875 622e636f 6d2f7268 ://github.com/rh 5030 626f6f74 2f736869 6d2f626c 6f622f6d boot/shim/blob/m 5040 61696e2f 53424154 2e6d640a 6c696e75 ain/SBAT.md.linu 5050 782e7873 2c312c43 6c6f7564 20536f66 x.xs,1,Cloud Sof 5060 74776172 65204772 6f75702c 6c696e75 tware Group,linu 5070 782c312c 6d61696c 746f3a73 65637572 x,1,mailto:secur 5080 65406369 74726978 2e636f6d 0a000000 [email protected].... 5090 00000000 00000000 00000000 00000000 ................ 50a0 00000000 00000000 00000000 00000000 ................ 50b0 00000000 00000000 00000000 00000000 ................ ... 5fd0 00000000 00000000 00000000 00000000 ................ 5fe0 00000000 00000000 00000000 00000000 ................ 5ff0 00000000 00000000 00000000 00000000 ................ ``` With CONFIG_EFI_MIXED=y ``` Sections: Idx Name Size VMA LMA File off Algn 0 .setup 00003000 0000000000001000 0000000000001000 00001000 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 1 .compat 00001000 0000000000004000 0000000000004000 00004000 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .sbat 00001000 0000000000005000 0000000000005000 00005000 2**4 CONTENTS, ALLOC, LOAD, READONLY, DATA 3 .text 00ab9000 0000000000006000 0000000000006000 00006000 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 4 .data 00000400 0000000000abf000 0000000000abf000 00abf000 2**4 CONTENTS, ALLOC, LOAD, DATA ``` I have tested that kernels built with CONFIG_EFI_MIXED=y and CONFIG_EFI_MIXED=n both boot fine. Signed-off-by: Kevin Lampis <[email protected]>

It can be disabled by passing lockdown_disable on the kernel command-line, only if Secure Boot is not enabled. Signed-off-by: Ross Lagerwall <[email protected]> [ Corentin: rebase from XS-9 kernel (v6.6.98 based) to UEK8 kernel (v6.12 based): - 260017f `("lsm: use default hook return value in call_int_hook()") simplified the `call_int_hook` macro, so align our usage in `security_lock_kernel_down()`]

…ways enabled unless Secure Boot is disabled and sig_enforce=false is passed. This is implemented by setting CONFIG_MODULE_SIG_FORCE but allowing it to be disabled from the command-line with this patch. The command-line setting is overridden in arch_get_ima_policy() so if Secure Boot is enabled, sig_enforce will always be true. Signed-off-by: Kevin Lampis <[email protected]> Signed-off-by: Ross Lagerwall <[email protected]>

Shim is supposed to create an EFI table containing the MoK configuration which the kernel accesses via the efi_mokvar_* functions. In some cases (possibly due to Xen), this table is not available. The configuration is also made available via EFI variables and the kernel has fallback paths that use the variables if the table is not available. Fall back to using the EFI variable in this case too. With this change, MoK keys correctly get added to the machine keying which is trusted and can be used to verify modules (rather than the untrusted platform keyring). Signed-off-by: Ross Lagerwall <[email protected]>

…abled. By default the kernel does not allow to access (read and/or write) any writeable file in debugfs filesystem. However is useful for user space to access these file for statistics and debugging issues. So, if lockdown mode is enabled provide read-only access to these files. Signed-off-by: Frediano Ziglio <[email protected]>

These hypercall filter should build against these headers. If the version of Xen changes, the headers here should be updated along with the hypercall filter. Signed-off-by: Ross Lagerwall <[email protected]>

Signed-off-by: Chunjie Zhu <[email protected]> [ Corentin: rebase from XS-9 kernel (v6.6.98 based) to UEK8 kernel (v6.12 based): Dropped xs_kabi_padding evry time UEK_KABI_RESERVE(n) was already present, keeped the hunks that were missing]

Signed-off-by: Chunjie Zhu <[email protected]>

The standard implementation of pci_reset_function() does not try an SBR if there are other sibling devices. This is a common configuration for multi-function devices (e.g., GPUs with a HDMI audio device function). If all the functions are co-assigned to the same domain at the same time, then it is safe to perform an SBR to reset all functions at the same time. Add a "reset" sysfs file with the same interface as the standard one (i.e., write 1 to reset) that will try an SBR if all sibling devices are unbound or bound to pciback. Note that this is weaker than the requirement for functions to be simultaneously co-assigned, but the toolstack is expected to ensure this. Signed-off-by: David Vrabel <[email protected]> [ Corentin: rebase from XS-9 kernel (v6.6.98 based) to UEK8 kernel (v6.12 based): commit 88801d0 ("xen/pci: Add a function to reset device for xen") slightly reworked how to do a pci_reset, integrate these changes into __pcistub_reset_function(), then call __pcistub_reset_function() instead of pcistub_reset_device_state() ]

not expecting the LUN change. Lest it may trigger udev rules which in turn trigger more LUN change events. Signed-off-by: Gerald Elder-Vass <[email protected]>

Xen cannot determine whether a hypercall originates from userspace or kernel space. A malicious userspace may use Xen to attack the dom0 kernel. To avoid this, instead of allowing all hypercalls verbatim, block all hypercalls by default. Allow through certain hypercalls with checks. Checking involves: 1) Any pointers need to point at valid userspace memory. For this, we also need to know the size of the buffer. 2) Buffers containing pointers need to be bounced to avoid userspace changing the memory after it has been checked. After the hypercall completes, the memory needs to be bounced back. In addition, for the kexec_load call, verify the kernel to maintain Secure Boot integrity. This patch is technical debt and should be replaced by a stable hypervisor ABI and a generic mechanism for the dom0 kernel to check hypercalls made from userspace. Signed-off-by: Ross Lagerwall <[email protected]> Signed-off-by: Frediano Ziglio <[email protected]> [ Corentin: rebase from XS-9 kernel (v6.6.98 based) to UEK8 kernel (v6.12 based): - 260017f `("lsm: use default hook return value in call_int_hook()") simplified the `call_int_hook` macro, so align our usage in `security_locked_down_nowarn()`]

casasnovas assigned casasnovas and unassigned casasnovas Mar 19, 2026

casasnovas self-requested a review March 19, 2026 16:34

Vates Git Importer and others added 4 commits March 20, 2026 10:19

expose-xsversion.patch

972d665

Write each kthread's pid into xenstore so that it can be ioniced by the

ea024a5

toolstack. Relative to the VBD, the xenstore key is: queue-NNN/kthread-pid where NNN is the queue number starting from 0. Signed-off-by: Ross Lagerwall <[email protected]>

CocoElAbricot force-pushed the corentin-xcpng-9/xenserver-rebased-patchqueue branch from 86b7451 to 07f59c1 Compare March 26, 2026 13:55

CocoElAbricot changed the base branch from corentin-xcpng-9/base to Corentin-UEK8/u1/base March 26, 2026 13:56

casasnovas requested changes Mar 26, 2026

View reviewed changes

Vates Git Importer and others added 4 commits March 27, 2026 10:13

enable-fragmention-gre-packets.patch

807f1fc

CocoElAbricot force-pushed the corentin-xcpng-9/xenserver-rebased-patchqueue branch 2 times, most recently from 4be73cd to cc9d604 Compare March 27, 2026 14:42

CocoElAbricot force-pushed the corentin-xcpng-9/xenserver-rebased-patchqueue branch from cc9d604 to 654f209 Compare March 27, 2026 14:51

chunjiez and others added 9 commits March 27, 2026 16:26

pci: export pci_probe_reset_function()

32e73a8

The xen-pciback needs to test if a per-function reset is available so it can provide an interface for a safe bus reset instead. Signed-off-by: David Vrabel <[email protected]>

pciback-mask-root-port-comp-timeout.patch

eede887

Some devices advertise FLReset+ but do not support it. Add a quirk for

d7a7706

some devices to fallback to an alternate reset method. Signed-off-by: Ross Lagerwall <[email protected]>

CA-135938-nfs-disconnect-on-rpc-retry.patch

305268d

sunrpc-force-disconnect-on-connection-timeout.patch

054c3b0

xen-ioemu-inject-msi.patch

f51d90e

Vates Git Importer and others added 17 commits March 27, 2026 16:26

0001-xen-swiotlb-size-128MiB.patch

f91f3ce

Import the Xen public headers in preparation for filtering hypercalls

b746c52

These hypercall filter should build against these headers. If the version of Xen changes, the headers here should be updated along with the hypercall filter. Signed-off-by: Ross Lagerwall <[email protected]>

module-hash-revocation.patch

61e3de7

CP-46343: common data structure padding

53e8e98

Signed-off-by: Chunjie Zhu <[email protected]> [ Corentin: rebase from XS-9 kernel (v6.6.98 based) to UEK8 kernel (v6.12 based): Dropped xs_kabi_padding evry time UEK_KABI_RESERVE(n) was already present, keeped the hunks that were missing]

CP-46343: reserve cpuid leaves for future use

2c0ab80

Signed-off-by: Chunjie Zhu <[email protected]>

disable-mitigations-by-default.patch

43d58b7

call-kexec-before-offlining-noncrashing-cpus.patch

404294d

SDEV_EVT_LUN_CHANGE_REPORTED events should only be reported when we're

4df0651

not expecting the LUN change. Lest it may trigger udev rules which in turn trigger more LUN change events. Signed-off-by: Gerald Elder-Vass <[email protected]>

CocoElAbricot force-pushed the corentin-xcpng-9/xenserver-rebased-patchqueue branch from 654f209 to 01e920f Compare March 27, 2026 15:41

Conversation

CocoElAbricot commented Mar 19, 2026

Uh oh!

casasnovas commented Mar 19, 2026

Uh oh!

CocoElAbricot commented Mar 26, 2026

Uh oh!

casasnovas left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

casasnovas Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

casasnovas commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CocoElAbricot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

casasnovas Mar 26, 2026 •

edited

Loading

casasnovas commented Mar 26, 2026 •

edited

Loading