The bug is that Firecracker opens the metrics fifo with O_NONBLOCK. They used to open it with read/write, but now they open with write only. Opening a fifo with write only and O_NONBLOCK returns ENXIO. It used to work because of a non-standard linux thing where opening with read/write satisfied the kernel that nonblocking write was fine because there's a reader too.
We can ask firecracker to open with read/write again to fix it. Fixing it on our end looks tough because we create the metrics fifo as part of the API that starts the vm, but we can't start the vm unless something is reading the fifo.
We probably had a potential metrics/log loss here anyway if you don't open the fifo fast enough and the fifo buffer fills.
The bug is that Firecracker opens the metrics fifo with
O_NONBLOCK. They used to open it with read/write, but now they open with write only. Opening a fifo with write only andO_NONBLOCKreturnsENXIO. It used to work because of a non-standard linux thing where opening with read/write satisfied the kernel that nonblocking write was fine because there's a reader too.We can ask firecracker to open with read/write again to fix it. Fixing it on our end looks tough because we create the metrics fifo as part of the API that starts the vm, but we can't start the vm unless something is reading the fifo.
We probably had a potential metrics/log loss here anyway if you don't open the fifo fast enough and the fifo buffer fills.