Skip to content

Conversation

@bdkabiruddin
Copy link

Description

This PR addresses issue #599 by enabling trellis-cli to work on Linux using Lima with QEMU.

Currently, trellis-cli relies on vmType: vz and macOS-specific networking, which makes VMs inaccessible on Linux (cannot route to the VM's IP).

Changes Implemented

1. Dynamic VM Type

  • Updated pkg/lima/instance.go and config.yml to switch between vz (macOS) and qemu (Linux).

2. Bridged Networking (The "TAP" Solution)

Since Lima on Linux (QEMU) implies user-mode networking by default, I implemented a bridged network approach:

  • Host Side: The CLI automatically checks for and creates a tap0 interface (assigned 192.168.56.1) on the host using sudo.
  • VM Side: A provision script creates a corresponding interface inside the VM assigned to 192.168.56.5.

3. QEMU Wrapper

Lima does not currently expose a way to inject specific -netdev tap arguments.

  • Implemented a temporary qemu-system-x86_64 wrapper script in pkg/lima/manager.go.
  • This script injects the TAP network device arguments before passing control to the real QEMU binary.

4. IP Detection

  • Updated Instance.IP() to prioritize the custom bridged subnet (192.168.56.x) on Linux.
  • Falls back to the default NAT IP if the bridge is missing or if running on macOS.

Verification

  • Tested on Ubuntu 22.04.
  • Confirmed macOS logic remains untouched via runtime.GOOS checks.

@swalkinshaw
Copy link
Member

This is a creative solution. Thanks for proposing it. Using Lima as the Linux solution is certainly more consistent and easier to implement than my other ideas of using LXD.

I'll try and test this out using a VM within a VM. Couple notes:

  • obviously would be much nicer if Lima allowed for arbitrary Qemu command line args to be specified in a config so we wouldn't need the wrapper hack. Maybe you or I can open an issue and see what they think of that.
  • implementing the fake "help/version commands" for compat is one of the main things I'd rather avoid in case those diverge and break

How is performance of the Qemu based VM and this tap networking mode? I assume the TAP part has no perf penalty.

@swalkinshaw
Copy link
Member

Related: lima-vm/lima#358 and lima-vm/lima#4371

@bdkabiruddin
Copy link
Author

Thanks @swalkinshaw for the feedback and for reviewing this!

Regarding your points:

  1. Wrapper Hack & Upstream Issue:
    I completely agree that the wrapper is a workaround. Native support in lima.yaml for arbitrary QEMU arguments would be the ideal solution. I will open a feature request in the lima-vm/lima repository proposing a configuration option (e.g., qemu.extraArgs) to handle this natively. Once/if that lands, we can remove the wrapper logic entirely.

  2. Compatibility:
    I share the concern about maintainability. The logic currently intercepts only the specific help/version flags to prevent crashes during Lima's checks. For all other commands, it blindly passes execution to the real binary, so the risk of divergence should be low.

  3. Performance:
    In my testing on Ubuntu 22.04 (KVM enabled), the performance feels near-native. The TAP interface adds negligible overhead, and site response times are snappy.

Additional Context:
I'd like to highlight that currently, trellis-cli is effectively non-functional on Linux because of the networking isolation. While this solution involves a wrapper, merging it would immediately unblock a large number of Linux developers who are currently unable to work.

I also think that testing this in other Linux environments (beyond my Ubuntu setup) is a good idea to ensure stability across different distributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants