Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds support for BGP unnumbered.
Depends on
The Main Event
Unnumbered peers are added as a new type of peer with their own set of API functions. When an unnumbered peer is added it goes into the pending state. The new peer request is handed over to an unnumbered peer manager where IPv6/NDP router advertisements (RA) are used to discover the address of the peer over the interface provided. When a router is discovered on the interface the pending peer then turns into a real peer. We use the IPv6 link-local address discovered in the RA and create a BGP session using that address. From here the extant BGP machinery works in the same way it currently does.
Neighbor discovery is a one-shot process when the unnumbered peer is created. Once we determine the peer address through an RA, that is what we use. If the peer address changes, the session needs to be deleted and recreated so discovery will happen again. No attempt is made to track neighbor state beyond initial session establishment. This may be revisited in a future release. One shot semantics has the benefit of being simple and stable. I don't believe we want to be overly sensitive to NDP RA dynamics as that would undercut the intent of things like BGP hold time and TCP connection tracking. Having a persistent monitor of NDP dynamics will need to be designed carefully.
Unnumbered support is designed to work in a point to point setting. We do not currently handle situations where there are multiple reachable routers reachable over a broadcast domain that the unnumbered interface is connected to. We simply attempt to establish a session with the first peer we find. This is how things have generally worked for unnumbered in my experience. We could develop something more robust, but this becomes a rather complex state machine where we have to try to establish sessions with every peer we discover. This could lead to doomed BGP sessions iterating through a state machine indefinitely.
Testing
This PR also adds a new testing substrate in the
falcon-labfolder. BGP unnumbered requires multicast ICMPv6/NDP to work. This means we need a testing environment that routers connected over a broadcast domain, and some other unnumbered implementation to peer with to ensure we are not deluding ourselves with self-peering. Using illumos zones would have been nice here, but none of the BGP implementations I'm aware of on illumos besides mgd fully support unnumbered. FRR on the other hand has great support for unnumbered. So peering with a linux-based router puts us in falcon territory.Another really important thing to test for unnumbered is mg-lower interaction. We need test coverage on nexthop interface mapping for data plane route installation. The interop test environment is currently upper-half only. Due to license nonsense, the interop topology machinery is also stuck in purgatory. So in this PR I decided to add a new falcon-based testing substrate that only uses unencumbered peers and the oxide nodes are softnpu vms so we have a full lower half with dendrite, P4 and the whole 9 yards. This also means the testing machinery can actually live in this repository which is a nice win. The best way to get acquainted with this new testing substrate is starting by reading through the first test.