Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BGP integration #182

Merged
merged 15 commits into from
Oct 4, 2024
Merged

Conversation

mkalcok
Copy link
Contributor

@mkalcok mkalcok commented Sep 12, 2024

This is relatively large piece of work that enables integration between MicroOVN and external BGP daemons. It's split into multiple commits for easier review, please review them individually. Commit messages and accompanied documentation pages try to explain intended usage and implementation details as much as possible.

Note: This is an experimental feature that, among other things, includes OVN patches not yet accepted in the upstream.

@mkalcok mkalcok requested a review from a team as a code owner September 12, 2024 16:08
@mkalcok mkalcok force-pushed the bgp-service branch 7 times, most recently from c4042a7 to 837db1e Compare September 16, 2024 11:31
Copy link
Member

@fnordahl fnordahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Martin, first and foremost thank you so much for your work on this feature.

Making all of the in-flight pieces come together like this is non-trivial work, and I really appreciate the care you have taken for everything from UX, adding documentation and end to end tests.

Being able to observe all the pieces together like this is of utmost importance for us to figure out the details of what the final implementation will look like, and you have unlocked our ability to be able to experiment and iterate our way towards a great end to end feature.

With some small adjustments we can probably put this in the hands of interested early adopters too and unlock their thoughts and ideas on how we could make this useful.

I have some comments in-line, please have a look.

Given the size of the PR I have prioritized the areas where I saw the most potential for improvement, so there are other parts where I still may have opinions but we need to prioritize what to address now and what to address as we iterate towards the final product in the coming cycle.

snap/hooks/install Show resolved Hide resolved
mkdir -p "$SNAP_COMMON/run/frr"

# Seed default config files for FRR
cp -r "$SNAP"/etc/frr/* "$frr_etc"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newline at end of file please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 208 to 243
rtrlib:
build-packages:
- cmake
- make
- gcc
- libssh-dev
stage-packages:
- libssh-4
- zlib1g
prime:
- lib/librtr.so*
- usr/lib/$SNAPCRAFT_ARCH_TRIPLET/libssh.so*
source: https://github.com/rtrlib/rtrlib.git
source-type: git
source-tag: v0.8.0
plugin: cmake
cmake-parameters:
- -DCMAKE_BUILD_TYPE=Release
libyang:
build-packages:
- cmake
- make
- gcc
- libpcre2-dev
stage-packages:
- libpcre2-8-0
source: https://github.com/CESNET/libyang.git
source-type: git
source-tag: v2.1.128
plugin: cmake
cmake-parameters:
- -DCMAKE_INSTALL_PREFIX:PATH=/usr
- -DENABLE_LYD_PRIV=ON
- -DENABLE_CACHE=ON
- -DCMAKE_BUILD_TYPE:String="Release"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: What is the reasoning behind building these two dependencies from source?

There appear to be packages with similar versions in the archive (librtr-dev and libyang2-dev). I see that the noble version for libyang2-dev is a bit off (2.1.30 vs. 2.1.128). Is that it?

If so I guess it's fine, we could always move back to build-packages once an edge version of core26 is made available, as Oracular appear to have a newer version (2.1.148).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. The reason is that FRR requires for libyang 2.1.128

@@ -263,7 +411,7 @@ parts:
export CGO_LDFLAGS_ALLOW="(-Wl,-wrap,pthread_create)|(-Wl,-z,now)"

# Check that `golangci-lint` is happy with the code.
golangci-lint run --verbose
golangci-lint run --verbose
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this white space change intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, it sneaked in when I was "undoing" the linter skip :D. I'll fix it

@@ -347,3 +495,9 @@ parts:
- bin/jq
- lib/*/libjq.so*
- lib/*/libonig.so*

layout:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are building FRR from source, is the layout necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /var/tmp/frr path is hardcoded so we won't get around that.

We could compile with custom --sysconfigdir, but if we ever switch to FRR from package, we'll have one less thing to do if we just leave both of these layouts in place now.

@@ -0,0 +1,91 @@
========================
OVN integration with BGP
========================
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I absolutely love how you have taken the care to properly document this, even though this is for a experimental feature branch. Thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How else would I know what I submitted here 2 weeks ago? 😆

Comment on lines +35 to +50
# microovn_start_bgp_unnumbered CONTAINER INTERFACE ASN VRF
#
# configure FRR bundled with MicroOVN in the CONTAINER, to
# start BGP in the unnumbered mode, listening on the INTERFACE
# in the VRF with ASN.
function microovn_start_bgp_unnumbered() {
local container=$1; shift
local interface=$1; shift
local asn=$1; shift
local vrf=$1; shift

lxc_exec "$container" "microovn.vtysh \
-c \"configure\" \
-c \"router bgp $asn vrf $vrf\" \
-c \"neighbor $interface interface remote-as internal\""
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented on another commit, I'd like MicroOVN to take responsibility of this configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the addition of the asn option, we now have tests that test both "automatic" and "manual" FRR configuration.

echo "# ($MICROOVN_BGP_CONTAINER) waiting on established BGP with $container" >&3
wait_until "microovn_bgp_established $MICROOVN_BGP_CONTAINER $vrf_device $container"
done
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check redistribution of routes? At this stage it could be as simple as adding a couple of NAT entries and checking that our bare bones BGP config allows the route to this address show up in the routing table in the remote BGP peer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we be testing FRR/OVN at that point? I don't really mind too much, especially if it's trivial, however I'm bit wary about unnecessary testing stuff that's outside of our responsibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not yet added.

@@ -4,6 +4,8 @@ setup_file() {
load test_helper/microovn.bash
load test_helper/bgp_utils.bash

# Ensure that required VRF kernel module is available on the host
lsmod | awk '{print $1}'| grep -i "\<vrf\>"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment on using LXD profile to load the module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, yes, that was indeed better approach. This was removed.

@@ -130,6 +130,8 @@ jobs:
sudo snap set lxd daemon.group=adm
sudo lxd init --auto
snap list
sudo apt install -y linux-modules-extra-$(uname -r)
sudo modprobe vrf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a precedent in the code base for installing kernel modules through LXD:

linux.kernel_modules: openvswitch

Using that would make loading automatic regardless of environment, and presumably it would also provide a useful error message if the module is not available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent point! thank you. I didn't realize that we can use this mechanism. The lxc launch fails with

Error: Failed to load kernel module 'vrf': Failed to run: modprobe -b vrf: exit status 1

if the module is not available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had bit of a dilemma here as I didn't want to enforce vrf kernel module for all containers and I didn't want to duplicate the profile either. I ended up using launch_containers_args function that allows us to pass -c linux.kernel_modules=vrf,openvswitch arg that overrides the profile just for the containers that need the vrf.

My concern with requiring VRF for all containers was that it' only needed in single test suite and reuqires installation of quite heavy package linux-modules-extra. Given that we distribute our test runs across many machines, most of them would install this package unnecessarily.

@mkalcok mkalcok force-pushed the bgp-service branch 2 times, most recently from ed7952d to 7581f22 Compare September 25, 2024 13:45
var allErrors error
// Find and remove Logical Router used for BGP redirect
logicalRouters, err := ovnCmd.NBCtlCluster(ctx, "--bare", "--columns", "name",
"find", "logical_router", fmt.Sprintf("external-ids:%s=true", BgpManagedTag),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hits too broadly and would remove other nodes configuration upon disabling bgp on a single node, I have confirmed that this is happening.

}

// Cleanup ovn-bridge mappings for external networks
ovnBridgeMapping, err := ovnCmd.VSCtl(ctx, s, "get", "Open_vSwitch", ".", "external-ids:ovn-bridge-mappings")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call creates an error when it does not exist, which is the case when a user has ran microovn enable bgp without options only to then do microovn disable bgp.

mkalcok added 15 commits October 2, 2024 23:10
This change adds `frr` part to the MicroOVN snap with following
FRR components enabled:
 * Zebra daemon
 * BGP daemon
 * VTYSH CLI application for communication with daemons

Due to issues with privlege separation module, we are unable to use
FRR from ubuntu repositories at this time. Biggest issue is that
FRR specifies allowed users and groups at build time [0]. During
the runtime, `zebra` daemon tries to make sure that the user is part
of the `frrvty` group but this fails due to the sctrict confiment.
As a result, we have to build the FRR from sources ourselves.

Note: BGP daemon sidesteps this issue by using -S option [1].

[0] https://docs.frrouting.org/en/stable-10.0/installation.html#cmdoption-configure-enable-vty-group
[1] https://docs.frrouting.org/en/stable-10.0/bgp.html#cmdoption-bgpd-S

Signed-off-by: Martin Kalcok <[email protected]>
Logical Router traffic redirect for BGP and BFD daemons
is supported starting with version 24.09. Therefore, we
need to build OVN from source.

Signed-off-by: Martin Kalcok <[email protected]>
These patches come from series posted by @fnordahl [0] and
they have been slightly modified to fit on top of OVN's branch
"branch-24.09".

[0] https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/

Signed-off-by: Martin Kalcok <[email protected]>
Due to IPV6 RAs from Logical Router not reaching
external networks [0], we need to include a patch
that removes blocking of RAs from BGP redirect ports.

BGP unnumbered relies on unsolicited RAs to discover
its peers and currently easies path to achieve that, is
to remove blocking of RAs from the BGP redirect ports.

In the future the proper solution would be to fix OVN
and allow periodic RAs from Logical Router Ports to reach
external networks.

Signed-off-by: Martin Kalcok <[email protected]>
Current implementation of `NBCtl` function supports only local
unix socket communication. This change splits adds a `NBCtlCluster`
function that connects to the network endpoints of the NB cluster.
This function utilizes snap's 'ovn-nbctl' command wrapper to automatically
set correct network endpoints and TLS certificates.

Signed-off-by: Martin Kalcok <[email protected]>
This change adds option to run `microovn (enable|disable) bgp`
commands. When executed, these commands start or stop BGP
and Zebra services provided by FRR.

This change also inclues abillity to pass additional config
options to `microovn enable` commands in format
"--config key=value".

Signed-off-by: Martin Kalcok <[email protected]>
This change enables users to specify external connections specific
to each MicroOVN node. External connection is defined by a physical
port name and IPv4 address of the Logical Router Port that will be
connected to the network provided by the physical port.

If these external connections are defined, MicroOVN will set up new
OVS ports (one for each external connection) and redirect BGP+BFD
traffic received from external connection to them.
These ports reside in a separate (shared) VRF to which OVN
redistributes addresses of the locally accessible NAT and Load
Balancers. Users can bind BGP daemons (like the bundled FRR) to these
ports and use them to announce these addresses to their BGP peers.

Signed-off-by: Martin Kalcok <[email protected]>
When users disable bgp service, we need to clean up
resources that were created for BGP redirection. This
change utilises "external-ids" column on various
OVN/OVS resources to tag them when they are created.
During the teardown, we can then find those resources
based on the external-ids and remove them.

Signed-off-by: Martin Kalcok <[email protected]>
While most of the resources are managed by OVN and on service (or
machine) restart will be re-instated, some of the resources need
manual attention.

Specifically, the interfaces that are used for BGP redirection,
need to be re-assigned to the VRF and they need to have their
IP address reconfigured.

In the future implementations, it'd be best if we could offload
management of interfaces to some other tool, for example, Netplan.

Signed-off-by: Martin Kalcok <[email protected]>
Add support for config option '--config asn=<number>' when
enabling BGP integration. When this option is supplied,
MicroOVN will autoconfigure FRR's BGP daemons on each
BGP-redirect interface in "unnumbered" mode that allows
automatic discovery of neighbors.

Signed-off-by: Martin Kalcok <[email protected]>
Explanation and How-To guide for using the MicroOVN's
BGP integration.

Signed-off-by: Martin Kalcok <[email protected]>
Added helper function that installs APT packages
in containers.

Signed-off-by: Martin Kalcok <[email protected]>
These tests verify that when BGP is enabled in
the MicroOVN via `microovn enable bgp` with
appropriate `--config` options, external BGP
daemons are able to establish connection with
the daemon that listens on the OVN side.

Signed-off-by: Martin Kalcok <[email protected]>
This module is required for BGP intgration testing.

Signed-off-by: Martin Kalcok <[email protected]>
This branch of MicroOVN is based on OVN 24.09. That means that
it can no longer directly upgrade from current default MicroOVN
snap channel (22.03).

These changes set the base channel for upgrade tests to 24.03/beta

Note: Since the MicroOVN 24.03 does not have a 'stable' channel yet,
we need to use 'beta'

Signed-off-by: Martin Kalcok <[email protected]>
@mkalcok
Copy link
Contributor Author

mkalcok commented Oct 2, 2024

Thanks for catching these issues. I fixed them and updated tests.
However I found something that I can't explain yet. In tests that enable BGP on multiple chassis (regardless of whether it's with single or with two external neighbours), some of the BGP sessions are stuck in Waiting for peer OPEN (n/a) state and never reach Established. So far I can't say why that is, as there doesn't appear to be anything wrong with the setup. I'll dig deeper into it, currently a workaround is to introduce few seconds of sleep between enabling the BGP services on MicroOVN nodes.
This is of course not a valid solution, so I'll put the PR into Draft stage, but I think it's otherwise ready for another round of review.

@mkalcok mkalcok marked this pull request as draft October 2, 2024 21:20
@fnordahl
Copy link
Member

fnordahl commented Oct 3, 2024

Thanks for catching these issues. I fixed them and updated tests. However I found something that I can't explain yet. In tests that enable BGP on multiple chassis (regardless of whether it's with single or with two external neighbours), some of the BGP sessions are stuck in Waiting for peer OPEN (n/a) state and never reach Established. So far I can't say why that is, as there doesn't appear to be anything wrong with the setup. I'll dig deeper into it, currently a workaround is to introduce few seconds of sleep between enabling the BGP services on MicroOVN nodes. This is of course not a valid solution, so I'll put the PR into Draft stage, but I think it's otherwise ready for another round of review.

I have not looked yet, but from your description of BGP sessions failing to form, my hunch would be FRRouting/frr#8668 (comment).

Essentially there can only be one unnumbered BGP peer per interface, and the test uses a single L2 broadcast domain for all the containers, so I think this is what we see here.

I guess we could have the test create more LXD networks, one per host-ToR link, to simulate the point to point nature of a physical cable?

@mkalcok
Copy link
Contributor Author

mkalcok commented Oct 3, 2024

Thanks for the suggestion, but unfortunately, I don't think that this is the problem. Tests already create individual networks for each ToR <-> Chassis connection. (Given that this is indeed pretty complex setup, I included more details about it on top of the setup_file for bgp tests)

@fnordahl
Copy link
Member

fnordahl commented Oct 3, 2024

Thanks for the suggestion, but unfortunately, I don't think that this is the problem. Tests already create individual networks for each ToR <-> Chassis connection. (Given that this is indeed pretty complex setup, I included more details about it on top of the setup_file for bgp tests)

Indeed you do, that's awesome, and sorry for missing that. I'll have a peek and see if I spot anything.

@mkalcok mkalcok marked this pull request as ready for review October 3, 2024 15:13
@mkalcok
Copy link
Contributor Author

mkalcok commented Oct 3, 2024

I changed my mind on the topic of merging this PR. Given that this is going to be an experimental feature branch (not main or stable), I think it's OK to merge this even with the unresolved issue mentioned in the comment above. The issue seems to manifest only when BGP is enabled in quick succession on multiple chassis. Any deployments configured by hand (or with slight backoff) should work fine.

If this ends up merged, I'll open a LP ticket regarding this issue and we can look at it later. (I have a slight suspicion that it may be even an upstream OVN issue)

Copy link
Member

@fnordahl fnordahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few unresolved conversations on this PR, being a feature branch I think that is fine as we don't know which parts of this will survive into the final version yet. Let's revisit once we integrate back to main.

LGTM, thank you for your work on this!

@fnordahl fnordahl merged commit fa34817 into canonical:experimental/24.09-bgp Oct 4, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants