In-cluster Route Reflection

In-cluster Route Reflection

Calico v3.3 integrates BGP route reflection more closely and scalably into the Calico user experience and data model.  Calico nodes can now act as route reflectors as well – instead of requiring additional infrastructure for that function – and we’ve streamlined the configuration of scalable route reflection.  Clusters can grow to hundreds or thousands of nodes with the BGP exchanges remaining scalable and efficient, and without needing O(n) BGP peering configurations.  In this blog, I’ll explain how that works and how you can use the new function.

Calico, routing and BGP route reflection

Calico is a routed network, in the sense that every packet going to or from a workload is routed by that workload’s host.  By default, Calico uses the Border Gateway Protocol (BGP) to distribute routing information so that workloads can communicate across hosts.

Every Calico node needs to have a BGP pathway to every other node (and also to any other BGP speaker within the same autonomous system),

  • either directly with a full node-to-node mesh – which means that each node makes n – 1 BGP connections, if n is the number of hosts
  • or indirectly via one or more BGP route reflectors – which means that each node only has BGP connections to a small number (typically 2 for redundancy) of route reflectors, and the route reflectors copy the route updates from each connected node to all of the other nodes.

As n gets large, the full mesh approach gets inefficient.  Using BGP route reflection avoids that and enables Calico clusters to scale to thousands of nodes – just as BGP route reflection in the Internet is an important factor in global Internet scaling.

Sometimes – as in typical on-prem deployments – there are already top of rack routers in the planned architecture, separate from the Calico hosts, that provide BGP route reflection.  Then each Calico host peers with those routers and the job is done, with no need for a full mesh or for in-cluster route reflection. But when external route reflectors are not already available, it’s useful for Calico to provide that option.

We’ve previously published a container image (calico/routereflector) that provides a standalone route reflector outside the cluster.  That worked fine, but

  • it needed extra resources – an extra machine or container for every route reflector
  • it needed additional peering configuration that was different from how BGP peerings are configured from and between Calico nodes.

A lot of community feedback asked if we could solve the first problem by having route reflector function directly on some of the Calico nodes, and in addressing that we solved the second problem too.  Now any Calico node can act as a route reflector for other nodes, at the same time as doing its regular job of hosting local workloads, doing the route and security programming for those, and exporting its own workload routes to its BGP peers.

Make it so!

So how do we tell a Calico node to behave as a route reflector?  The minimum is just to set one new field in the Calico Node resource:

apiVersion: projectcalico.org/v3
kind: Node
metadata:
  name: node-hostname
spec:
  bgp:
    asNumber: 64512
    ipv4Address: 10.244.0.1/24
    ipv6Address: 2001:db8:85a3::8a2e:370:7334/120
    routeReflectorClusterID: 10.0.0.1

A non-empty routeReflectorClusterID tells the node:

  • that it is a route reflector (as well as a regular Calico node)
  • that it should treat its BGP peers as route reflector clients (which has various detailed consequences in the BGP protocol), except when a BGP peer has the same routeReflectorClusterID (in which case that will be a normal peering).

However, you will probably also want to add a label, to make it easy to configure peerings between this route reflector and other nodes, for example:

apiVersion: projectcalico.org/v3
kind: Node
metadata:
  name: node-hostname
  labels:
    routeReflector: 10.0.0.1
spec:
  bgp:
    asNumber: 64512
    ipv4Address: 10.244.0.1/24
    ipv6Address: 2001:db8:85a3::8a2e:370:7334/120
    routeReflectorClusterID: 10.0.0.1

In Calico v3.3 we’ve also added fields to the BGPPeer resource to facilitate peering between labeled Calico nodes:

  • nodeSelector: <ns>, to say that the peering should exist on all nodes whose labels satisfy the selector <ns> (instead of specifying a particular node by name, or that the peering is for all nodes)
  • peerSelector: <ps>, to say that this peering should be to all nodes whose labels satisfy the selector <ps> (instead of specifying a single peer by its IP address).

Using those fields, this single BGPPeer resource will configure all the BGP peerings required between the route reflector and non-route reflector Calico nodes:

apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: cluster1
spec:
  nodeSelector: “all()”
  peerSelector: “has(routeReflector)”

More generally, you can do things like partitioning your cluster into node groups, with each group peering with a route reflector for that group, and then configuring a full mesh between the route reflectors.  It’s simply a matter of labeling your nodes however you require, then using BGPPeer to configure peerings between the labels.

calico/node as a standalone route reflector

calico/node can also serve as a standalone route reflector, fulfilling the exact same purpose as calico/routereflector used to.  To do that with Kubernetes, you should start and configure an instance of calico/node just as already described above, but on a host that is not a member of the Kubernetes cluster.  With orchestrators more generally, the approach is to start and configure the calico/node as above and simply instruct the orchestrator not to schedule any workloads on that host.

With calico/node providing route reflector function – whether standalone or in-cluster – the same unified configuration model covers all of the following:

  • peerings from non-route-reflector Calico nodes to the route reflectors, and/or to any other upstream BGP speakers
  • the peerings that are needed on the route reflectors back to the regular Calico nodes
  • the peerings that are needed between route reflectors (for clustering and redundancy)
  • arbitrary additional BGP peerings from the route reflectors, such as to upstream BGP speakers outside the cluster.

In summary, the calico/node approach and configuration model are cleaner and more powerful than calico/routereflector.  Therefore we’re now deprecating the calico/routereflector container image and the GitHub repo behind it, and recommend that deployments migrate to using calico/node instead.

Conclusion

I hope this has been a clear and useful presentation of the newly integrated route reflector function in Calico v3.3.  Please do try it out, and share your experiences with the Calico community on Slack. And if you have a Tigera Essentials subscription, your Tigera support engineer will be happy to work with you to assist in configuring this new feature and advise on upgrade/migration strategies.  Happy route reflecting!

 

Neil is a core developer at Project Calico and Tigera. Outside tech he mostly spends his time on choral singing.