Some time ago we wrote a blog entry about our Docker prototype – where we showed that Calico ran successfully (and with no changes to the Felix and ACL Manager components) in a Docker environment, providing multi-host networking between containers.
Having published this, one of the questions we were asked was whether the same prototype code would work on the Google Compute Engine (GCE), which would provide us with a great platform for easily testing Calico across multiple hosts and containers. How hard could it be, we thought?
It turns out that the key issue with running on GCE is that it’s an L3 routed network, as opposed to an L2 switched fabric. Adding more complexity is the fact that each VM (and hence compute host) in GCE is on its own /32 subnet, with the GCE fabric providing default routing between hosts (yes, that means that the default gateway is not in the same subnet as the VM!). On one hand, this means that GCE can handle much of the routing that Calico would normally set up. However, in order to be a useful Calico test platform, we needed to ensure that the Calico driven routing was being correctly distributed by BIRD. Moreover, to demonstrate the Calico function in full we needed to make sure that if the “Calico routes” weren’t being distributed properly, then GCE wouldn’t just route between VMs for us.
Consequently, in order to get this working we did need to make some tweaks to the original prototype; in particular, we needed to update our BIRD configuration so that it would work alongside the GCE fabric. We also needed to work out how to configure GCE so that it really was Calico providing the routing between containers, and not the GCE fabric. More information on this GCE prototype and how to set it up are in the GitHub repository.
In the process, we also made some changes to the original prototype to improve the packaging and instructions. A summary of the key changes in this new prototype are as follows.
- First and foremost, the prototype used downloadable images rather than Dockerfiles. That was due to the need to download some internal packages, but we’ve sorted that out and replaced the images with Dockerfiles – simplifying and speeding up the install process. That also means that if you download the prototype, you’ll be able to examine and edit the test code much more easily.
- The procedures to download and run the demo have been simplified and rewritten, so it’s quicker and easier to get things up and running. As part of this, the instructions now collect logs from all components running in containers on the host, so if you do make a mistake setting it up you’ll find it easier to figure out what is going on.
- There are a range of improvements and fixes to the prototype code, including both the plugin code and the script to network a container, with a new script to network a pre-existing container.
All of this code, including the simplified and improved documentation, is publicly available on GitHub.