The following blog is a guest blog post by Sarah Christoff, a member of our MVP Program. You can find more writings by Sarah on her blog, Cat on a Computer.
Every meeting I attend is centered around “Containers”, “Kubernetes”, and other words that I initially figured were synonymous with synergy. However, my assumptions were wrong, and after reading a bunch of articles to familiarize myself with Containers and Docker, many questions remain. What is a container, and why does everyone care?
Long ago (in 2006) Google engineers played around with this thing called “Process Containers”, which were meant to do seemingly that, containerize processes. This was changed to Control Groups (cgroups) to avoid confusion later on. cgroups limit or expand the usage of resources (CPU, memory, disk), measures a group resource usage, and control the group of processes. Namespace Isolation prevents each process from seeing any other processes running on the machine. Because of this, each process had the idea of being completely alone. Namespaces and cgroups are the two key features which Docker uses to run its platform.
Simply put, the Linux Kernel is utilized to create a well-managed, isolated, groups of processes, a container, that your application runs within.
Containerization or Virtualization
Before containers, virtual machines were designed as software to run on top of physical servers. Hypervisors act as the middle men between the Virtual Machine and the physical server itself, and allow for multiple Operating Systems to be ran on entire machines. Depending on your application, this could be beneficial or harmful. Multiple operating systems means all the packages, apps, and binaries that come with that operating system are installed as well, which can be redundant.
Container Engines, such as Docker or rkt, use the same Operating System kernel to run multiple containers. Each container is running in an isolated processes. Due to the container engine using the same operating system, this would take up less space than virtualization, but depending on your application needs this could be detrimental.
Note: Hearing Container Runtime a lot? It’s the same as container engine. A container runtime allows users to manage resources like CPU and RAM, as well as start and stop containers, by providing APIs and tooling that abstract the low level stuff.
Container Orchestration helps to stick true to the “Cattle not Pets” analogy that infrastructure and applications strive to follow. Container Orchestration increases redundancy by making it so your application is replicated in many environments (across cloud providers, datacenters, etc), monitoring container health, and simplifies container networking.
There are several different container orchestration tools, such as Docker Swarm, Nomad, and Mesos, but for simplicity’s sake we’ll focus on Kubernetes.
Kubernetes steps in to ensure that all these containers stay alive, that you have as many of them as you want, and that they have the resources you’ve allocated to them. A kubernetes cluster can be easily made larger, or smaller, and run across many different cloud providers (Google Cloud, Azure, AWS, etc).
Kubernetes Vocabulary Words of the Day:
Node: could be synonymous with instance, or server. A node encompasses your container, your namespace, your API servers, and pods.
Master: this machine handles all the minion machines. It also holds onto etcd, the master API server, and the controller manager server. In short, this is the boss that gives out orders to all the other lil’ nodes.
Minion: runs tasks given to it by the Kubernetes master.
Kubelet: Kubelet is the “node agent” that runs on each node, but can be thought of as the nurse. Kubelet monitors the health of the containers running on a specified node, and interacts with the master node
KubeCTL: a command line interface for interacting with the Kubernetes cluster
Pod: is a group of containers that are deployed together and must remain together at all times. You could think of this as being your web server container, your database container, and everything else you need all wrapped together.
Container: is a stand-alone, isolated process that runs your software and the necessary libraries within itself.
Replication Set: ensures that the specified number of pods are running at any given time, can generate or delete pods if necessary.
Label: is a specified value given to pods so the replication controller can keep track of them.
Namespace: (similar to a Linux Namespace) where it groups containers, volumes, all up in name to easily keep track of and replicate
Service: is a grouping of pods.
Cluster: is a grouping of nodes which typically consists of a master and a few minions in the same data center.
ETCD: a key/value data store, which Kubernetes uses to hide all your secrets and is able to access them at a given time
There are two primary different container networking standards (that I know of) – Container Network Interface (CNI) and Container Network Model (CNM).
Note: Both CNI and CNM are networking specifications, or standards (basically a rule book to follow). Plugins such as Libnetwork, Calico, Flannel, etc. use these guidelines to build their networking tools.
Container Network Model (CNM)
This was the initial model released by the makers of Docker. Libnetwork was created to be the implementation of the Container Network Model. Think of an interface as the bouncer before you enter a club, making sure you have the right credentials to get inside before you enter. Libnetwork is the interface between the Docker daemon and the network drivers. Network drivers are either native to Docker (bridge, host, MACVLAN, built-in to Libnetwork or Docker) or third-party plugins.
The CNM is built with three main parts –
Network Sandbox: Contains the configuration of the container’s network stack, such as routing tables, container’s interface, DNS settings, etc.
Endpoint: Joins the sandbox to the network, while also providing connectivity for services exposed by the container.
Network: A group of endpoints that are able to talk to each other.
Container Network Interface (CNI)
CNI was built by CoreOS, and is used by Kubernetes, rkt, and plugins such as Calico and Weave. CNI was built to be a simple way for the container runtime and the networking plugins to interact. The CNI first assigns the container an ID and gives the container a network namespace, then passes CNI configuration parameters to the network drivers. Then the network driver attaches the container to a network.
Popular Network Plugins
This is where all of these plugins like Calico, Flannel, and Weave, come into play – they are network plugins (or CNI plugins). These adhere to the rules set by CNI or CNM and build upon them to provide container networking, routing, and security.
Calico is highly scalable (integrates with libnetwork, Kubernetes, and more), secure, and widely used across multiple infrastructures. Calico was originally created to find a better solution to networking cloud workloads. Calico provides a Layer Three (Network Layer) approach to networking, by basing its design similarly to how the internet itself works. Calico uses vRouter, which replicates the functionality of Internet Protocol (IP) routing. Calico also uses Border Gateway Protocol (BGP) to advertise the IPs of each workload. Since Calico is using internet – like protocols and routing on which it’s functionality it based upon, it is safe to say it can handle large amount of traffic and is easily secured and scaled.
Flannel is an overlay network that assigns each pod a range of subnet addresses (10.0.0.0/24, or something of that nature). Flannel assigns each of these IP addresses to a container. It uses either User Datagram Protocol (UDP) , or VxLAN. Flannel uses packet encapsulation, or adds another layer onto the network packet, allowing for any protocol to be used. Flannel also allows for TLS to encrypt the channel between Flannel and ETCD.
Weave provides a networking solution to containerized applications. Weave assigns a range of subnet addresses, and assigns each of these IP addresses to a container. Like Flannel, Weave also uses UDP, however it will merge multiple containers packet’s into one before sending them out. Weave also makes use of VxLAN. Weave also supports name service between containers, so that can easily access your containers via host.weave.local.
There’s still a lot in the container realm to discover, explore, and understand. But this is just to make you sound like you’ve left 2012 at your next meeting, and you’ve made it to 2017 – the year of knowledge (and the Rooster)!