How to stabilize Calico's IP-in-IP tunnels in virtual environments
Estimated time to read: 5 minutes
When you work with bleeding edge technology you can expect the unexpected
As most of us know software is never without bugs and due to technical diversity, most of us won’t be able to fix these bugs ourselves. Instead, we can develop and deploy workarounds while we wait for specialists to release a fix.
In this post, I’d like to share how we resolved a connectivity issue between pods in our Kubernetes cluster due to a tunneling issue in Calico.
Public Cloud
At Cyso Cloud we run a public cloud based on the free and open-source cloud computing platform OpenStack. Initially, we built this public cloud for our users to set up and manage their own infrastructure and there has been an internal company demand for a similar service. We eat our own dog food.
Currently, we’re working on a continuous deployment pipeline to run OpenStack in containers. For fast iterations, we deploy to virtual hardware on our own public cloud. The containers are orchestrated by Kubernetes and intern-container connectivity is handled by Calico. Because we run on virtual hardware we use Calico’s IP in IP tunneling.
IP in IP is an IP tunnelling protocol that encapsulates one IP packet in another IP packet. (source wikipedia.org)
Problem
The problem we faced using Calico IP in IP tunnels in a virtual environment was that Kubernetes pods sometimes couldn’t connect to one another during the initialization phase. Somehow these IP-in-IP tunnels between pods weren’t properly initialized causing the pods to get stuck in a crash loop. During troubleshooting and many deployments runs we discovered sending ICMP packets from cross-origin pods within the Kubernetes cluster resolved the IP in IP network issues we were having.
The success rate of our continuous deployments went from 60% to 100%.
Workaround
Our current workaround is deploying a pod on each Kubernetes node which sends a single ICMP packet to each pod in the cluster. To deploy these pods we used some core features of Kubernetes narrowing the workaround down to a single configuration file containing no more than 30 lines.
Our resulting configuration after some iterations:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: pokepods
namespace: kube-system
labels:
app: pokepods
spec:
template:
metadata:
labels:
app: pokepods
spec:
containers:
- name: busybox
image: busybox
command: ["/bin/sh"]
args: ["-c", "PATH=$PATH:/host/usr/bin; while true; do kubectl get pods --all-namespaces -o go-template='range .itemsif (and (.status.podIP) (ne .metadata.namespace \"kube-system\"))ping -c 1 -w 1 .status.podIP || true;{{ end }}{{ end }}' |sh; sleep 5; done"]
volumeMounts:
- mountPath: /host/usr/bin
name: kubectl-path
readOnly: true
volumes:
- name: kubectl-path
hostPath:
path: /usr/bin
To deploy the pods in the Kubernetes cluster:
Break-it-down
Kubernetes configuration can be written in manifest files with a YAML or JSON format. The first three fields in the following excerpt are required for all Kubernetes configurations, apiVersion, kind, and metadata. The manifest complies with the defined API version, we want to deploy a DaemonSet resource in the kube-system namespace.
A DaemonSet ensures that all (or some) Nodes run a copy of a Pod (source: kubernetes.io)
The image we want to use as our container will be busybox. This image contains all the basic Unix tools we need to execute a shell script and ping other pods. The busybox image is pre-installed with Kubernetes which frees us from building and registering our own container image. When all you have is a hammerapiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: pokepods namespace: kube-system labels: app: pokepods
The busybox container runs a single process, configured in the command field. This process loops through all the running pods in the Kubernetes cluster and sends them an ICMP packet to jumpstart the IP in IP tunnel configured through Calico.
command: ["/bin/sh"]
args: ["-c", "PATH=$PATH:/host/usr/bin; while true; do kubectl get pods --all-namespaces -o go-template='...' |sh; sleep 5; done"]
volumeMounts:
- mountPath: /host/usr/bin
name: kubectl-path
readOnly: true
volumes:
- name: kubectl-path
hostPath:
path: /usr/bin
{{ no such element: type object['items'] }}
if (and (.status.podIP) (ne .metadata.namespace "kube-system"))
ping -c 1 -w 1 .status.podIP || true;
{{ end }}
{{ end }}
There are always improvements to be made but since this is a workaround I’ve left them out.
- Filter-out pods which are in ready status
- Log failed pings
Read more
To read more about the subject please consider the following links:
- IP in IP en.wikipedia.org
- Calico IP-in-IP docs.projectcalico.org
- Kubernetes DaemonSet kubernetes.io
- OpenStack openstack.org
- Busybox busybox.net
- Golang template package golang.org
- Alternative workaround developed by GiantSwarm github.com/giantswarm
- Cyso Cloud cyso.cloud
FAQ
A quick way to hunt down the available fields you can use within templates is to view the accompanying resource in JSON format using the Kubernetes client: