#flannel

bkoehn@diaspora.koehn.com

Last night I installed the new #Canal #CNI (#Calico + #Flannel) on the new #k3s #Kubernetes cluster in the same way I've always done it on the old #k8s cluster, neglecting the clear instructions to apply any changes from the original configuration to the new one. Those changes included little things like telling Flannel which interface to use, what IP range to allocate, and other trivialities. Wow did I blow that cluster to bits. Following the directions and deleting a few very confused pods fixed the issue.

Anyway, it's working now, and I have a better process in place to manage CNI upgrades.

bkoehn@diaspora.koehn.com

Alright, after a bit more puttering about I've got my #k3s #Kubernetes cluster networking working. Details follow.

From an inbound perspective, all the nodes in the cluster are completely unavailable from the internet, firewalled off using #hetzner's firewalls. This provides some reassurance that they're tougher to hack, and makes it harder for me to mess up the configuration. All the nodes are on a private network that allows them to communicate with one another, and that's their exclusive form of communication. All the nodes are allowed any outbound traffic. The servers are labeled in Hetzner's console to automatically apply firewall rules.

In front of the cluster is a Hetzner firewall that is configured to forward public internet traffic to the nodes on the private network (meaning the load balancer has public IPv4 and IPv6 addresses, and a private IPv4 address that it uses to communicate with the worker nodes). The load balancer does liveness checks on each node and can prevent non responsive nodes from receiving requests. The load balancer uses the PROXY protocol to preserve source #IP information. The same Hetzner server labels are used to add worker nodes to the load balancer automatically.

The traffic is forwarded to an #nginx Daemonset which k3s keeps running on every node in the cluster (for high availability), and the pods of that DaemonSet keep themselves in sync using a ConfigMap that allows tweaks to the nginx configuration to be applied automatically. Nginx listens on the node's private IP ports and handles #TLS termination for #HTTP traffic and works with Cert-Manager to maintain TLS certificates for websites using #LetsEncrypt for signing. TLS termination for #IMAP and #SMTP are handled by #Dovecot and #Postfix, respectively. Nginx forwards (mostly) cleartext to the appropriate service to handle the request using Kubernetes Ingress resources to bind ports, hosts, paths, etc. to the correct workloads.

The cluster uses #Canal as a #CNI to handle pod-to-pod networking. Canal is a hybrid of Calico and Flannel that is both easy to set up (basically a single YAML) and powerful to use, allowing me to set network policies to only permit pods to communicate with the other pods that they need, effectively acting as an internal firewall in case a pod is compromised. All pod communication is managed using standard Kubernetes Services, which behind the scenes simply create #IPCHAINS to move traffic to the correct pod.

The configuration of all this was a fair amount of effort, owing to Kubernetes' inherent flexibility in the kinds of environments it supports. But by integrating it with the capabilities that Hetzner provides I can fairly easily create an environment for running workloads that's redundant and highly secure. I had to turn off several k3s "features" to get it to work, disabling #Traefik, #Flannel, some strange load balancing capabilities, and forcing k3s to use only the private network rather than a public one. Still, it's been easier to work with than a full-blown Kubernetes installation, and uses considerably fewer server resources.

Next up: storage! Postgres, Objects, and filesystems.