#kubernetes

gehrke_test@libranet.de

Seit bald einem Jahr laufen unsere produktiven Services via #Kubernetes. Heute haben wir gemerkt, dass seit gestern plötzlich die Webserver parallel (RoundRobin) arbeiten anstatt im FailOver-Mode.So war es eigentlich die ganze Zeit geplant.

Vermutlich werden wir noch ein weiteres Jahr brauchen, herauszufinden, warum das so ist.

#EinmalMitProfis

Billiger Joke mit Papst zu Kubernetes

gehrke_test@libranet.de

Hab gestern mal wieder gemerkt, wie weit ich mich von der Basis entfernt habe. Eher featuregetrieben haben wir die produktiven Tomcats von #openjdk 11 auf 17 umgestellt. Wir hatten vorher funktional getestet, aber groß über die Hintergründe informiert hatte ich mich tatsächlich vorher nicht.

Jetzt verbrauchen die Pods plötzlich um Faktor 2 bis 4 mal weniger RAM als vorher und die GarbageCollection agiert sichtbar dynamisch. Alta!

#Java #Kubernetes #Tomcat #EinmalMitProfis

bkoehn@diaspora.koehn.com

I made a #kubernetes pod that updates my mail server’s DANE record whenever LetsEncrypt issues a new one because boredom. Curiously I used ChatGPT to generate the script that uses openssl to generate the TLSA record and it actually worked perfectly. Once I had the certificate and the TLSA record, a simple call to nsupdate with a TSIG key and BIND worked all its DNSSEC magic.

#!/bin/bash

TLSA="3 0 1 $(openssl x509 -in certificate.pem -outform DER | openssl sha256 | cut -d' ' -f2)"

NAME="_25._tcp.smtp.koehn.com"

UPDATE="
server ns1.koehn.com
zone koehn.com
update delete $NAME TLSA
update add $NAME 300 TLSA $TLSA
send"

echo "$UPDATE" | nsupdate -vy "$KEY"
bkoehn@diaspora.koehn.com

I finally upgraded #k3s on the cluster today, and it was so painless it felt like cheating. Download the new binary, move it in place, and restart the service. In #Kubernetes this takes an hour with endless plans and scripts and draining and reloading everything. In k3s it took five minutes.

gehrke_test@libranet.de

Ab 2:00 konnte ich nicht mehr schlafen. Senile Bettflucht, wer kennt das nicht.

Aufgestanden, Kaffee gemacht und mal geschaut, was die #CLT23 noch an coolen Talks zu bieten hat. Ah, was mit #kubernetes und #security. Sehr schön.

Das war ein Fehler! Jetzt kann ich wahrscheinlich nie mehr schlafen...

#k8s #CLT #ccc #Chaos #PoweredByRSS


Chaos Computer Club - Media (Inoffiziell) - 2023-03-11 23:00:00 GMT

Hacking Kubernetes Cluster and Secure it with Open Source (clt23)
https://mirror.selfnet.de/CCC/events/clt/2023/h264-hd/clt23-99-deu-Hacking_Kubernetes_Cluster_and_Secure_it_with_Open_Source_hd.mp4

bkoehn@diaspora.koehn.com

I worked quite hard to solve this problem, and I’m happy with its (eventual) simplicity. When you have (for example) two Kubernetes containers in a pod (or two processes that can share a named pipe) and you need to run a process on one of them from the other one, I have just the tool for you. It’s basically ssh without all the pesky networking, using named pipes instead of TCP streams.

https://unix.stackexchange.com/a/735642/157130

#linux #kubernetes #docker #fifo #bash

gehrke_test@libranet.de

Auch so'n Ding, an das ich mich in der Container-Welt erst noch gewöhnen muss:

dev-44f679674-7fghj:/> grep 'PRETTY_NAME' /etc/os-release  
PRETTY_NAME="openSUSE Leap 15.4"

dev-44f679674-7fghj:/> uname -a
Linux dev-44f679674-7fghj 5.4.0-126-generic #142-Ubuntu SMP Fri Aug 26 12:12:57 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

#kubernetes

bkoehn@diaspora.koehn.com

Now that I have working #NFS in #Kubernetes, can I point multiple instances of #Postfix at the queues for high availability? Nope.

Postfix file locking and NFS

For the Postfix mail queue, it does not matter how well NFS file locking works. The reason is that you cannot share Postfix queues among multiple running Postfix instances. You can use NFS to switch a Postfix mail queue from one NFS client to another one, but only one NFS client can access a Postfix mail queue at any particular point in time.

But I can create multiple instances of Postfix, each with its own queues, using StatefulSets. But I don’t think it buys me very much in terms of reliability; with my workload, a few seconds of downtime while k3s spins up a new Postfix pod isn’t very impactful.

bkoehn@diaspora.koehn.com

Devoted some time to continue to tear down my #Kubernetes #k8s infrastructure at #Hetzner and move it to my #k3s infrastructure at #ssdnodes. It's pretty easy to move everything, the actual work involving moving files and databases and a bit of downtime. As I relieve the old infrastructure I can save some money by shutting down nodes as the workload decreases. I've shut down two nodes so far. Might free up another tonight if I can move #Synapse and Diaspora.

bkoehn@diaspora.koehn.com

After a few hours of work, I have high-availability storage on my #k3s #Kubernetes cluster.

Running on bare Ubuntu VMs, each of the three servers has 48GB of RAM and 720GB of SSD storage. The provider I'm using doesn't supply extra SAN storage, so the on-VM storage is all I have, and any redundancy I have to handle myself.

Enter Longhorn. Longhorn is a FOSS project from Rancher that allows you to use local storage inside your Kubernetes cluster, and keeps replicas available on other nodes in case one of your servers is unavailable. The system is trivial to set up and highly efficient, and acts as a StorageClass that you can use when requesting storage for a pod. It can also schedule snapshots and backups to an offsite S3 instance for additional safety. It even has experimental support for shared volumes via NFS!

For object storage I've configured a modern Minio cluster. Minio is an FOSS S3-compatible server that also uses local storage, keeping multiple instances around for high availability. It's also quite easy to configure and use, with an incredibly rich feature set and a lovely UI. It doesn't have its own backups, but it's easy to replicate with a simple cron job.

I'm slowly moving workloads over to the new pod, and will migrate the Diaspora pod in a few days (expect an hour or so of downtime during the migration). The new cluster is more secure, more stable, and is much less likely to go down than the old one was.