Do you actually need Kubernetes? Netflix needs it, a few others probably do too. If you need to learn it for work or something, go right ahead, but chances are good that YOU don’t need to scale to 10,000 nodes at home. I didn’t want Kubernetes (or k8s, or k3s, or Minikube) at home since I am pretty limited for resources and want as much bang for my virtual buck as I can get.
I built mine out across Proxmox host(s) with Debian 10 VM’s (not containers) and TrueNAS virtual machines (Debian 10 also), but it doesn’t really matter, just try to stay with the same version of Docker across them for your own sanity. You can probably do this with multiple Raspberry Pi’s, but mine are otherwise occupied. I don’t know if you can (or should) combine the different machine types into one cluster, that seems like it would be bad. And many things need different images for the pi and would need to account for that. I’m setting up a Pi 1 with an 8TB Easystore to live at my parents house for backups and it has to use different images for minio and possibly Kuma. It’s not fast, but it doesn’t need to be.
Steps:
- Base OS
- Docker (clone them here if needed)
- Swarm init
- Swarm join cluster
- Test it!
- NFS Mounts for data (optional, depends on your needs)
- Traefik
- Configure deployments
- Swarmpit (optional)
- Swarmprom (optional)
- Portainer (optional)
- Apps and apps and apps (this is why you’re here)
- Proxied services outside of the swarm (optional)
- Add basic authentication to an app that doesn’t have any (optional)
- Apps that are docker but NOT public (optional)
Code and scripts are all here!
https://github.com/8layer8/swarm-public/tree/main
While you don’t need all of this, it does help to have something to start with.
Base OS
Do a basic Debian 10 install, set up disks, networks and hostnames as you need them, unselect the GUI, you really only want ssh and base utilities. Once it is up, ssh into it and run:
apt-get update
apt-get -y install apt-transport-https ca-certificates curl gnupg-agent software-properties-common sudo vim mc
Docker (clone them here if needed)
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"
apt-get update
apt-get -y install docker-ce docker-ce-cli
sudo systemctl start docker
sudo systemctl enable docker
sudo usermod -aG docker brad
sudo newgrp docker
sudo curl -L https://github.com/docker/compose/releases/download/1.18.0/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
Swarm init (master host)
# If you are going to clone the box for a worker node, stop here and clone it first!
# Use the IP or FQDN of the host
sudo docker swarm init --advertise-addr 192.168.0.123
# this gives you a URL to join the cluster, keep it somewhere
Swarm join cluster (worker nodes)
# If needed, regenerate the token on the master node:
docker swarm join-token worker
# Then run it on any/all worker nodes like:
docker swarm join --token SWMTKN-1-02wqm9v19hey4js984yg8hrhg9tmai80pbwtwiwiery8b-8e9857y49herughkc6jhtc47y5y 192.168.0.123:2377
Test it!
docker node ls
docker service create --name webserver -p 8080:80 nginx
# You should now be able to hit ANY node on port 8080 and get a response, it MAY take a few minutes, just chill.
# Scale it up or down:
docker service scale webserver=3
docker service scale webserver=1
# See that it is working:
docker service ls
# Delete it:
docker service rm webserver
NFS Mounts for data
There’s a way for Docker to handle NFS mounts itself, this is not it (yet)
Debian:
apt update
apt -y install nfs-common
vim /etc/fstab
192.168.0.253:/mnt/pool_alpha/vm_storage /mnt/pool_alpha/vm_storage nfs defaults 0 0
192.168.0.253:/mnt/pool_alpha/video/ /mnt/pool_alpha/video nfs defaults 0 0
(etc)
mkdir -p /mnt/pool_alpha/vm_storage
mkdir -p /mnt/pool_alpha/video
* Match up the NFS owners from a working node, this can be squirrelly! You don't want to just open it all up, but your NFS shouldn't be open to the internet anyway. It's your call. Sometimes you have to launch a service with no mount points defined, bash into it, cd to the mount point and ls -lanrt to see what numbers the UID and GID are, then use those to chown the mount root, then try again with the NFS mount
# Mount the shares
mount -a
# if they don't mount at boot, create this file:
/etc/network/if-up.d/fstab
#!/bin/sh
mount -a
# Make it executable
sudo chmod +x /etc/network/if-up.d/fstab
# reboot the box and make sure the mounts work as expected before getting much further!
Firewall
Your firewall plumbing will vary, I’m assuming you are putting this behind a home firewall.
All you need to forward is TCP/80 and TCP/443 to your master node. If you can, and understand what you’re doing, you can forward 80 and 443 to ALL of the docker swarm boxes, since swarm will handle the networking no matter which box you land on. This is not always possible with some firewalls so the safe way is to just point them at the master node. Even though nothing really lives on port 80 on Traefik, it will issue the redirects to the https site(s) and it also has to be there for Lets Encrypt challenge responses to make it to where they need to go. You have to have port 80 open! Very little traffic will use it, but it has to be there for things to work properly.
Traefik
While swarm will advertise ports across all members, it doesn’t handle much else (Certs, authentication, etc.) So setting up Traefik to do all the plumbing automatically is the way to go.
I set mine up according to https://dockerswarm.rocks/traefik/
It can be a little funky getting it play nice with Lets Encrypt and wildcard domains, so my sanitized configs for using Traefik + Lets Encrypt + Digital Ocean wildcard DNS are in the git repo.
Configure Deployments
There are dozens of ways to do this, but this is for home, and I wanted it simple.
Simple way:
Use a directory on the NFS shares to keep all of your docker compose files and start/stop scripts. Ideally, only the master will run them, but if it all falls apart, rebuilding a new cluster takes a few minutes, mount the NFS, init the swarm, run the scripts and you’re back in business.
I set up:
/mnt/pool_alpha/vm_storage/docker-compose/traefik
├── start.sh
├── stop.sh
└── traefik.yml
Then each service stack has its’ own directory and configs and scripts under docker-compose:
audioserve
compression
cura
homeserver
kuma
traefik
etc.
Docker-compose you say? Yes, well, Docker swarm and docker-compose are very closely related. You can start with docker-compose.yml files, make a minor addition to it, and deploy it into the swarm.
I made a wrapper script to start and stop all the services, and start and stop a node to evacuate all the services from it. Docker cleanup scripts, log viewer scripts, etc. all go here so any node can do them. (Scripts at end of post)
Swarmpit
Stupid name, nice utility to manage your swarm. Also easy to set up. I followed this:
https://dockerswarm.rocks/swarmpit/
My scripts (again) are at the end.
Swarmprom
Add Grafana and Prometheus to your swarm, all the workers report in, easy peasy!
https://dockerswarm.rocks/swarmprom/
Hey look! Grafana and Prometheus! I almost have a Bingo…
Portainer
You can use scripts, Swarmpit or Portainer to launch services. I like using the scripts because I can clone them out of a git repo (and back in) and run them all to build a new cluster with everything in minutes. You can paste configs into Swarmpit and roll things back with the GUI. Portainer likes to take control, so I just use it as a GUI for troubleshooting, but YMMV:
https://dockerswarm.rocks/portainer/
Apps and apps and apps
Ok, so by now you have the Swarm running on multiple nodes, Swarmpit and Grafana and Portainer and Traefik and working certificates and everything! So, let’s add an application to hang on the internet.
Setting up Kuma Uptime as an external example application:
mkdir /mnt/pool_alpha/vm_storage/docker-compose/kuma
cd /mnt/pool_alpha/vm_storage/docker-compose/kuma
touch start.shtouch stop.sh
touch kuma.yml
chmod +x *.sh
kuma.yml:
version: '3.7'
services:
kuma:
image: louislam/uptime-kuma:latest
environment:
- PUID=1020
- PGID=1020
- TZ=America/New_York
networks:
- net
- traefik-public
volumes:
- /mnt/pool_alpha/vm_storage/kuma:/app/data
deploy:
labels:
- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.constraint-label=traefik-public
- traefik.http.routers.kuma-http.rule=Host(`${DOMAIN?Variable not set}`)
- traefik.http.routers.kuma-http.entrypoints=http
- traefik.http.routers.kuma-http.middlewares=https-redirect
- traefik.http.routers.kuma-https.rule=Host(`${DOMAIN?Variable not set}`)
- traefik.http.routers.kuma-https.entrypoints=https
- traefik.http.routers.kuma-https.tls=true
- traefik.http.routers.kuma-https.tls.certresolver=le
- traefik.http.services.kuma.loadbalancer.server.port=3001
networks:
net:
driver: overlay
attachable: true
traefik-public:
external: true
start.sh:
#Connect via SSH to your Docker Swarm manager node.
#Create an environment variable with the domain where you want to access your kuma instance, e.g.:
export DOMAIN=kuma.mydomain.com
#Make sure that your DNS records point that domain to your public IPs
#Make sure your firewall allows port 80 and 443 to (at least) one of the IPs of the Docker Swarm mode cluster.
# Deploy the app:
docker stack deploy -c kuma.yml kuma
# Below this is just information, you don't *need* any of it
echo "Access at: https://${DOMAIN}"
sleep 10
docker stack ps kuma
sleep 10
docker stack ps kuma
docker service logs kuma_kuma
stop.sh:
docker stack rm kuma
On your master node, cd into /mnt/pool_alpha/vm_storage/docker-compose/kuma
and run ./start.sh
Sit back and wait, and you should be able to hit that service, by name, with real certs, in a minute or two.
Go look at my git repo for more examples:
https://github.com/8layer8/swarm-public/tree/main/docker-compose
Proxied services outside of the swarm
Traefik 1.x used to be complex to hang a non-docker application onto it. Traefik 2.x basically makes it so difficult that I gave up after 2 days. The *easy* way to do this is to set up another app that is just an Nginx image with a reverse proxy config on it. This gets your app available to the outside and does all the Lets Encrypt heavy lifting for you, and traefik doesn’t fuss about it at all. Making a config is pretty straightforward on nginxconfig.io, and then your compose.yml file just maps out the files for the proxy to use. Once you set up one, you can clone it and tweak the few things that change between them.
---
version: '3.7'
services:
openvas:
image: nginx:latest
environment:
- PUID=1020
- PGID=1000
- TZ=America/New_York
volumes:
- /mnt/pool_alpha/vm_storage/proxies/openvas/nginx.conf:/etc/nginx/nginx.conf:ro
- /mnt/pool_alpha/vm_storage/proxies/openvas/sites-enabled/openvas.8layer8.com.conf:/etc/nginx/sites-enabled/openvas.8layer8.com.conf
- /mnt/pool_alpha/vm_storage/proxies/openvas/sites-available/openvas.8layer8.com.conf:/etc/nginx/sites-available/openvas.8layer8.com.conf
- /mnt/pool_alpha/vm_storage/proxies/openvas/nginxconfig.io/general.conf:/etc/nginx/nginxconfig.io/general.conf
- /mnt/pool_alpha/vm_storage/proxies/openvas/nginxconfig.io/security.conf:/etc/nginx/nginxconfig.io/security.conf
- /mnt/pool_alpha/vm_storage/proxies/openvas/nginxconfig.io/proxy.conf:/etc/nginx/nginxconfig.io/proxy.conf
- /mnt/pool_alpha/vm_storage/proxies/server.crt:/etc/nginx/ssl/server.crt
- /mnt/pool_alpha/vm_storage/proxies/server.key:/etc/nginx/ssl/server.key
- /mnt/pool_alpha/vm_storage/proxies/dhparam.pem:/etc/nginx/dhparam.pem
#- /mnt/pool_alpha/vm_storage/proxies/openvas/static:/var/www
networks:
- net
- traefik-public
deploy:
labels:
- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.constraint-label=traefik-public
- traefik.http.routers.openvas-proxy-http.rule=Host(`openvas.8layer8.com`)
- traefik.http.routers.openvas-proxy-http.entrypoints=http
- traefik.http.routers.openvas-proxy-http.middlewares=https-redirect
- traefik.http.routers.openvas-proxy-https.rule=Host(`openvas.8layer8.com`)
- traefik.http.routers.openvas-proxy-https.entrypoints=https
- traefik.http.routers.openvas-proxy-https.tls=true
- traefik.http.routers.openvas-proxy-https.tls.certresolver=le
- traefik.http.services.openvas-proxy.loadbalancer.server.port=80
- traefik.http.middlewares.openvas-auth.basicauth.users=brad:$$apr1$$vyr.UUVe$$iVBZogF6TZPx3LMR4BKuV1
- traefik.http.routers.openvas-proxy-https.middlewares=openvas-auth
networks:
net:
driver: overlay
attachable: true
traefik-public:
external: true
start.sh
#Connect via SSH to a Docker Swarm manager node.
docker stack deploy -c proxies.yml proxies
sleep 10
docker stack ps proxies
sleep 10
docker stack ps proxies
docker service logs proxies_proxy1
stop.sh
docker stack rm proxies
Add basic authentication to an app that doesn’t have any
Generate a password and escape it properly:
sudo apt install apache2-utils
echo $(htpasswd -nb brad mypassword) | sed -e s/\\$/\\$\\$/g
brad:$$apr1$$yLCU9Fxl$$V1G.kbqrTKLpXilRYkqeT/
Add/Edit to your deploy: labels: section
- traefik.http.middlewares.test-auth.basicauth.users=brad:$$apr1$$yLCU9Fxl$$V1G.kbqrTKLpXilRYkqeT/
- traefik.http.routers.catapp.middlewares=test-auth
#Redeploy your app with stop.sh and start.sh
Add apps that are docker but NOT public
INTERNAL ONLY stuff, easiest way is to just not put it on traefik and expose the port. You can hit the port on any of the hosts (due to swarm) and it will work. I left the traefik stuff in but commented so you can see the difference, it is not needed in general:
# cura.yml internal only
version: '3.7'
services:
cura:
image: mindcrime30/docker-cura:4.12.0
environment:
- PUID=1020
- PGID=1020
- TZ=America/New_York
networks:
- net
# - traefik-public
ports:
- 5800:5800
volumes:
- /mnt/pool_alpha/vm_storage/cura/config:/config
- /mnt/pool_alpha/shared:/storage
- /mnt/pool_alpha/vm_storage/cura/output:/output
deploy:
labels:
- needs.something.to.deploy=true
# - traefik.enable=true
# - traefik.docker.network=traefik-public
# - traefik.constraint-label=traefik-public
# - traefik.http.routers.cura-http.rule=Host(`${DOMAIN?Variable not set}`)
# - traefik.http.routers.cura-http.entrypoints=http
# - traefik.http.routers.cura-http.middlewares=https-redirect
# - traefik.http.routers.cura-https.rule=Host(`${DOMAIN?Variable not set}`)
# - traefik.http.routers.cura-https.entrypoints=https
# - traefik.http.routers.cura-https.tls=true
# - traefik.http.routers.cura-https.tls.certresolver=le
# - traefik.http.services.cura.loadbalancer.server.port=5800
networks:
net:
driver: overlay
attachable: true
# traefik-public:
# external: true
Then you can just hit http://any.swarm.box.ip:5800/ and it will get you there.