Setting up K3s HA Cluster with Cloudflare Tunnel and Tailscale on Mixed Network Topology ๐ง
Setting up a highly available Kubernetes cluster can be challenging, especially when dealing with nodes across different network topologies. This blog walks through setting up a K3s cluster from scratch, no IaC, with both public and private nodes, using Tailscale for secure networking, and handling common pitfalls along the way.
Let's explore how to build a resilient K3s cluster that can handle node failures while maintaining secure communication between nodes.
TL;DR:
Setting up a K3s HA cluster requires careful consideration of network topology, proper etcd configuration, and secure communication between nodes. Using Tailscale simplifies cross-network communication while maintaining security, and Cloudflare Tunnel provides secure ingress access.
Understanding the Requirements
Before diving into the setup, let's understand what we're trying to achieve:
- A highly available K3s cluster with multiple master nodes in a budget VPS
- Mix of nodes with public and private IPs
- Secure communication between nodes using Tailscale
- Proper handling of etcd cluster formation
- Network routes for pod-to-pod communication
Prerequisites
- Multiple servers (at least 3 for true HA)
- Tailscale installed on all nodes
- Basic understanding of Kubernetes networking
- Access to modify firewall rules
Network Preparation
Setting up Tailscale
First, we need to configure Tailscale on all nodes to handle Kubernetes networking:
# Enable IP forwarding
echo 'net.ipv4.ip_forward = 1' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
# Configure Tailscale with proper routes
sudo tailscale up \
--accept-routes \
--advertise-routes=10.42.0.0/16,10.43.0.0/16 \
--hostname=$(hostname -f)
Firewall Configuration
Open necessary ports for K3s and etcd communication:
# Allow Tailscale traffic
sudo ufw allow in on tailscale0
sudo ufw allow out on tailscale0
# K3s required ports
sudo ufw allow 6443 # Kubernetes API
sudo ufw allow 2379 # etcd client
sudo ufw allow 2380 # etcd peer
sudo ufw allow 10250 # kubelet
Setting Nodes
Setting up the First Master Node
The first master node initializes the cluster and etcd:
# Get Tailscale IP
TAILSCALE_IP=$(tailscale ip -4)
# Create K3s config
sudo mkdir -p /etc/rancher/k3s
sudo tee /etc/rancher/k3s/config.yaml <<EOF
node-ip: "${TAILSCALE_IP}"
node-external-ip: "${TAILSCALE_IP}"
advertise-address: "${TAILSCALE_IP}"
tls-san:
- "${TAILSCALE_IP}"
- "$(hostname -f)"
cluster-init: true
token: "your-secure-token" # Set your own token
flannel-iface: "tailscale0"
disable:
- traefik
etcd-expose-metrics: true
EOF
# Install K3s
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server --cluster-init" sh -
Adding Additional Master Nodes
For each additional master node:
# Get Tailscale IP
TAILSCALE_IP=$(tailscale ip -4)
MASTER1_TAILSCALE_IP="<first-node-tailscale-ip>"
# Create K3s config
sudo mkdir -p /etc/rancher/k3s
sudo tee /etc/rancher/k3s/config.yaml <<EOF
server: "https://${MASTER1_TAILSCALE_IP}:6443"
token: "your-secure-token" # Same as first node
node-ip: "${TAILSCALE_IP}"
node-external-ip: "${TAILSCALE_IP}"
advertise-address: "${TAILSCALE_IP}"
tls-san:
- "${TAILSCALE_IP}"
- "$(hostname -f)"
flannel-iface: "tailscale0"
disable:
- traefik
EOF
# Install K3s
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server" sh -
Verifying the Cluster
After setting up all nodes, verify the cluster status:
# Check node status
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION
<node_1_censored> Ready control-plane,etcd,master 156m v1.31.4+k3s1
<node_2_censored> Ready control-plane,etcd,master 3h1m v1.31.4+k3s1
<node_3_censored> Ready control-plane,etcd,master 160m v1.31.4+k3s1
If nothing went wrong, you will se three nodes in ready state.
Try rebooting a node and re-run kubectl get nodes -o wide
command. It (personally, for me) gives satisfaction. โจ
Setting Up Ingress
If you noticed, in the previous snippet we disabled Traefik. That means our cluster is running without a Load Balancer. Let's set one up!
Fortunately, we don't need to set it up from scratch because someone already made a Controller we can use.
First, add the helm repository:
helm repo add strrl.dev https://helm.strrl.dev
helm repo update
Then install with helm:
helm upgrade --install --wait \
-n cloudflare-tunnel-ingress-controller --create-namespace \
cloudflare-tunnel-ingress-controller \
strrl.dev/cloudflare-tunnel-ingress-controller \
--set=cloudflare.apiToken="<cloudflare-api-token>",cloudflare.accountId="<cloudflare-account-id>",cloudflare.tunnelName="<your-favorite-tunnel-name>"
Once the ingress controller is created, we are ready to expose a pod to the internet. Let's add a deployment:
tee httpbin.yaml <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: httpbin
---
apiVersion: v1
kind: Service
metadata:
name: httpbin
labels:
app: httpbin
service: httpbin
annotations:
tailscale.com/expose: "true"
spec:
ports:
- name: http
port: 80
targetPort: 80
selector:
app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpbin
spec:
replicas: 2
selector:
matchLabels:
app: httpbin
version: v1
template:
metadata:
labels:
app: httpbin
version: v1
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- httpbin
topologyKey: kubernetes.io/hostname
weight: 100
serviceAccountName: httpbin
containers:
- image: kennethreitz/httpbin:latest
imagePullPolicy: IfNotPresent
name: httpbin
ports:
- containerPort: 80
EOF
# Apply the configuration
kubectl apply -f httpbin.yaml
# Create the ingress rule
kubectl create ingress httpbin-tunnel \
--rule="httpbin.alif.web.id/*=httpbin:80" \
--class cloudflare-tunnel
And voila! It's done โจ
You can now access it through https://httpbin.alif.web.id.
Architecture Overview
Below is our final architecture design:

Common Issues and Solutions
etcd Cluster Formation
If you encounter etcd-related issues:
- Ensure all nodes can reach each other on ports 2379 and 2380
- Verify Tailscale routes are properly configured
- Check that node IPs are consistent in the configuration
- Monitor etcd metrics using
etcdctl endpoint health
Network Connectivity
For networking issues:
- Test connectivity between nodes using
nc -zv
- Verify Tailscale status with
tailscale status
- Check firewall rules on all nodes
- Ensure pod CIDR ranges don't overlap with your network
Cloudflare Tunnel
Common Cloudflare Tunnel issues:
- Verify your API token has sufficient permissions
- Check tunnel logs using
kubectl logs -n cloudflare-tunnel-ingress-controller
- Ensure DNS records are properly configured in Cloudflare
- Monitor tunnel status in Cloudflare Zero Trust dashboard
Things I Learned (TIL)
- Always use odd numbers of master nodes (3, 5, etc.) for proper etcd quorum
- Monitor etcd health regularly using built-in tools
- Use consistent networking across all nodes when possible
- Document your configuration - it's recommended to use IaC like Terraform or Ansible
- Back up etcd regularly - K3s stores this in
/var/lib/rancher/k3s/server/db/
- Keep your tokens secure - rotate them periodically
- Monitor resource usage - especially on smaller VPS instances
Maintenance Tips
-
Regular Updates
- Keep K3s version up to date
- Update Cloudflare Tunnel controller regularly
- Monitor security advisories
-
Backup Strategy
# Backup etcd data k3s etcd-snapshot save --name pre-upgrade
-
Health Checks
# Check cluster health kubectl get componentstatuses kubectl cluster-info
Conclusion
Setting up a highly available K3s cluster with mixed network topology requires careful planning and configuration. Using Tailscale significantly simplifies the networking setup while maintaining security, and Cloudflare Tunnel provides a secure way to expose services to the internet. Regular monitoring and backups ensure cluster reliability over time.
Remember that while this setup works well for many use cases, your specific requirements might need additional configuration or security measures. Always follow security best practices and keep your cluster components updated.
This article is based on real experience setting up a K3s cluster with both public and private nodes. The configuration shown uses K3s v1.31.4+k3s1, but the concepts should apply to other versions as well. Last updated: January 2024.