Setting up K3s HA Cluster with Cloudflare Tunnel and Tailscale on Mixed Network Topology ๐Ÿ”ง

Published on
ยท
Time to read
15 min read โ˜•
Post category
kubernetesnetworkingsecurity

Setting up a highly available Kubernetes cluster can be challenging, especially when dealing with nodes across different network topologies. This blog walks through setting up a K3s cluster from scratch, no IaC, with both public and private nodes, using Tailscale for secure networking, and handling common pitfalls along the way.

Let's explore how to build a resilient K3s cluster that can handle node failures while maintaining secure communication between nodes.

TL;DR:

Setting up a K3s HA cluster requires careful consideration of network topology, proper etcd configuration, and secure communication between nodes. Using Tailscale simplifies cross-network communication while maintaining security, and Cloudflare Tunnel provides secure ingress access.

Understanding the Requirements

Before diving into the setup, let's understand what we're trying to achieve:

  • A highly available K3s cluster with multiple master nodes in a budget VPS
  • Mix of nodes with public and private IPs
  • Secure communication between nodes using Tailscale
  • Proper handling of etcd cluster formation
  • Network routes for pod-to-pod communication

Prerequisites

  1. Multiple servers (at least 3 for true HA)
  2. Tailscale installed on all nodes
  3. Basic understanding of Kubernetes networking
  4. Access to modify firewall rules

Network Preparation

Setting up Tailscale

First, we need to configure Tailscale on all nodes to handle Kubernetes networking:

# Enable IP forwarding
echo 'net.ipv4.ip_forward = 1' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

# Configure Tailscale with proper routes
sudo tailscale up \
  --accept-routes \
  --advertise-routes=10.42.0.0/16,10.43.0.0/16 \
  --hostname=$(hostname -f)

Firewall Configuration

Open necessary ports for K3s and etcd communication:

# Allow Tailscale traffic
sudo ufw allow in on tailscale0
sudo ufw allow out on tailscale0

# K3s required ports
sudo ufw allow 6443  # Kubernetes API
sudo ufw allow 2379  # etcd client
sudo ufw allow 2380  # etcd peer
sudo ufw allow 10250 # kubelet

Setting Nodes

Setting up the First Master Node

The first master node initializes the cluster and etcd:

# Get Tailscale IP
TAILSCALE_IP=$(tailscale ip -4)

# Create K3s config
sudo mkdir -p /etc/rancher/k3s
sudo tee /etc/rancher/k3s/config.yaml <<EOF
node-ip: "${TAILSCALE_IP}"
node-external-ip: "${TAILSCALE_IP}"
advertise-address: "${TAILSCALE_IP}"
tls-san:
  - "${TAILSCALE_IP}"
  - "$(hostname -f)"
cluster-init: true
token: "your-secure-token"  # Set your own token
flannel-iface: "tailscale0"
disable:
  - traefik
etcd-expose-metrics: true
EOF

# Install K3s
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server --cluster-init" sh -

Adding Additional Master Nodes

For each additional master node:

# Get Tailscale IP
TAILSCALE_IP=$(tailscale ip -4)
MASTER1_TAILSCALE_IP="<first-node-tailscale-ip>"

# Create K3s config
sudo mkdir -p /etc/rancher/k3s
sudo tee /etc/rancher/k3s/config.yaml <<EOF
server: "https://${MASTER1_TAILSCALE_IP}:6443"
token: "your-secure-token"  # Same as first node
node-ip: "${TAILSCALE_IP}"
node-external-ip: "${TAILSCALE_IP}"
advertise-address: "${TAILSCALE_IP}"
tls-san:
  - "${TAILSCALE_IP}"
  - "$(hostname -f)"
flannel-iface: "tailscale0"
disable:
  - traefik
EOF

# Install K3s
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server" sh -

Verifying the Cluster

After setting up all nodes, verify the cluster status:

# Check node status
$ kubectl get nodes -o wide
NAME                 STATUS   ROLES                       AGE    VERSION
<node_1_censored>    Ready    control-plane,etcd,master   156m   v1.31.4+k3s1
<node_2_censored>    Ready    control-plane,etcd,master   3h1m   v1.31.4+k3s1
<node_3_censored>    Ready    control-plane,etcd,master   160m   v1.31.4+k3s1

If nothing went wrong, you will se three nodes in ready state.
Try rebooting a node and re-run kubectl get nodes -o wide command. It (personally, for me) gives satisfaction. โœจ

Setting Up Ingress

If you noticed, in the previous snippet we disabled Traefik. That means our cluster is running without a Load Balancer. Let's set one up!

Fortunately, we don't need to set it up from scratch because someone already made a Controller we can use.

First, add the helm repository:

helm repo add strrl.dev https://helm.strrl.dev
helm repo update

Then install with helm:

helm upgrade --install --wait \
  -n cloudflare-tunnel-ingress-controller --create-namespace \
  cloudflare-tunnel-ingress-controller \
  strrl.dev/cloudflare-tunnel-ingress-controller \
  --set=cloudflare.apiToken="<cloudflare-api-token>",cloudflare.accountId="<cloudflare-account-id>",cloudflare.tunnelName="<your-favorite-tunnel-name>" 

Once the ingress controller is created, we are ready to expose a pod to the internet. Let's add a deployment:

tee httpbin.yaml <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
  name: httpbin
---
apiVersion: v1
kind: Service
metadata:
  name: httpbin
  labels:
    app: httpbin
    service: httpbin
  annotations:
    tailscale.com/expose: "true"
spec:
  ports:
  - name: http
    port: 80
    targetPort: 80
  selector:
    app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin
spec:
  replicas: 2
  selector:
    matchLabels:
      app: httpbin
      version: v1
  template:
    metadata:
      labels:
        app: httpbin
        version: v1
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - httpbin
              topologyKey: kubernetes.io/hostname
            weight: 100
      serviceAccountName: httpbin
      containers:
      - image: kennethreitz/httpbin:latest
        imagePullPolicy: IfNotPresent
        name: httpbin
        ports:
        - containerPort: 80
EOF

# Apply the configuration
kubectl apply -f httpbin.yaml

# Create the ingress rule
kubectl create ingress httpbin-tunnel \
  --rule="httpbin.alif.web.id/*=httpbin:80" \
  --class cloudflare-tunnel

And voila! It's done โœจ

You can now access it through https://httpbin.alif.web.id.

Architecture Overview

Below is our final architecture design:

K3s Cluster Architecture Diagram

Common Issues and Solutions

etcd Cluster Formation

If you encounter etcd-related issues:

  • Ensure all nodes can reach each other on ports 2379 and 2380
  • Verify Tailscale routes are properly configured
  • Check that node IPs are consistent in the configuration
  • Monitor etcd metrics using etcdctl endpoint health

Network Connectivity

For networking issues:

  • Test connectivity between nodes using nc -zv
  • Verify Tailscale status with tailscale status
  • Check firewall rules on all nodes
  • Ensure pod CIDR ranges don't overlap with your network

Cloudflare Tunnel

Common Cloudflare Tunnel issues:

  • Verify your API token has sufficient permissions
  • Check tunnel logs using kubectl logs -n cloudflare-tunnel-ingress-controller
  • Ensure DNS records are properly configured in Cloudflare
  • Monitor tunnel status in Cloudflare Zero Trust dashboard

Things I Learned (TIL)

  1. Always use odd numbers of master nodes (3, 5, etc.) for proper etcd quorum
  2. Monitor etcd health regularly using built-in tools
  3. Use consistent networking across all nodes when possible
  4. Document your configuration - it's recommended to use IaC like Terraform or Ansible
  5. Back up etcd regularly - K3s stores this in /var/lib/rancher/k3s/server/db/
  6. Keep your tokens secure - rotate them periodically
  7. Monitor resource usage - especially on smaller VPS instances

Maintenance Tips

  1. Regular Updates

    • Keep K3s version up to date
    • Update Cloudflare Tunnel controller regularly
    • Monitor security advisories
  2. Backup Strategy

    # Backup etcd data
    k3s etcd-snapshot save --name pre-upgrade
    
  3. Health Checks

    # Check cluster health
    kubectl get componentstatuses
    kubectl cluster-info
    

Conclusion

Setting up a highly available K3s cluster with mixed network topology requires careful planning and configuration. Using Tailscale significantly simplifies the networking setup while maintaining security, and Cloudflare Tunnel provides a secure way to expose services to the internet. Regular monitoring and backups ensure cluster reliability over time.

Remember that while this setup works well for many use cases, your specific requirements might need additional configuration or security measures. Always follow security best practices and keep your cluster components updated.


This article is based on real experience setting up a K3s cluster with both public and private nodes. The configuration shown uses K3s v1.31.4+k3s1, but the concepts should apply to other versions as well. Last updated: January 2024.