k8s Infrastructure remarks

App Suite 8 and the automation of this repo was tested against different k8s, including kubesprayed k8s, IONOS managed k8s, AWS EKS.

Short overview:

  • For production, and labs "like production", we recommend kubesprayed vanilla k8s. https://github.com/kubernetes-sigs/kubespray

    Rancher products are known to work as well. Other CNCF certified k8s distributions / installers are expected to work as well, but should be tested. (Problematic as of now are "compliance enforcing" k8s distributions like OpenShift, OKD.)

  • For labs down to the scale of "on a developer's laptop" (which is possible if the laptop has enough memory), we recommend kubesprayed k8s, or k3s, in a single-node cluster on a virtual machine. For the OS of the VM, Debian 12 seems to work well / cause less irritations than its predecessor.

    • k8s via kubespray installs in a few minutes on a VM on a modern laptop
    • k3s is even much faster (less than a minute)

    Reasonable labs are possible starting with 16 GB for the VM, and if you want to do some experiments with roles and scaling, probably 32GB of memory for the VM are reasonable.

  • We recommend to not use any of these:

    • minikube
    • kind
    • k3d

    When experimenting with these, we usually run into issues which are hard to debug and turn out to be caused by the "platform tooling", and not the as8 application or our automation itself. The products lack documentation and transparency.

kubespray

Our primary reference platform for as8-deployment and running as8 in general is a kubesprayedopen in new window k8s. We hint towards Ciliumopen in new window usage for the CNI, but it should be fine with other CNIs as well.

Single node clusters for labs are possible (for minimum sizing see the comments above about the "developer laptop"). Multi-node clusters of "arbitrary" size are, of course, possible; upstream limits apply, e.g. https://kubernetes.io/docs/setup/best-practices/cluster-large/#size-of-master-and-master-components.

Follow the installation procedure as documented upstreams https://github.com/kubernetes-sigs/kubespray#quick-start. We like to use a slightly modified command line ...

% ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root --extra-vars '@extra_vars.yml' cluster.yml

... pulling some extra vars we put in a file extra_vars.yml which looks like

cluster_name: something.local
dns_domain: cluster.local
supplementary_addresses_in_ssl_keys: ['192.168.9.2']
kubeconfig_localhost: true
metrics_server_enabled: true
local_path_provisioner_enabled: true
local_path_provisioner_storage_class: "local-path"
kube_network_plugin: "cilium"
ping_access_ip: false
upstream_dns_servers: ['172.17.0.1']

which are explained as follows, roughly in order of relevance:

  • kube_network_plugin: we want to use cilium instead of kubespray's default calico
  • upstream_dns_servers: we observe in our labs that, using a "standard" VM like the Debian cloud images, in combination with cloud-init, resoveconf, systemd-resolved, coredns, etc, leads to broken setups with circular loops in DNS forwards, making coredns die. A resolution is to inject here a valid source for a DNS (fowarding) server.
  • supplementary_addresses_in_ssl_keys: we provide the external IP of one or more kubelet nodes. Otherwise kubectl will not work against the endpoint(s)
  • metrics_server_enabled: allows us to do e.g. kubectl top node
  • ping_access_ip: depending on firewalls, this may fail. We just like to disable this test.
  • cluster_name: allows for identifying the cluster in a merged KUBECONFIG file
  • dns_domain: We require to run on the k8s default DNS domain of cluster.local.
  • local_path_provisioner*: allows use to use local path storage for labs (not recommended for prod usage)

Rancher k3s

We recommend to use k3s in a VM (e.g. Debian stable). k3s's plugins should be mostly disabled (flannel, traefik, etc). Install cilium.

References:

  • https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/
  • https://docs.k3s.io/installation/network-options

Hints:

  • k3s installation:
    curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC='--flannel-backend=none --disable-network-policy --disable=servicelb --disable=traefik --disable-helm-controller' sh -
    
  • cilium installation: need to specify the kubeconfig location. If you're fine with "latest" for your lab, you can invoke cilium install without a version parameter.
    # install the cilium CLI as described on the upstream docs page
    # then
    export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
    cilium install
    

EKS

You can also use a managed k8s like AWS EKS. Keep in mind though that we don't do any platform-specific integrations. (For the AWS example, this refers to RDS, S3, EBS, etc.)

We recommend to use the eksctl installation method.

A eks.yaml file can look like

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: eks-for-as8
  region: eu-central-1

nodeGroups:
  - name: eks-for-as8-ng-1
    instanceType: m5.large
    desiredCapacity: 2

Actually it creates a sizing which is default anyways, but it gives you a deterministic cluster name rather than some random name which sounds like a ubuntu distro name.

Creating a cluster then goes by (after obtaining some eksctl)

% eksctl create cluster -f eks.yaml

You'll need the lab.yml batteries_disable_storage flag because we don't have EBS integration (yet?) and thus no StorageClass on EKS. Accordingly, we did super-minimal functional testing of the installations. Input on proper AWS services integration (EBS, S3, RDS, etc) would be appreciated.

Before destrying the cluster, it is recommended to uninstall all helm charts and deleting all (application) namespaces. This will speed up destroying dramatically and make it more robust.

Destroying goes by

% eksctl delete cluster -f eks.yaml

IONOS managed k8s

... is known to work as well. Instructions missing, input welcome.