Saturday, May 05, 2018

Kubernetes on a VirtualBox cluster


Installling Kubernetes (1.9) on a 3-VM cluster in VirtualBox

This is a labour of frustration, as much as of love. Kubernetes, k8s for short, is the holy grail for wannabe cloud engineers these days. Given that and the endless hype around it, you would expect that installing it on a local cluster "from the first principles" should be reasonably straightforward. Alas, that's not the case and after running through hoops for a whole workday and some more, I think I finally have something of a recipe. In case you've hit upon this article after having gone through that rigmarole, you might want to head straight to the last section of this guide to figure out how to sanitize an existing kubernetes installation, or to clean it up altogether, before attempting a fresh start.

The VMs


My laptop has reasonably good specs - 16 GiB of RAM, 500 GB of non-SSD storage, i5 5th generation dual-core processor with hyperthreading enabled. I run CentOS 7.4 (7.4.1708 to be precise) for my VMs on Oracle VirtualBox. The Linux kernel version on the VMs is 3.10.0 (3.10.0-693.21.1.el7.x86_64 to be precise).
They each have 3 GiB of RAM, 2 vCPUs, and around 30 Gigs of hard drive space. Installed CentOS Server with GUI option, with some development tools.
Make sure the three nodes are assigned static IPs, or if you have administrative access to your router, map fixed IPs to the MAC addresses of your VMs' bridged interfaces. Then, edit /etc/hosts of each of the VMs to map the hostnames to the fixed IPs of these nodes. My /etc/hosts has the following lines added on all three nodes, one for each node. (Yes, I have funky naming scheme for my VMs - short names starting with E):

10.0.0.10  ebro ebro
10.0.0.11  enzo enzo
10.0.0.12  elba elba

Finally, disable SELinux with the following changes:

$ setenforce 0
$ cat /etc/sysconfig/selinux | sed 's/^SELINUX=\([^ ]*\)\(.*\)/SELINUX=disabled\2/' >/tmp/selinux
$ [ $? -eq 0 ] && cp /tmp/selinux /etc/sysconfig/selinux || (echo "Failed"; exit 1)

Network settings


I have run into issues with my CentOS VMs when they have a single network interface with a bridged network. Web connectivity becomes flaky, and syslog contains error messages about duplicate IPv6 addresses being detected. There might be specific solutions for these that I am aware of. I could mitigate these issues by using two network interfaces - one NAT-ed and the other bridged. I use the bridged interface for the kube server to listen on, and the NAT network for internet access.

Kubernetes and Docker versions


The latest and greatest Kubernetes (1.10) doesn't work with the latest and greatest Docker (18.03). Moreover, k8s 1.10 isn't all well-rounded yet it seems. So, I chose Kubernetes 1.9 (1.9.7-0) and for that I installed Docker 17.03. If you're trying to install Docker Community Edition, you'll end up at this site (https://docs.docker.com/install/linux/docker-ce/centos/#install-docker-ce-1) in all likelihood and the steps there would install Docker 18.03 instead (yes, even when you explicitly specify 17.03*). So instead, you may download the docker 17.03 RPMs manually. The following should work:

$ wget https://yum.dockerproject.org/repo/main/centos/7/Packages/docker-engine-17.03.1.ce-1.el7.centos.x86_64.rpm
$ wget https://yum.dockerproject.org/repo/main/centos/7/Packages/docker-engine-selinux-17.03.1.ce-1.el7.centos.noarch.rpm
$ rpm -U docker-engine-selinux-17.03.1.ce-1.el7.centos.noarch.rpm docker-engine-17.03.1.ce-1.el7.centos.x86_64.rpm

You might see some errors, like the following, coming from the installation of docker-engine-selinux; you can ignore them:

setsebool:  SELinux is disabled.
Re-declaration of type docker_t
Failed to create node
Bad type declaration at /etc/selinux/targeted/tmp/modules/400/docker/cil:1
/usr/sbin/semodule:  Failed!

Finally, enable docker:

$ systemctl enable docker
$ systemctl start docker

Preparing for the Kubernetes install


The first thing you need to do on all your nodes is turn off swap. Use the following commands:

$ swapoff -a
$ cat /etc/fstab | sed '/.*swap/s/^\(.*\)/# \1/' > /tmp/fstab
$ [ $? -eq 0 ] && cp /tmp/fstab /etc/fstab || (echo "Failed"; exit 1)

Next, if dnsmasq is running locally on your system, disable it as kube-dns won't come up while dnsmasq is running:

$ systemctl disable dnsmasq  # Or your kube-dns won't come up

Next:

$ modprobe br_netfilter
$ sysctl net.bridge.bridge-nf-call-ip6tables=1
$ sysctl net.bridge.bridge-nf-call-iptables=1

Some preemptive clean up:

$ rm -rf /etc/kubernetes/*
$ systemctl disable firewalld && systemctl stop firewalld

Installing Kubernetes and doing some tweaks


Add the appropriate repos first:

$ cat > /etc/yum.repos.d/kubernetes.repo <<EOF
[kubernetes]
name=Kubernetes
baseurl=http://yum.kubernetes.io/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
    https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg 
EOF

Next, check the versions available:

$ yum list kubeadm --showduplicates | sort -r

Pick kubeadm version 1.9.7-0 if you can see it, or the latest 1.9.x version that's listed. Now install these and other related packages:

$ yum install kubeadm-1.9.7-0 kubelet-1.9.7-0 kubectl-1.9.7-0 kubernetes-cni-0.6.0-0

The above line should also install two dependencies - kubernetes-cni (~ 0.6.0-0) and socat (~ 1.7.3.2). Also enable the kubelet service (without starting it):

$ systemctl enable kubelet

Next, run the following commands to see that the cgroup driver for the docker installation matches that of kubeadm.

$ docker info | grep cgroup
Cgroup Driver: cgroupfs

Now check the contents of the file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf and edit it in case it refers to a different cgroup driver. If you see the following line:

Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"
change it to:
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"  # same as the docker cgroup driver

Then run:

$ systemctl daemon-reload

Finally, on the VM that you want to designate as the master run the following command:

$ kubeadm init --apiserver-advertise-address=10.0.0.10 --pod-network-cidr=10.244.0.0/16 # use the right IP

The --pod-network-cidr option is needed if you want to use Flannel for your POD network. Refer to the official documentation for other types of POD networks. Of course, use whatever is the stable IP of the master node in place of 10.0.0.10.

Once kubeadm starts your master node, you should see a line such as the one below in its output:
kubeadm join --token 689f6f.1865da38377c3e55 10.0.0.10:6443 --discovery-token-ca-cert-hash sha256:1250d71d8ed1b11de1b156bc23fcac9e80eaafa60395f75ba88ed550c64e42f4
On the remaining nodes, run the above from the command line. These will join your cluster. Also, copy over the file /etc/kubernetes/admin.conf from your master node to the same location on your worker nodes. This will allow you to run kubectl commands from your worker nodes too.
On the master (or a worker node), run the following command to see whether the other nodes have joined:

$ export KUBECONFIG=/etc/kubernetes/admin.conf
$ kubectl get nodes

Also check the status of the pods that kubernetes creates:

$ kubectl get pods --all-namespaces

Cleaning up Kubernetes on an existing cluster


The following should work for resetting Kubernetes on a cluster before starting out afresh:

$ kubectl drain <node>  --delete-local-data --force --ignore-daemonsets
$ kubectl delete node <node>
$ kubeadm reset

If you have to remove the installation altogether, follow the above up with the following:

$ yum remove kubeadm kubelet kubectl kubernetes-cni socat
$ systemctl disable kubelet
$ rm -rf /var/lib/etcd /etc/kubernetes/*
$ docker ps -q | xargs docker rm -f  # Very careful, this deletes all containers

No comments: