I have a single-node Kubernetes system running on a 4GB arm64 Jetson
Nano, using kind
(Kubernetes in Docker). This is the smallest
configuration I can reasonably expect to have a functioning
Kubernetes control plane on. The device regularly
throws errors of the type
2021-05-23 19:00:21.333267 W | wal: sync duration of 2.25553095s, expected less than 1s
in its etcd log files.
This particular device is using an SD card for its backing
store for Docker. etcd
has a very heavy write load, and
when its writes are too slow it will spit out this error.
The solution (once I am ready to fund it) will involve
restarting the cluster on an SSD storage device, which is
generally regarded as fast enough to etcd
to stay happy.
Is there a sweet spot, cheaper than SSD but more performant
than an SD card? Marcel Wiget has a
3 node CM4 cluster
using the onboard eMMC memory of the Pi CM4 modules for
backing store.
When etcd
is unhappy, your Kubernetes cluster is unhappy.
That process is responsible for managing consensus among
nodes, and when it can't keep up, all manner of things are
said to fail. If your cluster has mysterious timeouts, check
etcd
first.
IBM Cloud has a
tutorial on using fio
to benchmark your storage to help predict whether it will
be fast enough to run etcd
. You are looking to benchmark
writes, specifically the 99% percentile time for the
fdatasync(2)
system operation to keep it under 10ms.
Are there ways to reduce the write load of etcd
, and what
are the tradeoffs? If you're willing to rebuild the cluster
from scratch every time there's a crash or a reboot, you can
look at the
unsafe-no-fsync flag,
or put etcd backing store on tmpfs.
Both are suitable only for very ephemeral clusters, like the
sorts of setups you would do for CI/CD testing.
With only 4GB of memory, the Nano is no powerhouse - in use
it eats up about 1.5GB of runtime memory for applications
that keep Kubernetes going, leaving only about 2.5GB free
for actual application memory. The biggest process, etcd
,
makes very heavy use of memory-mapped files. The working
plan is to set up a few low-impact OpenFaas functions on
the Nano once it's stabilized.
kind has arm64 support
only fairly recently. It's best thought of as a quick and
convenient way to set up a temporary cluster, especially
given its warnings that there are no provisions for cluster
upgrades and no specific security hardening.
Acknowledgements
Thanks to the following folks for assistance in this bring up process
- Alex Ellis (tutorials, OpenFaaS, "arkade" package manager)
- Matteo Olivi (IBM tutorial on fio)
- Ben Elder (kind and arm64 support)
- Thomas Strömberg (etcd on ramfs, "half kidding")
- the Portainer team (easy access to log files)
- Jason DeTiberus (warnings about cascading failures)
- Marcel Wiget (CM4 cluster with eMMC)
None of this should be thought of as recommendations for a
production cluster, but everyone should have access to a
teaching cluster that's woefully underpowered so that
you can see what fails when you run out of some scarce resource.