etcdctl restore snapshot: always specify --data-dir

Let’s consider the scenario of etcd backup and restore, a popular question in CKA - Certified Kubernetes Administrator exam. The Kubernetes documentation (at the time of writing) mentions —data-dir as an optional parameter to consider during etcd restore from a backup. But running etcdctl restore snapshot without —data-dir flag has weird side effect. It considers the current working directory as the new data directory. This behavior is not very well documented. Let me demonstrate below.

I shall use a single node kubeadm cluster, like the one you would encounter during the CKA exam.

➜  k get nodes
NAME           STATUS   ROLES           AGE     VERSION
controlplane   Ready    control-plane   7m16s   v1.27.0

Like all kubeadm clusters, it runs etcd as a static pod on the control-plane in the kube-system namespace.

➜  k get pods -n kube-system
NAME                                   READY   STATUS    RESTARTS   AGE
coredns-5d78c9869d-blxqk               1/1     Running   0          10m
coredns-5d78c9869d-h8rqq               1/1     Running   0          10m
etcd-controlplane                      1/1     Running   0          10m
kube-apiserver-controlplane            1/1     Running   0          10m
kube-controller-manager-controlplane   1/1     Running   0          10m
kube-proxy-4876j                       1/1     Running   0          10m
kube-proxy-v8zfw                       1/1     Running   0          10m
kube-scheduler-controlplane            1/1     Running   0          10m

To demonstrate this, I shall deploy nginx as a Deployment in this cluster:

➜ kubectl apply -f https://k8s.io/examples/application/deployment.yaml
deployment.apps/nginx-deployment created

It creates a Deployment with a Replicaset controlling 2 Pods:

➜  kubectl get all | grep nginx
pod/nginx-deployment-cbdccf466-2k77w   1/1     Running   0          69s
pod/nginx-deployment-cbdccf466-xfrlj   1/1     Running   0          69s
deployment.apps/nginx-deployment   2/2     2            2           69s
replicaset.apps/nginx-deployment-cbdccf466   2         2         2       69s

As you know, all the information about the current state of the cluster are stored in etcd. You may look this up as follows:

➜ ETCDCTL_API=3 etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/apiserver-etcd-client.crt --key /etc/kubernetes/pki/apiserver-etcd-client.key get /registry/ --prefix --keys-only |grep pods/default/nginx
/registry/pods/default/nginx-deployment-cbdccf466-2k77w
/registry/pods/default/nginx-deployment-cbdccf466-xfrlj

➜ ETCDCTL_API=3 etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/apiserver-etcd-client.crt --key /etc/kubernetes/pki/apiserver-etcd-client.key get /registry/ --prefix --keys-only |grep replicasets/default/nginx
/registry/replicasets/default/nginx-deployment-cbdccf466

➜ ETCDCTL_API=3 etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/apiserver-etcd-client.crt --key /etc/kubernetes/pki/apiserver-etcd-client.key get /registry/ --prefix --keys-only |grep deployments/default/nginx
/registry/deployments/default/nginx-deployment

To demonstrate the point, I shall clean up the above deployment from etcd. But, before that, let’s take a snapshot of etcd for us to restore later:

➜  ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key   snapshot save /opt/snapshot-04Dec2023.db
Snapshot saved at /opt/snapshot-04Dec2023.db

I shall just delete the deployment, which intern shall clean up the ReplicaSet and the Pods:

➜  ETCDCTL_API=3 etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/apiserver-etcd-client.crt --key /etc/kubernetes/pki/apiserver-etcd-client.key del /registry/deployments/default/nginx-deployment
1

➜  k get all
NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
service/kubernetes     ClusterIP   10.96.0.1        <none>        443/TCP        29m

At this point, let’s say, you are tasked with restoring this cluster to its previous state a.k.a bring back the beloved nginx deployment. What you gonna do? Before going any further, let’s understand how etcd works on a kubeadm cluster. As I mentioned before, kubeadm runs etcd as a static pod on the control-plane. You may check out the respective manifest located at: /etc/kubernetes/manifests/etcd.yaml. It configures a volume of type hostPath and mounts the same on the etcd container. Below is the snippet from the manifest file:

volumes:
  - hostPath:
      path: /var/lib/etcd
      type: DirectoryOrCreate
    name: etcd-data

Updating restoring a database would ideally mean restoring the snapshot to a new directory and update the manifest to mount the new directory as hostPath. Running the etcdctl restore command without -data-dir would force etcdctl to create a folder named default.etcd in the current working directory and restore the snapshot in this directory. This is well and good as long as you are aware and correctly update the manifest hostPath as $(pwd)/default.etcd.

➜  ETCDCTL_API=3 etcdctl snapshot restore /opt/snapshot-04Dec2023.db
2024-01-01 16:24:30.552088 I | mvcc: restore compact to 788
2024-01-01 16:24:30.559555 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32

➜  ls -ltr
total 8
-rw-rw-rw- 1 root root    0 Dec 13 05:39 sample.yaml
-rw-rw-rw- 1 root root 1717 Dec 13 05:39 etcd-backup-and-restore.md
drwx------ 3 root root 4096 Jan  1 16:24 default.etcd

But, for whatever reason if you need to rerun etdctl restore fro the same directory, you shall encounter the below error:

➜  ETCDCTL_API=3 etcdctl snapshot restore /opt/snapshot-04Dec2023.db
Error: data-dir "default.etcd" exists

So, we better be diligent about it and choose the restore directory. Here are the recommended steps:

# mention the --data-directory while restoring the snapshot.
➜  ETCDCTL_API=3 etcdctl snapshot restore --data-dir /var/lib/etcd-backup /opt/snapshot-04Dec2023.db
2024-01-01 16:34:49.878275 I | mvcc: restore compact to 788
2024-01-01 16:34:49.883925 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32

And, update the etc manifest to point to the restored directory:

volumes:
  - hostPath:
      path: /var/lib/etcd-backup #updated
      type: DirectoryOrCreate
    name: etcd-data

This shall trigger a recreation of the etcd pod and the nginx deployment shall be back in action. Here is the proof:


➜  kubectl get all | grep nginx
pod/nginx-deployment-cbdccf466-2k77w   1/1     Running   0         7m51s
pod/nginx-deployment-cbdccf466-xfrlj   1/1     Running   0          7m51s
deployment.apps/nginx-deployment   2/2     2            2           7m51s
replicaset.apps/nginx-deployment-cbdccf466   2         2         2       7m51s

References

Published Dec 12, 2023