5.5.2 Ensure Node Auto-Repair is enabled for GKE nodes

Information

Nodes in a degraded state are an unknown quantity and so may pose a security risk.

Kubernetes Engine's node auto-repair feature helps you keep the nodes in the cluster in a healthy, running state. When enabled, Kubernetes Engine makes periodic checks on the health state of each node in the cluster. If a node fails consecutive health checks over an extended time period, Kubernetes Engine initiates a repair process for that node.

Solution

Using Google Cloud Console

- Go to Kubernetes Engine by visiting:

https://console.cloud.google.com/kubernetes/list

- Select the Kubernetes cluster containing the node pool for which auto-repair is disabled.
- Select the Node pool by clicking on the name of the pool.
- Navigate to the Node pool details pane and click EDIT
- Under the Management heading, check the Enable auto-repair box.
- Click SAVE
- Repeat steps 2-6 for every cluster and node pool with auto-upgrade disabled.

Using Command Line

To enable node auto-repair for an existing cluster's Node pool:

gcloud container node-pools update <node_pool_name> --cluster <cluster_name> --zone <compute_zone> --enable-autorepair

Impact:

If multiple nodes require repair, Kubernetes Engine might repair them in parallel. Kubernetes Engine limits number of repairs depending on the size of the cluster (bigger clusters have a higher limit) and the number of broken nodes in the cluster (limit decreases if many nodes are broken).

Node auto-repair is not available on Alpha Clusters.

See Also

https://workbench.cisecurity.org/benchmarks/19166

Item Details

Category: RISK ASSESSMENT

References: 800-53|RA-5, CSCv7|3.1

Plugin: GCP

Control ID: ed7039ff2405f655de403618dfe9db85e39aa140f08774aa9781e1d003209056