kubeadm refuses to upgrade if a node is unhealthy #539

anguslees · 2017-11-13T08:09:01Z

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version):

kubeadm version: &version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.4-beta.0.30+7c3a25cf07b734-dirty", GitCommit:"7c3a25cf07b734006bafe7267d3ca7e90d376bee", GitTreeState:"dirty", BuildDate:"2017-11-13T06:23:45Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/arm"}

Environment:

Kubernetes version (use kubectl version):

Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"6e937839ac04a38cac63e6a7a306c5d035fe7b0a", GitTreeState:"clean", BuildDate:"2017-09-28T22:46:41Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/arm"}

Cloud provider or hardware configuration: baremetal (arm + amd64)

What happened?

./kubeadm upgrade plan --skip-preflight-checks
[preflight] Skipping pre-flight checks
[upgrade] Making sure the cluster is healthy:
[upgrade/health] Checking API Server health: Healthy
[upgrade/health] Checking Node health: More than one Node unhealthy
[upgrade/health] FATAL: The cluster is not in an upgradeable state due to: there are NotReady Nodes in the cluster: [<node name>]

.. and yes, I currently have a (non-master) node offline and drained.

What you expected to happen?

Some method of upgrading the cluster, despite a node being unhealthy. In a large cluster, it is almost inconceivable that every node will be healthy, and that's just fine.

I suggest removing this check. The existing MaxUnavailable and PodDisruptionBudget rules should provide sufficient safety, assuming jobs are updated using normal mechanisms and nodes are cleanly drained where necessary. If those things aren't happening, then we should fix that problem instead.

How to reproduce it (as minimally and precisely as possible)?

Drain a node, or otherwise have one unhealthy. Try to upgrade.

Anything else we need to know?

The text was updated successfully, but these errors were encountered:

luxas · 2017-11-13T13:06:08Z

If we meet in the middle, would you be happy with making the health checks skippable with -f @anguslees?
I don't want to remove this completely

anguslees · 2017-11-13T21:24:20Z

Yep happy to do so. So I know what warning to give, can you outline an issue that this health check protects us from?

kad · 2017-11-14T12:47:25Z

side suggestion: maybe make this check compatible with interfaces of preflight checks, so it can be run with RunChecks() ? this will allow later on (soonish...) to unify mechanism of individual control of the checks.

luxas · 2017-11-14T15:59:18Z

@kad sounds reasonable to me; but doesn't have to be done in the same PR.
@anguslees I'm happy to see a PR to make it skippable with -f

We currently have the following checks:

Make sure the API Server answers ok on /healthz -- kinda required
Make sure all nodes are Ready; we shouldn't upgrade for example, a cluster where the master is unhealthy; instead let the admin fix the problem first. This also ties into the CNI config. In order for the cluster to be functional you have to install a CNI network. It guards for instance from upgrading a cluster without a CNI network installed.
Make sure the required Static Pod manifests are there if Static Pods are used
...or make sure the right control plane DaemonSets are healthy if self-hosting.

anguslees · 2017-11-15T04:21:52Z

for example, a cluster where the master is unhealthy

Isn't this (better) covered by the other apiserver /healthz check?
Why can't I upgrade the control plane with a non-functional CNI network (assuming the apiserver is otherwise reachable and responding)?

This is what I'm trying to get at: I can trivially construct hypothetical situations where every worker node is down, and - assuming the admin is otherwise ok with this situation - there's no reason for kubeadm to prevent upgrading the control plane. For master nodes themselves, I can similarly easily construct situations where (eg) a minority of master nodes are down, and again - assuming the admin is otherwise ok with this - there's no reason for kubeadm to prevent upgrading the masters. More importantly I can easily construct hypothetical situations where upgrading the control plane is required to deploy a bug fix that is a critical part of recovering the cluster from a partial outage.

I can't think of any situation where verifying all the nodes are healthy is useful. Adding a global -f flag makes me uncomfortable, since anyone with more than a toy cluster is going to have to use that flag every time they use kubeadm - and I wouldn't want to disable other (actually useful) checks just because of this one. So if I'm going to implement an ignore flag, it'll be something more specific like --ignore=nodehealth

I guess what I'm trying to say is: kubeadm is not a monitoring tool and it shouldn't be doing monitoring "just because". The only checks it should perform are the ones required as part of the upgrade method (eg: if the apiserver isn't responding at all, then that's a problem for a self-hosted upgrade). I know I'm well into rant territory now, but there is also no acknowledgement in the above discussion that a naive "all healthy" criteria is simply unachievable in the real world, and this makes me feel uncertain about whether I'm going to be able to use kubeadm in practical situations. As a concerned contributor, I'm not sure how I can best help if there's going to be pushback when these sorts of well-intentioned-but-wrong issues are identified :(

kad · 2017-11-15T08:56:13Z

@anguslees before start introducing new flags, please wait couple of days for me to finalize solution for #480 We will have flag that initially supposed to cover pre-flight checks, but if in all other places we will be using same interface for all kind of checks, it will be possible to fine-control even in situations like this.

anguslees · 2017-11-16T04:48:16Z

@kad: hah, that's basically exactly what I implemented yesterday, only I added it just for the healthChecks in health.go. I'll hold off on sending the PR for a few days.

luxas · 2017-11-18T15:40:06Z

@anguslees Would you be ok with converting this to a preflight check, having it skippable with using @kad's ignore-specific-preflight-check, and only requiring that the master nodes are healthy?

anguslees · 2017-11-19T23:23:33Z

@luxas yep, sure.

Just to continue beating this dead horse, however: If the apiserver is responding healthy, does it matter whether all nodes tagged with the "master" role are healthy?

I suspect the actual requirement is that we don't take the masters below the etcd majority quorum (assuming we're in some future kubeadm-HA world). I presume the size of etcd cluster is (going to be) stored in the kubeadm config, or do we have some other plan here?

With the current upgrade approach (single master, and in-place master upgrades) do we just need a single healthy master? (in particular, with a single master we are going to go below the majority quorum 1->0 during the upgrade since the upgrade process is so different to a rolling-update HA approach)

luxas · 2017-11-20T21:58:46Z

We need the master to be Ready so we know the CNI network is set up there. Otherwise we can't assume normal functionality in the cluster to rely on when upgrading. Especially when self-hosting comes around we need this requirement.

anguslees · 2017-11-21T06:19:28Z

ok, rewritten on top of @kad's (unmerged) ignore-specific-preflight-check PR, and only requires master nodes to be Ready.

luxas · 2017-11-22T19:59:49Z

@kad's PR is now merged :)

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Make healthchecks skippable, and check masters only **What this PR does / why we need it**: Previously kubeadm would abort if _any_ node was not Ready. This is obviously infeasible in a non-trivial (esp. baremetal) cluster. This PR makes two changes: - Allows kubeadm healthchecks to be selectively skipped (made non-fatal) with --ignore-checks-errors. - Check only that the *master* nodes are Ready. **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes kubernetes/kubeadm#539 **Special notes for your reviewer**: Builds on #56072 **Release note**: ```release-note kubeadm health checks can also be skipped with `--ignore-checks-errors` ```

luxas added this to the v1.9 milestone Nov 13, 2017

luxas added area/upgrades area/UX labels Nov 13, 2017

luxas assigned anguslees Nov 14, 2017

anguslees mentioned this issue Nov 21, 2017

Make healthchecks skippable, and check masters only kubernetes/kubernetes#56130

Merged

k8s-github-robot closed this as completed in kubernetes/kubernetes#56130 Nov 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubeadm refuses to upgrade if a node is unhealthy #539

kubeadm refuses to upgrade if a node is unhealthy #539

anguslees commented Nov 13, 2017

luxas commented Nov 13, 2017

anguslees commented Nov 13, 2017 •

edited

Loading

kad commented Nov 14, 2017

luxas commented Nov 14, 2017

anguslees commented Nov 15, 2017 •

edited

Loading

kad commented Nov 15, 2017

anguslees commented Nov 16, 2017

luxas commented Nov 18, 2017

anguslees commented Nov 19, 2017 •

edited

Loading

luxas commented Nov 20, 2017

anguslees commented Nov 21, 2017

luxas commented Nov 22, 2017

kubeadm refuses to upgrade if a node is unhealthy #539

kubeadm refuses to upgrade if a node is unhealthy #539

Comments

anguslees commented Nov 13, 2017

Is this a BUG REPORT or FEATURE REQUEST?

Versions

What happened?

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

luxas commented Nov 13, 2017

anguslees commented Nov 13, 2017 • edited Loading

kad commented Nov 14, 2017

luxas commented Nov 14, 2017

anguslees commented Nov 15, 2017 • edited Loading

kad commented Nov 15, 2017

anguslees commented Nov 16, 2017

luxas commented Nov 18, 2017

anguslees commented Nov 19, 2017 • edited Loading

luxas commented Nov 20, 2017

anguslees commented Nov 21, 2017

luxas commented Nov 22, 2017

anguslees commented Nov 13, 2017 •

edited

Loading

anguslees commented Nov 15, 2017 •

edited

Loading

anguslees commented Nov 19, 2017 •

edited

Loading