-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make healthchecks skippable, and check masters only #56130
Make healthchecks skippable, and check masters only #56130
Conversation
51aa112
to
7dd3c1d
Compare
@@ -106,16 +107,25 @@ func apiServerHealthy(client clientset.Interface) error { | |||
return nil | |||
} | |||
|
|||
// nodesHealthy checks whether all Nodes in the cluster are in the Running state | |||
func nodesHealthy(client clientset.Interface) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would actually keep this check and produce warning in case some nodes are not in healthy state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree if possible to plumb through a warning -- users may want a heads up if their cluster isn't at full strength.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, it is easy to add a warning of course, and I'm happy to do so if you insist.
I strongly disagree in principle however: kubeadm is not a monitoring service - if the admin wants to be informed that they have a node down, they shouldn't run kubeadm to discover that. Furthermore, I think the general posture of "we should protect the kubeadm user from unspecified problems that don't actually prevent an upgrade" has a real danger of making kubeadm fragile and not useful for real work (eg kubernetes/kubeadm#539 made kubeadm unusable for me). Imo kubeadm must be a do-what-I-say tool.
@@ -106,16 +107,25 @@ func apiServerHealthy(client clientset.Interface) error { | |||
return nil | |||
} | |||
|
|||
// nodesHealthy checks whether all Nodes in the cluster are in the Running state | |||
func nodesHealthy(client clientset.Interface) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree if possible to plumb through a warning -- users may want a heads up if their cluster isn't at full strength.
/approve |
/kind cleanup |
LGTM overall after a rebase |
7dd3c1d
to
1db7eed
Compare
1db7eed
to
8dc77a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/retest Review the full test history for this PR. |
/lgtm cancel
@anguslees This needs attention ASAP. |
[MILESTONENOTIFIER] Milestone Pull Request Current @anguslees @cblecker @justinsb @kad @luxas @mikedanese Note: This pull request is marked as Example update:
Pull Request Labels
|
8dc77a5
to
3da5985
Compare
/retest |
1 similar comment
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: anguslees, jbeda, luxas Associated issue: 539 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
@anguslees since we changed the flag name to I see the release note |
@xiangpengzhao I fixed it here and in release notes |
What this PR does / why we need it:
Previously kubeadm would abort if any node was not Ready. This is obviously infeasible in a non-trivial (esp. baremetal) cluster.
This PR makes two changes:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes kubernetes/kubeadm#539
Special notes for your reviewer:
Builds on #56072
Release note: