-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a kubelet metric to track certificate expiration. #51031
Add a kubelet metric to track certificate expiration. #51031
Conversation
8b1d7ab
to
b0d77d0
Compare
glog.V(2).Infof("Waiting %v for next certificate rotation", m.rotationDeadline.Sub(time.Now())) | ||
for m.rotationDeadline.After(time.Now()) { | ||
metrics.ClientCertificateExpiration.Set(m.rotationDeadline.Sub(time.Now()).Seconds()) | ||
time.Sleep(1 * time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug code? or did you mean to make this spin every minute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did mean this to spin every minute. So it can update the metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to do this. Just set the deadline as a gauge and the gauge value to the expected time in seconds. And only update the value when the cert manager starts or updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spinning a goroutine for this is unusual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt we should also expose this for the masters for serving certs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And add serving cert lifetime in the kubelet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton Do you a metric on the apiserver that shows the analogous value for the apiserver's certificate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, each of the components that has a serving cert and each of the clients that request client certs want this.
6ab3358
to
7447bb7
Compare
/assign @crassirostris |
Overall LGTM But be aware that MustRegisted will panic if e.g. the name is incorrect (since you form it dynamically), maybe use |
@crassirostris Your point about MustRegister vs Register is good, but the name is formed from static values, not from user data. I'd rather it crashed and get debugged rather than have the metric disappear and that not be noticed. |
/assign @Random-Liu |
7447bb7
to
bb54bfd
Compare
bb54bfd
to
f1fef11
Compare
But won't it crash anyway, just upper the stack, rather than in this place? I understand that now |
/lgtm |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jcbsmpsn, smarterclayton No associated issue. Update pull-request body to add a reference to an issue, or get approval with The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@jcbsmpsn You need either |
/retest Review the full test history for this PR. |
5 similar comments
/retest Review the full test history for this PR. |
/retest Review the full test history for this PR. |
/retest Review the full test history for this PR. |
/retest Review the full test history for this PR. |
/retest Review the full test history for this PR. |
/retest Review the full test history for this PR. |
2 similar comments
/retest Review the full test history for this PR. |
/retest Review the full test history for this PR. |
/test all [submit-queue is verifying that this PR is safe to merge] |
@jcbsmpsn: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue (batch tested with PRs 51031, 51705, 51888, 51727, 51684). If you want to cherry-pick this change to another branch, please follow the instructions here.. |
Fix #51964