-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make HPA tolerance a flag #52275
Make HPA tolerance a flag #52275
Conversation
Hi @mattjmcnaughton. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/unassign |
/ok-to-test |
c949523
to
1cf1e0c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple nits, otherwise looks good
tolerance = 0.1 | ||
const ( | ||
// DefaultTolerance used to calculating when to scale up/scale down. | ||
DefaultTolerance = 0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to replica_calculator.go, since we actually use it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, good point!
@@ -164,6 +166,7 @@ func (s *CMServer) AddFlags(fs *pflag.FlagSet, allControllers []string, disabled | |||
fs.DurationVar(&s.HorizontalPodAutoscalerSyncPeriod.Duration, "horizontal-pod-autoscaler-sync-period", s.HorizontalPodAutoscalerSyncPeriod.Duration, "The period for syncing the number of pods in horizontal pod autoscaler.") | |||
fs.DurationVar(&s.HorizontalPodAutoscalerUpscaleForbiddenWindow.Duration, "horizontal-pod-autoscaler-upscale-delay", s.HorizontalPodAutoscalerUpscaleForbiddenWindow.Duration, "The period since last upscale, before another upscale can be performed in horizontal pod autoscaler.") | |||
fs.DurationVar(&s.HorizontalPodAutoscalerDownscaleForbiddenWindow.Duration, "horizontal-pod-autoscaler-downscale-delay", s.HorizontalPodAutoscalerDownscaleForbiddenWindow.Duration, "The period since last downscale, before another downscale can be performed in horizontal pod autoscaler.") | |||
fs.Float64Var(&s.HorizontalPodAutoscalerTolerance, "horizontal-pod-autoscaler-tolerance", s.HorizontalPodAutoscalerTolerance, "Desired pod tolerance outside of which controller will upscale/downscale.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be reworded slightly for clarity. People get easily confused about the mechanics of the HPA controller, so we need to make sure things are crystal clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I was thinking about that - I will take another stab - please let me know if you have any suggestions :)
3f1627d
to
cfb8c0a
Compare
@DirectXMan12 - thanks for the feedback - pushed an update incorporating your changes whenever you get the chance to take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one last change, and then this is good
@@ -77,6 +78,7 @@ func NewCMServer() *CMServer { | |||
HorizontalPodAutoscalerSyncPeriod: metav1.Duration{Duration: 30 * time.Second}, | |||
HorizontalPodAutoscalerUpscaleForbiddenWindow: metav1.Duration{Duration: 3 * time.Minute}, | |||
HorizontalPodAutoscalerDownscaleForbiddenWindow: metav1.Duration{Duration: 5 * time.Minute}, | |||
HorizontalPodAutoscalerTolerance: podautoscaler.DefaultTolerance, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, I missed this the last time around. Normally, this location is the place for "defaults". We actually probably just want the default value here (i.e. just have the value here, and don't reference podautoscaler
, and the rename DefaultTolerance
to something like defaultTestingTolerance
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha, I wasn't sure about that. Thanks for catching.
@@ -164,6 +166,7 @@ func (s *CMServer) AddFlags(fs *pflag.FlagSet, allControllers []string, disabled | |||
fs.DurationVar(&s.HorizontalPodAutoscalerSyncPeriod.Duration, "horizontal-pod-autoscaler-sync-period", s.HorizontalPodAutoscalerSyncPeriod.Duration, "The period for syncing the number of pods in horizontal pod autoscaler.") | |||
fs.DurationVar(&s.HorizontalPodAutoscalerUpscaleForbiddenWindow.Duration, "horizontal-pod-autoscaler-upscale-delay", s.HorizontalPodAutoscalerUpscaleForbiddenWindow.Duration, "The period since last upscale, before another upscale can be performed in horizontal pod autoscaler.") | |||
fs.DurationVar(&s.HorizontalPodAutoscalerDownscaleForbiddenWindow.Duration, "horizontal-pod-autoscaler-downscale-delay", s.HorizontalPodAutoscalerDownscaleForbiddenWindow.Duration, "The period since last downscale, before another downscale can be performed in horizontal pod autoscaler.") | |||
fs.Float64Var(&s.HorizontalPodAutoscalerTolerance, "horizontal-pod-autoscaler-tolerance", s.HorizontalPodAutoscalerTolerance, "Horizontal pod autoscaler will not attempt upscaling/downscaling if desired replicas/current replicas is within tolerance of 1.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that I'm more awake, I'd probably go for something like "The minimum change (from 1.0) in the desired-to-actual metrics ratio for the horizontal pod autoscaler to consider scaling".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect :)
cfb8c0a
to
d7a29f0
Compare
@DirectXMan12 updated with your suggested changes - this is ready for another look whenever you get the chance. Thanks! |
/lgtm Looks like the bazel failure is #52433. Retest once master re-opens. |
/test pull-kubernetes-e2e-kops-aws |
/assign @mikedanese (re suggested approvers) |
/test all |
/test pull-kubernetes-e2e-gce-etcd3 |
Ping @mikedanese whenever you get a chance to look at this - thanks :) |
Fix kubernetes#18155 Make HPA tolerance configurable as a flag. This change allows us to use different tolerance values in production/testing. Signed-off-by: mattjmcnaughton <[email protected]>
42d7507
to
abd4668
Compare
@DirectXMan12 would you mind giving another LGTM? My rebase cancelled your previous one. /assign @lavalamp (cmd/kube-controller-manager/OWNERS) Thanks so much :) |
/lgtm |
Why aren't we making this an HPA spec element? Was that discussed? @kubernetes/sig-autoscaling-api-reviews |
Great question @mikedanese - the original TODO in the code case was In my mind, tolerance conceptually fits with sync period and upscale/downscale forbidden window, which are also flags. But I'm definitely open to discussion :) |
HI @mikedanese just wanted to follow up on this - what's the best process for doing an auto-scaling api review, if you think its necessary (I'm pretty new to k8s contributing, so its a bit of a foreign process to me haha)? Thanks! |
@mattjmcnaughton @mikedanese we've discussed things like this in the past, and generally tended to err on the side of "that seems like an implementation detail, and implementation details shouldn't be in the API". While I'm certainly willing to have that discussion, I think making it a flag is probably fine. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: DirectXMan12, mattjmcnaughton, mikedanese Associated issue: 18155 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/retest |
/retest Review the full test history for this PR. |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
What this PR does / why we need it:
Make HPA tolerance configurable as a flag. This change allows us to use
different tolerance values in production/testing.
Which issue this PR fixes:
Fixes #18155
Release note:
Signed-off-by: mattjmcnaughton [email protected]