-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
try to fix e2e test failure #1119
try to fix e2e test failure #1119
Conversation
/retest |
/test pull-metrics-server-test-e2e |
/test pull-metrics-server-test-e2e-ha |
/test pull-metrics-server-test-e2e |
/test pull-metrics-server-test-e2e-ha |
1 similar comment
/test pull-metrics-server-test-e2e-ha |
/cc @serathius , this pr try to fix the e2e test failure |
Before changing test, please ensure that this is not a an issue with metrics-server. |
ok, when creating a pod, let me do more tests and get data verification directly from the |
/hold |
c31f6d9
to
b92bb62
Compare
b92bb62
to
dcb60e4
Compare
/test pull-metrics-server-test-e2e |
2 similar comments
/test pull-metrics-server-test-e2e |
/test pull-metrics-server-test-e2e |
/test pull-metrics-server-test-e2e-ha |
/test pull-metrics-server-test-e2e-helm |
/test pull-metrics-server-test-e2e |
/test pull-metrics-server-test-e2e-helm |
/test pull-metrics-server-test-e2e |
3 similar comments
/test pull-metrics-server-test-e2e |
/test pull-metrics-server-test-e2e |
/test pull-metrics-server-test-e2e |
@dgrisonnet any ideas? |
What I wanted to check in #1119 (comment) was if kubelet (cadvisor under the hood) was the culprit. To do so, we can compare the data we get from kubelet with data from either Another option that might be better would be to grab CPU profiles from the cpu-consumer container, but since I don't expect this program to have pprof endpoints it might be a bit harder. We could for example get the profile by installing the parca-agent: https://www.parca.dev/docs/parca-agent and running it in CI, but that's quite a bit of work. |
I will continue to analyze it. |
Thanks! Let me know if you need any help, this can be a little tricky |
It is indeed a bit tricky. Using para, I still don't know how to export the statistics. I might need to look into it |
I am not sure if collecting raw data is supported yet. As far as I can tell from the doc, they only allow sending the profiles to a compatible server. Let me check if I can find something else we could use. |
We might be able to use |
OK, Let me do some research about it |
e92b499
to
6041de6
Compare
/cc @dgrisonnet ,I try to use kubectl-flame plugin to get cpu-consumer's process cpu statistics, but can't get it normally. The error message is similar to
|
OK, when I have some times on my hands, I'll try to have a look, but I would want us to further investigate where the issue is coming from |
6041de6
to
4b55064
Compare
Hi, @dgrisonnet As we discussed above, this is not a an issue with metrics-server. |
/assign @dgrisonnet |
/cc @serathius Could we consider merging this PR? |
test/e2e_test.go
Outdated
@@ -94,6 +94,7 @@ var _ = Describe("MetricsServer", func() { | |||
if err != nil { | |||
panic(err) | |||
} | |||
time.Sleep(15 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment why this sleep is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, the effective key point is in
Limits: map[corev1.ResourceName]resource.Quantity{
corev1.ResourceCPU: mustQuantity("100m"),
},
I will roll back this modification
@@ -535,6 +536,9 @@ func consumeCPU(client clientset.Interface, podName string) error { | |||
Requests: map[corev1.ResourceName]resource.Quantity{ | |||
corev1.ResourceCPU: mustQuantity("100m"), | |||
}, | |||
Limits: map[corev1.ResourceName]resource.Quantity{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave a TODO to investigate and remove limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
4b55064
to
51963ef
Compare
/cc @serathius I have updated. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: serathius, yangjunmyfm192085 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
The pull-metrics-server-test-e2e and pull-metrics-server-test-e2e-ha tests have failed frequently recently because the test case
returns accurate CPU metric
fails.The most likely reason is that when the pod is just successfully deployed, the metrics obtained are inaccurate. We need to wait for one scrape cycle
https://prow.k8s.io/job-history/gs/kubernetes-jenkins/pr-logs/directory/pull-metrics-server-test-e2e
https://prow.k8s.io/job-history/gs/kubernetes-jenkins/pr-logs/directory/pull-metrics-server-test-e2e-ha
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #