-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add client side event spam filtering #47367
Conversation
marking do-not-merge until i add a unit test. while I think server-side spam filtering is important, this prevents our internal agents from abusing the apiserver. we have experienced a significant amount of abuse in a few clusters both intentional (miners creating replica sets with huge numbers), and unintentional (users pods that are in constant crashloop backoff). in either scenario, if an event keeps recurring, ttl doesn't help, and we need to reduce the frequency of traffic from our agents. this client-side spam detection does that. |
this is a production stability problem for us right now, so marking 1.7 milestone. |
Rationale:
|
@@ -362,11 +471,11 @@ func NewEventCorrelator(clock clock.Clock) *EventCorrelator { | |||
|
|||
// EventCorrelate filters, aggregates, counts, and de-duplicates all incoming events | |||
func (c *EventCorrelator) EventCorrelate(newEvent *v1.Event) (*EventCorrelateResult, error) { | |||
aggregateEvent, ckey := c.aggregator.EventAggregate(newEvent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How expensive is EventAggregate? I assume that's why it was lower before - maybe we should filter twice, instead of once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
event aggregate is cheap. we had it lower before because i had not thought through spam detection well enough when it was first written. we need spam filtering after the event we plan to send to the server is aggregated so spam filtering works on aggregated event itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. just some readability recommendations.
if interval > maxInterval { | ||
// enough time has transpired, we create a new record | ||
record = spamRecord{} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat confusing. I'd move L138-146 inside the if block at L134 and only overwrite record
if interval < maxInterval
. That way we don't create a spamRecord on L123, lose it on L135, and make another new one on L145.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if the record is outside the interval, we should remove it from the cache. Unless the eventKey
is the same for the new record we are adding, then it'll just be an overwrite. Looking at getEventKey()
, it looks like the key would be the same for repeated records. Just need to check.
|
||
maxSyncInterval := time.Duration(f.syncIntervalInSeconds) * time.Second | ||
syncInterval := now.Time.Sub(record.lastSynced.Time) | ||
if syncInterval > maxSyncInterval { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invert the if condition and set filter = true
below and you can remove L158. That way we are setting false, then true, the false again.
de86cfc
to
20cfdb6
Compare
@smarterclayton @sjenning -- updated per review comment. the values chosen for spam filtering appears to work well with the pathological replication controller scenario. we may need to further tweak the interval when i look at some of the other worst offenders we encountered. we also NEED to stop a replication controller from going from 0-500 in every sync interval concurrently. while those requests do not cause a write when denied by quota, we need to stop making calls just to be told no. either way, for the pathological scenario:
previously, this created 4807 PATCH requests to the server, with client side spam filtering:
|
Actual data in one 300 node cluster we see:
But, over that same time period if I break
So, if I understand correctly, this would never hit either of the top 2 contributors to events in this cluster. |
looking at more of the data that showed our event spam. nodes are prone to induce event spam:
|
@eparis -- i need to parse through more of the for example, i am seeing ~6x reduction in traffic with
|
20cfdb6
to
05f5d1c
Compare
in other news:
gives you a number of terrifically horrible events reported back now with the CRI. dissecting them more, it's hard to know what to do here when there are many similar bad pods. deleting the pods in crash loop back-off just causes more of them to come from the backing job. the the start (or failed start) of any pod can cause a lot of events.
a number of those events just suck. also i am not sure what user will really care about the container id versus just the container name. and man, do those oci errors stink coming back from runc... i am going to try and see if i can clean this up more. that said, depending on what is in your pod, its normal to get a fair number of events on pod start. thinking for one way to handle this is to try and get the kubelet itself to log fewer events if the pod has a large restart count. will wait for more sample data from @eparis to compare as that cluster was also on kube 1.5 and may vary as well. |
We talked about a couple of times on cleaning up those events, together with logging spam from Kubelet, but never get around with it. Introducing a client side event filter / aggregator before sending to the server is a good plan to me. On another hand, we should re-evaluate today's events exposed by Kubelet. IMHO, we failed to export a lot of valuable events to the users, for example, sys oom killing, etc. |
We do report container ID in the ContainerStatus through, and it is useful for debugging in some cases. Maybe we can trim the ID to make it shorter? |
re: #47367 (comment) We just talked about this at sig-node. Can we push containerID to logging, not in event? Event is for the user, not necessary for the debuggers, like us. |
Then why do we even report them in the ContainerStatus? :-) |
@derekwaynecarr Awesome, thanks. |
@mml -- it is non-trivial to plumb a kube feature gate at this level of the stack as it seems like bad practice for client-go to reference a kubefeature gate. I do not see a compelling reason for this to not be the default behavior, and it is consistent with converstions/comments discussed here (https://docs.google.com/document/d/13BeJlrEcJhSKgsHOHWmHdJGqXrjSm_o9XxOtwcN6yNg/edit) |
agree
also agree. this is an evolution of a feature already present and intended to prevent event overload, that was not filtering sufficiently |
Given agreement and that this is a production issue for large clusters, I'm inclined to approve. Given a choice between write exhaustion and dropped events, I'd move to dropped events. /approve based on criteria and general consensus on the thread |
However would still like to see a lgtm from the reviewer |
My concern with this on by default is that there is no easy workaround if it really does create a problem, and such problems will be hard to find with test plans. AFAICT, we don't even expose knobs for the burst or refill rate. I would really like us to prioritize making this more configurable or less necessary in 1.9. Can we please file followup bugs about that now? /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: derekwaynecarr, mml, smarterclayton Associated issue: 47366 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Agree this needs a bit more of a knob. There is an issue for per client rate limiting server side that is being worked on #50925. As to soak, we've been running in production with this for a month now on 4 large clusters and have not yet observed lost events (although we do deserve to file an AAR on this that evaluates outgoing vs limited). This reduced event traffic to a negligible concern since the vast majority of events were redundant and repeating. |
/test all |
/retest Review the full test history for this PR. |
/retest |
/test all [submit-queue is verifying that this PR is safe to merge] |
@derekwaynecarr: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue |
Automatic merge from submit-queue (batch tested with PRs 51840, 53542, 53857, 53831, 53702). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. kubelet sync pod throws more detailed events **What this PR does / why we need it**: If there are errors in the kubelet sync pod iteration, it is difficult to determine the problem. This provides more specific events for errors that occur in the syncPod iteration to help perform problem isolation. Fixes #53900 **Special notes for your reviewer**: It is safer to dispatch more specific events now that we have an event budget per object enforced via #47367 **Release note**: ```release-note kubelet provides more specific events when unable to sync pod ```
Automatic merge from submit-queue (batch tested with PRs 56308, 54304, 56364, 56388, 55853). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Send events on certain service repair controller errors **What this PR does / why we need it**: This PR enables sending events when the api-server service IP and port allocator repair controllers find an error repairing a cluster ip or a port respectively. **Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #54303 **Special notes for your reviewer**: In case of an error, events will be emitted [every 3 minutes](https://github.com/kubernetes/kubernetes/blob/master/pkg/master/controller.go#L93) for each failed Service. Even so, event spam protection has been merged (#47367) to mitigate the risk of excessive events. **Release note**: ```release-note api-server provides specific events when unable to repair a service cluster ip or node port ```
kubernetes#47367 introduced client side event spam filtering to reduce the API server/etcd pressure of processing repeated events. However, it use only Source and InvolvedObject as key, so it can also drop important and unique events. Slowing down the diagnosing significantly. kubernetes#103918 try to resolve this by adding a new SpamKeyFunc to customize the key used. But this is still not ideal, because every user needs to dig into the code of client-go to figure out why his event does not appear. If the EventBroadcaster is initialized in a library, it may be even harder to set the SpamKeyFunc. I propose to only drop PATCH events, which is the intent of the original proposal. This patch deprecates SpamKeyFunc, and effectively fixed it to the return value of EventAggregate(). For the users who don't set SpamKeyFunc, they should get exactly more events then before, so this is not breaking for them. Other users are likely just adding Reason to the key. We already do this now.
PR kubernetes#47367 introduced client side event spam filtering to reduce the API server/etcd pressure of processing repeated events. However, it use only Source and InvolvedObject as key, so it can also drop important and unique events. Slowing down the diagnosing significantly. PR kubernetes#103918 try to resolve this by adding a new SpamKeyFunc to customize the key used. But this is still not ideal, because every user needs to dig into the code of client-go to figure out why his event does not appear. If the EventBroadcaster is initialized in a library, it may be even harder to set the SpamKeyFunc. I propose to only drop PATCH events by default, which is the intent of the original proposal. For the users who don't set SpamKeyFunc, they should get exactly more events.
PR kubernetes#47367 introduced client side event spam filtering to reduce the API server/etcd pressure of processing repeated events. However, it use only Source and InvolvedObject as key, so it can also drop important and unique events. Slowing down the diagnosing significantly. PR kubernetes#103918 try to resolve this by adding a new SpamKeyFunc to customize the key used. But this is still not ideal, because every user needs to dig into the code of client-go to figure out why his event does not appear. If the EventBroadcaster is initialized in a library, it may be even harder to set the SpamKeyFunc. I propose to only drop PATCH events by default, which is the intent of the original proposal. For the users who don't set SpamKeyFunc, they should get exactly more events.
PR kubernetes#47367 introduced client side event spam filtering to reduce the API server/etcd pressure of processing repeated events. However, it use only Source and InvolvedObject as key, so it can also drop important and unique events. Slowing down the diagnosing significantly. PR kubernetes#103918 try to resolve this by adding a new SpamKeyFunc to customize the key used. But this is still not ideal, because every user needs to dig into the code of client-go to figure out why his event does not appear. If the EventBroadcaster is initialized in a library, it may be even harder to set the SpamKeyFunc. I propose to only drop PATCH events by default, which is the intent of the original proposal. For the users who don't set SpamKeyFunc, they should get exactly more events.
What this PR does / why we need it:
Add client side event spam filtering to stop excessive traffic to api-server from internal cluster components.
this pr defines a per source+object event budget of 25 burst with refill of 1 every 5 minutes.
i tested this pr on the following scenarios:
Scenario 1: Node with 50 crash-looping pods
Before:
After:
Observation:
Scenario 2: replication controller limited by quota
Before:
After:
Which issue this PR fixes
fixes #47366
Special notes for your reviewer:
this was a significant problem in a kube 1.5 cluster we are running where events were co-located in a single etcd. this cluster was normal to have larger numbers of unhealty pods as well as denial by quota.
Release note: