You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On a comment in #92@dadux mentioned a few issues they were seeing with the DefaultRole being returned on pods when under load. Wanted to get this out there to track.
We've tried running of this branch in our dev environments, but still get a lot of "default role" under load (our build-engineering team spinning up 100s of concurrent jobs)
I want to dig into the lifecycle of the callback handlers a bit closer, the only way I think this could be occurring would be if:
Indexer gets an Add/Update event for a new pod that contains the IP but not the annotation (this would add the IP -> Pod store then)
Request comes in for credentials, GetRoleMapping is hit and returns the partial pod representation (missing explicit annotation) and falls back to the default role
Some time later, Update event is received with the fully represented Pod that contains the appropriate annotation information
The text was updated successfully, but these errors were encountered:
As per #92 (comment), I'm thinking to move the fallback to the default role down the chain so that we return an error from extractRoleARN. That will trigger the exponential backoff operation retry and should be able to catch the updated annotation. There will be some latency impact but hopefully it's acceptable.
Hey folks, I think I'm hitting this issue. I have a cronjob which kicks off a pod once a minute, and it fairly frequently fails the first time a pod comes up in that minute, and then the second or third retry usually works.
I've just put in a retry on startup so it'll should backoff and retry for a while. Will see if that makes any difference.
On a comment in #92 @dadux mentioned a few issues they were seeing with the DefaultRole being returned on pods when under load. Wanted to get this out there to track.
I want to dig into the lifecycle of the callback handlers a bit closer, the only way I think this could be occurring would be if:
GetRoleMapping
is hit and returns the partial pod representation (missing explicit annotation) and falls back to the default roleThe text was updated successfully, but these errors were encountered: