DefaultRole being returned for new pods under load #100

jrnt30 · 2017-09-22T13:43:28Z

On a comment in #92 @dadux mentioned a few issues they were seeing with the DefaultRole being returned on pods when under load. Wanted to get this out there to track.

We've tried running of this branch in our dev environments, but still get a lot of "default role" under load (our build-engineering team spinning up 100s of concurrent jobs)

I want to dig into the lifecycle of the callback handlers a bit closer, the only way I think this could be occurring would be if:

Indexer gets an Add/Update event for a new pod that contains the IP but not the annotation (this would add the IP -> Pod store then)
Request comes in for credentials, GetRoleMapping is hit and returns the partial pod representation (missing explicit annotation) and falls back to the default role
Some time later, Update event is received with the fully represented Pod that contains the appropriate annotation information

The text was updated successfully, but these errors were encountered:

jtblin · 2017-11-27T02:16:25Z

As per #92 (comment), I'm thinking to move the fallback to the default role down the chain so that we return an error from extractRoleARN. That will trigger the exponential backoff operation retry and should be able to catch the updated annotation. There will be some latency impact but hopefully it's acceptable.

jagregory · 2018-02-19T03:23:36Z

Hey folks, I think I'm hitting this issue. I have a cronjob which kicks off a pod once a minute, and it fairly frequently fails the first time a pod comes up in that minute, and then the second or third retry usually works.

I've just put in a retry on startup so it'll should backoff and retry for a while. Will see if that makes any difference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DefaultRole being returned for new pods under load #100

DefaultRole being returned for new pods under load #100

jrnt30 commented Sep 22, 2017

jtblin commented Nov 27, 2017

jagregory commented Feb 19, 2018

DefaultRole being returned for new pods under load #100

DefaultRole being returned for new pods under load #100

Comments

jrnt30 commented Sep 22, 2017

jtblin commented Nov 27, 2017

jagregory commented Feb 19, 2018