-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hotfix: create new kubernetes client per request #163
hotfix: create new kubernetes client per request #163
Conversation
if more than a job is started simultaneously with the same secretConfig, the client will be reused, but the first job that succeeds will close it
oops, i was meaning to open this PR on our own fork: getsentry#1 Now that this is open, anyone have any ideas on what a more proper fix would look like? Ideally clients are reused - I'm thinking maybe a job identifier or something could be used in addition to secretConfig. |
The code does seem problematic at first glance, after looking at it. Without having actively tried this plugin more recently, I am a bit surprised that it works at all after the 1st request since it seems it closes the client regardless of success or failure of secrets retrieval? What am I missing? Lines 61 to 63 in 4ee3471
I'd have thought the client should just not be closed by the caller at all, given it is intended to be re-used. The very heavily used plugin at https://github.com/gocd/kubernetes-elastic-agents doesn't close the client except when the factory decides to clear it out: https://github.com/gocd/kubernetes-elastic-agents/blob/d898615d213ea5467b0e1960064c4227fa10b4f8/src/main/java/cd/go/contrib/elasticagent/KubernetesClientFactory.java#L86-L92. Have you tried removing the |
You aren't missing anything, I just think that that secretConfig is more unique to a stage execution than you might imagine:
Our feedback loop is kind of laborious but we had a stage with 2+ jobs and only k8s secrets defined at pipeline-level, which resulted in the same secretConfig for each job. Maybe this just isn't a common enough scenario? I was also surprised to not find anything in the issue tracker or gocd google group discussions. Removing I think a better fix here would be to only look at the relevant pieces of the "secretConfig" for client reuse, and even better, maintain a mapping so you can cache multiple clients. We're happy with the hotfix for now though, and I know basically 0 java, haha. |
Aight thanks, let me take a closer look at the secret config later and set up a local env. Should be possible to release at least the simple fix so you don't need to maintain a fork. |
Can you have a try with https://github.com/gocd/gocd-kubernetes-based-secrets-plugin/releases/tag/v1.2.2-169 and create a new issue if you see any problems? Was easily able to replicate the issues here, with simpler setups. Leaking clients is still potentially an issue (#166) but I note the much more heavily used plugin at https://github.com/gocd/kubernetes-elastic-agents suffers from the same issue. At least prior to my change, that plugin didn't suffer from the same "uniqueness" challenges as this one on its config (which I have changed here to mitigate leakage). Can only conclude this plugin isn't so frequently used :-/ |
@chadlwilson sorry for the delay - been meaning to try out your new release. Works on my end! Thanks for your time. |
If more than 1 job is started simultaneously with the same secretConfig (easy to reproduce - pretty much any stage which has 2 or more jobs, and no k8s secrets referenced at the job level), the client will be reused, but the first job that succeeds will close it, invariably leading to the remainder of client calls failing because the underlying okhttp threadpool is closed.