-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pinned jenkins plugins not working because operator evaluates against latest public list. #797
Comments
Hi @Harguer I'm not able to replicate the issue. Could you please share the version of the lts jenkins image that you are using and also can you try to remove everything you have in the jenkins CRD and let the operator deploy a vanilla version from the latest helm chart? |
Hi @brokenpip3, thanks a lot for looking at it :) Pod running for 18 days
configuration of the current jenkins CR, where i have pinned all the plugins (the dependencies too)
the result is pinned here in the jenkins CR
Now, deleting the pod:
New pod start crashing in a loop:
Inspecting the jenkins pods, you can see that there is no reason to have this jenkins terminated, it just die (killed by the operator)
But from the jenkins operator, you can see the reason, it is because jenkins-operator compares with a fresh list of plugins here https://github.com/jenkinsci/kubernetes-operator/blob/master/pkg/configuration/base/plugin.go#L15-L46
This is an important issue it could make people to leave using jenkins operator, every time jenkins dies, you need to update the plugins, I have tried pinned all plugins directly into the dockerfile, but same result, jenkins-operator eventually will get restarted and then it could crash because some plugin needs to be updated, it happened during the weekends or at nights with my prod jenkins, and it creates incidents and not good for my team believe me :') |
Ok I see several things we can try/improve here :)
Please don't, this is a common mistake, you don't need to pin the dependencies otherwise you are going to have this kind of issue all the time :)
Can you describe the pods?
I don't think this is true, all of those are warnings during a reconcile loop, it's the operator that is killing the pod because its checking the plugins update, looking at the logs mostly likely it's a jvm issue, with all of these plugins only:
is not enough, try by doubling the size of limits and request here and let me know
Can happen that a new version of jenkins will require some version for some plugins but in general jenkins will not crash because there is a plugin update. Try these suggestions and let me know :) |
We're hitting this problem as well since we migrated to the operator with tag 60b8ee5. We have all the plugin versions pinned in the CR but a Jenkins restart (for whatever reason) will most likely not start correctly because of an automatic plugin update, entering a restart loop I don't completely understand the reasoning for the operator to compare with a fresh list of plugins, I would hope for it to respect the CR versions otherwise we can't follow a GitOps approach (jenkins resource declaration in git != what's actually running in prod) Even if we follow the steps above to get a shorter plugin list to pin is there any guarantee that one of those pinned plugins won't get automatically bumped to the latest version? |
Nothing will automatically bump the plugin to the latest version. Let's take the @Harguer logs/config as example, he has in the base plugin (by default) the
but again this is something that should be avoided, you need to pin only the plugins you want to install without touching the dependencies, only adding to the gitops state what you need and do not bother if you see some plugins (not listed in your
Can you please share operator/jenkins logs, k8s events of the jenkins pods and the jenkins crd that you are using? so I can take a look and help if needed |
Thank you for the help! According to the docs by default the tool will take the latest version of the dependencies even if they are pinned in the plugin list that the operator is providing
This explains the behavior, and I understand that the recommended way to set it up is to not pin dependencies, but I think it's a valid use case to be able to control everything that gets installed in the pod
|
Using
There are tools for that, see for example how we're doing it in jenkins-infra/docker-jenkins-lts and jenkins-infra/docker-jenkins-weekly, ex https://github.com/jenkins-infra/docker-jenkins-lts/blob/main/.github/workflows/update.yaml |
First of all thanks a lot to digging deeper on this 💪
I did some tests: $ cat /base-plugins.txt
kubernetes:3883.v4d70a_a_a_df034
kubernetes-credentials-provider:1.209.v862c6e5fb_1ef
workflow-job:1282.ve6d865025906
workflow-aggregator:590.v6a_d052e5a_a_b_5
git:5.0.0
job-dsl:1.81
configuration-as-code:1569.vb_72405b_80249 using this ^ simple list of plugins I ran twice the $ jenkins-plugin-cli --verbose -f base-plugins.txt --no-download -l --latest=false > /tmp/latest-false
$ jenkins-plugin-cli --verbose -f base-plugins.txt --no-download -l --latest=true > /tmp/latest-true and then compared the result: $ diff /tmp/latest-false /tmp/latest-true|wc -l
104
$ diff /tmp/latest-false /tmp/latest-true|grep -i kubernetes-client-api
< kubernetes-client-api 6.4.1-208.vfe09a_9362c2c
> kubernetes-client-api 6.4.1-215.v2ed17097a_8e9 Indeed we have differences for all the dependencies like for instance the Thanks for your contribution! |
Thanks for your feedback!
By However your example was great because I thought about creating a github action that could be used by our community members that follow a gitops flow (like me) to install and configure this jenkins operator. Thanks! 🎖️ |
Hi @Harguer @rikycaldeira, I think I have a better understanding of the issue now. We switched from the old jenkins cli plugin sh to the new
and it will download the compatible version. I could be wrong but the old install-plugin bash script was not so smart and jenkins was just not able to boot properly. So the situation where as code we had the incompatible plugin and at runtime the correct one was never possible (in the past). |
- prepare to switch from `master` to `main` - avoid to run workflow in case is not needed - add a way to bump the lts via make - use latest jenkins lts 2.387.1 - add the docker labels - update base plugins - fix #797 - Add more tests with bats - Update base plugin to latest version - Temporary revert #807 - Better nightly job
@Harguer @rikycaldeira could you please try the latest chart from master branch and let me know? thanks! |
any luck with the tests? Thanks! |
hey @brokenpip3 , thanks a lot for your help with this, i will try it and i will let you know :) |
gently ping :) |
yes that's what I modified, if you want to avoid that you need to set kubernetes-operator/chart/jenkins-operator/values.yaml Lines 120 to 124 in 3fe842f
|
This is the change:
This is what happens
No matter which value I specify for latestPlugins, latestP will end up being always true. |
Yeah seems that is not working, even when I set this to false, still complaining about new plugins and my jenkins is breaking
|
see #827
What you mean by
can you elaborate? what happened? please share the logs, the events, the jenkins crd config etc otherwise will be hard to understand. |
Yes you are right, will fix it soon |
Hi @brokenpip3 , thanks for getting back and for your help with this :). jenkins configs:
Jenkins-master container, it breaks/restarts every 20 seconds or so:
jenkins-master container:
Jenkins-Operator, there is no so much information about the error, just endless logs like this:
thanks again for looking at it and let me know if you need any aditional details/information. |
@Harguer The master pod logs states the incompatibilities and the versions you need in order to fix your issue.
Keep doing the same investigation for the remaining base/plugins until there is no error on the master pod. Or just update the full list to the latest available versions (this is what I did).
For now I could work around this issue by unpinning/commenting the incompatible plugin. Then I was able to re-deploy Jenkins using the same code and I noticed that the latest version (I commented) was automatically installed. I am waiting a couple of days to see how Jenkins will behaves upon a new deletion. |
Hi @ingineru , thank you for looking into it. If this is something won't be fixed, then I think there is no point to have jenkins-operator. As I explained before, sometimes jenkins pods die during weekends or during nights and jenkins is down until someone manually fix it (by doing the same thing you described, that TBH is horrible to do, while your boss and developers are awaiting for it). |
@Harguer I was not particularly looking into your case, I am just in a similar situation. UPDATE: Some of the plugins that are found to be incompatible can be excluded from the code and Jenkins will handle them by installing the latest available version (I did it for instance-identity and okhttp-api). I deleted the test instance master after several days and the automatic redeployment was successful. This is worth trying as a workaround (cc: @Harguer) |
I was able to successfully deploy by overriding the list of base plugins with a list generated by jenkins-plugin-cli. The following command:
will produce a fully pinned list of plugins with resolved dependencies on stdout. This list can be transformed to be compatible with the format expected in I'm using the image referenced in the command above with operator image |
hei @Harguer @ingineru @vlad-ivanov-name I think I fixed the always true latestPlugin bug, can you try this version? But also please be aware that using Also @vlad-ivanov-name please do not use that very old operator image, if you installed the operator with the |
This becomes less of an issue when using Terraform with Helm and Kustomization providers: the long list of plugins can be read from a file or some other data source.
Thanks! Would it be possible to link to quay.io from https://hub.docker.com/r/virtuslab/jenkins-operator or is it owned by a different person? |
No, that dockerhub account is owned by the previous team |
fixed with this version: https://github.com/jenkinsci/kubernetes-operator/releases/tag/v0.8.0-beta2, let me know if it's still an issue |
I used the quay.io image and the issue seems to be fixed (still waiting for some plugin updates to see how it behaves then). Thank you for solving it! |
it depends of the types of changes that you are doing, not the 100% of the jenkins crd will trigger a restart, what you changed? |
Thank you @brokenpip3 , to me seems to be working fine. I tested on one of my test Jenkins. It was running for 40 days, then I pull all the plugins and manually pinned in the Jenkins CR, then I killed my Jenkins and it star crashing, then I updated the image of Jenkins operator with the latest one, also important to say that i had to update the Jenkins CRDs. |
It was a change in the list of pinned plugins that would have triggered the restart. The issue proved to be from my test environment In the end (the Jenkins LTS image was no longer whitelisted). Thanks again for your work! |
changing/updating/incompatible plugins ist definitly a real problem with jenkins-operator or jenkins itself. Every time we have a (scheduled or not scheduled) restart for jenkins we ran into serious troubles as some plugins wont't install anymore. |
are you using the latestplugin equals to true or false? |
@pniederlag also make sure you are using the updated Jenkins operator image from quay.io and the latest CRDs. |
First of all, big thx @Harguer @brokenpip3 for your feedback on this!
I have toggled about hundred times already. currently using latestPlugins: false and (autogenerated/curated) list of all plugins in basePlugins: Some observations:init.sh latest does not reflect spec.master.latestPlugins: false/var/jenkins/scripts/init.sh contains --latest true (also spec.master.latestPlugins: false!) basePlugins installed firstAs basePlugins are installed prior to plugins they might introduce some dependencies that later on have an effect of the list of plugins ? operator imageusing quay.io/jenkins-kubernetes-operator/operator:v0.8.0-beta2 (which hopefully carries the fix?) The message from the controller might be misleading
I verified snakeyaml-api 1.33-95.va_b_a_e3e47b_fa_4 yesterday morning, and added it to plugins. Later on that evening the above message showed up, making me think something is broken. Actually snakeyaml-api 2.2-111.vc6598e30cc6 was just released a couple of hours ago. To my knowledge there is nothing that makes snakeyaml-api 1.33-95.va_b_a_e3e47b_fa_4 incompatible, it is just not the very most recent version as of now. ;) ConclusionsIs there any deeper reasoning for having basePlugins and plugins other than trying to be clear on hard requirements(which are already part of the operator) and some plugins the user wants to use on top of that? How about merging base-plugins and user-plugins and running one install on the resulting list? Jenkins plugins handling in general is not quite clear as it seems hard to really nail a fixed set of working plugin-versions |
snakeyaml-api got updated, and we ran into the same thing. Note: We were running an older version of Jenkins with many other pinned plugins. You should check your other plugins, maybe one of them requires the new version. |
Yeah I think you still need to pin all other plugins (dependencies) to make it work with |
double checking once more... we actually really do have SnakeYAML2.2-111.vc6598e30cc65 personally I'd prefer if operator/jenkins would just bail and exit if configured plugins are not compatible with jenkins. I don't like to see other versions installed than the ones I do define, even if it is just to get jenkins up and running. |
I am confused by the latestPlugins parameter functionality. |
finaly, with upgrade to v0.8.0 for chart and operator it seems stuff might be solved.... at least we don't see any warnings on incompatible versions lastetPlugins: false does kick in Great, thx! |
Describe the bug
I'm not sure if this is a bug or an improvement, but this is affecting a lot our prod jenkins, and also leadership is thinking to move from jenkins-operator to helm charts :(.
when my jenkins operator restarts, it start crashing because there are new plugins that are not compatible with my initial setup and/or with my current jenkins lts version pinned.
The problem is that in this section, jenkins is evaluating the plugins we pin in the jenkins CR (
basPlugins
andplugins
) against the latest available, this brings instability to production jenkins because jenkins won't start due plugins issues.https://github.com/jenkinsci/kubernetes-operator/blob/master/pkg/configuration/base/plugin.go#L15-L46
I have tried to pin all the plugins in the jenkins CR and also I tried to pin all plugins in a Dockerfile, but it won't work because jenkins-operator.
I think if we put something ilike this
pinPlugins: <true/false>
in the jenkins CR and code it, we could solve this issue.To Reproduce
Install base plugins, some plugins and latest lts, let this running for a few days and kill the jenkins pod. the new jenkins pod will start crashing due plugin dependencies.
Additional information
Kubernetes version tested:
1.21, 1.22, 1.23
Jenkins Operator version:
v0.7.1 and tag 60b8ee5
Add error logs about the problem here (operator logs and Kubernetes events).
The text was updated successfully, but these errors were encountered: