Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security Realm settings being reset on new rollout #1702

Closed
michaelishri opened this issue Jul 13, 2023 · 16 comments
Closed

Security Realm settings being reset on new rollout #1702

michaelishri opened this issue Jul 13, 2023 · 16 comments
Assignees

Comments

@michaelishri
Copy link

Hi team,

We've noticed an issue with the s2i run script which causes our Security Realm to be wiped out whenever a new rollout is performed on OpenShift.

We have a need to use LDAP auth instead of the OpenShift OAuth flow. With the current public image, we were finding the LDAP settings were getting lost intermittently. It wasn't until a new rollout was done out of hours that we could pinpoint when it was being reset, after a new rollout (when we did a graceful reboot of Jenkins via the Jenkins UI, we didn't lose the settings).

In the end, it comes down to this logic on line 590 in jenkins/2/contrib/s2i/run.

if [ "${OPENSHIFT_ENABLE_OAUTH}" == "false" ]; then
  echo "OpenShift OAuth disabled: Using Jenkins built-in user database"
  xmlstarlet edit -L --update '//hudson/securityRealm' --value '' "${JENKINS_HOME}/config.xml"  2>/dev/null
  xmlstarlet edit -L --delete '//hudson/securityRealm/@plugin' "${JENKINS_HOME}/config.xml"  2>/dev/null
  xmlstarlet edit -L --update '//hudson/securityRealm/@class' --value 'hudson.security.HudsonPrivateSecurityRealm' "${JENKINS_HOME}/config.xml" 2>/dev/null
else
  echo "OpenShift OAuth enabled: true"
  # if OAUTH is enabled and the current realm is Jenkins local Database, we delete the realm, so the openshift plugins auto configures
  if xmlstarlet select -t -c  "/hudson/securityRealm[@class='hudson.security.HudsonPrivateSecurityRealm']" 2> /dev/null > /dev/null; then
    xmlstarlet edit -L --delete '//hudson/securityRealm' "${JENKINS_HOME}/config.xml"  2>/dev/null
  fi
  # In case openshift login plugin is set as a provider but without the defaulted values, we remove it to enforce plugin re-configuration
  # TODO replace with an || statement but by improving readability
  if xmlstarlet select -t -c  "/hudson/securityRealm[starts-with(@plugin,'openshift-login')][@class='org.openshift.jenkins.plugins.openshiftlogin.OpenShiftOAuth2SecurityRealm']" \
           "${JENKINS_HOME}/config.xml"  2> /dev/null > /dev/null  && \
     ! xmlstarlet select -t  -c  "/hudson/securityRealm/defaultedServiceAccountDirectory" "${JENKINS_HOME}/config.xml" 2> /dev/null > /dev/null; then
    echo "OpenShift OAuth enabled and provider incorrectly set: Resetting openshift login provider"
    xmlstarlet edit -L --delete '//hudson/securityRealm' "${JENKINS_HOME}/config.xml"  2>/dev/null
  fi
fi

It assumes when false is set for OPENSHIFT_ENABLE_OAUTH, it should wipe the security realm tree in config.xml.

To make this work for us, we've updated this script to the following, that allows us to specify alternative which simply skips the block.

if [ "${OPENSHIFT_ENABLE_OAUTH}" == "alternative" ]; then
  # Skip this block when an alternative is required by the user. We do not want to keep resetting their security settings each time a pod is recycled.
  echo "OpenShift OAuth disabled: Alternative security realm has been specified."
elif [ "${OPENSHIFT_ENABLE_OAUTH}" == "false" ]; then
  echo "OpenShift OAuth disabled: Using Jenkins built-in user database"
  xmlstarlet edit -L --update '//hudson/securityRealm' --value '' "${JENKINS_HOME}/config.xml"  2>/dev/null
  xmlstarlet edit -L --delete '//hudson/securityRealm/@plugin' "${JENKINS_HOME}/config.xml"  2>/dev/null
  xmlstarlet edit -L --update '//hudson/securityRealm/@class' --value 'hudson.security.HudsonPrivateSecurityRealm' "${JENKINS_HOME}/config.xml" 2>/dev/null
else
  echo "OpenShift OAuth enabled: true"
  # if OAUTH is enabled and the current realm is Jenkins local Database, we delete the realm, so the openshift plugins auto configures
  if xmlstarlet select -t -c  "/hudson/securityRealm[@class='hudson.security.HudsonPrivateSecurityRealm']" 2> /dev/null > /dev/null; then
    xmlstarlet edit -L --delete '//hudson/securityRealm' "${JENKINS_HOME}/config.xml"  2>/dev/null
  fi
  # In case openshift login plugin is set as a provider but without the defaulted values, we remove it to enforce plugin re-configuration
  # TODO replace with an || statement but by improving readability
  if xmlstarlet select -t -c  "/hudson/securityRealm[starts-with(@plugin,'openshift-login')][@class='org.openshift.jenkins.plugins.openshiftlogin.OpenShiftOAuth2SecurityRealm']" \
           "${JENKINS_HOME}/config.xml"  2> /dev/null > /dev/null  && \
     ! xmlstarlet select -t  -c  "/hudson/securityRealm/defaultedServiceAccountDirectory" "${JENKINS_HOME}/config.xml" 2> /dev/null > /dev/null; then
    echo "OpenShift OAuth enabled and provider incorrectly set: Resetting openshift login provider"
    xmlstarlet edit -L --delete '//hudson/securityRealm' "${JENKINS_HOME}/config.xml"  2>/dev/null
  fi
fi

If you're okay with this suggestion, I'm happy to raise a PR otherwise open to exploring alternative options.

@coreydaley
Copy link
Member

Thank you for opening an issue about this, we love it when our users share their enhancement ideas with us!

I like where you are going with this idea, but am not in love with the implementation.
What would you think about the following:
Create a new environment variable called something like JENKINS_AUTH_PROVIDER whose valid options would be openshift, local and external which would allow for the two existing paths plus the one you want to add, and if that environment variable is not set, it will enter the current block of code that uses the old environment variable, that way we could provide an extensible way to add more options in the future (if needed) but still support backwards compatibility.

I am open to any thoughts you might have about this implementation.

Thanks!
Corey

@michaelishri
Copy link
Author

Thanks @coreydaley, that sounds great. Not exactly the same, but my first version was similar to your suggestion but I was worried about backwards compatibility and settled on this instead for it's conciseness.

I'll put something together and post back here.

@coreydaley
Copy link
Member

I look forward to reviewing and collaborating on it

@michaelishri
Copy link
Author

michaelishri commented Jul 30, 2023

Hi @coreydaley,

I've been testing the various options based on your suggestion however it's not as straight forward as it appears.

Firstly, I thought this was as clean as I could get it,

# Sets the Jenkins Security Realm (login/authentication mechanism)
if [ "${JENKINS_AUTH_PROVIDER}" == "openshift" ]; then
  echo "Jenkins Auth Provider: Using OpenShift OAuth"
  set_security_realm_openshift_oauth
elif [ "${JENKINS_AUTH_PROVIDER}" == "local" ]; then
  echo "Jenkins Auth Provider: Using Jenkins built-in user database"
  set_security_realm_local
elif [ "${JENKINS_AUTH_PROVIDER}" == "external" ]; then
  echo "Jenkins Auth Provider: Using external auth provider"
  # Do nothing, allow the user to set their own.
else
  if [ "${OPENSHIFT_ENABLE_OAUTH}" == "false" ]; then
    echo "OpenShift OAuth disabled: Using Jenkins built-in user database"
    set_security_realm_local
  else
    echo "OpenShift OAuth enabled: true"
    set_security_realm_openshift_oauth
  fi
fi

However, after testing, I realised that the regardless of what is set, if the OpenShift Login plugin is enabled, it will override what we configure if OPENSHIFT_ENABLE_OAUTH isn't set to false. This meant I had to put a bunch of conditionals inside each condition and advise the user, which ended up looking like this:

# Sets the Jenkins Security Realm (login/authentication mechanism)
if [ "${JENKINS_AUTH_PROVIDER}" == "openshift" ]; then
  echo "Jenkins Auth Provider: Using OpenShift OAuth"
  set_security_realm_openshift_oauth
elif [ "${JENKINS_AUTH_PROVIDER}" == "local" ]; then
  if [ "${OPENSHIFT_ENABLE_OAUTH}" == "true" ]; then
    echo "JENKINS_AUTH_PROVIDER has been set to LOCAL however the OPENSHIFT_ENABLE_OAUTH has not been set to FALSE which means the OpenShift Login plugin will override if it is enabled."
    set_security_realm_local
  else
    echo "Jenkins Auth Provider: Using Jenkins built-in user database"
    set_security_realm_local
  fi
elif [ "${JENKINS_AUTH_PROVIDER}" == "external" ]; then
  if [ "${OPENSHIFT_ENABLE_OAUTH}" == "true" ]; then
    echo "JENKINS_AUTH_PROVIDER has been set to EXTERNAL however the OPENSHIFT_ENABLE_OAUTH has not been set to FALSE which means the OpenShift Login plugin will override if it is enabled."
    set_security_realm_local
  else
    echo "Jenkins Auth Provider: Using external auth provider"
    # Do nothing, allow the user to set their own.
  fi
else
  if [ "${OPENSHIFT_ENABLE_OAUTH}" == "false" ]; then
    echo "OpenShift OAuth disabled: Using Jenkins built-in user database"
    set_security_realm_local
  else
    echo "OpenShift OAuth enabled: true"
    set_security_realm_openshift_oauth
  fi
fi

Not as elegant as I was hoping. If the plugin worked in the opposite manner; it is active when OPENSHIFT_ENABLE_OAUTH is set to TRUE, I don't thing this would be a problem but there must've been a reason to make it work the way it currently works.

These were the test scenario's I started with but abandoned them once I realised what was going on, with the plugin enabled.

Environment Variable(s) Expected Outcome Actual Outcome
JENKINS_AUTH_PROVIDER = openshift + OPENSHIFT_ENABLE_OAUTH = true OpenShift OAuth ENABLED OpenShift OAuth ENABLED
JENKINS_AUTH_PROVIDER = openshift + OPENSHIFT_ENABLE_OAUTH = false OpenShift OAuth ENABLED OpenShift OAuth ENABLED
JENKINS_AUTH_PROVIDER = openshift + OPENSHIFT_ENABLE_OAUTH = other OpenShift OAuth ENABLED OpenShift OAuth ENABLED
⛔️ JENKINS_AUTH_PROVIDER = local + OPENSHIFT_ENABLE_OAUTH = true Jenkins Local Database OpenShift OAuth ENABLED
JENKINS_AUTH_PROVIDER = local + OPENSHIFT_ENABLE_OAUTH = false Jenkins Local Database Jenkins Local Database
⛔️ JENKINS_AUTH_PROVIDER = local + OPENSHIFT_ENABLE_OAUTH = other Jenkins Local Database OpenShift OAuth ENABLED
JENKINS_AUTH_PROVIDER = external + OPENSHIFT_ENABLE_OAUTH = true User configured Test not performed
JENKINS_AUTH_PROVIDER = external + OPENSHIFT_ENABLE_OAUTH = false User configured Test not performed
JENKINS_AUTH_PROVIDER = external + OPENSHIFT_ENABLE_OAUTH = other User configured Test not performed

It would be helpful to hear your thoughts on how to move this forward.

@coreydaley
Copy link
Member

I wonder if we could do something similar to this:

# Sets the Jenkins Security Realm (login/authentication mechanism)
if [ "${JENKINS_AUTH_PROVIDER}" == "external" ]; then
  export OPENSHIFT_ENABLE_OAUTH=false
  echo "Jenkins Auth Provider: Using external auth provider"
  # Do nothing, allow the user to set their own.
elif [ "${JENKINS_AUTH_PROVIDER}" == "local" ] || [ "${OPENSHIFT_ENABLE_OAUTH}" == "false" ]; then
  echo "Jenkins Auth Provider: Using Jenkins built-in user database"
  set_security_realm_local
elif [ "${JENKINS_AUTH_PROVIDER}" == "openshift" ] || [ "${OPENSHIFT_ENABLE_OAUTH}" != "false" ]; then
  echo "Jenkins Auth Provider: Using OpenShift OAuth"
  set_security_realm_openshift_oauth
else
  echo "OpenShift OAuth enabled: true"
  set_security_realm_openshift_oauth
fi

@michaelishri
Copy link
Author

I had considered it but thought it might be a bit weird for the user having explicitly set the OPENSHIFT_ENABLE_OAUTH variable and then having the container behave differently? I don't work with containers a whole lot so not heaps familiar with what's acceptable in this case.

@coreydaley
Copy link
Member

Hmmm, I am not seeing where anyone only using the OPENSHIFT_ENABLE_OAUTH env var would be affected by this change. Can you elaborate?

@coreydaley
Copy link
Member

bump

@michaelishri
Copy link
Author

Thanks for the bump, been wanting to get back to this.

What I'm unsure about is the "side-effect" of changing the environment variable under-the-hood. If I set and environment variable, I expect it to be there as I set it. If we do it for the user as part of this, it could make it quite tricky to troubleshoot if you didn't know about this repository (which I didn't until I google'd my error message).

Perhaps it's okay if we explicitly mention that it's being flipped because JENKINS_AUTH_PROVIDER variable has been set so if that's an acceptable solution, then happy to raise a PR.

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 20, 2023
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 20, 2023
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci bot closed this as completed Jan 20, 2024
Copy link
Contributor

openshift-ci bot commented Jan 20, 2024

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@larsknutson
Copy link

larsknutson commented Jan 22, 2024

Having the same issue - can this be reopened?

@larsknutson
Copy link

/remove-lifecycle rotten
/reopen

Copy link
Contributor

openshift-ci bot commented Jan 22, 2024

@larsknutson: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/remove-lifecycle rotten
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants