-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Alerting] Investigate RBAC exemption on legacy alerts #74858
Comments
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
I've done some early investigation and it seems there are a lot of moving parts to exempting the older Alerts from RBAC all together or over their Action execution, but I think I've found a path we could explore further, I just want to run it by the team before actually trying to implement it. First off, regarding option 3 of a migrating path, I'm not actively investigating that, as it sounds like there's no way we could rush that kind of thing for 7.10 (unless @legrego can pull a rabbit out of a hat). Regarding options 1 & 2, it looks like either way we go, we would need to add arguments to the Actions client that could potentially be abused to bypass RBAC in other cases. I'd like to try and make option 1 work, as it means we only exempt RBAC in the minimal needed case in order for Alerts that were created by Solution users inside that solution to continue working after the upgrade. The more secure option, which is also more complicatedAs we won't be able to remove this exemption until 8.0.0 (which means both the potential security hole and the code debt impact will be around for a while), I'm looking for a pathway that will allow the ActionsClient itself to make the assessment as to whether a certain execution should be exempt from RBAC. To achieve that, I don't think we can rely on the AlertsClient telling the ActionsClient that a specific execution is exempt, because that could be used by any and every other plugin in the future to get around security checks - a clear security hole. This way, the ActionsClient can look at the source Saved Object type and ask the associated plugin (in our case, the Alerts plugin) whether the user should be allowed to execute the action without applying RBAC to the This approach has a couple of complications to consider:
For example imagine this Alert SavedObject:
We wouldn't expose
The less secure option, but far simpler approachIf we decide not to take the path of attaching a reference from the action params back to the alert when it's scheduled, then we would need to rely on the Alert to tell us (when it schedules the action execution) whether to apply RBAC or not. The problem with this is that there's nothing preventing other plugins from doing this whenever they schedule actions for execution and that seems like a rather blatant security hole that is very easy to abuse. If we take this approach we'll need to add an argument to Point number 3 above is relevant to this one too - we'll have to migrate every Alert and mark it as well. What would be the impact of these two approaches?Both the approaches I've laid out above will achieve the same end result in terms of the Alerts.
The bottom line is this - this remediation will ensure that any Alert created by a Solution user inside of a Solution that they have been granted privileges to, will continue to work as expected after the upgrade, but those users will have their abilities limited until they are granted access to the Actions plugin by an administrator. |
I think we're going to want something like this eventually anyway. We really need it for event log, so when we execute an action we can identify the "source" - which today is only alerts or http request - and add it in the action execution event document. Today, we can't trace back from an action execution to the alert that caused it, directly. One reason I've deferred implementing this is it's not clear what the "source" could be, long term, and I'm not sure the best way to model both "an alert saved object" and "an http request" in the same piece of data. My plan had been to pass something "extra" to the action executor, somehow, to indicate this source, which sounds exactly what you've figured out - action execution params feels like the right place (where else?!?!). |
Yup, agree with all of those thoughts. :)
The Action Params ar ethe only thing that really connects to an execution so that feels like the right place as far as I can tell. |
I've been thinking more recently that we probably want to change our "alert execution and subsequent action execution" model where we queue actions to run during alert execution. Longer term, we need to be able to pipe data from one action into another (eg, create github issue, and include a link to that issue in a subsequent email or slack message). For reference, today we just queue up all the actions that are going to be run for an action group, in parallel. Now imagine trying to serialize that queued execution - we will need some kind of "step runner" to manage this. "It's complicated." You know what would be a lot easier? Just run the actions in the same task manager task as the alert executor. This makes our task usage coarser - instead of many fine-grained tasks, we'll have fewer big hairy tasks. Lots of potential sharp edges here, but one thing I'm wondering about is: would that help in this case, if all the action executors were run in-process with the alert executor? |
Yeah, this had occurred to me. Any objection? |
Yes, it would help alot, but that feels like a much bigger change to rush through on a 7.10 blocker :/ |
Ya, WORKSFORME. Event log already has slots for SO's space/type/id, that are used when we have SO's to reference. I've been thinking we can come up with some kind of "extension" pattern for things that aren't SO's, using spaceIds &| types which aren't valid for SO's, or perhaps an additional field indicating additional types of references, in the nested properties that holds the SO's. Eg, when an action is executed via HTTP request, the SO references in the event would include a SO reference that somehow indicates it was via HTTP, perhaps including an HTTP request id, etc. |
Yes, we've done literally no scoping/sizing for this sort of change - I'd guess it would greatly simplify the inner workings of alerting/actions, and allow us to remove some code / artifacts, but will also mean we have to write some new code, and more importantly figure out the hard parts like ... retries and timeouts. Thought I'd throw it out though. |
Have we considered whether it would be possible to run some kind of "fix-up script" after a migration to 7.10, to ... fix up things? Eg, something a customer would run against their own deployment. I'm not sure I remember hearing any mention of this. And the script could end up just being an http entrypoint we add to alerting, where all the "stuff" happens in the route handler. I have no idea what this script might actually do, perhaps given the problem, it's not really possible. It's certainly not the kindest way to deal with this issue, but if such a script were possible, it would mean not having to deal with "legacy" issues in the main code base. And I think with the view that alerting is still "beta", it's not an unreasonable way to solve the problem (if solvable this way). |
Rather than an same structure as above, but with
|
Not something the Alerting team could do - this would require some kind of Privileges and Features migrations and there's nothing like that on the books for @elastic/kibana-security as far as I'm aware. |
I did investigate in that direction. The reason I'd prefer not to go with an empty value equalling exemption is that it means that the default path is to exempt from RBAC- which feels less secure and actually harder to maintain further down the line. I'd rather the exempt alerts are special cased based on additional existing data, rather than simply exempt by default whenever we forget to set the version. That said, we can migrate the old ones to just be: I don't mind going with the version rather than the |
In theory we do know who created and last updated alerts - but not actions - via these fields in the SO: kibana/x-pack/plugins/alerts/server/saved_objects/mappings.json Lines 56 to 61 in 644e9c2
I doubt that gets us very far here though. In lieu of some magic "fixer-upper" script, would some kind of script be helpful that could analyze their current alerts / actions, with some advice on how to "fix" things? I often see complex tools offer a "doctor" tool that does this kind of stuff. I'm sensitive to helping customers who have been using alerts, to keep their alerts running when they install 7.10 (or other future versions), but at the same time don't want to have to overly complicate our code to have to deal with our beta issues for a looong time (forever? what happens to a customer who migrates from 7.8 to 8.0?), so grasping at straws here! :-) |
I'm sorry to say I've been back and forth on this with Security, and I'm afraid it doesn't sound like there' something we can provide to do this... 🤷 |
This is going to sound strange coming from the security team, but I'm honestly not opposed to exploring If the only objection here is the ability for other plugins to bypass security controls, then I'm comfortable with that tradeoff. We have a similar mechanism in place for the saved objects client, where you can create one without the Security Wrapper, thereby removing RBAC from the equation. It takes an explicit action do to this though, and isn't something that can be done accidentally. We can help educate consumers by giving the configuration option a scary name, akin to React's I'm also not concerned about third-party plugins at this point either. If an attacker has the ability to install an arbitrary plugin, then they can do far more damage than just bypassing this specific check.
Nothing on the books, but I've been giving this some thought as time permits. I have a placeholder issue here which I need to flesh out more, but you're right that this isn't something we'd be able to pull off for 7.10: #68814 |
Sorry for the delay in updates but I'm now back from my travels :) I have just opened a PR for the Alerting team to review which I hope will address this issue. The key challenge in exempting legacy alerts from RBAC is that Alerts and Actions are, by design, two independent plugins. This means that allowing Alerts to tell Actions to exempt certain operations from RBAC could open up the ability for any plugin to exempt an operation from RBAC, which opens up a severe backdoor in our Actions plugin. The Chosen ApproachDiscussing the options with members of the team, the feeling was that the more secure approach is the preferable one, despite the additional cognitive load and technical debt it introduces. To that effect, the PR introduces the following:
My testing suggests this will allow all the alerts created prior to 7.10.0 to continue to run, as long as the user who created them is still privileged to the underlying Alert. This means that if a user somehow created a SIEM rule without having privileges to SIEM, then their Alert will not work after the upgrade. Things to noteJust a couple of thoughts for moving forward:
|
Thanks Larry :) The less secure option didn't actually prove possible in the end, as the actual execution of actions happens asynchronously to request to execute, making it impossible to tell the Actions client to skip auth without also storing that exemption in some manner. Additionally, it feels pretty secure and equally important - it doesn't entice plugins using the Actions plugin by offering them an easy way to skip security (which TBH was my bigger concern ;) ). |
… until updated (#75563) Marks all Alerts with a `versionApiKeyLastmodified ` field that tracks what version the alert's Api Key was last updated in. We then use this field to exempt legacy alerts (created pre `7.10.0`) in order to use a _dialed down_ version of RBAC which should allow old alerts to continue to function after the upgrade, until they are updates (at which point they will no longer be **Legacy**). More details here: #74858 (comment)
… until updated (elastic#75563) Marks all Alerts with a `versionApiKeyLastmodified ` field that tracks what version the alert's Api Key was last updated in. We then use this field to exempt legacy alerts (created pre `7.10.0`) in order to use a _dialed down_ version of RBAC which should allow old alerts to continue to function after the upgrade, until they are updates (at which point they will no longer be **Legacy**). More details here: elastic#74858 (comment)
… until updated (#75563) (#77595) Marks all Alerts with a `versionApiKeyLastmodified ` field that tracks what version the alert's Api Key was last updated in. We then use this field to exempt legacy alerts (created pre `7.10.0`) in order to use a _dialed down_ version of RBAC which should allow old alerts to continue to function after the upgrade, until they are updates (at which point they will no longer be **Legacy**). More details here: #74858 (comment)
Following up from the #70851 issue, where we discussed the fact that under certain scenarios existing alerts will stop working after the 7.10 upgrade, we need to investigate way to mitigate this.
A few ideas to explore:
The text was updated successfully, but these errors were encountered: