Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Dashboard aware RBAC dataset permission #24789

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Jul 25, 2023

SUMMARY

This PR (alongside #24806) fixes #24782.

Previously users were granted access to a dataset/chart associated with a dashboard (per the RBAC dashboard roles) however it wasn't context aware, i.e., access was granted irrespective of the context of the request. Specifically this PR ensures that role based dashboard access only occurs within the confines of a dashboard.

TESTING INSTRUCTIONS

Added unit tests.

BEFORE

AFTER

ADDITIONAL INFORMATION

cc: @mdeshmu @sfirke

@codecov
Copy link

codecov bot commented Jul 25, 2023

Codecov Report

Merging #24789 (0c7cdf3) into master (9f7f2c6) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

❗ Current head 0c7cdf3 differs from pull request most recent head a80cced. Consider uploading reports for the commit a80cced to get more accurate results

@@            Coverage Diff             @@
##           master   #24789      +/-   ##
==========================================
- Coverage   68.99%   68.99%   -0.01%     
==========================================
  Files        1906     1906              
  Lines       74133    74122      -11     
  Branches     8208     8208              
==========================================
- Hits        51151    51140      -11     
  Misses      20859    20859              
  Partials     2123     2123              
Flag Coverage Δ
hive 54.17% <75.00%> (+0.01%) ⬆️
mysql 79.21% <100.00%> (-0.01%) ⬇️
postgres 79.31% <100.00%> (-0.01%) ⬇️
presto 54.07% <75.00%> (+0.01%) ⬆️
python 83.37% <100.00%> (-0.01%) ⬇️
sqlite 77.88% <100.00%> (-0.01%) ⬇️
unit 55.04% <25.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
superset/security/manager.py 94.43% <100.00%> (+0.48%) ⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@john-bodley john-bodley force-pushed the john-bodley--fix-24782-raise-for-access branch 2 times, most recently from 3c93b96 to a1f351f Compare July 25, 2023 04:49
mock_can_access.return_value = False
mock_can_access_schema.return_value = False
mock_is_owner.return_value = False

with self.assertRaises(SupersetSecurityException):
security_manager.raise_for_access(viz=test_viz)

@pytest.mark.usefixtures("load_birth_names_dashboard_with_slices")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is needed, i.e., test_get_dashboard_view__user_access_with_dashboard_permission might be suffice.

@@ -191,7 +190,7 @@ def test_get_dashboard_view__user_access_with_dashboard_permission(self):

request_payload = get_query_context("birth_names")
rv = self.post_assert_metric(CHART_DATA_URI, request_payload, "data")
self.assertEqual(rv.status_code, 200)
self.assertEqual(rv.status_code, 403)
Copy link
Member Author

@john-bodley john-bodley Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test (alongside others) was failing after the change which likely indicates that either i) RBAC was supposed to expand beyond the scope/context of the dashboard (and thus my understanding of said issue was misconstrued), or ii) was a mistake.

@john-bodley john-bodley added the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Jul 25, 2023
@john-bodley john-bodley force-pushed the john-bodley--fix-24782-raise-for-access branch from a1f351f to 71366fa Compare July 25, 2023 05:17
@pull-request-size pull-request-size bot added size/L and removed size/M labels Jul 25, 2023
@@ -159,41 +159,6 @@ def test_has_guest_access__unauthorized_guest_user__different_resource_type(self
has_guest_access = security_manager.has_guest_access(self.dash)
self.assertFalse(has_guest_access)

def test_chart_raise_for_access_as_guest(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditional on my logic being correct, these tests are now superfluous as the irrespective of whether a guest is authorized or not, they don't have access to the chart, dataset, etc. unless it's in the context of RBAC. The logic previously outlined here condenses to the same logic which is defined in the security_tests.py.

@john-bodley john-bodley marked this pull request as ready for review July 25, 2023 05:53
@john-bodley john-bodley changed the title fix: Dashboard aware RBAC dataset permission fix(security): Dashboard aware RBAC dataset permission Jul 26, 2023
@john-bodley john-bodley changed the title fix(security): Dashboard aware RBAC dataset permission fix: Dashboard aware RBAC dataset permission Jul 26, 2023
@john-bodley john-bodley force-pushed the john-bodley--fix-24782-raise-for-access branch from 71366fa to 95356d1 Compare July 26, 2023 04:41
@betodealmeida
Copy link
Member

@john-bodley @Vitor-Avila did a fix in #24808 giving access to datasets that power native filters, when using Dashboard RBAC. Can we include his changes here, or would that use case already work with your changes?

@john-bodley
Copy link
Member Author

john-bodley commented Jul 26, 2023

Thanks @betodealmeida for sharing @Vitor-Avila's PR. Initially I can't say whether this change will resolve said issue as it's not apparent where (and with what arguments) the raise_for_access method is being invoked from regarding the native filters.

@Vitor-Avila would you mind using my branch to test your issue? Note assuming the approach outlined in this PR is correct (@villebro it's probably the definitive expert on this) and it doesn't address @Vitor-Avila's issue, I do sense the right solution there is make sure that the access for native filters is akin to that of dashboard charts.

Personally I'm a tad worried about the DASHBOARD_RBAC feature flag as I don't think all the various paths have been fully flushed out. Furthermore our default security manager/policy is already somewhat of a complex web of rules which only gets worse when you add another variable (the feature flag—which turns a lot of the existing logic on its head) to the mix.

Additionally at Airbnb we use a completely different security model—access is purely defined at the dataset level—and thus I struggle to grok what the logic should be, i.e., I likely lack the necessary context.

@Vitor-Avila
Copy link
Contributor

hey @john-bodley, thanks for the quick response!

My PR was actually not created with the DASHBOARD_RBAC FF in mind, but actually focusing on embedded access for dashboards. I'm not sure if the permission flow/logic is the same between guest user/dashboard RBAC access, but I know at least for embedded access you would only specify the dashboard uuid when generating the guest_token, and I believe the expectation is that the user should be able to view all charts and filters used in the specified dashboard. For embedded dashboards, it's also not possible to access Charts directly (at least not through the iframe) -- but likely through the API.

I'll test if my issue is also addressed with your branch and report back -- thank you! 🙏

@Vitor-Avila
Copy link
Contributor

Yup, I confirmed the issue still exists in this branch:
image

@eschutho mentioned you also have a PR to re-introduce the logic from can_access_based_on_dashboard: #24804

Should I try to re-create my changes on top of this modification?

@john-bodley
Copy link
Member Author

john-bodley commented Jul 26, 2023

Thanks @Vitor-Avila for confirming. I thought that might be the case as the dashboard information is likely lacking from the form-data when the native dashboard filter datasource check is happening. #24804 is simply a refactor and thus wouldn't change anything.

The challenge with the RBAC feature is opening up access only in the context of a dashboard. Furthermore I sense there's could be flaw with the model, i.e., a Gamma user (who has been granted access to a dashboard) could likely have broader access than an owner as it pertains to how access is determined for the datasets within the dashboard which are driving the native filters and charts. Again I need to provide a caveat with everything I say as I'm not overly familiar with how chart etc. access is defined.

@Vitor-Avila
Copy link
Contributor

@john-bodley sorry I mis-read #24804 and thought you were re-adding the logic for can_access_based_on_dashboard somewhere else, but that's not true - I was confused with the raise_for_dashboard_access function. My apologies for the confusion.

In my previous test, I had only checked the issue I'm trying to fix for embedded users, but I just tried with the DASHBOARD_RBAC access as well and confirmed it also happens for this flow:
image

Not sure if we can keep the can_access_based_on_dashboard function or if that's not compatible with the work you've been doing in these PRs.

@john-bodley
Copy link
Member Author

@Vitor-Avila I think we likely need to take a pause on this for now until collectively the community can agree on how the dashboard RBAC are supposed to work. Hopefully this can happen sometime next week once some of the Apache Superset committers who are well versed in this space are back online.

The challenge otherwise is we could find ourselves playing whack-a-mole with trying to fix issues whilst simultaneously creating other ones. Taking a more holistic (30,000') view of problem will likely help everyone better grok the intention and thus what access controls are required.

@Vitor-Avila
Copy link
Contributor

sure, that makes sense! It could also be a good opportunity to revisit if both flows (DASHBOARD_RBAC and guest user) should be handled in a more distinct way.

form_data
and (dashboard_id := form_data.get("dashboardId"))
and (dashboard := DashboardDAO.find_by_id(dashboard_id))
and self.can_access_dashboard(dashboard)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too familiar with this feature either, but just noticed that the previous version checks for datasource access and the new one checks for dashboard access only. Is that what we want?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eschutho the logic is as follows. If you don't have explicit access to dataset you then fallback to the form-data of either the visualization or query-context to try to decipher the context. If it's in the context of a dashboard, i.e., you're viewing a chart in a dashboard, one checks whether you have access to the dashboard—per the can_access_dashboard method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check that the purported chart whose data is being loaded actually exists on the dashboard?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@villebro is the logic on line #1846 not suffice, i.e., extracting the dashboard ID from the form-data (which could stem from either a legacy visualization or query-context)?

@eschutho
Copy link
Member

@john-bodley are you ok with us merging #24808 while we wait for more feedback on RBAC for this PR? @Vitor-Avila's PR isn't for the DASHBOARD_RBAC flag specifically.

@john-bodley john-bodley force-pushed the john-bodley--fix-24782-raise-for-access branch from 0ce3494 to c1f8acb Compare August 4, 2023 04:43
@john-bodley john-bodley force-pushed the john-bodley--fix-24782-raise-for-access branch from c1f8acb to a80cced Compare August 4, 2023 18:18
@john-bodley john-bodley merged commit 7397ab3 into apache:master Aug 4, 2023
@john-bodley john-bodley deleted the john-bodley--fix-24782-raise-for-access branch August 4, 2023 18:53
@john-bodley john-bodley added the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Aug 4, 2023
michael-s-molina pushed a commit that referenced this pull request Aug 7, 2023
jfrag1 added a commit to preset-io/superset that referenced this pull request Aug 16, 2023
@jfrag1
Copy link
Member

jfrag1 commented Aug 16, 2023

@john-bodley I discovered that this PR introduces a security issue that can allow any user to get data from any datasource. I've opened #24996 to revert this, see the PR description there for more details

jinghua-qa added a commit to preset-io/superset that referenced this pull request Aug 16, 2023
john-bodley added a commit to john-bodley/superset that referenced this pull request Aug 17, 2023
michael-s-molina pushed a commit that referenced this pull request Aug 21, 2023
@mistercrunch mistercrunch added 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 8, 2024
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🚢 3.1.0
Projects
None yet
9 participants