-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-85] Convert Airflow Webserver UI to FAB for RBAC support #3015
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome to see this as a PR!
Some comments, some questions from reading the code. I haven't had a chance to run it yet though, but I'll try and find time to do that next week.
UPDATING.md
Outdated
|
||
FAB has built-in authentication support for DB, OAuth, OpenID, LDAP, and REMOTE_USER. The default auth type is `AUTH_DB`. | ||
|
||
For any other authentication type (OAuth, OpenID, LDAP, REMOTE_USER), see [this link](http://flask-appbuilder.readthedocs.io/en/latest/security.html#authentication-methods) for how to configure variables in webserver_config.py file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to avoid having link targets saying "this link" or "click here", as it doesn't provide clues for screen readers and other assistive technologies. (I guess it doesn't matter much in this specific case, but it is a good habit to get in to.)
Something like:
see the [Authentication section of FAB docs](http://flask-appbuilder.readthedocs.io/en/latest/security.html#authentication-methods)
UPDATING.md
Outdated
|
||
For any other authentication type (OAuth, OpenID, LDAP, REMOTE_USER), see [this link](http://flask-appbuilder.readthedocs.io/en/latest/security.html#authentication-methods) for how to configure variables in webserver_config.py file. | ||
|
||
Once you modify your config file, run `airflow initdb` to generate new tables for RBAC support (these tables will have the prefix `ab_`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing any users created in the old style won't be migrated automatically and will need to be re-created. A note about this would be good.
airflow/bin/cli.py
Outdated
@@ -1116,6 +1129,34 @@ def kerberos(args): # noqa | |||
airflow.security.kerberos.run() | |||
|
|||
|
|||
def create_admin(args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work if something if FAB is configured to use other auth backends than AUTH_DB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashb The create_admin is for auth_db only. The OAuth flow (which I tested) will also create an entry in the ab_user table, with a username assembled by FAB (provider name plus an id it parsed from the auth response). So if you know the username and enter with provider prefix, you could create an account via this command, but it's not very obvious.
I spent a bit of time on sorting out the OAuth flow today. The registration is best done via the UI and is a bit more involved. It requires the following config change
- Modify webserver_config.py
AUTH_TYPE = AUTH_OAUTH
OAUTH_PROVIDERS = [
{
'name':'google',
'whitelist': ['@COMPANY.COM'],
'token_key':'access_token',
'icon':'fa-google',
'remote_app': {
'base_url':'https://www.googleapis.com/oauth2/v2/',
'request_token_params':{
'scope': 'email profile'
},
'access_token_url':'https://accounts.google.com/o/oauth2/token',
'authorize_url':'https://accounts.google.com/o/oauth2/auth',
'request_token_url': None,
'consumer_key': CONSUMER_KEY,
'consumer_secret': SECRET_KEY,
}
}
]
AUTH_USER_REGISTRATION = True
AUTH_USER_REGISTRATION_ROLE = "Admin"
This will allow the admin to self register an account with full access
2. Once the admin(s) is registered, modify webserver_config.py again:
AUTH_USER_REGISTRATION_ROLE = "Public"
This will allow all other users to freely register with the Public
role via OAuth.
Will add a note about this in the doc. I haven't had much time to test out the other auth flows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just found out that FAB provides a nice way to override/customize the oauth registration stored in the db, which will make this create_admin cli tool more usable (for example, username can be set to email address). See explanation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like we can configure FAB logins so this works for all backends, but if not is it worth guarding this function so that it says "can't be used with this auth backend" or a similar error (rather than silently creating a user record that won't do anything)
It also might be worth generalizing this to a create_user
command instead, and adding the role/group name(s) as a parameter -- this would let people create users via the CLI too which sounds like a plus for scripting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct, we can configure it such that this works for all backends. And great idea, will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to do this or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashb oops, yes! this slipped my mind hehe. working on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
airflow/configuration.py
Outdated
else: | ||
WEBSERVER_CONFIG = AIRFLOW_HOME + '/webserver_config.py' | ||
else: | ||
WEBSERVER_CONFIG = expand_env_var(os.environ['WEBSERVER_CONFIG']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this entire block not be removed if this was a normal setting in airflow.cfg, and then the default env of AIRFLOW__WEBSERVER__CONFIG
could be used.
airflow/configuration.py
Outdated
f.write(DEFAULT_WEBSERVER_CONFIG) | ||
|
||
if 'WEBSERVER_CONFIG' not in os.environ: | ||
if os.path.isfile(expand_env_var('~/webserver_config.py')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an odd file location to open -- having a webserver_config.py in a home directory is somewhat confusing. GIven the ways of overloading the file to read can we remove this and not look for this file automatically?
@@ -0,0 +1,28 @@ | |||
/*! ======================================================================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to point to the static assets from www/ instead of having to duplicate them in this tree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the start I wasn't familiar with the amount of modification required, so it was easier to copy/paste the entire folder and not worry about breaking old UI (this will also make deprecating the older webserver easier). In the end, maybe 1-2 static files were slightly modified, most untouched. so I don't feel too strong about going either way.
@@ -0,0 +1,57 @@ | |||
{# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with templates -- are these the same between www and www_rbac? If so can we use the www/templates/ instead of copying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The templates are not quite the same between www and www_rbac:
- they extends from different master templates (/appbuilder rather than /admin) and imports different html/jinja macros
- some hard-coded urls in the html files had to be adjusted for FAB's endpoints
There are some files that no longer apply for FAB, like query.html/chart.html. I'll take some time and remove the old css/js/html files
airflow/www_rbac/views.py
Outdated
return json.dumps(task_instances) | ||
|
||
@expose('/variables/<form>', methods=["GET", "POST"]) | ||
@has_access |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too familiar with FAB yet, but is this action protected/restricted to a different ACL to other more generic views?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACLs are explicitly managed (see security.py on the default role-permission mappings), the only thing @has_access does is making this a secured method.
Generic views (which i assume you mean the model views) under the hood uses the same mechanism -- they have @has_access annotations on each of the CRUD methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess by generic I actually meant "less dangerous"/read only. i.e. is getting or editing variables a different permission you can assign to users/groups from say "viewing the list of dags" or any of the other actions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I see, in op_vms
list in security.py. I ugess I just expected to see the permission defined here. Treat this as unfamiliarity with FAB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashb np, the decoupling of permissions (and roles) from actions is one of my favorite property of FAB's security model :)
airflow/www_rbac/views.py
Outdated
@expose('/variables/<form>', methods=["GET", "POST"]) | ||
@has_access | ||
@action_logging | ||
def variables(self, form): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this is in the AirflwoView rather than the VariablesView?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a direct copy/paste. You are right I don't think it should be here.
Update: this is not used anywhere. will remove completely.
@@ -206,7 +206,7 @@ def do_setup(): | |||
'configparser>=3.5.0, <3.6.0', | |||
'croniter>=0.3.17, <0.4', | |||
'dill>=0.2.2, <0.3', | |||
'flask>=0.11, <0.12', | |||
'flask>=0.12, <0.13', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add flask-appbuilder in here too somewhere don't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops
Does this also include granting permissions to users at DAG Level ? Will something like this - |
@yaswanth477 that is not in the scope of this PR. This is only view-level access control. Next step will be dag-level access control. |
Say i was able to create all the tables for dag level control - Only user is honored with dag level control. Is the plan to use FAB to control access to dags - read/execute? I want to contribute more to this to atttain our target state.Any pointers would be helpful. |
@harikak84 yes, FAB is what I have in mind. Here is a super high-level sketch of how dag-level access control could be implemented (not final or only way): (1) Currently each endpoints is annotated with the @has_access decorator, which essentially tells FAB to create a single permission-view mapping for that endpoint in the db. To make this DAG-level, we'd have to re-implement the decorator so it has more granularity: instead of creating one perm-view mapping per endpoint, we'd need one per endpoint per DAG. (2) Modify the dag file parser so that it can parse a new 'access_control' config in the dag file. It would then add/delete restricted permission-view-mapping to the db (depending on if the starting point is no access or all access) (3) For any high-level pages (i.e the homepage, /list/taskinstance, /list/dagrun, etc.), they need to be be filtered. The implementation won't be trivial, but I do see a path for it as described above. Let me know if you have more questions/concerns. |
650be24
to
045a309
Compare
ping @ashb @Acehaidrey @bolkedebruin @mistercrunch @Fokko @mistercrunch do you have release access for FAB ? if not I can ping the owner. Thanks! |
When I try to install using
Also a lot of other packages are missing with |
When I try to make all the users just
I get the following error:
This looks like something on our side |
@jgao54 can you rebase on master? It looks very good, a very nice improvement on the UI and permission management! Unfortunately, I'm having some troubles with sqlite and the updated timezone stuff:
@bolkedebruin PTAL |
I get similar logs using Postgres. Those "registering class" at info (even assuming we fix the errors) are too chatty - we might need to tweak the log levels for FAB to so that we don't create 100s of log lines form the webserver every time we reload the workers. OR: Do we still need the automatic worker cycling that we do? Do either of you know it that added to work around a specific bug/problem that might not be needed any more? |
@ashb I don't really mind the lines at startup. I don't periodically restart the worker at the different airflow instances that I'm running. |
Thank you for reviewing! @Fokko Can we make the title Airflow instead of FAB? When I try to make all the users just Admin... When I go to the {Tree,Dag,..}view of the dag, the UI is full width. Is this an undocumented feature, of intentionally? Unfortunately, I'm having some troubles with sqlite and the updated timezone stuff: @ashb |
cc9137a
to
d01e46b
Compare
Modified the flask_appbuilder logging level in DEFAULT_LOGGING_CONFIG so it defaults to 'WARN'. |
The failed tests are resulting from tz issue, so I'm waiting on fab release. Otherwise I believe all the issues raised above are addressed. |
3c7b27d
to
37f3ce3
Compare
@bolkedebruin Added more UI tests. I missed all the UI tests from core.py. Thanks for catching that. |
we should really refactor www tests out of core.py ... |
+1 |
+1 LGTM, looking forward to this |
Merging! |
Awesome work! Can't wait having this in a release! it looks like it will be included in 1.10.0 (fix version in AIRFLOW-85 Jira ticket). Any idea about the planned release date? The Jira releases view could be the right place to provide such information. But it seems outdated for now. |
Great work ! @jgao54 or anyone aware of any DAG access control request ungoing project ? |
@gauthiermartin |
@jgao54 Thank you |
@jgao54 Anyidea when this is going to be released? |
@rsubra13 it's in the 1.10 release, which is currently in beta |
Thanks @criccomini ! Looking forward to it. |
@jgao54 I have been waiting for RBAC for such a long time, thanks for this functionality 👍 AUTH_TYPE = AUTH_LDAP |
Related, and a better place to ask: dpgaspar/Flask-AppBuilder#740 |
I was also checking the authentication using OPENID, even though i successfully sign in with Okta, the Airflow application says access denied with and present me the login screen and the Okta logo too doesnt appear in webserver_config.py |
@jgao54 Thanks I was able to resolve this issue, by the way i am facing an issue with MySQL with this feature, I have raised Jira Issue https://issues.apache.org/jira/browse/AIRFLOW-2459?filter=-2 |
Hi @jgao54 So I tried running
I checked the cli.py and it doesn't have a create_admin function. Am I doing it right? |
hey @jensenity , the command would be |
Thanks @feng-tao ! This feature is really amazing! Thank you for this! |
hey @jensenity , @jgao54 is the one who implemented RBAC feature.... |
Closes apache#3015 from jgao54/rbac
Hi @rushtokunal How did you fix the okta login issue |
I have enabled Airflow with LDAP credentials with FAB. But the problem is that all the users have the Admin role after self registration, because we have given the below value in webserver_config.py file AUTH_USER_REGISTRATION_ROLE = “Admin”. How can we dynamically assign the AUTH_USER_REGISTRATION_ROLE based on the users LDAP role? We have different users like tester, developer and operation user but with the above webserver config file all users are automatically assigned the Admin role via Flask_appbuilder.security under manager.py file. Is there any way to create the customize manager file and while login refer this customize file instead of Flask_appbuilder.security.manager.py file. Because I can not change directly in flask_appbuilder.security manager.py file and add the our customize role and assign in AUTH_USER_REGISTRATION_ROLE based on the users LDAP role |
@Abhishekchechani I have just created a PR in Flask-AppBuilder which implements this feature, see here. |
Make sure you have checked all steps below.
JIRA
Description
New UI with RBAC support.
Tests
Commits
My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
Passes
git diff upstream/master -u -- "*.py" | flake8 --diff
To test this PR:
python setup.py install
orpython setup.py develop
so Flask-AppBuilder at master can be installed.airflow.cfg
file, under[webserver]
, setrbac = True
.airflow initdb
to generate new tables for RBAC support.airflow create_admin
to create an admin user.airflow webserver
as usual.Note that the default auth type is DB. You can change AUTH_TYPE in the newly generated
webserver_config.py
file in AIRFLOW_HOME. See this link for instructions.Given the size of this PR, it will likely require a few iterations to get the rough edges sorted out. Love any help I can get with code-review and testing.