Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DI-1113. ADDENDUM. Authentication: Enable user impersonation for Superset to HiveServer2 using hive.server2.proxy.user (a.fernandez) #3697

Merged
merged 1 commit into from
Nov 6, 2017

Conversation

afernandez
Copy link
Contributor

Figured out how to make changes to configuration dictionary instead of having to patch the PyHive library. When connecting to Hive, will determine if impersonation is enabled, and if so, will send the currently logged on user via the hive.server2.proxy.user property in the configuration dictionary to SQLAlchemy's create_engine function.

Tested this on my local Superset instance, both Test Connection and Run SQL Query.

Copy link
Contributor Author

@afernandez afernandez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -3,27 +3,6 @@
from thrift import Thrift


old_Connection = hive.Connection
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer need to patch this.

engine_params['poolclass'] = NullPool

configuration = {}
configuration.update(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update configuration dictionary instead.

logging.info("Superset.testconn(). Masked URL: {0}".format(masked_url))

configuration = {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update configuration dictionary instead.

@@ -187,26 +187,16 @@ def select_star(cls, my_db, table_name, schema=None, limit=100,
return sql

@classmethod
def modify_url_for_impersonation(cls, url, impersonate_user, username):
def get_configuration_for_impersonation(cls, uri, impersonate_user, username):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the configuration dictionary instead of the URL's query dictionary so that we don't have to modify PyHive library.

@afernandez
Copy link
Contributor Author

Fixing unit tests right now.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 70.947% when pulling 2d592858a269dbdc008b8dc8aaec6d105b5f6d90 on afernandez:afernandez_impersonate into e121a85 on apache:master.

2 similar comments
@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 70.947% when pulling 2d592858a269dbdc008b8dc8aaec6d105b5f6d90 on afernandez:afernandez_impersonate into e121a85 on apache:master.

@coveralls
Copy link

coveralls commented Oct 20, 2017

Coverage Status

Coverage increased (+0.2%) to 70.947% when pulling 2d592858a269dbdc008b8dc8aaec6d105b5f6d90 on afernandez:afernandez_impersonate into e121a85 on apache:master.

Copy link
Member

@mistercrunch mistercrunch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor nits otherwise LGTM


connect_args = (
request.json
.get('extras', {})
.get('engine_params', {})
.get('connect_args', {}))

if len(configuration.keys()) > 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as "pythonesquier" if configurations:


logging.info("Superset.testconn(). Masked URL: {0}".format(masked_url))
configuration.update(db_engine.get_configuration_for_impersonation(uri, impersonate_user, username))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line if very long, goes way beyond 90chars


configuration = {}

if database is not None and uri is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if database and uri:

…rset to HiveServer2 using hive.server2.proxy.user (a.fernandez)
Copy link
Contributor Author

@afernandez afernandez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed comments

self.db_engine_spec.get_configuration_for_impersonation(str(url),
self.impersonate_user,
effective_username))
if configuration:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mistercrunch fixed the long line.

self.db_engine_spec.get_configuration_for_impersonation(str(url),
self.impersonate_user,
effective_username))
if configuration:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mistercrunch fixed this.


configuration = {}

if database and uri:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mistercrunch fixed this.

@coveralls
Copy link

coveralls commented Nov 3, 2017

Coverage Status

Coverage increased (+0.2%) to 71.461% when pulling 69d49ed on afernandez:afernandez_impersonate into 7f3edad on apache:master.

@mistercrunch
Copy link
Member

LGTM, you can sync with @graceguo-supercat to merge and deploy

@graceguo-supercat graceguo-supercat merged commit b059506 into apache:master Nov 6, 2017
@ottomata
Copy link

ottomata commented Jan 8, 2018

We use AUTH_TYPE = AUTH_REMOTE_USER for authentication, and set REMOTE_USER to the authenticated LDAP user's shell username via Apache mod_ldap in front of superset. I'd like for folks that authenticate this way to be able to run hive queries as their own user.

2 questions:

  • Why check 'auth' in url.query.keys() when deciding whether or not to set this. If impersonate_user and username is not None, shouldn't that be enough? I have to configure my Hive database connection with ?auth=NONE to get this to work. (To do so I also annoyingly have to specify a database in the URI, as hostname:10000?auth=NONE seems to try and parse the full 10000?auth=NONE as the integer port for the URI.)

  • When I do set ?auth=NONE in the URI, testing the database connection seems fine. But, saving the database connection fails with: __init__() got an unexpected keyword argument 'hive_server2_proxy_user', and fails to save the change. Upon saving, I see INFO:root:Database.get_sqla_engine(). Masked URL: hive://hostname:10000/default?auth=NONE&hive_server2_proxy_user=otto in the superset logs, which I think looks mostly good, buuuut, where are those underscores coming from? The code explicitly sets configuration['hive.server2.proxy.user'], so something must be inserting them.

@ottomata
Copy link

ottomata commented Jan 8, 2018

BTW, I've also tried to save the URI as hive://hostname:10000/default?hive.server2.proxy.user=otto, just to see if that works, but it gives the similar error message of __init__() got an unexpected keyword argument 'hive.server2.proxy.user' with INFO:root:Database.get_sqla_engine(). Masked URL: hive://hostname:10000/default?hive.server2.proxy.user=otto in the superset logs

michellethomas pushed a commit to michellethomas/panoramix that referenced this pull request May 24, 2018
…rset to HiveServer2 using hive.server2.proxy.user (a.fernandez) (apache#3697)
wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018
…rset to HiveServer2 using hive.server2.proxy.user (a.fernandez) (apache#3697)
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants