Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal Error when creating DB with hive #24031

Closed
LIN-Yu-Ting opened this issue May 11, 2023 · 5 comments
Closed

Fatal Error when creating DB with hive #24031

LIN-Yu-Ting opened this issue May 11, 2023 · 5 comments
Assignees
Labels
data:connect:hive Related to Hive

Comments

@LIN-Yu-Ting
Copy link

LIN-Yu-Ting commented May 11, 2023

Hi, I have question on connecting superset with hive server or Apache Spark SQL.

I have initialized a Spark cluster with hive server which I can already connect with beeline as you can see in the following image.
截圖 2023-05-12 上午5 51 21

I try to first startup a superset instance located at the same instance of spark master node. However, I got a Fatal Error message without any detail information. In the following image, you can see that connection looks good.
截圖 2023-05-12 上午5 44 13

However, when clicking on Connect, you can see that there is a Database Creation Error.
截圖 2023-05-12 上午5 52 50

What does it mean ?

I also attach ERROR message obtained from superset instance.

2023-05-11 22:25:08,222:ERROR:flask_appbuilder.api:Object of type bytes is not JSON serializable
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/flask_appbuilder/api/__init__.py", line 110, in wraps
    return f(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/superset/views/base_api.py", line 122, in wraps
    raise ex
  File "/usr/local/lib/python3.8/dist-packages/superset/views/base_api.py", line 113, in wraps
    duration, response = time_function(f, self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/superset/utils/core.py", line 1586, in time_function
    response = func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/superset/utils/log.py", line 266, in wrapper
    value = f(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/superset/views/base_api.py", line 85, in wraps
    return f(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/superset/databases/api.py", line 345, in post
    return self.response(201, id=new_model.id, result=item)
  File "/usr/local/lib/python3.8/dist-packages/flask_appbuilder/api/__init__.py", line 769, in response
    _ret_json = jsonify(kwargs)
  File "/usr/local/lib/python3.8/dist-packages/flask/json/__init__.py", line 302, in jsonify
    f"{dumps(data, indent=indent, separators=separators)}\n",
  File "/usr/local/lib/python3.8/dist-packages/flask/json/__init__.py", line 132, in dumps
    return _json.dumps(obj, **kwargs)
  File "/usr/lib/python3.8/json/__init__.py", line 234, in dumps
    return cls(
  File "/usr/lib/python3.8/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.8/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/local/lib/python3.8/dist-packages/flask/json/__init__.py", line 51, in default
    return super().default(o)
  File "/usr/lib/python3.8/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes is not JSON serializable

It seems that even with Database Creation Error, there is still a Database Connection created.
截圖 2023-05-12 上午6 13 49

However, there is error which can not recognize correctly database and table. I try to use dataset function however it only detect database-level item.
截圖 2023-05-12 上午6 17 49

@TinTin-DXQ
Copy link

i have the same error

@Siham-IT
Copy link

Me too, I have the same problem with Apache Hive in version 2.1.0

@Siham-IT
Copy link

Finally, we solved this problem by following Usiel's solution:

Usiel/PyHive@fb692d9?diff=split

by changing those two lines in the sqlalchemy_hive.py file:

class HiveDialect(default.DefaultDialect):
name = b'hive'
driver = b'thrift'

to this:

class HiveDialect(default.DefaultDialect):
name = 'hive'
driver = 'thrift'

Alternatively, you can simply use the latest version of PyHive, which already contains these changes.

@vndroid
Copy link

vndroid commented Jun 6, 2023

Finally, we solved this problem by following Usiel's solution:

Usiel/PyHive@fb692d9?diff=split

by changing those two lines in the sqlalchemy_hive.py file:

class HiveDialect(default.DefaultDialect): name = b'hive' driver = b'thrift'

to this:

class HiveDialect(default.DefaultDialect): name = 'hive' driver = 'thrift'

Alternatively, you can simply use the latest version of PyHive, which already contains these changes.

Amazing! Thanks a lot.

@bkyryliuk
Copy link
Member

It seems to be a duplicate of #22316
Give pyhive 0.7.0 a try, as I recall there was a contribution fixing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:connect:hive Related to Hive
Projects
None yet
Development

No branches or pull requests

6 participants