Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload a CSV, not working with MySQL #4287

Closed
3 tasks done
britishbadger opened this issue Jan 25, 2018 · 20 comments
Closed
3 tasks done

Upload a CSV, not working with MySQL #4287

britishbadger opened this issue Jan 25, 2018 · 20 comments

Comments

@britishbadger
Copy link

britishbadger commented Jan 25, 2018

Make sure these boxes are checked before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if any
  • I have reproduced the issue with at least the latest released version of superset
  • I have checked the issue tracker for the same issue and I haven't found one similar

Superset version

0.22.1

Expected results

CSV upload should create new table and load data into mysql

Actual results

        Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1615, in full_dispatch_request
    return self.finalize_request(rv)
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1632, in finalize_request
    response = self.process_response(response)
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1858, in process_response
    self.save_session(ctx.session, response)
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 924, in save_session
    return self.session_interface.save_session(self, session, response)
  File "/usr/local/lib/python3.5/dist-packages/flask/sessions.py", line 363, in save_session
    val = self.get_signing_serializer(app).dumps(dict(session))
  File "/usr/local/lib/python3.5/dist-packages/itsdangerous.py", line 565, in dumps
    payload = want_bytes(self.dump_payload(obj))
  File "/usr/local/lib/python3.5/dist-packages/itsdangerous.py", line 847, in dump_payload
    json = super(URLSafeSerializerMixin, self).dump_payload(obj)
  File "/usr/local/lib/python3.5/dist-packages/itsdangerous.py", line 550, in dump_payload
    return want_bytes(self.serializer.dumps(obj))
  File "/usr/local/lib/python3.5/dist-packages/flask/sessions.py", line 85, in dumps
    return json.dumps(_tag(value), separators=(',', ':'))
  File "/usr/local/lib/python3.5/dist-packages/flask/json.py", line 123, in dumps
    rv = _json.dumps(obj, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/simplejson/__init__.py", line 397, in dumps
    **kw).encode(obj)
  File "/usr/local/lib/python3.5/dist-packages/simplejson/encoder.py", line 291, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/lib/python3.5/dist-packages/simplejson/encoder.py", line 373, in iterencode
    return _iterencode(o, 0)
  File "/usr/local/lib/python3.5/dist-packages/flask/json.py", line 80, in default
    return _json.JSONEncoder.default(self, o)
  File "/usr/local/lib/python3.5/dist-packages/simplejson/encoder.py", line 268, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: OperationalError('(_mysql_exceptions.OperationalError) (1045, "Access denied for user \'superset\'@\'172.21.0.4\' (using password: YES)")',) is not JSON serializable

Steps to reproduce

I'm using the following docker image : https://github.com/amancevice/superset

add the following to mysql/supertset_config.py
UPLOAD_FOLDER='/tmp/'
I created a new database called TEST and ensured the user has permissions to do stuff with this schema:

GRANT ALL PRIVILEGES ON TEST.* TO 'superset'@'%' WITH GRANT OPTION;

Password was set to superset and I can prove external connection by using sql workbench (external client) to connect directly to the mysql container.

screencapture-localhost-8088-csvtodatabaseview-form-1516889626860

CSV file called x.csv contained the following

XXXXXX,YYYYYYY
1,2
@britishbadger
Copy link
Author

I have also posted an issue on the docker image at amancevice/docker-superset#45

@amancevice
Copy link

Does the user you are connecting to MySQL with have the correct permissions? I've never used this feature, but it looks like that's the root of your issue to me.

@britishbadger
Copy link
Author

As I put in the issue, the user has all the privs it needs and I can successfully import data through mysql workbench using this newly created user.

@xrmx
Copy link
Contributor

xrmx commented Jan 25, 2018

@britishbadger does testing the connection from the database page work fine?

@britishbadger
Copy link
Author

yeah that produces the message "Seems OK"

@amancevice
Copy link

@britishbadger
Copy link
Author

I took a look at that post but I don't really understand it's relevance. I can connect directly to the database from MySQL workbench and the data loads in fine, indicating the user has all the right privileges.

@xrmx
Copy link
Contributor

xrmx commented Jan 31, 2018

@britishbadger you did say you tried the connection not the data upload with workbench, it was a fair shot :)

@amancevice
Copy link

Have you tried adding local_infile=1 to your connection string? As in mysql://user:pass@host:3306/db?local_infile=1

Oddly enough I just ran into this myself for a different Python project and this was the only thing that fixed it.

@britishbadger
Copy link
Author

britishbadger commented Jan 31, 2018

Thanks for taking a look at this,

With the following connection string:
mysql://test:XXXXXXXXXX@mysql:3306/TEST?local_infile=1

I still get the following:

 Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1615, in full_dispatch_request
    return self.finalize_request(rv)
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1632, in finalize_request
    response = self.process_response(response)
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1858, in process_response
    self.save_session(ctx.session, response)
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 924, in save_session
    return self.session_interface.save_session(self, session, response)
  File "/usr/local/lib/python3.5/dist-packages/flask/sessions.py", line 363, in save_session
    val = self.get_signing_serializer(app).dumps(dict(session))
  File "/usr/local/lib/python3.5/dist-packages/itsdangerous.py", line 565, in dumps
    payload = want_bytes(self.dump_payload(obj))
  File "/usr/local/lib/python3.5/dist-packages/itsdangerous.py", line 847, in dump_payload
    json = super(URLSafeSerializerMixin, self).dump_payload(obj)
  File "/usr/local/lib/python3.5/dist-packages/itsdangerous.py", line 550, in dump_payload
    return want_bytes(self.serializer.dumps(obj))
  File "/usr/local/lib/python3.5/dist-packages/flask/sessions.py", line 85, in dumps
    return json.dumps(_tag(value), separators=(',', ':'))
  File "/usr/local/lib/python3.5/dist-packages/flask/json.py", line 123, in dumps
    rv = _json.dumps(obj, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/simplejson/__init__.py", line 397, in dumps
    **kw).encode(obj)
  File "/usr/local/lib/python3.5/dist-packages/simplejson/encoder.py", line 291, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/lib/python3.5/dist-packages/simplejson/encoder.py", line 373, in iterencode
    return _iterencode(o, 0)
  File "/usr/local/lib/python3.5/dist-packages/flask/json.py", line 80, in default
    return _json.JSONEncoder.default(self, o)
  File "/usr/local/lib/python3.5/dist-packages/simplejson/encoder.py", line 268, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: OperationalError('(_mysql_exceptions.OperationalError) (1045, "Access denied for user \'test\'@\'172.21.0.4\' (using password: YES)")',) is not JSON serializable

@xrmx
Copy link
Contributor

xrmx commented Jan 31, 2018

@britishbadger give it a try to #4298 as it's supposed to fix your issue

@britishbadger
Copy link
Author

Thanks, I'll keep an eye on it or build it when I get some free time.

@TMorville
Copy link

I have a very similar issue, where "Upload a csv" form yields this error

/usr/local/lib/python3.6/site-packages/superset/app/static/uploads/data.csv'

even though the path for the data is different. Further, /usr/local/lib/python3.6/site-packages/superset/app/static/uploads does not seem to exist. There is no 'app' folder in the superset folder.

@shinabarger
Copy link

Also getting similar error message ([Errno 13] Permission denied: '/usr/local/lib/python3.5/site-packages/superset/app') despite updating to newest version.

@guinny65
Copy link

@shinabarger Are you running superset within a docker image?
I experienced the same issue when not providing superset with a superset_config.py file due to a error in the volume mapping.

@robertozerbini
Copy link

I ran this command and it worked
sudo docker exec -u 0 -it superset chown superset /usr/local/lib/python3.5/dist-packages/superset

@welshamy
Copy link

I had the same problem and solved it properly by changing the path where Superset uploads the CSV.

Here is Stackoverflow answer that worked for me.

@trmanish
Copy link

@welshamy I am doing the same thing as mentioned in the stackoverflow answer but it still throws the same error. This is for superset installed on Ubuntu server. For mac I never got csv error.

Could u help? Here is how my superset_config.py looks like:


        if default is not None:
            return default
        else:
            error_msg = 'The environment variable {} was missing, abort...'\
                        .format(var_name)
            raise EnvironmentError(error_msg)


POSTGRES_USER = get_env_variable('POSTGRES_USER')
POSTGRES_PASSWORD = get_env_variable('POSTGRES_PASSWORD')
POSTGRES_HOST = get_env_variable('POSTGRES_HOST')
POSTGRES_PORT = get_env_variable('POSTGRES_PORT')
POSTGRES_DB = get_env_variable('POSTGRES_DB')

# The SQLAlchemy connection string.
SQLALCHEMY_DATABASE_URI = 'postgresql://%s:%s@%s:%s/%s' % (POSTGRES_USER,
                                                           POSTGRES_PASSWORD,
                                                           POSTGRES_HOST,
                                                           POSTGRES_PORT,
                                                           POSTGRES_DB)

REDIS_HOST = get_env_variable('REDIS_HOST')
REDIS_PORT = get_env_variable('REDIS_PORT')


class CeleryConfig(object):
    BROKER_URL = 'redis://%s:%s/0' % (REDIS_HOST, REDIS_PORT)
    CELERY_IMPORTS = ('superset.sql_lab', )
    CELERY_RESULT_BACKEND = 'redis://%s:%s/1' % (REDIS_HOST, REDIS_PORT)
    CELERY_ANNOTATIONS = {'tasks.add': {'rate_limit': '10/s'}}
    CELERY_TASK_PROTOCOL = 1


CELERY_CONFIG = CeleryConfig


# The file upload folder, when using models with files
UPLOAD_FOLDER = os.path.abspath(os.path.dirname(__file__)) + '/app/static/uploads/'

# The image upload folder, when using models with images
IMG_UPLOAD_FOLDER = os.path.abspath(os.path.dirname(__file__)) + '/app/static/uploads/'

Here is my docker-compose.yml file. Notice the Development option is commented out since I was getting Mkdir permission denied issue then. Commenting it out fixed that.


  postgres:
    image: postgres:10
    restart: unless-stopped
    environment:
      POSTGRES_DB: superset
      POSTGRES_PASSWORD: superset
      POSTGRES_USER: superset
    ports:
      - 5432:5432
    volumes:
      - postgres:/var/lib/postgresql/data

  superset:
    build:
      context: ../../
      dockerfile: contrib/docker/Dockerfile
    restart: unless-stopped
    environment:
      POSTGRES_DB: superset
      POSTGRES_USER: superset
      POSTGRES_PASSWORD: superset
      POSTGRES_HOST: postgres
      POSTGRES_PORT: 5432
      REDIS_HOST: redis
      REDIS_PORT: 6379
      # If using production, comment development volume below
      SUPERSET_ENV: production
     # SUPERSET_ENV: development
    ports:
      - 8088:8088
    depends_on:
      - postgres
      - redis
    volumes:
      # this is needed to communicate with the postgres and redis services
      - ./superset_config.py:/home/superset/superset/superset_config.py
      # this is needed for development, remove with SUPERSET_ENV=production
      - ../../superset:/home/superset/superset

volumes:
  postgres:
    external: false
  redis:
    external: false

Any suggestion how to fix it?

@trmanish
Copy link

I ran this command and it worked
sudo docker exec -u 0 -it superset chown superset /usr/local/lib/python3.5/dist-packages/superset

@robertozerbini Where do u find this folder in Ubuntu server? I installed superset on ubuntu server but I don't have any superset folder in this path. Any idea where that could be installed on server? So that I can run similar docker exec command?

@ghphb
Copy link

ghphb commented Jul 13, 2019

in superset_config.py I also commented out the line
- ../../superset:/home/superset/superset
and it worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests