Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bugfix] Fix case sensitivity of date column for postgres #5962

Conversation

victornoel
Copy link
Contributor

Closes #5886

After discussing with @villebro in #5886, we arrived to the conclusion that we shouldn't fix this bug globally by using sqlalchemy to generate the query but only at the level of postgresql.

The rationale is to avoid impacting other db since it seems this may only be touching postgres.

@codecov-io
Copy link

codecov-io commented Sep 24, 2018

Codecov Report

Merging #5962 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #5962   +/-   ##
=======================================
  Coverage   63.56%   63.56%           
=======================================
  Files         393      393           
  Lines       23667    23667           
  Branches     2638     2638           
=======================================
  Hits        15044    15044           
  Misses       8610     8610           
  Partials       13       13
Impacted Files Coverage Δ
superset/db_engine_specs.py 55.7% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6b89b7f...b988748. Read the comment docs.

@victornoel
Copy link
Contributor Author

@mistercrunch could it be possible to get a review and/or maybe get this merged? :)

@rumbin
Copy link
Contributor

rumbin commented Sep 28, 2018

I am concerned that this PR will again revive #3844. From the issues and PRs within this very issue you can see that this topic is very controversial.

That time we came to these conclusions:

  • Quoting the time column name breaks calculated time columns.
  • A workaround for Case-sensitive time columns may be to rename them by creating a calculated column out of them.

@victornoel
Copy link
Contributor Author

@rumbin good point, I didn't know the col in this expression could have been more complex things.

And indeed a workaround is to exploit that situation to make a calculated column with a proper name.

Isn't the actual real solution to the problem to stop trying to manipulate SQL by hand and use the proper tools so that this situation does not arise? According to @villebro, using sqlalchemy to generate those queries would be "not too readable" (cf #5886 (comment)) but at least it would be correct in all cases, right?

I will close this for now, but I would really appreciate if you, @rumbin, could chip in the discussion in #5886 so that we find a solution to this situation?

@victornoel victornoel closed this Sep 29, 2018
@victornoel victornoel deleted the 5886-postgres-date-column-case-sensitivity branch September 29, 2018 06:36
@villebro
Copy link
Member

Same here: as we haven't used calculated time columns, this problem didn't come to mind. Thanks for chiming in @rumbin . @victornoel: Luckily time grains can be modified locally by adding the following in your superset_config.py file:

TIME_GRAIN_ADDON_FUNCTIONS = {
    'postgres': {
        None: '"{col}"',
        'PT1S': "DATE_TRUNC('second', \"{col}\") AT TIME ZONE 'UTC'",
        'PT1M': "DATE_TRUNC('minute', \"{col}\") AT TIME ZONE 'UTC'",
        'PT1H': "DATE_TRUNC('hour', \"{col}\") AT TIME ZONE 'UTC'",
        'P1D': "DATE_TRUNC('day', \"{col}\") AT TIME ZONE 'UTC'",
        'P1W': "DATE_TRUNC('week', \"{col}\") AT TIME ZONE 'UTC'",
        'P1M': "DATE_TRUNC('month', \"{col}\") AT TIME ZONE 'UTC'",
        'P0.25Y': "DATE_TRUNC('quarter', \"{col}\") AT TIME ZONE 'UTC'",
        'P1Y': "DATE_TRUNC('year', \"{col}\") AT TIME ZONE 'UTC'",
    }
}

This will replace the built-in time grains with the ones defined above. Perhaps this is the best solution for now.

@victornoel
Copy link
Contributor Author

@villebro thanks for this supplementary workaround.
That being said, I would love to help fixing this more seriously for future users and I can contribute if needed: let's continue this discussion in the original issue #5886.

@prettymuchbryce
Copy link

In case anyone else comes across the workaround posted by @villebro keep in mind that the database name recognized by SQLAlchemy seems to be postgresql not postgres. I tried both and only postgresql worked for me.

e.g.

TIME_GRAIN_ADDON_FUNCTIONS = {
    'postgresql': {
        None: '"{col}"',
        'PT1S': "DATE_TRUNC('second', \"{col}\") AT TIME ZONE 'UTC'",
        'PT1M': "DATE_TRUNC('minute', \"{col}\") AT TIME ZONE 'UTC'",
        'PT1H': "DATE_TRUNC('hour', \"{col}\") AT TIME ZONE 'UTC'",
        'P1D': "DATE_TRUNC('day', \"{col}\") AT TIME ZONE 'UTC'",
        'P1W': "DATE_TRUNC('week', \"{col}\") AT TIME ZONE 'UTC'",
        'P1M': "DATE_TRUNC('month', \"{col}\") AT TIME ZONE 'UTC'",
        'P0.25Y': "DATE_TRUNC('quarter', \"{col}\") AT TIME ZONE 'UTC'",
        'P1Y': "DATE_TRUNC('year', \"{col}\") AT TIME ZONE 'UTC'",
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Superset ignore case-sensitivity of time column names with Postgresql
5 participants