Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Severe regression: Slices with calculated time(?) columns stopped working, due to quoting of the column expression #3844

Closed
3 tasks done
rumbin opened this issue Nov 12, 2017 · 9 comments
Labels
inactive Inactive for >= 30 days

Comments

@rumbin
Copy link
Contributor

rumbin commented Nov 12, 2017

Make sure these boxes are checked before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if any
  • I have reproduced the issue with at least the latest released version of superset
  • I have checked the issue tracker for the same issue and I haven't found one similar

Superset version

0.20.5, regression from 0.20.4

Expected results

All slices can be loaded.

Actual results

Slices that contain calculated columns (defined via the Expression field of Add Column) fail to load.
I observed this issue for a time column (Is temporal = [x]) and by have not checked if other calculated/virtual columns are affected, too. I can do that, if it would be helpful for debugging.

Steps to reproduce

I hope you don't mind the German column names. If they distury you, I can translate them…

The calculated column auftrag_ende_projiziert in my case is defined as

auftrag_ende + (date_trunc('year',current_date) - date_trunc('year', auftrag_ende))

In superset 0.20.4 the resulting query looks somewhat like this

SELECT to_char(auftrag_ende::date, 'IYYY') AS auftrag_ende_jahr,
       DATE_TRUNC('week', auftrag_ende + (date_trunc('year',current_date) - date_trunc('year', auftrag_ende))) AS __timestamp,
       sum(anzahl) AS sum__anzahl
FROM …

In superset 0.20.5 the expression of the calculated column now is quoted, which obviously causes the query to fail:

SELECT to_char(auftrag_ende::date, 'IYYY') AS auftrag_ende_jahr,
       DATE_TRUNC('week', "auftrag_ende + (date_trunc('year',current_date) - date_trunc('year', auftrag_ende))") AS __timestamp,
       sum(anzahl) AS sum__anzahl
FROM …

Unfortunately I don't know which PR introduced the quoting, as I don't know which PRs to take into account, as I suffer from the same ignorance as #3806

@rumbin rumbin changed the title Severe regression: Slices with calculated (time?) columns stopped working, due to quoting of the column expression Severe regression: Slices with calculated time(?) columns stopped working, due to quoting of the column expression Nov 12, 2017
@xrmx
Copy link
Contributor

xrmx commented Nov 12, 2017

@rumbin All you need is git log and optionally git blame. Even the github ui will do as the changelog is not curated but generated from git commits.

@xrmx
Copy link
Contributor

xrmx commented Nov 12, 2017

Assuming you are using postgres this should be it
814b70f

@rumbin
Copy link
Contributor Author

rumbin commented Nov 12, 2017

Thanks @xrmx for pointing me to this PR.
Bugfix #3736 definitely is the culprit of this issue.

I'm not sure how to handle this situation, as simply reverting #3736 would of course bring back #3208.
I guess, the cleanest approach would be to distinguish if the time column provided is an existing table column, or if there is an Expression behind it. But where would this be done best?

Ideas would be welcome; @ryanthegiantlion, what do you think about this?

@rumbin
Copy link
Contributor Author

rumbin commented Nov 12, 2017

Providing a MWE:

Take an arbitrary table and add a calculated column with, e.g., the following Expression

now()::timestamp

and make it a temporal one.

Then try to create a time series line chart with time grain of, e.g. week, using this colum as time column.

@ryanthegiantlion
Copy link
Contributor

@rumbin It seems a problem that at query time we cannot tell if a column is indeed a database column or a database expression. Unfortunately I am not familiar with sql alchemy to know how to handle this offhand. There is a meta data prop on the baseclass: https://github.com/apache/incubator-superset/blob/500455fc72dc58f9c0bd32d2b3b4da9a915bfb8c/superset/db_engine_specs.py#L72, so perhaps that can be used.

I will have a go at creating a PR that can handle both cases this week. If a fix cannot be provided it probably makes sense to revert the change. There is a workaround for #3208 - just create a view with properties cased correctly. It is a tedious workaround though, especially if every timestamp in the DB is cased 'createdAt', so it would be nice to keep the fix . . .

@mistercrunch
Copy link
Member

Let me know if you think I should revert + release quickly

@rumbin
Copy link
Contributor Author

rumbin commented Nov 13, 2017

I'd welcome if you reverted #3736, as ot breaks existing slices.

Users who suffer from #3208 may — until a permanent solution has been found — be advised to work around the issue by introducing a calculated temporal column with lowercased name, that simply contains the originally capitalized column name in (")-quotes as Expression.

@ryanthegiantlion
Copy link
Contributor

ryanthegiantlion commented Nov 13, 2017

Good point @rumbin. That sounds like a better workaround than doing it through a view.

@stale
Copy link

stale bot commented Apr 11, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label Apr 11, 2019
@stale stale bot closed this as completed Apr 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive Inactive for >= 30 days
Projects
None yet
Development

No branches or pull requests

4 participants