Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: convert column type to str during dual read/write #19701

Closed
wants to merge 2 commits into from

Conversation

hughhhh
Copy link
Member

@hughhhh hughhhh commented Apr 13, 2022

SUMMARY

Users are blocked when trying to save a dataset on a query/table that has an ARRAY column in it. Due the python_type returning a list instead of a str for the type. To fix this i've added a util function to check if the type is a list and convert it to a string like this ARRAY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@hughhhh hughhhh changed the title fix: convert column type in function fix: convert column type to str during dual read/write Apr 13, 2022
@eschutho eschutho requested review from ktmud and betodealmeida April 13, 2022 19:50
@eschutho
Copy link
Member

This is great @hughhhh. Can you write a quick unit test for it?

@codecov
Copy link

codecov bot commented Apr 13, 2022

Codecov Report

Merging #19701 (8269591) into master (c8304a2) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #19701      +/-   ##
==========================================
- Coverage   66.51%   66.51%   -0.01%     
==========================================
  Files        1684     1684              
  Lines       64559    64584      +25     
  Branches     6626     6626              
==========================================
+ Hits        42941    42955      +14     
- Misses      19923    19934      +11     
  Partials     1695     1695              
Flag Coverage Δ
hive 52.67% <16.66%> (-0.02%) ⬇️
javascript 51.15% <ø> (ø)
mysql 81.93% <100.00%> (-0.03%) ⬇️
postgres 81.97% <100.00%> (-0.03%) ⬇️
presto 52.52% <16.66%> (-0.02%) ⬇️
python 82.41% <100.00%> (-0.03%) ⬇️
sqlite 81.75% <100.00%> (-0.03%) ⬇️
unit 47.73% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/connectors/sqla/utils.py 91.34% <100.00%> (+0.52%) ⬆️
superset/sql_lab.py 78.90% <0.00%> (-2.74%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c8304a2...8269591. Read the comment docs.

@ktmud
Copy link
Member

ktmud commented Apr 13, 2022

Should we save the original database type here? The type field specifically used Text as storage so the originally intention is probably to allow storing the full schema of complex types such as ARRAY and STRUCT?

What does the old column model save?

@hughhhh
Copy link
Member Author

hughhhh commented Apr 13, 2022

Should we save the original database type here? The type field specifically used Text as storage so the originally intention is probably to allow storing the full schema of complex types such as ARRAY and STRUCT?

What does the old column model save?

I'm pretty sure it was still ARRAY, but this is only a patch i'm pretty sure there other types we'd have to map to strings later

@hughhhh hughhhh force-pushed the fix-ds-save-array branch from bb732bc to f7c1bb7 Compare April 14, 2022 01:23
if column_type.python_type == list:
return "ARRAY"
if column_type.python_type == dict:
return "JSON"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it also be a STRUCT? I guess it probably depends on the database.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i'm following the pattern from the sqlalchemy.sql.sqltypes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we save the original database type here? The type field specifically used Text as storage so the originally intention is probably to allow storing the full schema of complex types such as ARRAY and STRUCT?

I agree, using the original native type is the ideal solution here.

@villebro
Copy link
Member

@hughhhh @ktmud I've opened up an alternative solution for this issue that reuses existing code that's used when creating physical datasets: #19714

@hughhhh hughhhh closed this Apr 14, 2022
@sadpandajoe
Copy link
Member

🏷️ preset:2022.15

@mistercrunch mistercrunch deleted the fix-ds-save-array branch March 26, 2024 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants