Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sqllab): clean comments within quotes #23908

Merged

Conversation

justinpark
Copy link
Member

SUMMARY

minor regression from #23378

The regex for commented out block also clean up the block within the quotes.
Therefore following query threw a syntax error since it removes the quoted value.

SELECT * FROM table WHERE column = '--unknown--';
==> 
SELECT * FROM table WHERE column = '

This commit adds the condition to skip the quoted block.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

After:

SELECT * FROM table WHERE column = '--unknown--';
==> 
SELECT * FROM table WHERE column = '--unknown--';

Before:

SELECT * FROM table WHERE column = '--unknown--';
==> 
SELECT * FROM table WHERE column = '

TESTING INSTRUCTIONS

Added tests for edge cases

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented May 2, 2023

Codecov Report

Merging #23908 (f9f1118) into master (594d3e0) will increase coverage by 0.00%.
The diff coverage is 79.04%.

❗ Current head f9f1118 differs from pull request most recent head 0d1cc15. Consider uploading reports for the commit 0d1cc15 to get more accurate results

@@           Coverage Diff           @@
##           master   #23908   +/-   ##
=======================================
  Coverage   68.11%   68.12%           
=======================================
  Files        1938     1940    +2     
  Lines       74972    75048   +76     
  Branches     8141     8155   +14     
=======================================
+ Hits        51067    51124   +57     
- Misses      21826    21841   +15     
- Partials     2079     2083    +4     
Flag Coverage Δ
javascript 54.49% <69.56%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...s/legacy-plugin-chart-calendar/src/controlPanel.ts 50.00% <ø> (ø)
...rts/src/BigNumber/BigNumberTotal/transformProps.ts 0.00% <0.00%> (ø)
...lugin-chart-echarts/src/BigNumber/BigNumberViz.tsx 0.00% <0.00%> (ø)
...set-frontend/src/components/Select/AsyncSelect.tsx 88.46% <ø> (ø)
superset-frontend/src/components/Select/Select.tsx 90.41% <ø> (ø)
...nd/src/explore/components/RunQueryButton/index.tsx 100.00% <ø> (ø)
...onalFormattingControl/FormattingPopoverContent.tsx 51.35% <ø> (ø)
superset-frontend/src/preamble.ts 0.00% <0.00%> (ø)
superset-frontend/src/setup/setupFormatters.ts 0.00% <ø> (ø)
superset-frontend/src/types/bootstrapTypes.ts 100.00% <ø> (ø)
... and 19 more

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@justinpark justinpark force-pushed the fix--clean-sql-comments-within-quotes branch from 66791a0 to 944d22b Compare May 2, 2023 22:59
@pull-request-size pull-request-size bot added size/M and removed size/S labels May 2, 2023
@justinpark justinpark force-pushed the fix--clean-sql-comments-within-quotes branch from 70f21e3 to 167d6a2 Compare May 3, 2023 02:46
Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but it would be nice to get @villebro and @betodealmeida approvals too.

Copy link
Member

@ktmud ktmud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting approach

)
.join('')
// Clean out the commented-out blocks
.replace(/(--.*?$|\/\*[\s\S]*?\*\/)\n?/gm, '\n')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.replace(/(--.*?$|\/\*[\s\S]*?\*\/)\n?/gm, '\n')
.replace(/(--.*$|\/\*[\s\S]*\*\/)\n?/gm, '\n')

I don't think these question marks are necessary

Copy link
Member Author

@justinpark justinpark May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ? mark covers the multiple comment group case

i.e.

/*
comments
*/
no-comments
/*
comments
*/

.replace(/(--.?$|/*[\s\S]?*/)\n?/gm, '\n') => no-comments
.replace(/(--.$|/*[\s\S]*/)\n?/gm, '\n') => '\n'

@villebro
Copy link
Member

villebro commented May 3, 2023

@justinpark @michael-s-molina @ktmud we have a fair deal of logic for this in the backend - is there some reason we want to parse for this kind of stuff in the frontend? It feels like we should centralize this in one place.

@michael-s-molina
Copy link
Member

@justinpark @michael-s-molina @ktmud we have a fair deal of logic for this in the backend - is there some reason we want to parse for this kind of stuff in the frontend? It feels like we should centralize this in one place.

I didn't know about that @villebro. Could you point to the code? I agree that if there's no reason for doing it in the frontend, we should centralize it in the backend.

@ktmud
Copy link
Member

ktmud commented May 3, 2023

I agree this logic is probably better to live in the backend---it's more useful to record the original SQL users wrote, too.

Not sure how comments are removed in the backend but my guess is it happens after we render the Jinja template (which is why we saw the error in #23378 in the first place).

Maybe we can change it to remove comments before rendering, but how would you know the comments are safe to remove without rendering the template? Also, if we are removing comments using a SQL parser, the parser is likely to fail before you render the template. Not sure what's the best way forward, but since we are already removing comments in the frontend, it's probably still worth merging this PR first.

@justinpark
Copy link
Member Author

Maybe we can change it to remove comments before rendering, but how would you know the comments are safe to remove without rendering the template? Also, if we are removing comments using a SQL parser, the parser is likely to fail before you render the template. Not sure what's the best way forward, but since we are already removing comments in the frontend, it's probably still worth merging this PR first.

I agree with ktmud's idea

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ktmud @justinpark I agree, this is an improvement to the status quo; we can follow up later by moving this into the backend. @michael-s-molina you can check superset/sql_parse.py which we use for parsing queries in the backend, and I seem to recall having added tests that ensure '--comment' aren't stripped out, which is very similar to what's being checked here.

@justinpark
Copy link
Member Author

I agree, this is an improvement to the status quo; we can follow up later by moving this into the backend.

Sounds good. yeah definitely backend is a better place. until then, we can use this solution

@justinpark justinpark merged commit 841726d into apache:master May 5, 2023
justinpark added a commit to airbnb/superset-fork that referenced this pull request May 5, 2023
justinpark added a commit to justinpark/superset that referenced this pull request May 10, 2023
justinpark added a commit to airbnb/superset-fork that referenced this pull request May 10, 2023
john-bodley pushed a commit to airbnb/superset-fork that referenced this pull request May 10, 2023
john-bodley added a commit to airbnb/superset-fork that referenced this pull request May 10, 2023
@mistercrunch mistercrunch added the 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels label Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/M 🚢 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants