-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding parametrized queries #93
Comments
I like this. |
you mean something like this? limit = 10
corpus_name = "something"
|
Exactly. from string import Template
template = Template("""
SELECT *
FROM my_data
LIMIT $limit
""")
limit_one = template.substitute(limit=1)
limit_two = template.substitute(limit=2)
%sql $limit_one |
cool, I'd say there should be two modes. one where the user explicitly says "go grab my global scope" and another one where they pass parameters explicitly. Here, we grab existing limit and corpus_name from existing scope (shorter version: %%sql --use-globals
SELECT word, SUM(word_count) as count
FROM `bigquery-public-data.samples.shakespeare`
WHERE corpus = {{corpus_name}}
GROUP BY word
ORDER BY count DESC
LIMIT {{limit}} here, we pass values explicitly: %%sql -p limit 1 -p corpus_name stuff
SELECT word, SUM(word_count) as count
FROM `bigquery-public-data.samples.shakespeare`
WHERE corpus = {{corpus_name}}
GROUP BY word
ORDER BY count DESC
LIMIT {{limit}} Mixing them will use the existing scope and override any passed values: %%sql -p limit 1 --use-globals
SELECT word, SUM(word_count) as count
FROM `bigquery-public-data.samples.shakespeare`
WHERE corpus = {{corpus_name}}
GROUP BY word
ORDER BY count DESC
LIMIT {{limit}} thoughts? |
We can do this mix, I don't see why we need |
following the zen of python 😁: Explicit is better than implicit. |
lol sure. |
and, we could even combine it with the %%sql --save some_name --no-execute
SELECT word, SUM(word_count) as count
FROM `bigquery-public-data.samples.shakespeare`
WHERE corpus = {{corpus_name}}
GROUP BY word
ORDER BY count DESC
LIMIT {{limit}} Then:
|
That could be one example for the docs, |
agree
unless you use |
For the
Any other better naming suggestion? |
let's use capital p: -P |
Let's think about if we want to keep this |
I looked at the existing behavior: it needs to be more consistent (see here) and support some new features. So let's go ahead and implement it with jinja2.Template. I think the current implementation is happening here: Line 393 in 27823e6
so we need to:
what's interesting about the existing implementation is that the original author decided to pass the user's namespace implicitly, so I think let's do it that way; once that's in, we can decide if we add the example of the new api: parameter = "value" %%sql
select * from table where parameter = '{{parameter}}' should resolve to: select * from table where parameter = '{{value}}' deprecationI think we should keep the existing behavior for a bit to allow users to migrate. I think we can detect queries with the existing behavior with some regex that looks for :something $something or {something}, and if so, show a warning so they use {{something}} (and show our slack link so they can ask for help) |
I think the original way they parsed the variable was happening earlier before our execution function: Give this example:
In the def execution function, where our main sql execute handler
After some researches I think Ipython already parsed the line string with DeprecationProbably we need to find the way to control that parsing part (either throwing some warning message, or disable the Add Double Curly Bracelets parsing supportIn current codebase, if we execute something like:
It will become
We can still parse the
Need more research on this... |
Update 2/22/23: Design ProposalThere is a decorator called @no_var_expand which can be attached to Probably it's better to turn-off the default variable parsing since we want to have a full-control to the original cmd, the case like:
would be odd & hard to handle after After the variable parsing is turned off by our hand, we can write our custom parsers to handle the variable by ourself. There will be two cases:
Custom parser A: only handle the Finally, I think when the user provides something like @edublancas Kinda complicate...but lmk what you think |
For both 1 & 2, we can add the feature, support it for the required minimum version and deprecate it in the next major, as per our deprecation policy. I think if there are no major design issues/concerns we should move to talk over an actual PR. |
Alright, so I think let's break this into a few steps:
questions @tonykploomber ? |
@edublancas |
many people are already familiar with parametrized queries since they're available in dbt and airflow, so we should add that. possibly with jinja?
This is how BigQuery does it (but with a different syntax).
I'm unsure if the
--params {"corpus_name": "hamlet", "limit": 10}
if the best choice. perhaps something simpler like-p corpus_name hamlet -p limit 10
? similar to what we do in ploomber-engineThe text was updated successfully, but these errors were encountered: