Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whitespace handling #513

Closed
courentin opened this issue Mar 9, 2022 · 5 comments
Closed

Whitespace handling #513

courentin opened this issue Mar 9, 2022 · 5 comments
Labels
enhancement New feature or request good first issue

Comments

@courentin
Copy link
Contributor

Describe the bug

When using https://github.com/sqlfluff/sqlfluff with this package, the rule L007 breaks.

Steps to reproduce

So when used by dbt_date with a comparator, it get's compiled from:

WHERE answer.day >= {{ dbt_date.n_months_ago(3) }}

to:

WHERE answer.day >= 
    date_trunc('month', 
	...

This is due to some dbt_utils macros don't taking care of whitespaces. For example this one:

{% macro default__date_trunc(datepart, date) %}
date_trunc('{{datepart}}', {{date}})
{% endmacro %}

Expected results

Expected compilation:

WHERE answer.day >= date_trunc('month', 
	...

Are you interested in contributing the fix?

Yes, I was about to make a PR to add whitespace control to all macros. But before making such a cumbersome work, I prefer to ask if it's a good idea.
So: should we make all macros use the whitespace control? Should we make sure all macros respect this convention in the future?

@courentin courentin added bug Something isn't working triage labels Mar 9, 2022
@joellabes
Copy link
Contributor

joellabes commented Mar 10, 2022

Hi @courentin! Conceptually, I'm super on board with improving the way that code is rendered and reducing unneeded whitespace.

If you wanted to open a PR (or multiple PRs chipping away over time!) I'd absolutely welcome that.

With that said: over the last couple of months I've spent more hours than I care to admit trying to get the whitespace just so on the dbt_metrics package, and I've generally found that I can get it about 90% right without having to degrade the readability of the raw macro's code, or being stuck with a newline somewhere I don't want it.

I don't know how the SQLFluff team are thinking about this, but my order of priorities for well-linted/readable code is:

1: Raw dbt-sql:

where answer.day >= {{ dbt_date.n_months_ago(3) }}

is preferable to

where answer.day >= {{dbt_date.n_months_ago(3)}} -- no whitespace around braces

or

where           answer.day >=
{{ dbt_date.n_months_ago(3) }}

2: A macro's definition:

        {%- set ander = joiner(" and ") -%}
        {%- set aggregate_string -%}
            {%- if start_date or end_date -%}
                case when 
                    {%- if start_date %}
                        {{- ander() }} spine.date_day >= cast('{{ start_date }}' as date)
                    {%- endif %}
                    {%- if end_date %}
                        {{- ander() }} spine.date_day <= cast('{{ end_date }}' as date)
                    {% endif -%}
                then source_query.property_to_aggregate end
            {%- else -%}
                source_query.property_to_aggregate
            {%- endif -%}
        {%- endset -%}

is preferable to

{%- set ander = joiner(" and ") -%}
case when {{ ander() }} {% if start_date %} spine.date_day >= cast('{{ start_date }} as date) {{ ander() }} {% if end_date %} spine.date_day <= cast('{{ end_date }} as date) then source_query.property_to_aggregate end

even though it doesn't render perfectly in the final template. (Note: those macros aren't identical and I don't think that specific code would actually belong on one line, it's just an example)

3: The final compiled code

Once both of those are done, then the prettier the final compiled code can be the better.

where answer.day >= date_trunc('month', getdate())

is totally better than

where answer.day >= 
date_trunc('month', getdate())

and it's reasonably easy to do, so go for it!

But, as a harder example, I think it'd be basically impossible to get a complex macro like pivot() to generate beautiful code without making the macro itself harder to read. I'm OK with that tradeoff.

Wrap up

That was a lot of writing to basically say "yes please!"

If you want to open a PR once you've done a couple, I'd be happy to approve and merge these in phases.

Also keep in mind that for macros that have a lot of conditional logic, you'll need to check that all variations still work - you don't want to accidentally get rid of so much whitespace that you wind up with

where {{ dbt_utils.length('some_column') }} < 20 --contrived example that wouldn't actually break

turning into

wherelen(some_column) < 20 

Finally, a heads up that the cross-db macros, as they're moving into dbt-core imminently: dbt-labs/dbt-core#4813. It'd be worth checking on that ticket before doing any significant work around those; you don't want to open a PR and discover that they've already been copied as-is and have to apply the change somewhere else.

@joellabes joellabes added enhancement New feature or request good first issue and removed bug Something isn't working triage labels Mar 10, 2022
@courentin
Copy link
Contributor Author

Thank you very much for these advices and providing the context, it is super useful!
It is definitely a bit more work than I expected. But I think I can start with the "simple" macros!

@SunriseLong
Copy link
Contributor

@joellabes I'd be interested to update some of the macros that are affecting my team regarding linting. I see dbt-labs/dbt-core#4813 has been moved to a v1.2 milestone. In the meantime, is it a good idea for me to open a PR here?

@joellabes joellabes mentioned this issue Mar 25, 2022
15 tasks
@joellabes
Copy link
Contributor

@SunriseLong yes I think you'd be OK to start opening PRs to address this!

@leahwicz can you pop an update on this thread once the Core team is ready to start moving things over so we don't step on each other's toes?

@dbeatty10
Copy link
Contributor

@courentin did #529 resolve the whitespace control issues you were experiencing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue
Projects
None yet
Development

No branches or pull requests

4 participants