Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use tmp table in static insert overwrite #630

Merged
merged 4 commits into from
Jul 14, 2023

Conversation

github-christophe-oudar
Copy link
Contributor

@github-christophe-oudar github-christophe-oudar commented Mar 25, 2023

Resolves #427
Resolves #556

Description

As described in #427, static insert overwrite, when involving on_schema_change != 'ignore' would create a temp table to check for a schema change anyway.
It fixes as well:

  • In case, on_schema_change = 'ignore' with dynamic insert overwrite where the SQL header would not be added
  • In case, we would have static insert overwrite with on_schema_change != 'ignore' where the temp table would not be deleted

Checklist

@github-christophe-oudar github-christophe-oudar requested a review from a team as a code owner March 25, 2023 19:54
@cla-bot cla-bot bot added the cla:yes label Mar 25, 2023
@Kayrnt Kayrnt force-pushed the tmp_table_insert_overwrite branch from 59cf7e6 to 9a9eada Compare March 25, 2023 19:54
@Fleid Fleid added the ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering label Apr 4, 2023
@Fleid
Copy link
Contributor

Fleid commented Apr 4, 2023

Lining this up for 1.6.0, sadly I don't think we'll have time to review it before 1.5.0 ships.

@@ -1,5 +1,5 @@
{% macro bq_generate_incremental_insert_overwrite_build_sql(
tmp_relation, target_relation, sql, unique_key, partition_by, partitions, dest_columns, on_schema_change, copy_partitions
tmp_relation, target_relation, sql, unique_key, partition_by, partitions, dest_columns, tmp_relation_exists, copy_partitions
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-christophe-oudar
Copy link
Contributor Author

I think it should work as intended, should we move forward on that one for 1.6 release?

@Fleid
Copy link
Contributor

Fleid commented Jun 5, 2023

I'm raising priority on our side, but there is still a ton of other tickets to review higher on the list :/
I'm sorry @github-christophe-oudar

@mikealfare
Copy link
Contributor

@Kayrnt We just merged in a bunch of smaller dependabot type changes. Now that that's all done, do you mind updating your branch?

@Kayrnt Kayrnt force-pushed the tmp_table_insert_overwrite branch from f8993ae to aa0c030 Compare June 24, 2023 20:18
@Kayrnt
Copy link
Contributor

Kayrnt commented Jun 24, 2023

I just rebased, let's see if it passes the tests

@colin-rogers-dbt
Copy link
Contributor

Local testing successful and it looks like we're now using the temp table as expected.

@colin-rogers-dbt
Copy link
Contributor

colin-rogers-dbt commented Jul 13, 2023

Last request: can we add a test fixture that covers this to test_incremental_strategies.py

Suggested fixture:

{% set partitions_to_replace = [
  "date_sub(current_date, interval 1 day)",
  "date_sub(current_date, interval 2 day)"
] %}

{{
    config(
        materialized="incremental",
        incremental_strategy="insert_overwrite",
        cluster_by="id",
        partition_by={
            "field": "date_time",
            "data_type": "datetime",
            "granularity": "day"
        },
        partitions=partitions_to_replace,
        on_schema_change="sync_all_columns"
    )
}}


with data as (

    {% if not is_incremental() %}

        select 1 as id, cast('2020-01-01' as datetime) as date_time union all
        select 2 as id, cast('2020-01-01' as datetime) as date_time union all
        select 3 as id, cast('2020-01-01' as datetime) as date_time union all
        select 4 as id, cast('2020-01-01' as datetime) as date_time

    {% else %}

        -- we want to overwrite the 4 records in the 2020-01-01 partition
        -- with the 2 records below, but add two more in the 2020-01-02 partition
        select 10 as id, cast('2020-01-01' as datetime) as date_time union all
        select 20 as id, cast('2020-01-01' as datetime) as date_time union all
        select 30 as id, cast('2020-01-02' as datetime) as date_time union all
        select 40 as id, cast('2020-01-02' as datetime) as date_time

    {% endif %}

)

select * from data

@Kayrnt
Copy link
Contributor

Kayrnt commented Jul 13, 2023

@colin-rogers-dbt I added the test case, we should be fine 🚀

@dbeatty10
Copy link
Contributor

@github-christophe-oudar Could you bring it up-to-date with the base branch?

image

We might need to keep asking you to do an update like that until this PR is merged 😅

Alternatively, if you allow edits from maintainers (recommended!), then we can handle that type of thing as-needed.

@Kayrnt Kayrnt force-pushed the tmp_table_insert_overwrite branch from 025a08f to 7d8df06 Compare July 13, 2023 16:31
@Kayrnt
Copy link
Contributor

Kayrnt commented Jul 13, 2023

You're merging too many things 😅
I tried to find the option but couldn't... I wonder if it's because I created the PR from another account than the one owning the fork

@colin-rogers-dbt
Copy link
Contributor

if we can do one more update I can merge this

@cla-bot
Copy link

cla-bot bot commented Jul 13, 2023

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR.

CLA has not been signed by users: @github-christophe-oudar

@cla-bot cla-bot bot removed the cla:yes label Jul 13, 2023
@github-christophe-oudar
Copy link
Contributor Author

Done 👍

@McKnight-42
Copy link
Contributor

@cla-bot check

@cla-bot
Copy link

cla-bot bot commented Jul 13, 2023

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR.

CLA has not been signed by users: @github-christophe-oudar

@cla-bot cla-bot bot removed the cla:yes label Jul 13, 2023
@cla-bot
Copy link

cla-bot bot commented Jul 13, 2023

The cla-bot has been summoned, and re-checked this pull request!

@Kayrnt Kayrnt force-pushed the tmp_table_insert_overwrite branch from 6f3370b to 124023f Compare July 13, 2023 21:57
@cla-bot cla-bot bot added the cla:yes label Jul 13, 2023
@Kayrnt
Copy link
Contributor

Kayrnt commented Jul 13, 2023

@McKnight-42 @colin-rogers-dbt I rebased with my "cla:yes" account (my personal one) so that it should be OK.
I initially rebased earlier today because with my other account as I was logged in on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes ok to test ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering test all
Projects
None yet
8 participants