-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
normalization: bigquery partition pruning optimization #14485
normalization: bigquery partition pruning optimization #14485
Conversation
Thank you @brunofaustino for opening this contribution! @edgao may I ask for a review? It looks like an important optimization considering the discussion on #14070 . |
/test connector=bases/base-normalization
Build PassedTest summary info:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch!
@brunofaustino looks like there are some conflicts that need to be resolved before this can be merged, do you have some time to take a look at them? |
@natalyjazzviolin Done! :) |
/publish connector=bases/base-normalization
if you have connectors that successfully published but failed definition generation, follow step 4 here |
@natalyjazzviolin fyi I'm running /publish, since the version numbers are somewhat conflict-prone (the only merge conflicts were in the version number + changelog); will merge once it finishes. |
@edgao do we need to bump NORMALIZATION_VERSION in NormalizationRunnerFactory in scope of the ticket? |
* bigquery partition pruning otimization * bump version and add changelog
@sashaNeshcheret I think the version bump already happened? https://github.com/airbytehq/airbyte/pull/14485/files#diff-968c6c1b743a3ad7d51f13669b2f2fae9b86d32d7698c5bc0eddf7613476be03 unless there's something else you're referring to |
What
Change where clause to set a constant expression instead of a dynamic subselect query. It prevents BigQuery from scanning all of the partitions.
It's a known BigQuery issue: https://cloud.google.com/bigquery/docs/querying-partitioned-tables#best_practices_for_partition_pruning
This solves the issue: #14070
How
Add a DBT macro called bigquery__incremental_clause. This macro is dynamically dispatched by incremental_clause macro.
Recommended reading order
🚨 User Impact 🚨
BigQuery will be able to limit the partitions that are scanned in a query, optimizing cost and performance.
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changesTests
Unit
Put your unit tests output here.
Integration
Put your integration tests output here.
Acceptance
Put your acceptance tests output here.