Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery Denormalized : Cover arrays only if they are nested #14023

Merged
merged 38 commits into from
Sep 8, 2022

Conversation

DoNotPanicUA
Copy link
Contributor

@DoNotPanicUA DoNotPanicUA commented Jun 22, 2022

What

  • 11109 Stop surrounding any array. Do it only if we have an array of arrays.
  • 14058 Fix processing an array of DateTime values.
  • 13634 Enable new DAT tests.
  • 5912 Reopened.
  • 14668 Cover by tests the BigQuery inheritance limitation
  • 15363 Workaround

Recommended reading order

  1. DefaultBigQueryDenormalizedRecordFormatter.java
  2. BigQueryUtils.java
  3. BigQueryDenormalizedDestinationTest.java
  4. BigQueryDenormalizedDestinationAcceptanceTest.java

@DoNotPanicUA

This comment was marked as outdated.

@DoNotPanicUA

This comment was marked as outdated.

@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Jun 24, 2022
@DoNotPanicUA DoNotPanicUA marked this pull request as ready for review June 24, 2022 08:17
@DoNotPanicUA

This comment was marked as outdated.

@DoNotPanicUA

This comment was marked as outdated.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 8, 2022

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to run corresponding integration tests:

  • destination-bigquery
  • destination-bigquery-denormalized

@DoNotPanicUA
Copy link
Contributor Author

DoNotPanicUA commented Sep 8, 2022

/test connector=connectors/destination-bigquery

🕑 connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/3017128603
✅ connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/3017128603
Python tests coverage:

Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
normalization/transform_config/__init__.py                            2      0   100%
normalization/transform_catalog/reserved_keywords.py                 14      0   100%
normalization/transform_catalog/__init__.py                           2      0   100%
normalization/destination_type.py                                    14      0   100%
normalization/__init__.py                                             4      0   100%
normalization/transform_catalog/destination_name_transformer.py     166      8    95%
normalization/transform_catalog/table_name_registry.py              174     34    80%
normalization/transform_config/transform.py                         191     49    74%
normalization/transform_catalog/utils.py                             51     14    73%
normalization/transform_catalog/dbt_macro.py                         22      7    68%
normalization/transform_catalog/catalog_processor.py                147     80    46%
normalization/transform_catalog/transform.py                         61     38    38%
normalization/transform_catalog/stream_processor.py                 589    394    33%
-------------------------------------------------------------------------------------
TOTAL                                                              1437    624    57%

Build Passed

Test summary info:

All Passed

@DoNotPanicUA
Copy link
Contributor Author

DoNotPanicUA commented Sep 8, 2022

/test connector=connectors/destination-bigquery-denormalized

🕑 connectors/destination-bigquery-denormalized https://github.com/airbytehq/airbyte/actions/runs/3017129268
✅ connectors/destination-bigquery-denormalized https://github.com/airbytehq/airbyte/actions/runs/3017129268
Python tests coverage:

Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
normalization/transform_config/__init__.py                            2      0   100%
normalization/transform_catalog/reserved_keywords.py                 14      0   100%
normalization/transform_catalog/__init__.py                           2      0   100%
normalization/destination_type.py                                    14      0   100%
normalization/__init__.py                                             4      0   100%
normalization/transform_catalog/destination_name_transformer.py     166      8    95%
normalization/transform_catalog/table_name_registry.py              174     34    80%
normalization/transform_config/transform.py                         191     49    74%
normalization/transform_catalog/utils.py                             51     14    73%
normalization/transform_catalog/dbt_macro.py                         22      7    68%
normalization/transform_catalog/catalog_processor.py                147     80    46%
normalization/transform_catalog/transform.py                         61     38    38%
normalization/transform_catalog/stream_processor.py                 589    394    33%
-------------------------------------------------------------------------------------
TOTAL                                                              1437    624    57%

Build Passed

Test summary info:

All Passed

@github-actions
Copy link
Contributor

github-actions bot commented Sep 8, 2022

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to run corresponding integration tests:

  • destination-bigquery
  • destination-bigquery-denormalized

}

/**
* Compare field modes. Field can have on of three modes: NULLABLE, REQUIRED, REPEATED, null. Only
Copy link
Contributor

@grishick grishick Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "one of three modes"

also, "NULLABLE", "REQUIRED", and "REPEATED" are the modes, but then there is also "null" at the end of the sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@github-actions
Copy link
Contributor

github-actions bot commented Sep 8, 2022

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to run corresponding integration tests:

  • destination-bigquery
  • destination-bigquery-denormalized

@DoNotPanicUA DoNotPanicUA merged commit 58f18c4 into master Sep 8, 2022
@DoNotPanicUA DoNotPanicUA deleted the aleonets/11109-bigq-denorm-no-array-sur branch September 8, 2022 22:02
DoNotPanicUA added a commit that referenced this pull request Sep 8, 2022
DoNotPanicUA added a commit that referenced this pull request Sep 8, 2022
grishick pushed a commit that referenced this pull request Sep 9, 2022
* Increase version for BQ PR #14023

* auto-bump connector version [ci skip]

* auto-bump connector version [ci skip]

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
@DoNotPanicUA
Copy link
Contributor Author

@grishick
Followup ticket for the tests airbytehq/airbyte-internal-issues#874

robbinhan pushed a commit to robbinhan/airbyte that referenced this pull request Sep 29, 2022
…hq#14023)

* stop covering any array. cover only if we have array of arrays (restriction of BigQuery)

* add test with nested arrays and update existing tests

* [14058] fix datetime arrays

* [11109] cover only array of arrays by object instead of any array

* [14058] fix datetime format fail when we have an array of objects with datetime

* enable Array and Array+Object DATs

* reopen Issue airbytehq#11166 and disable functionality

* Improve the tests by moving common part to Utils

* Add tests to check `Array of arrays` cases

* Increase version

* Doc

* format

* review update:
- update comment about reopen issue
- added test case with multiply array sub values
- fix nested arrays with datetime
- add test case for nested arrays with datetime

* fix date formatting

* disable testAnyOf test and upd comments

* remove some code duplication in the tests

* [14668] cover by tests the BigQuery inheritance limitation

* Make GCS implementation running same tests as standard impl

* Make common format for returning date values to cover DateTime and Timestamp columns by one test

* [15363] add backward compatibility for existing connections.

* Populate stream config and messages by tablespace. Now it's required inside processing.

* Compare only fields from the stream config

* Rework BigQueryUploaderFactory and UploaderConfig to have possibility make a decision about array formmater before we create temporary table

* Compare fields

* remove extra logging

* fix project:dataset format of the datasetId

* missing import

* remove debug logging

* fix log messages

* format

* 4 > 3
robbinhan pushed a commit to robbinhan/airbyte that referenced this pull request Sep 29, 2022
…rbytehq#16494)

* Increase version for BQ PR airbytehq#14023

* auto-bump connector version [ci skip]

* auto-bump connector version [ci skip]

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
jhammarstedt pushed a commit to jhammarstedt/airbyte that referenced this pull request Oct 31, 2022
…hq#14023)

* stop covering any array. cover only if we have array of arrays (restriction of BigQuery)

* add test with nested arrays and update existing tests

* [14058] fix datetime arrays

* [11109] cover only array of arrays by object instead of any array

* [14058] fix datetime format fail when we have an array of objects with datetime

* enable Array and Array+Object DATs

* reopen Issue airbytehq#11166 and disable functionality

* Improve the tests by moving common part to Utils

* Add tests to check `Array of arrays` cases

* Increase version

* Doc

* format

* review update:
- update comment about reopen issue
- added test case with multiply array sub values
- fix nested arrays with datetime
- add test case for nested arrays with datetime

* fix date formatting

* disable testAnyOf test and upd comments

* remove some code duplication in the tests

* [14668] cover by tests the BigQuery inheritance limitation

* Make GCS implementation running same tests as standard impl

* Make common format for returning date values to cover DateTime and Timestamp columns by one test

* [15363] add backward compatibility for existing connections.

* Populate stream config and messages by tablespace. Now it's required inside processing.

* Compare only fields from the stream config

* Rework BigQueryUploaderFactory and UploaderConfig to have possibility make a decision about array formmater before we create temporary table

* Compare fields

* remove extra logging

* fix project:dataset format of the datasetId

* missing import

* remove debug logging

* fix log messages

* format

* 4 > 3
jhammarstedt pushed a commit to jhammarstedt/airbyte that referenced this pull request Oct 31, 2022
…rbytehq#16494)

* Increase version for BQ PR airbytehq#14023

* auto-bump connector version [ci skip]

* auto-bump connector version [ci skip]

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment