Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Destination S3: Fix Parquet LZO compression #26284

Merged
merged 10 commits into from
May 22, 2023
Merged

Conversation

edgao
Copy link
Contributor

@edgao edgao commented May 18, 2023

revert #21085 + add a test case to actually run an lzo compression sync. seen in https://github.com/airbytehq/oncall/issues/2028

the test fails without the revert + passes after the revert.

it's still broken in destination-gcs, but I don't want to install java 8 + lzo libraries there... issue to do something about this https://github.com/airbytehq/airbyte-internal-issues/issues/1774

@edgao edgao requested a review from a team as a code owner May 18, 2023 22:35
@github-actions
Copy link
Contributor

github-actions bot commented May 18, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan and you've followed all steps in the Breaking Changes Checklist
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • You, or an Airbyter, have run /test successfully on this PR - or on a non-forked branch
  • You, or an Airbyter, have run /publish successfully on this PR - or on a non-forked branch
  • You've updated the connector's metadata.yaml file new!

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

@edgao
Copy link
Contributor Author

edgao commented May 18, 2023

/test connector=connectors/destination-s3

🕑 connectors/destination-s3 https://github.com/airbytehq/airbyte/actions/runs/5018877381
✅ connectors/destination-s3 https://github.com/airbytehq/airbyte/actions/runs/5018877381
No Python unittests run

Build Passed

Test summary info:

All Passed

@edgao edgao changed the title Destination S3: Fix Parquet LZO compression 🐛 Destination S3: Fix Parquet LZO compression May 18, 2023
@github-actions
Copy link
Contributor

github-actions bot commented May 18, 2023

Affected Connector Report

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to do the following as needed:

  • Run integration tests
  • Bump connector or module version
  • Add changelog
  • Publish the new version

✅ Sources (0)

Connector Version Changelog Publish
  • See "Actionable Items" below for how to resolve warnings and errors.

❌ Destinations (10)

Connector Version Changelog Publish
destination-bigquery 1.4.3
destination-bigquery-denormalized 1.4.1
destination-databricks 1.0.2
destination-gcs 0.3.0
destination-r2 0.1.0
destination-redshift 0.4.7
destination-s3 0.4.1
(diff seed version)
destination-s3-glue 0.1.7
destination-snowflake 1.0.4
destination-starburst-galaxy 0.0.1
  • See "Actionable Items" below for how to resolve warnings and errors.

✅ Other Modules (0)

Actionable Items

(click to expand)

Category Status Actionable Item
Version
mismatch
The version of the connector is different from its normal variant. Please bump the version of the connector.

doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.
Changelog
doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.

changelog missing
There is no chnagelog for the current version of the connector. If you are the author of the current version, please add a changelog.
Publish
not in seed
The connector is not in the cloud or oss registry, so its publication status cannot be checked. This can be normal (e.g. some connectors are cloud-specific, and only listed in the cloud seed file). Please double-check to make sure that you have added a metadata.yaml file and the expected registries are enabled.

@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label May 18, 2023
@edgao
Copy link
Contributor Author

edgao commented May 18, 2023

hm, test is passing but I'm still seeing errors when I try running a sync in oss :/

@edgao edgao marked this pull request as draft May 18, 2023 23:17
@edgao
Copy link
Contributor Author

edgao commented May 18, 2023

nope, I'm just dumb and built the image from the wrong branch 🤦 @airbytehq/destinations ptal :D

@edgao edgao marked this pull request as ready for review May 18, 2023 23:25
@edgao
Copy link
Contributor Author

edgao commented May 19, 2023

/test connector=connectors/destination-gcs

🕑 connectors/destination-gcs https://github.com/airbytehq/airbyte/actions/runs/5025855492
✅ connectors/destination-gcs https://github.com/airbytehq/airbyte/actions/runs/5025855492
No Python unittests run

Build Passed

Test summary info:

All Passed

* Only verifies that it runs successfully, which is sufficient to catch any issues with installing the lzo libraries.
*/
@Test
public void testLzoCompression() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@edgao
Copy link
Contributor Author

edgao commented May 22, 2023

/test connector=connectors/destination-s3

🕑 connectors/destination-s3 https://github.com/airbytehq/airbyte/actions/runs/5048179444
✅ connectors/destination-s3 https://github.com/airbytehq/airbyte/actions/runs/5048179444
No Python unittests run

Build Passed

Test summary info:

All Passed

@edgao
Copy link
Contributor Author

edgao commented May 22, 2023

merging without publish because we have publish on merge!

@edgao edgao enabled auto-merge (squash) May 22, 2023 16:46
@edgao edgao merged commit 67f3cdb into master May 22, 2023
@edgao edgao deleted the edgao/dest_s3_lzo branch May 22, 2023 16:49
@evantahler
Copy link
Contributor

evantahler commented May 22, 2023

Screenshot 2023-05-22 at 10 01 48 AM

Link

╭────────────────────── DESTINATION-S3 - PUBLISH RESULTS ──────────────────────╮
│                                Steps results                                 │
│ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ │
│ ┃ Step                                       ┃ Result     ┃ Finished after ┃ │
│ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │
│ │ Validate                                   │ Successful │ 319s           │ │
│ │ airbyte-integrations/connectors/destinati… │            │                │ │
│ │ Check if the connector docker image does   │ Successful │ 316s           │ │
│ │ not exist on the registry.                 │            │                │ │
│ │ Build connector for publish                │ Successful │ 90s            │ │
│ │ Push connector image to registry           │ Successful │ 51s            │ │
│ │ Pull connector image from registry         │ Successful │ 47s            │ │
│ │ Upload connector spec to spec cache bucket │ Successful │ 4s             │ │
│ │ Upload                                     │ Successful │ 0s             │ │
│ │ airbyte-integrations/connectors/destinati… │            │                │ │
│ └────────────────────────────────────────────┴────────────┴────────────────┘ │
╰───────── ⏲️  Total pipeline duration for destination-s3: 336 seconds ─────────╯

@edgao
Copy link
Contributor Author

edgao commented May 22, 2023

but there's (presumably?) a diff between the dockerfile and how dagger is configured :(

lzo sync still failing on cloud https://cloud.airbyte.com/workspaces/46b4844a-94a3-496f-af53-2050bf4141af/connections/79b4d19f-2077-461a-9d4b-cfaa6c9538dc/job-history#2292715::0

nguyenaiden pushed a commit that referenced this pull request May 25, 2023
* Revert "Move hadoop-lzo to test dependency (#21085)"

This reverts commit 1241569.

* add basic test

* Automated Change

* version bumps, changelog

* Automated Change

* unused import

* Ran ./gradlew :spotlessJavaApply to trigger GitHub build

* regenerate registry

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: ryankfu <ryan.fu@airbyte.io>
marcosmarxm pushed a commit to natalia-miinto/airbyte that referenced this pull request Jun 8, 2023
* Revert "Move hadoop-lzo to test dependency (airbytehq#21085)"

This reverts commit 1241569.

* add basic test

* Automated Change

* version bumps, changelog

* Automated Change

* unused import

* Ran ./gradlew :spotlessJavaApply to trigger GitHub build

* regenerate registry

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: ryankfu <ryan.fu@airbyte.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants