Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 New Destination: Weaviate #20094

Merged
merged 64 commits into from
Jan 12, 2023

Conversation

samos123
Copy link
Contributor

@samos123 samos123 commented Dec 5, 2022

What

Add a connector for Weaviate to be able to write data to Weaviate. Source Connector will be done in separate PR

How

Use the Weaviate Python client to batch AirbyteRecords and flush them on state or when batch size is reached.

🚨 User Impact 🚨

Yes 🚨, end-users will be able to utilize this new connector after this gets merged. So this should probably be highlighted in change notes somewhere.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here
Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here
Connector Generator
  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed

Tests

Unit

Put your unit tests output here.

Integration
         -- Docs: https://docs.pytest.org/en/stable/warnings.html                                                                       ======================== 5 passed, 5 warnings in 9.88s =========================                                      
                                                                                                                               
> Task :airbyte-integrations:connectors:destination-weaviate:customIntegrationTests
Name                                  Stmts   Miss  Cover      
---------------------------------------------------------       
destination_weaviate/__init__.py          2      0   100%      
destination_weaviate/destination.py      39      2    95%      
destination_weaviate/client.py           44      6    86%                                                                      
---------------------------------------------------------                                                                      
TOTAL                                    85      8    91%      
                                                                                                                               
Deprecated Gradle features were used in this build, making it incompatible with Gradle 8.0.                                    
                                                                                                                               
You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts o
r plugins.                                                                                                                     
                                                                                                                               
See https://docs.gradle.org/7.6/userguide/command_line_interface.html#sec:command_line_warnings                                
                                                               
BUILD SUCCESSFUL in 38s
Acceptance

Put your acceptance tests output here.

@CLAassistant
Copy link

CLAassistant commented Dec 5, 2022

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ samos123
✅ itaseskii
❌ itaseski


itaseski seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@samos123
Copy link
Contributor Author

samos123 commented Dec 5, 2022

/test connector=destination-weaviate

1 similar comment
@samos123
Copy link
Contributor Author

samos123 commented Dec 5, 2022

/test connector=destination-weaviate

@samos123 samos123 changed the title Add Weaviate Destination #20012 🎉 New Destination: Weaviate Dec 5, 2022
@itaseskii itaseskii self-assigned this Dec 5, 2022
@itaseskii itaseskii self-requested a review December 5, 2022 22:34
@sajarin sajarin added the bounty-XL Maintainer program: claimable extra large bounty PR label Dec 6, 2022
@samos123 samos123 force-pushed the add-weaviate-destination branch from 801b846 to 9e56857 Compare December 6, 2022 22:07
@samos123
Copy link
Contributor Author

samos123 commented Dec 9, 2022

Tested it with an RSS feed as source and it worked as expected:
image

@octavia-squidington-iv octavia-squidington-iv added the area/documentation Improvements or additions to documentation label Dec 9, 2022
@itaseskii
Copy link
Contributor

itaseskii commented Dec 12, 2022

/test connector=connectors/destination-weaviate

🕑 connectors/destination-weaviate https://github.com/airbytehq/airbyte/actions/runs/3674928650
❌ connectors/destination-weaviate https://github.com/airbytehq/airbyte/actions/runs/3674928650
🐛

@itaseskii
Copy link
Contributor

itaseskii commented Dec 12, 2022

/test connector=connectors/destination-weaviate

🕑 connectors/destination-weaviate https://github.com/airbytehq/airbyte/actions/runs/3676540542
❌ connectors/destination-weaviate https://github.com/airbytehq/airbyte/actions/runs/3676540542
🐛

@samos123
Copy link
Contributor Author

@itaseskii
Copy link
Contributor

itaseskii commented Jan 9, 2023

/test connector=connectors/destination-weaviate

🕑 connectors/destination-weaviate https://github.com/airbytehq/airbyte/actions/runs/3876781998
❌ connectors/destination-weaviate https://github.com/airbytehq/airbyte/actions/runs/3876781998
🐛 https://gradle.com/s/s4dosklwa3upw

Build Failed

Test summary info:

	 =========================== short test summary info ============================
	 FAILED integration_tests/integration_test.py::test_line_break_characters - At...
	 FAILED integration_tests/integration_test.py::test_write_id - AttributeError:...
	 FAILED integration_tests/integration_test.py::test_write_pokemon_source_pikachu
	 �[31m================== �[31m�[1m3 failed�[0m, �[32m12 passed�[0m, �[33m10 warnings�[0m�[31m in 17.89s�[0m�[31m ==================�[0m

@samos123
Copy link
Contributor Author

samos123 commented Jan 9, 2023

/test connector=connectors/destination-weaviate

@itaseskii
Copy link
Contributor

itaseskii commented Jan 9, 2023

/test connector=connectors/destination-weaviate

🕑 connectors/destination-weaviate https://github.com/airbytehq/airbyte/actions/runs/3878393797
✅ connectors/destination-weaviate https://github.com/airbytehq/airbyte/actions/runs/3878393797
Python tests coverage:

Name                                  Stmts   Miss  Cover
---------------------------------------------------------
destination_weaviate/utils.py            52      0   100%
destination_weaviate/__init__.py          2      0   100%
destination_weaviate/client.py           97      5    95%
destination_weaviate/destination.py      35      2    94%
---------------------------------------------------------
TOTAL                                   186      7    96%

Build Passed

Test summary info:

All Passed

@samos123 samos123 force-pushed the add-weaviate-destination branch from d7989df to 19541e9 Compare January 9, 2023 22:51
@samos123 samos123 requested a review from itaseskii January 9, 2023 22:53
@itaseskii
Copy link
Contributor

/test connector=connectors/destination-weaviate

@itaseskii
Copy link
Contributor

itaseskii commented Jan 11, 2023

/test connector=connectors/destination-weaviate

🕑 connectors/destination-weaviate https://github.com/airbytehq/airbyte/actions/runs/3896955514
✅ connectors/destination-weaviate https://github.com/airbytehq/airbyte/actions/runs/3896955514
Python tests coverage:

Name                                  Stmts   Miss  Cover
---------------------------------------------------------
destination_weaviate/utils.py            52      0   100%
destination_weaviate/__init__.py          2      0   100%
destination_weaviate/client.py           97      5    95%
destination_weaviate/destination.py      35      2    94%
---------------------------------------------------------
TOTAL                                   186      7    96%

Build Passed

Test summary info:

All Passed

@itaseskii
Copy link
Contributor

itaseskii commented Jan 11, 2023

/publish connector=connectors/destination-weaviate

🕑 Publishing the following connectors:
connectors/destination-weaviate
https://github.com/airbytehq/airbyte/actions/runs/3897120236


Connector Did it publish? Were definitions generated?
connectors/destination-weaviate

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@itaseskii
Copy link
Contributor

itaseskii commented Jan 11, 2023

/publish connector=connectors/destination-weaviate

🕑 Publishing the following connectors:
connectors/destination-weaviate
https://github.com/airbytehq/airbyte/actions/runs/3897349217


Connector Did it publish? Were definitions generated?
connectors/destination-weaviate

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@natalyjazzviolin
Copy link
Contributor

@samos123 please make sure you follow step no.4 of this document to fix the definition generation failure:
https://docs.airbyte.com/connector-development/#publishing-a-connector

Then when the publish command is successful and all CI/CD checks have passed, we can merge this PR. Thanks for your contribution!

@itaseskii
Copy link
Contributor

itaseskii commented Jan 12, 2023

@samos123 please make sure you follow step no.4 of this document to fix the definition generation failure: https://docs.airbyte.com/connector-development/#publishing-a-connector

Then when the publish command is successful and all CI/CD checks have passed, we can merge this PR. Thanks for your contribution!

The definitions have already been generated in a subsequent commit after the publish failure. Going through the points sequentially the connector should first be published and then definitions should be generated guarantying that definition generation will fail on publish for PR's for new connectors.

@sajarin sajarin merged commit 4778615 into airbytehq:master Jan 12, 2023
@sajarin
Copy link
Contributor

sajarin commented Jan 12, 2023

Thanks for the PR @samos123 and thanks for the review @itaseskii

jbfbell pushed a commit that referenced this pull request Jan 13, 2023
* Add Weaviate Destination #20012

* Fix formatting and standards

* Fix flake issue

* Fix unused client variable

* Add support for int based ID fields

* Ensure stream name meets Weaviate class reqs

* add integration test for using pokemon as source

* handle nested objects by converting to json string

* create schema for transforming data to weaviate

* Add docs for weaviate destination

* Remove pokemon-schema external dependency

* Remove pikachu integration test external dep

* Add large batch test case

* add test for second sync

* Fix issue with fields starting with uppercase

* add more checks to line_break test

* Update README for Weaviate

* Make batch_size configurable with 100 as default

* Add support for providing vectors

* Update docs

* Add test for existing Weaviate class

* Add trying to create schema in check connection

* Add support for mongodb _id fields

* Add support for providing custom ID

* remove unused file

* fix flow of is_ready() check

* Move standalone functions to utils.py

* Support overwrite mode

* Add regex based stream_name_class_name conversion

* remove unneeded print statement

* Add "airbyte_secret" : true to password config

* add support for array of arrays

* remove unneeded variable declaration

* change to MutableMapping since we use del

* change name from queued_write to buffered_write

* add retry on partial batch error

* Fix partial batch retry and add tests

* fix ID generation

* Clean up recursive retry logic

* fix flake tests

* ran flake reformat

* add definitions

Co-authored-by: Ivica Taseski <ivica.taseski94@gmail.com>
Co-authored-by: itaseski <itaseski@debian-BULLSEYE-live-builder-AMD64>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation bounty bounty-XL Maintainer program: claimable extra large bounty PR community connectors/destination/weaviate
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

7 participants