Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial and documentation for config-based connectors #15027

Merged
merged 99 commits into from
Aug 12, 2022

Conversation

girarda
Copy link
Contributor

@girarda girarda commented Jul 26, 2022

What

Add a tutorial and some documentation to help developers implement config-based connectors

How

  • 6-step tutorial implementing a source connector for exchange api
  • High level documentation describing the components that can be used (with some examples)
  • Everything is bundled in docs/connector-development/config-based/ since I'm not sure we want everyone to start using config-based right away. Happy to move the files to wherever makes the most sense though.

Recommended reading order

  1. docs/connector-development/config-based/overview.md
  2. docs/connector-development/config-based/tutorial/
  3. the other files

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here
Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here
Connector Generator
  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Jul 26, 2022
Let's start by cloning the Airbyte repository

```
git clone git@github.com:airbytehq/airbyte.git
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the instructions depend on merging #14552 and #14923

@github-actions github-actions bot added the CDK Connector Development Kit label Aug 1, 2022
python -m pytest integration_tests -p integration_tests.acceptance
```

1 test should be failing
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are currently 2 tests failing. the second one will be fixed by #15067

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm I have the following two failures:

 - airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py:183 TestConnection.test_check[inputs1]
- airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_incremental.py:207 TestIncremental.test_state_with_abnormally_large

1. `streams`: list of streams that are part of the source
2. `check`: component describing how to check the connection.

The configuration will be validated against this JSON Schema, which defines the set of valid properties.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

soon (tm).
we don't have the json schema yet

@girarda girarda changed the title [low-code connectors] tutorial Tutorial and documentation for config-based connectors Aug 2, 2022
Copy link
Contributor

@brianjlai brianjlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few suggestions for grammar fixes but looks good to me!

4. [Data retriever](overview.md#data-retriever): Describes how to retrieve the data from the API
5. [Cursor field](../cdk-python/incremental-stream.md) (Optional): Field to use used as stream cursor. Can either be a string, or a list of strings if the cursor is a nested field.
6. [Transformations](./record-selector.md#transformations) (Optional): A set of transformations to be applied on the records read from the source before emitting them to the destination
7. [Checkpoint interval](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol/#state--checkpointing) (Optional): Defines the interval at which incremental syncs should be checkpointed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth providing an example of what the integer indicates. For example, 5 could mean a lot of different things, items, hours, days, etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. done

docs/connector-development/config-based/overview.md Outdated Show resolved Hide resolved
docs/connector-development/config-based/overview.md Outdated Show resolved Hide resolved
docs/connector-development/config-based/overview.md Outdated Show resolved Hide resolved
docs/connector-development/config-based/overview.md Outdated Show resolved Hide resolved

This cursor value can be used to request the next page of record.

In this example, the next page of record is defined by setting the `from` request parameter to the id of the last record read:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be worth adding a sub header to break apart the two approaches to pagination called "Cursor Pagination in Body" and "Cursor Pagination in Headers", just to differentiate them from a reading perspective

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

docs/connector-development/config-based/record-selector.md Outdated Show resolved Hide resolved

- stream_slice
- stream_state
- next_page_token
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to think what the current use case for using these in the language is now, that the preferred method is to let other components handle things more declaratively. My worry is that we highlight this instead of the preferred way. But happy to leave it in if it provides needed flexibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea. let's leave this as an undocumented behavior for now. we can always add it back with some examples if needed


will create one slice per day for the interval `2021-02-01` - `2021-03-01`.

The `DatetimeStreamSlicer` also supports an optional lookback window, specifying how many days before the start_datetime to read data for.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as per slack discussion as something we should fix, we should also include that in the case of incremental syncs, the lookback window is also applied for the cursor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created an issue #15628

girarda and others added 8 commits August 12, 2022 14:47
Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
Copy link
Contributor

@supertopher supertopher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks sane. I would give a heads up to @Amruta-Ranade especially as this will probably induce unexpected behavior for what docs show up for her and her team

@girarda girarda dismissed alafanechere’s stale review August 12, 2022 22:29

the docs have been updated and reviewed since your last pass

@girarda girarda merged commit 288c3ca into master Aug 12, 2022
@girarda girarda deleted the alex/lowcodeTutorial branch August 12, 2022 22:50
rodireich pushed a commit that referenced this pull request Aug 20, 2022
* 5-step tutorial

* move

* tiny bit of editing

* Update tutorial

* update docs

* reset

* move files

* record selector, request options, and more links

* update

* update

* connector definition

* link

* links

* update example

* footnote

* typo

* document string interpolation

* note on string interpolation

* update

* fix code sample

* fix

* update sample

* fix

* use the actual config

* Update as per comments

* write as yaml

* typo

* Clarify options overloading

* clarify that docker must be running

* remove extra footnote

* use venv directly

* Apply suggestions from code review

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* signup instructions

* update

* clarify that both dot and bracket notations are interchangeable

* Clarify how check works

* create spec and config before updating connector definition

* clarify what now_local() is

* rename to yaml structure

* Go through tutorial and update end of section code samples

* fix link

* update

* update code samples

* Update code samples

* Update to bracket notation

* remove superfluous comments

* Update docs/connector-development/config-based/tutorial/2-install-dependencies.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/4-reading-data.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* fix path

* update

* motivation blurp

* warning

* warning

* fix code block

* update code samples

* update code sample

* update code samples

* small updates

* update yaml structure

* custom class example

* language annotations

* update warning

* Update tutorial to use dpath extractor

* Update record selector docs

* unit test

* link to contributing

* tiny update

* $ in front of commands

* $ in front of commands

* More readings

* link to existing config-based connectors

* index

* update

* delete broken link

* supported features

* update

* Add some links

* Update docs/connector-development/config-based/overview.md

Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>

* Update docs/connector-development/config-based/record-selector.md

Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>

* Update docs/connector-development/config-based/overview.md

Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>

* Update docs/connector-development/config-based/overview.md

Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>

* Update docs/connector-development/config-based/overview.md

Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>

* mention the unit

* headers

* remove mentions of interpolating on stream slice, etc.

* update

* exclude config-based docs

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Augustin <augustin.lafanechere@gmail.com>
Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Improvements or additions to documentation CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants