Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial and documentation for config-based connectors #15027

Merged
merged 99 commits into from
Aug 12, 2022
Merged
Show file tree
Hide file tree
Changes from 88 commits
Commits
Show all changes
99 commits
Select commit Hold shift + click to select a range
4855a72
5-step tutorial
girarda Jul 25, 2022
138bd52
move
girarda Jul 26, 2022
637c2a7
tiny bit of editing
girarda Jul 26, 2022
9fabad8
Merge branch 'master' into alex/lowcodeTutorial
girarda Jul 28, 2022
ff775e3
Update tutorial
girarda Jul 28, 2022
6ebee74
update docs
girarda Aug 1, 2022
ff2b602
reset
girarda Aug 1, 2022
906f915
move files
girarda Aug 1, 2022
a64c758
record selector, request options, and more links
girarda Aug 1, 2022
2099b24
update
girarda Aug 1, 2022
8bab845
update
girarda Aug 1, 2022
a03e6f3
connector definition
girarda Aug 1, 2022
d444d74
link
girarda Aug 1, 2022
71e0a5b
links
girarda Aug 2, 2022
a9512ab
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 2, 2022
7b36ca6
update example
girarda Aug 2, 2022
218bfd9
footnote
girarda Aug 2, 2022
78599e3
typo
girarda Aug 2, 2022
c9bfb99
document string interpolation
girarda Aug 2, 2022
58567c5
note on string interpolation
girarda Aug 2, 2022
76a95ae
update
girarda Aug 2, 2022
8feb1a6
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 2, 2022
ecf9b34
fix code sample
girarda Aug 2, 2022
990d44a
fix
girarda Aug 2, 2022
f9b1b68
update sample
girarda Aug 2, 2022
945cc3e
fix
girarda Aug 2, 2022
a3349df
use the actual config
girarda Aug 2, 2022
318e613
Update as per comments
girarda Aug 7, 2022
c54b0c4
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 8, 2022
9cc1e4b
write as yaml
girarda Aug 8, 2022
f096296
typo
girarda Aug 8, 2022
8bd35b4
Clarify options overloading
girarda Aug 8, 2022
cfb4528
clarify that docker must be running
girarda Aug 8, 2022
85d5afb
remove extra footnote
girarda Aug 8, 2022
61a75b5
use venv directly
girarda Aug 8, 2022
7e1dc95
Apply suggestions from code review
girarda Aug 8, 2022
3df5071
signup instructions
girarda Aug 8, 2022
b074832
update
girarda Aug 8, 2022
672eb16
clarify that both dot and bracket notations are interchangeable
girarda Aug 8, 2022
6575c9d
Clarify how check works
girarda Aug 8, 2022
e747b4a
create spec and config before updating connector definition
girarda Aug 8, 2022
d5ac31d
clarify what now_local() is
girarda Aug 8, 2022
fdce2c6
rename to yaml structure
girarda Aug 8, 2022
198b421
Go through tutorial and update end of section code samples
girarda Aug 9, 2022
18bc40f
fix link
girarda Aug 9, 2022
f4e5ed4
update
girarda Aug 9, 2022
83e3845
update code samples
girarda Aug 9, 2022
bab017b
Update code samples
girarda Aug 9, 2022
37d1fde
Update to bracket notation
girarda Aug 9, 2022
1fc83fe
remove superfluous comments
girarda Aug 9, 2022
6944916
Update docs/connector-development/config-based/tutorial/2-install-dep…
girarda Aug 9, 2022
c317a49
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
096a370
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
ff804be
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
49be031
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
6790422
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
34214b4
Update docs/connector-development/config-based/tutorial/4-reading-dat…
girarda Aug 9, 2022
bf9a205
fix path
girarda Aug 9, 2022
74e4de8
update
girarda Aug 9, 2022
ca0f93c
motivation blurp
girarda Aug 9, 2022
46b2ee4
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 10, 2022
9cfd223
warning
girarda Aug 10, 2022
65a966c
warning
girarda Aug 10, 2022
dd4437c
fix code block
girarda Aug 10, 2022
365c0dc
update code samples
girarda Aug 10, 2022
ebaa701
update code sample
girarda Aug 10, 2022
aacc30a
update code samples
girarda Aug 10, 2022
3b1e85f
small updates
girarda Aug 10, 2022
b4498f3
update yaml structure
girarda Aug 10, 2022
306e9e5
custom class example
girarda Aug 10, 2022
c2d9b86
language annotations
girarda Aug 10, 2022
562844b
update warning
girarda Aug 11, 2022
faada9a
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 11, 2022
08487f7
Update tutorial to use dpath extractor
girarda Aug 11, 2022
30d25c0
Update record selector docs
girarda Aug 11, 2022
63a295c
unit test
girarda Aug 11, 2022
019cc0a
link to contributing
girarda Aug 12, 2022
117ee2f
tiny update
girarda Aug 12, 2022
3a00dac
$ in front of commands
girarda Aug 12, 2022
b2040fc
$ in front of commands
girarda Aug 12, 2022
db243a8
More readings
girarda Aug 12, 2022
cc0d76c
link to existing config-based connectors
girarda Aug 12, 2022
6cbdaa0
index
girarda Aug 12, 2022
619bf37
update
girarda Aug 12, 2022
9a4f1c9
delete broken link
girarda Aug 12, 2022
5337868
supported features
girarda Aug 12, 2022
e4919d5
update
girarda Aug 12, 2022
e27fade
Add some links
girarda Aug 12, 2022
048bddb
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
019aad2
Update docs/connector-development/config-based/record-selector.md
girarda Aug 12, 2022
cc308f2
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
e3cffc8
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
785a3e4
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
2db7694
mention the unit
girarda Aug 12, 2022
2a7d5fc
headers
girarda Aug 12, 2022
eba0322
remove mentions of interpolating on stream slice, etc.
girarda Aug 12, 2022
e7f0023
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 12, 2022
8353121
update
girarda Aug 12, 2022
e6637d3
exclude config-based docs
girarda Aug 12, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ def token(self) -> str:
@dataclass
class BasicHttpAuthenticator(AbstractHeaderAuthenticator):
"""
Builds auth based off the basic authentication scheme as defined by RFC 7617, which transmits credentials as USER ID/password pairs, encoded using bas64
Builds auth based off the basic authentication scheme as defined by RFC 7617, which transmits credentials as USER ID/password pairs, encoded using base64
https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme

The header is of the form
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

from typing import Mapping, Type

from airbyte_cdk.sources.declarative.auth.oauth import DeclarativeOauth2Authenticator
from airbyte_cdk.sources.declarative.auth.token import ApiKeyAuthenticator, BasicHttpAuthenticator, BearerAuthenticator
from airbyte_cdk.sources.declarative.datetime.min_max_datetime import MinMaxDatetime
from airbyte_cdk.sources.declarative.declarative_stream import DeclarativeStream
Expand Down Expand Up @@ -56,6 +57,7 @@
"ListStreamSlicer": ListStreamSlicer,
"MinMaxDatetime": MinMaxDatetime,
"NoPagination": NoPagination,
"OAuthAuthenticator": DeclarativeOauth2Authenticator,
"OffsetIncrement": OffsetIncrement,
"RecordSelector": RecordSelector,
"RemoveFields": RemoveFields,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ class DeclarativeComponentFactory:
If the component definition is a mapping with neither a "class_name" nor a "type" field,
the factory will do a best-effort attempt at inferring the component type by looking up the parent object's constructor type hints.
If the type hint is an interface present in `DEFAULT_IMPLEMENTATIONS_REGISTRY`,
then the factory will create an object of it's default implementation.
then the factory will create an object of its default implementation.

If the component definition is a list, then the factory will iterate over the elements of the list,
instantiate its subcomponents, and return a list of instantiated objects.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ class YamlParser(ConnectionDefinitionParser):
"""
Parses a Yaml string to a ConnectionDefinition

In addition to standard Yaml parsing, the input_string can contain refererences to values previously defined.
In addition to standard Yaml parsing, the input_string can contain references to values previously defined.
This parser will dereference these values to produce a complete ConnectionDefinition.

References can be defined using a *ref(<arg>) string.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
[
("test_extract_from_array", ["data"], {"data": [{"id": 1}, {"id": 2}]}, [{"id": 1}, {"id": 2}]),
("test_extract_single_record", ["data"], {"data": {"id": 1}}, [{"id": 1}]),
("test_extract_single_record_from_root", [], {"id": 1}, [{"id": 1}]),
("test_extract_from_root_array", [], [{"id": 1}, {"id": 2}], [{"id": 1}, {"id": 2}]),
("test_nested_field", ["data", "records"], {"data": {"records": [{"id": 1}, {"id": 2}]}}, [{"id": 1}, {"id": 2}]),
("test_field_in_config", ["{{ config['field'] }}"], {"record_array": [{"id": 1}, {"id": 2}]}, [{"id": 1}, {"id": 2}]),
Expand Down
73 changes: 73 additions & 0 deletions docs/connector-development/config-based/authentication.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Authentication

The `Authenticator` defines how to configure outgoing HTTP requests to authenticate on the API source.

## Authenticators

### ApiKeyAuthenticator

The `ApiKeyAuthenticator` sets an HTTP header on outgoing requests.
The following definition will set the header "Authorization" with a value "Bearer hello":

```yaml
authenticator:
type: "ApiKeyAuthenticator"
header: "Authorization"
token: "Bearer hello"
```

### BearerAuthenticator

The `BearerAuthenticator` is a specialized `ApiKeyAuthenticator` that always sets the header "Authorization" with the value "Bearer {token}".
The following definition will set the header "Authorization" with a value "Bearer hello"

```yaml
authenticator:
type: "BearerAuthenticator"
token: "hello"
```

More information on bearer authentication can be found [here](https://swagger.io/docs/specification/authentication/bearer-authentication/)

### BasicHttpAuthenticator

The `BasicHttpAuthenticator` set the "Authorization" header with a (USER ID/password) pair, encoded using base64 as per [RFC 7617](https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme).
The following definition will set the header "Authorization" with a value "Basic <encoded credentials>"

```yaml
authenticator:
type: "BasicHttpAuthenticator"
username: "hello"
password: "world"
```

The password is optional. Authenticating with APIs using Basic HTTP and a single API key can be done as:

```yaml
authenticator:
type: "BasicHttpAuthenticator"
username: "hello"
```

### OAuth

OAuth authentication is supported through the `OAuthAuthenticator`, which requires the following parameters:

alafanechere marked this conversation as resolved.
Show resolved Hide resolved
- token_refresh_endpoint: The endpoint to refresh the access token
- client_id: The client id
- client_secret: The client secret
- refresh_token: The token used to refresh the access token
- scopes (Optional): The scopes to request. Default: Empty list
- token_expiry_date (Optional): The access token expiration date formatted as RFC-3339 ("%Y-%m-%dT%H:%M:%S.%f%z")
- access_token_name (Optional): The field to extract access token from in the response. Default: "access_token".
- expires_in_name (Optional): The field to extract expires_in from in the response. Default: "expires_in"
- refresh_request_body (Optional): The request body to send in the refresh request. Default: None

```yaml
authenticator:
type: "OAuthAuthenticator"
token_refresh_endpoint: "https://api.searchmetrics.com/v4/token"
client_id: "{{ config['api_key'] }}"
client_secret: "{{ config['client_secret'] }}"
refresh_token: ""
```
177 changes: 177 additions & 0 deletions docs/connector-development/config-based/error-handling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Error handling

By default, only server errors (HTTP 5XX) and too many requests (HTTP 429) will be retried up to 5 times with exponential backoff.
Other HTTP errors will result in a failed read.

Other behaviors can be configured through the `Requester`'s `error_handler` field.

## Defining errors

### From status code

Response filters can be used to define how to handle requests resulting in responses with a specific HTTP status code.
For instance, this example will configure the handler to also retry responses with 404 error:

```yaml
requester:
<...>
error_handler:
response_filters:
- http_codes: [ 404 ]
action: RETRY
```

Response filters can be used to specify HTTP errors to ignore.
For instance, this example will configure the handler to ignore responses with 404 error:

```yaml
requester:
<...>
error_handler:
response_filters:
- http_codes: [ 404 ]
action: IGNORE
```

### From error message

Errors can also be defined by parsing the error message.
For instance, this error handler will ignores responses if the error message contains the string "ignorethisresponse"

```yaml
requester:
<...>
error_handler:
response_filters:
- error_message_contain: "ignorethisresponse"
action: IGNORE
```

This can also be done through a more generic string interpolation strategy with the following parameters:

- response: the decoded response

This example ignores errors where the response contains a "code" field:

```yaml
requester:
<...>
error_handler:
response_filters:
- predicate: "{{ 'code' in response }}"
action: IGNORE
```

The error handler can have multiple response filters.
The following example is configured to ignore 404 errors, and retry 429 errors:

```yaml
requester:
<...>
error_handler:
response_filters:
- http_codes: [ 404 ]
action: IGNORE
- http_codes: [ 429 ]
action: RETRY
```

## Backoff Strategies

The error handler supports a few backoff strategies, which are described in the following sections.

### Exponential backoff

This is the default backoff strategy. The requester will backoff with an exponential backoff interval

### Constant Backoff

When using the `ConstantBackoffStrategy`, the requester will backoff with a constant interval.

### Wait time defined in header

When using the `WaitTimeFromHeaderBackoffStrategy`, the requester will backoff by an interval specified in the response header.
In this example, the requester will backoff by the response's "wait_time" header value:

```yaml
requester:
<...>
error_handler:
<...>
backoff_strategies:
- type: "WaitTimeFromHeaderBackoffStrategy"
header: "wait_time"
```

Optionally, a regular expression can be configured to extract the wait time from the header value.

```yaml
requester:
<...>
error_handler:
<...>
backoff_strategies:
- type: "WaitTimeFromHeaderBackoffStrategy"
header: "wait_time"
regex: "[-+]?\d+"
```

### Wait until time defined in header

When using the `WaitUntilTimeFromHeaderBackoffStrategy`, the requester will backoff until the time specified in the response header.
In this example, the requester will wait until the time specified in the "wait_until" header value:

```yaml
requester:
<...>
error_handler:
<...>
backoff_strategies:
- type: "WaitUntilTimeFromHeaderBackoffStrategy"
header: "wait_until"
regex: "[-+]?\d+"
min_wait: 5
```

The strategy accepts an optional regular expression to extract the time from the header value, and a minimum time to wait.

## Advanced error handling

The error handler can have multiple backoff strategies, allowing it to fallback if a strategy cannot be evaluated.
For instance, the following defines an error handler that will read the backoff time from a header, and default to a constant backoff if the wait time could not be extracted from the response:

```yaml
requester:
<...>
error_handler:
<...>
backoff_strategies:
- type: "WaitTimeFromHeaderBackoffStrategy"
header: "wait_time"
- type: "ConstantBackoffStrategy"
backoff_time_in_seconds: 5

```

The `requester` can be configured to use a `CompositeErrorHandler`, which sequentially iterates over a list of error handlers, enabling different retry mechanisms for different types of errors.

In this example, a constant backoff of 5 seconds, will be applied if the response contains a "code" field, and an exponential backoff will be applied if the error code is 403:

```yaml
requester:
<...>
error_handler:
type: "CompositeErrorHandler"
error_handlers:
- response_filters:
- predicate: "{{ 'code' in response }}"
action: RETRY
backoff_strategies:
- type: "ConstantBackoffStrategy"
backoff_time_in_seconds: 5
- response_filters:
- http_codes: [ 403 ]
action: RETRY
backoff_strategies:
- type: "ExponentialBackoffStrategy"
```
26 changes: 26 additions & 0 deletions docs/connector-development/config-based/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Index

## From scratch

- [Overview](overview.md)
- [Yaml structure](overview.md)
- [Reference docs](https://airbyte-cdk.readthedocs.io/en/latest/api/airbyte_cdk.sources.declarative.html)

## Concepts

- [Authentication](authentication.md)
- [Error handling](error-handling.md)
- [Pagination](pagination.md)
- [Record selection](record-selector.md)
- [Request options](request-options.md)
- [Stream slicers](stream-slicers.md)

## Tutorial

0. [Getting started](tutorial/0-getting-started.md)
1. [Creating a source](tutorial/1-create-source.md)
2. [Installing dependencies](tutorial/2-install-dependencies.md)
3. [Connecting to the API](tutorial/3-connecting-to-the-API-source.md)
4. [Reading data](tutorial/4-reading-data.md)
5. [Incremental reads](tutorial/5-incremental-reads.md)
6. [Testing](tutorial/6-testing.md)
Loading