Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial and documentation for config-based connectors #15027

Merged
merged 99 commits into from
Aug 12, 2022
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
99 commits
Select commit Hold shift + click to select a range
4855a72
5-step tutorial
girarda Jul 25, 2022
138bd52
move
girarda Jul 26, 2022
637c2a7
tiny bit of editing
girarda Jul 26, 2022
9fabad8
Merge branch 'master' into alex/lowcodeTutorial
girarda Jul 28, 2022
ff775e3
Update tutorial
girarda Jul 28, 2022
6ebee74
update docs
girarda Aug 1, 2022
ff2b602
reset
girarda Aug 1, 2022
906f915
move files
girarda Aug 1, 2022
a64c758
record selector, request options, and more links
girarda Aug 1, 2022
2099b24
update
girarda Aug 1, 2022
8bab845
update
girarda Aug 1, 2022
a03e6f3
connector definition
girarda Aug 1, 2022
d444d74
link
girarda Aug 1, 2022
71e0a5b
links
girarda Aug 2, 2022
a9512ab
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 2, 2022
7b36ca6
update example
girarda Aug 2, 2022
218bfd9
footnote
girarda Aug 2, 2022
78599e3
typo
girarda Aug 2, 2022
c9bfb99
document string interpolation
girarda Aug 2, 2022
58567c5
note on string interpolation
girarda Aug 2, 2022
76a95ae
update
girarda Aug 2, 2022
8feb1a6
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 2, 2022
ecf9b34
fix code sample
girarda Aug 2, 2022
990d44a
fix
girarda Aug 2, 2022
f9b1b68
update sample
girarda Aug 2, 2022
945cc3e
fix
girarda Aug 2, 2022
a3349df
use the actual config
girarda Aug 2, 2022
318e613
Update as per comments
girarda Aug 7, 2022
c54b0c4
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 8, 2022
9cc1e4b
write as yaml
girarda Aug 8, 2022
f096296
typo
girarda Aug 8, 2022
8bd35b4
Clarify options overloading
girarda Aug 8, 2022
cfb4528
clarify that docker must be running
girarda Aug 8, 2022
85d5afb
remove extra footnote
girarda Aug 8, 2022
61a75b5
use venv directly
girarda Aug 8, 2022
7e1dc95
Apply suggestions from code review
girarda Aug 8, 2022
3df5071
signup instructions
girarda Aug 8, 2022
b074832
update
girarda Aug 8, 2022
672eb16
clarify that both dot and bracket notations are interchangeable
girarda Aug 8, 2022
6575c9d
Clarify how check works
girarda Aug 8, 2022
e747b4a
create spec and config before updating connector definition
girarda Aug 8, 2022
d5ac31d
clarify what now_local() is
girarda Aug 8, 2022
fdce2c6
rename to yaml structure
girarda Aug 8, 2022
198b421
Go through tutorial and update end of section code samples
girarda Aug 9, 2022
18bc40f
fix link
girarda Aug 9, 2022
f4e5ed4
update
girarda Aug 9, 2022
83e3845
update code samples
girarda Aug 9, 2022
bab017b
Update code samples
girarda Aug 9, 2022
37d1fde
Update to bracket notation
girarda Aug 9, 2022
1fc83fe
remove superfluous comments
girarda Aug 9, 2022
6944916
Update docs/connector-development/config-based/tutorial/2-install-dep…
girarda Aug 9, 2022
c317a49
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
096a370
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
ff804be
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
49be031
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
6790422
Update docs/connector-development/config-based/tutorial/3-connecting-…
girarda Aug 9, 2022
34214b4
Update docs/connector-development/config-based/tutorial/4-reading-dat…
girarda Aug 9, 2022
bf9a205
fix path
girarda Aug 9, 2022
74e4de8
update
girarda Aug 9, 2022
ca0f93c
motivation blurp
girarda Aug 9, 2022
46b2ee4
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 10, 2022
9cfd223
warning
girarda Aug 10, 2022
65a966c
warning
girarda Aug 10, 2022
dd4437c
fix code block
girarda Aug 10, 2022
365c0dc
update code samples
girarda Aug 10, 2022
ebaa701
update code sample
girarda Aug 10, 2022
aacc30a
update code samples
girarda Aug 10, 2022
3b1e85f
small updates
girarda Aug 10, 2022
b4498f3
update yaml structure
girarda Aug 10, 2022
306e9e5
custom class example
girarda Aug 10, 2022
c2d9b86
language annotations
girarda Aug 10, 2022
562844b
update warning
girarda Aug 11, 2022
faada9a
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 11, 2022
08487f7
Update tutorial to use dpath extractor
girarda Aug 11, 2022
30d25c0
Update record selector docs
girarda Aug 11, 2022
63a295c
unit test
girarda Aug 11, 2022
019cc0a
link to contributing
girarda Aug 12, 2022
117ee2f
tiny update
girarda Aug 12, 2022
3a00dac
$ in front of commands
girarda Aug 12, 2022
b2040fc
$ in front of commands
girarda Aug 12, 2022
db243a8
More readings
girarda Aug 12, 2022
cc0d76c
link to existing config-based connectors
girarda Aug 12, 2022
6cbdaa0
index
girarda Aug 12, 2022
619bf37
update
girarda Aug 12, 2022
9a4f1c9
delete broken link
girarda Aug 12, 2022
5337868
supported features
girarda Aug 12, 2022
e4919d5
update
girarda Aug 12, 2022
e27fade
Add some links
girarda Aug 12, 2022
048bddb
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
019aad2
Update docs/connector-development/config-based/record-selector.md
girarda Aug 12, 2022
cc308f2
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
e3cffc8
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
785a3e4
Update docs/connector-development/config-based/overview.md
girarda Aug 12, 2022
2db7694
mention the unit
girarda Aug 12, 2022
2a7d5fc
headers
girarda Aug 12, 2022
eba0322
remove mentions of interpolating on stream slice, etc.
girarda Aug 12, 2022
e7f0023
Merge branch 'master' into alex/lowcodeTutorial
girarda Aug 12, 2022
8353121
update
girarda Aug 12, 2022
e6637d3
exclude config-based docs
girarda Aug 12, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ def token(self) -> str:

class BasicHttpAuthenticator(AbstractHeaderAuthenticator):
"""
Builds auth based off the basic authentication scheme as defined by RFC 7617, which transmits credentials as USER ID/password pairs, encoded using bas64
Builds auth based off the basic authentication scheme as defined by RFC 7617, which transmits credentials as USER ID/password pairs, encoded using base64
https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme

The header is of the form
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

from typing import Mapping, Type

from airbyte_cdk.sources.declarative.auth.oauth import DeclarativeOauth2Authenticator
from airbyte_cdk.sources.declarative.auth.token import ApiKeyAuthenticator, BasicHttpAuthenticator, BearerAuthenticator
from airbyte_cdk.sources.declarative.datetime.min_max_datetime import MinMaxDatetime
from airbyte_cdk.sources.declarative.declarative_stream import DeclarativeStream
Expand Down Expand Up @@ -56,6 +57,7 @@
"ListStreamSlicer": ListStreamSlicer,
"MinMaxDatetime": MinMaxDatetime,
"NoPagination": NoPagination,
"OAuthAuthenticator": DeclarativeOauth2Authenticator,
"OffsetIncrement": OffsetIncrement,
"RecordSelector": RecordSelector,
"RemoveFields": RemoveFields,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ class YamlParser(ConnectionDefinitionParser):
"""
Parses a Yaml string to a ConnectionDefinition
In addition to standard Yaml parsing, the input_string can contain refererences to values previously defined.
In addition to standard Yaml parsing, the input_string can contain references to values previously defined.
This parser will dereference these values to produce a complete ConnectionDefinition.
References can be defined using a *ref(<arg>) string.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ class LimitPaginator(Paginator):
* updates the request path with "{{ response._metadata.next }}"
paginator:
type: "LimitPaginator"
limit_value: 10
page_size: 10
limit_option:
option_type: request_parameter
field_name: page_size
Expand All @@ -41,7 +41,7 @@ class LimitPaginator(Paginator):
`
paginator:
type: "LimitPaginator"
limit_value: 5
page_size: 5
limit_option:
option_type: header
field_name: page_size
Expand All @@ -58,7 +58,7 @@ class LimitPaginator(Paginator):
`
paginator:
type: "LimitPaginator"
limit_value: 5
page_size: 5
limit_option:
option_type: request_parameter
field_name: page_size
Expand Down
70 changes: 70 additions & 0 deletions docs/connector-development/config-based/authentication.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Authentication

The `Authenticator` defines how to configure outgoing HTTP requests to authenticate on the API source.

## Authenticators

### ApiKeyAuthenticator

The `ApiKeyAuthenticator` sets an HTTP header on outgoing requests.
The following definition will set the header "Authorization" with a value "Bearer hello":

```
authenticator:
type: "ApiKeyAuthenticator"
header: "Authorization"
token: "Bearer hello"
```

### BearerAuthenticator

The `BearerAuthenticator` is a specialized `ApiKeyAuthenticator` that always sets the header "Authorization" with the value "Bearer {token}".
The following definition will set the header "Authorization" with a value "Bearer hello"

```
authenticator:
type: "BearerAuthenticator"
token: "hello"
```

More information on bearer authentication can be found [here](https://swagger.io/docs/specification/authentication/bearer-authentication/)

### BasicHttpAuthenticator

The `BasicHttpAuthenticator` set the "Authorization" header with a (USER ID/password) pair, encoded using base64 as per [RFC 7617](https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme).
The following definition will set the header "Authorization" with a value "Basic <encoded credentials>"

The encoding scheme is:

1. concatenate the username and the password with `":"` in between
2. Encode the resulting string in base 64
3. Decode the result in utf8
alafanechere marked this conversation as resolved.
Show resolved Hide resolved

```
authenticator:
type: "BasicHttpAuthenticator"
username: "hello"
password: "world"
```

The password is optional. Authenticating with APIs using Basic HTTP and a single API key can be done as:

```
authenticator:
type: "BasicHttpAuthenticator"
username: "hello"
```

### OAuth

OAuth authentication is supported through the `OAuthAuthenticator`, which requires the following parameters:

alafanechere marked this conversation as resolved.
Show resolved Hide resolved
- token_refresh_endpoint: The endpoint to refresh the access token
- client_id: The client id
- client_secret: Client secret
- refresh_token: The token used to refresh the access token
- scopes: The scopes to request
- token_expiry_date: The access token expiration date
- access_token_name: THe field to extract access token from in the response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- access_token_name: THe field to extract access token from in the response
- access_token_name: The field to extract access token from in the response

- expires_in_name:The field to extract expires_in from in the response
- refresh_request_body: The request body to send in the refresh request
243 changes: 243 additions & 0 deletions docs/connector-development/config-based/connector-definition.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,243 @@
# Connector definition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend renaming this to YAML structure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


Connectors are defined as a yaml configuration describing the connector's Source.

2 top-level fields are required:

1. `streams`: list of streams that are part of the source
2. `check`: component describing how to check the connection.

The configuration will be validated against this JSON Schema, which defines the set of valid properties.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

soon (tm).
we don't have the json schema yet

girarda marked this conversation as resolved.
Show resolved Hide resolved

We recommend using the `Configuration Based Source` template from the template generator in `airbyte-integrations/connector-templates/generator` to generate the basic file structure.

See the [tutorial for a complete connector definition](tutorial/6-testing.md)

## Object instantiation

This section describes the object that are to be instantiated from the YAML definition.

If the component is a literal, then it is returned as is:

```
3
```

will result in

```
3
```

If the component is a mapping with a "class_name" field,
an object of type "class_name" will be instantiated by passing the mapping's other fields to the constructor

```
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldu this just be written as YAML to stay consistsent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, done

"class_name": "fully_qualified.class_name",
"a_parameter: 3,
"another_parameter: "hello"
}
```

will result in

```
fully_qualified.class_name(a_parameter=3, another_parameter="helo"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fully_qualified.class_name(a_parameter=3, another_parameter="helo"
fully_qualified.class_name(a_parameter=3, another_parameter="hello"

```

If the component definition is a mapping with a "type" field,
the factory will lookup the [CLASS_TYPES_REGISTRY](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/declarative/parsers/class_types_registry.py) and replace the "type" field by "class_name" -> CLASS_TYPES_REGISTRY[type]
and instantiate the object from the resulting mapping

If the component definition is a mapping with neither a "class_name" nor a "type" field,
the factory will do a best-effort attempt at inferring the component type by looking up the parent object's constructor type hints.
If the type hint is an interface present in [DEFAULT_IMPLEMENTATIONS_REGISTRY](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/declarative/parsers/default_implementation_registry.py,
then the factory will create an object of its default implementation.

If the component definition is a list, then the factory will iterate over the elements of the list,
instantiate its subcomponents, and return a list of instantiated objects.

If the component has subcomponents, the factory will create the subcomponents before instantiating the top level object

```
{
"type": TopLevel
"param":
{
"type": "ParamType"
"k": "v"
}
}
```

will result in

```
TopLevel(param=ParamType(k="v"))
```

More details on object instantiation can be found [here](https://airbyte-cdk.readthedocs.io/en/latest/api/airbyte_cdk.sources.declarative.parsers.html?highlight=factory#airbyte_cdk.sources.declarative.parsers.factory.DeclarativeComponentFactory).

### $options

Parameters can be passed down from a parent component to its subcomponents using the $options key.
This can be used to avoid repetitions.

```
outer:
$options:
MyKey: MyValue
inner:
k2: v2
```

This the example above, if both outer and inner are types with a "MyKey" field, both of them will evaluate to "MyValue".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if inner.options.MyKey is defined to YourValue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added clarification with an example


The value can also be used for string interpolation:

```
outer:
$options:
MyKey: MyValue
inner:
k2: "MyKey is {{ options.MyKey }}"
```

In this example, outer.inner.k2 will evaluate to "MyValue"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the interpolation this would evaluate to "MyKey is MyValue" correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, fixed.


## References

Strings can contain references to values previously defined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Strings can contain references to values previously defined.
Strings can contain references to previously defined values.

nit

The parser will dereference these values to produce a complete ConnectionDefinition

References can be defined using a *ref(<arg>) string.

```
key: 1234
reference: "*ref(key)"
```

will produce the following definition:

```
key: 1234
reference: 1234
```

This also works with objects:

```
key_value_pairs:
k1: v1
k2: v2
same_key_value_pairs: "*ref(key_value_pairs)"
```

will produce the following definition:

```
key_value_pairs:
k1: v1
k2: v2
same_key_value_pairs:
k1: v1
k2: v2
```

The $ref keyword can be used to refer to an object and enhance it with addition key-value pairs

```
key_value_pairs:
k1: v1
k2: v2
same_key_value_pairs:
$ref: "*ref(key_value_pairs)"
k3: v3
```

will produce the following definition:

```
key_value_pairs:
k1: v1
k2: v2
same_key_value_pairs:
k1: v1
k2: v2
k3: v3
```

References can also point to nested values.
Nested references are ambiguous because one could define a key containing with `.`
in this example, we want to refer to the limit key in the dict object:

```
dict:
limit: 50
limit_ref: "*ref(dict.limit)"
```

will produce the following definition:

```
dict
limit: 50
limit-ref: 50
```

whereas here we want to access the `nested.path` value.

```
nested:
path: "first one"
nested.path: "uh oh"
value: "ref(nested.path)
```

will produce the following definition:

```
nested:
path: "first one"
nested.path: "uh oh"
value: "uh oh"
```

to resolve the ambiguity, we try looking for the reference key at the top level, and then traverse the structs downward
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to resolve the ambiguity, we try looking for the reference key at the top level, and then traverse the structs downward
To resolve the ambiguity, we try looking for the reference key at the top level, and then traverse the structs downward

until we find a key with the given path, or until there is nothing to traverse.

More details on referencing values can be found [here](https://airbyte-cdk.readthedocs.io/en/latest/api/airbyte_cdk.sources.declarative.parsers.html?highlight=yamlparser#airbyte_cdk.sources.declarative.parsers.yaml_parser.YamlParser).

## String interpolation

String values can be evaluated as Jinja2 templates.

Interpolation strategy using the Jinja2 template engine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads more like a title header. But also feels redundant, we could probably just remove the sentence

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted


If the input string is a raw string, the interpolated string will be the same.
`"hello world" -> "hello world"`

The engine will evaluate the content passed within {{}}, interpolating the keys from context-specific arguments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The engine will evaluate the content passed within {{}}, interpolating the keys from context-specific arguments.
The engine will evaluate the content passed within `{{...}}`, interpolating the keys from context-specific arguments.

the "options" keyword [see ($options)](connector-definition.md#object-instantiation) can be referenced.

For example, inner_object.key will evaluate to "Hello airbyte" at runtime.

```
some_object:
$options:
name: "airbyte"
inner_object:
key: "Hello {{ options.name }}"
```

Some components also pass in additional arguments to the context.
This is the case for the [record selector](record-selector.md), which passes in an additional `response` argument.

In additional to passing additional values through the kwargs argument, macros can be called from within the string interpolation.
For example,
`"{{ max(2, 3) }}" -> 3`

The macros available can be found [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/declarative/interpolation/macros.py).

Additional information on jinja templating can be found at https://jinja.palletsprojects.com/en/3.1.x/templates/#
Loading