Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDK: Support YAML specs #11668

Closed
4 of 6 tasks
sherifnada opened this issue Apr 1, 2022 · 1 comment
Closed
4 of 6 tasks

CDK: Support YAML specs #11668

sherifnada opened this issue Apr 1, 2022 · 1 comment
Assignees
Labels
CDK Connector Development Kit team/extensibility type/enhancement New feature or request

Comments

@sherifnada
Copy link
Contributor

sherifnada commented Apr 1, 2022

Tell us about the problem you're trying to solve

To declare a spec, a connector developer often places a spec.json file in the connector module directory (example). The CDK automatically looks for a spec.json file and uses that as the spec where possible. Specs are most commonly created in JSON (sometimes in Pydantic), which is a bit more cumbersome to work with than YAML.

Describe the solution you’d like

I want the CDK to support declaring specs as YAMLs.

implementation

Code changes

  1. add pyyaml to the CDK's dependencies in setup.py and add import yaml in files you're doing yaml parsing
  2. in connector.py look for a .yaml file, prioritizing it over a .json file. If it is found, read the YAML file. You'll probably also want to log a warning or raise an exception to the developer if you find both a .json and a .yaml, letting them know that YAML takes precedence and that they really shouldn't use both simultaneously.

Testing

The current function (which looks for a .json to read by default) is not unit tested for no good reason. As part of this ticket, we should add unit tests in test_connector.py to verify all these behaviors. It may be a good idea to do this first (a la TDD) before doing the refactor to support YAMLs.

Docs changes

Edit this piece of the docs to indicate that it should be possible to use YAML files. Although, it might be better to move the section regarding spec under the Source section, as that behavior (autoreading json/yaml specs) also applies to the Source class.

Publish the new CDK version

instructions in README or here. Make sure to update the changelog.md file!

Update a single downstream consumer

Make a PR to any connector (let's say source-stripe) bumping its CDK dependency and changing its spec to be a YAML, then publish a new version of that connector

Acceptance criteria:

  1. Create a new convention that in addition to dropping a spec.json file in the connector's module directory, the developer could instead declare a spec.yaml file which works exactly the same way but uses YAML. By default, the CDK should first look for a spec.yaml and if not found, a spec.json.
  2. Update the CDK docs to inform the user about this (i may have missed a couple more references to spec in the docs, make sure to look around to see if we need to update more docs)
  3. create follow-up tickets to change the code generator and tutorials to use YAMLs instead of JSONs

Follow up work

  • Do we want to update all connectors to use YAML instead of JSON? we can do this completely programmatically?
  • airbytehq/airbyte-internal-issues#557

Describe the alternative you’ve considered or used

PyDantic is also an option. We could declare a spec.py file which leverages pydantic, which would also be nice.

@sherifnada sherifnada added type/enhancement New feature or request good first issue CDK Connector Development Kit labels Apr 1, 2022
@noahkawasakigoogle
Copy link
Contributor

noahkawasakigoogle commented Apr 1, 2022

+1 on YAML! Way easier to read and way more conventional for IaC stuff and dbt people, etc. I assumed there was some reason JSON was being used. Would this not be applicable to all connectors, not just CDK?

Dont know much about pydantic but seems like it would be harder for the Java core of airbyte to work with if there was ever a refactor where core needed to pull spec info from these files. Not familiar enough with all the places spec's are used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit team/extensibility type/enhancement New feature or request
Projects
Status: Done
Development

No branches or pull requests

4 participants