Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify oauthSpecification to allow working with oneOfs #6456

Merged
merged 7 commits into from
Sep 27, 2021

Conversation

sherifnada
Copy link
Contributor

@sherifnada sherifnada commented Sep 27, 2021

What

While working on oauth UI/BE interface, we realized the current authSpecification block is insufficiently expressive to encode situations where the oauth credentials live inside a oneOf block. For example, if there is a oneOf block like the following:

credentials: {
oneOf: [
  {
     title: Webflow Oauth
     type: object
     properties:
       client_id:
       client_secret:
       refresh_token:
  },
  {
     title: API Key auth
    type: object
    properties: {
       apiKey
    }
  }
]
}

Then the current authSpecification primitive:

authSpecification:
  type: oauth2.0
  oauth2Specification:
     initParameters: [["credentials", "client_id"], ["credentials", "client_secret"]]

has some problems:

  1. It's inaccurate in that a connector might support multiple auth formats (Oauth2, openID connect, API key, etc...)
  2. it is incapable of expressing differences within a oneOf (or any combination keyword). For example, if there are two slightly different oauth blocks inside a oneOf (e.g: if one can be obtained via the Oauth webflow and the other via an oauth playground that doesn't require a developer app) then there can be name collisions that make it impossible to know when the UI should display the "authorize" button or not.

This leaves us with a few options:

  1. Disable multiple auth options for a connector and if a connector uses oauth, make that the only auth method. This sounds tempting but has some significant drawbacks. For many connectors for an OSS users, oauth2 is strictly worse than API key. This means we need to fork the connectors and have one used in cloud and one used in OSS. But while we were able to fork java connectors a little more easily such as in Create a secure-only Postgres Source #6419, it's more difficult for a python connector in our current monorepo setup for the reasons outlined in Allow using intra-monorepo python dependencies without publishing them to pypi #6455. I would have loved to just go with this limitation for now but unfortunately it is not workable within a reasonable timeline/contains delivery risks.
  2. Handle the oneOf case now via one of the following options
    1. embed the auth specification into nodes in connectionSpecification directly e.g:
    "connectionSpecification": {
        "properties": {
          "credentials": {
            "title": "Authentication mechanism",
            "type": "object",
            "description": "Choose either OAuth2.0 flow or provide your own JWT credentials for service account",
            "oneOf": [
              {
                "type": "object",
                "title": "OAuth2.0 authorization",
                "auth": { <<<<<<<<<< THIS IS THE CUSTOM BLOCK
                   "type": "oauth2.0",
                   "initParams": [["client_id"], ["client_secret"]],
                   "outputParams": [["refresh_token"]]
                },
                "properties": {
                  "option_title": {
                    "type": "string",
                    "const": "Default OAuth2.0 authorization"
                  },
                  "client_id": { "type": "string" },
                  "client_secret": { "type": "string", "airbyte_secret": true },
                  "refresh_token": { "type": "string", "airbyte_secret": true }
                },
                "required": ["client_id", "client_secret", "refresh_token"],
                "additionalProperties": false
              },
              ...
            ]
          }
        }
    }
    
    1. handle the path in the top-level object by adding a property rootObject which tells us which object contains the oauth logic. This way, if the object is ever used inside a oneOf then a UI understands that when this object is selected, the "authorize" button should be shown.
     {
           "connectionSpecification": {
             "properties": {
               "credentials": {
                 "title": "Authentication mechanism",
                 "type": "object",
                 "description": "Choose either OAuth2.0 flow or provide your own JWT credentials for service account",
                 "oneOf": [
                   {
                     "type": "object",
                     "title": "OAuth2.0 authorization",
                     "properties": {
                       "option_title": {
                         "type": "string",
                         "const": "Default OAuth2.0 authorization"
                       },
                       "client_id": { "type": "string" },
                       "client_secret": { "type": "string", "airbyte_secret": true },
                       "refresh_token": { "type": "string", "airbyte_secret": true }
                     },
                     "required": ["client_id", "client_secret", "refresh_token"],
                     "additionalProperties": false
                   },
                   ...
                 ]
               }
           },
           "authSpecification": {
             "auth_type": "oauth2.0",
             "oauth2Specification": {
               "rootObject": ["credentials", 0],
               "oauthFlowInitParameters": [["client_id"], ["client_secret"]]
               "oauthFlowOutputParameters": [["refresh_token"]]
             }
           }
         }
    

The tradeoffs between the approaches are:

  • Approach 2.i (embed/in-line) makes it easier to express multiple possible auth flows. Let’s say you have oneof containing openID connect and oauth2.0. Then auth specification must also have a way of expressing this oneof. The most natural way of doing it is inline. Otherwise you have to follow all this pointer logic because it lives at the top level. drawbacks of that approach is it’s harder to validate and it mixes domain-specific logic inside the connector's spec
  • Approach 2.ii (top-level) keeps the concerns very separated but is probably more difficult for a UI to consume. Plus, it

I'm honestly not sure exactly what the right solution is. I like inlining for the reason mentioned above (more natural way to express things). But
at the same time, inlining doesn't actually give us multiple auth mechanisms (the assumptions baked in the API/UI are that there is only ever one oauth implementation) so it's probably not worth completely changing course to half-implement a feature. Since all of these are two way doors so I'm choosing something and we'll probably have to come back to adjust it in the future.

How

This PR Takes approach 2.ii

Recommended reading order

  1. config.yaml/airbyte_protocol.yaml (they're the same change expressed in both the protocol and the API layer)

@github-actions github-actions bot added area/api Related to the api area/platform issues related to the platform area/documentation Improvements or additions to documentation area/protocol labels Sep 27, 2021
@sherifnada sherifnada temporarily deployed to more-secrets September 27, 2021 08:42 Inactive
@ChristopheDuong ChristopheDuong temporarily deployed to more-secrets September 27, 2021 09:21 Inactive
@ChristopheDuong ChristopheDuong temporarily deployed to more-secrets September 27, 2021 09:46 Inactive
@github-actions github-actions bot added area/connectors Connector related issues CDK Connector Development Kit labels Sep 27, 2021
@ChristopheDuong ChristopheDuong temporarily deployed to more-secrets September 27, 2021 10:01 Inactive
@ChristopheDuong ChristopheDuong deleted the sherif/oauth-improvement branch September 27, 2021 16:00
@avida avida mentioned this pull request Sep 27, 2021
38 tasks
Copy link
Contributor

@cgardens cgardens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherifnada @ChristopheDuong why do we need a rootObject field? why not just use the fully qualified path in oauthFlowInitParameters and oauthFlowOutputParameters. you've structured those fields as arrays of arrays anyway.

@ChristopheDuong
Copy link
Contributor

ChristopheDuong commented Sep 28, 2021

the root object is enforcing that all oauth params have a common parent. Effectively it helps the UI know when to show the oauth button in case auth lives inside a oneOf block

@sherifnada , do we need to publish changes to protocol in some docker images?
Like upgrading the airbyte-protocol's version and also update reference to it such as this one?

FROM airbyte/base-airbyte-protocol-python:0.1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Related to the api area/documentation Improvements or additions to documentation area/platform issues related to the platform area/protocol CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants