-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Artem1205/lowcode cdk authenticator schemaloader #19718
Artem1205/lowcode cdk authenticator schemaloader #19718
Conversation
…tor-schemaloader' into artem1205/lowcode-cdk-authenticator-schemaloader # Conflicts: # airbyte-cdk/python/airbyte_cdk/sources/declarative/schema/json_file_schema_loader.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/schema/json_file_schema_loader.py
Outdated
Show resolved
Hide resolved
airbyte-cdk/python/airbyte_cdk/sources/declarative/auth/oauth.py
Outdated
Show resolved
Hide resolved
@artem1205 apologies for the delay in review, will review tomorrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Echoing Alex, it would be very helpful in the PR description to more concretely explain what the use case is for the reference resolution in the schema loader. Also had some notes around how the authenticator is being refactored
@@ -29,10 +29,9 @@ def get_auth_header(self) -> Mapping[str, Any]: | |||
def get_access_token(self) -> str: | |||
"""Returns the access token""" | |||
if self.token_has_expired(): | |||
t0 = pendulum.now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one slight drawback to moving the now()
generation into the authenticator set token expiry methods is that we do have a very small corner case where we may attempt a request with a token that has expired.
When we were using pendulum.now()
before refreshing the access token, the corner case meant we might think a token is already expired and refresh it earlier than necessary.
However, now with the reverse of calling pendulum.now()
after already refreshing the token, then at the very end of expiry, we may attempt a request where the token has already expired.
I'd argue that refreshing the token a little more often than necessary is better than attempting requests with an already expired token so just to be on the safe side, I think we should continue to get the current time before calling self.refresh_access_token()
and pass it into self.set_token_expiry_date()
Its a little verbose to pass both params in, but we're already changing the interface regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
airbyte-cdk/python/airbyte_cdk/sources/declarative/schema/json_file_schema_loader.py
Outdated
Show resolved
Hide resolved
return resolved | ||
|
||
|
||
def resolve_ref_links(obj: Any) -> Union[Dict[str, Any], List[Any]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some unit tests to validate some each of the scenarios that can be recursed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The oauth changes look good, but thinking about the references a little more, wanted to get your thoughts on a slight refactor of the file schema loader to reuse some more code.
@@ -63,7 +68,8 @@ def get_json_schema(self) -> Mapping[str, Any]: | |||
raw_schema = json.loads(raw_json_file) | |||
except ValueError as err: | |||
raise RuntimeError(f"Invalid JSON file format for file {json_schema_path}") from err | |||
return raw_schema | |||
self.__resource = resource | |||
return self._resolve_schema_references(raw_schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we're mostly copying these two methods from the existing code in schema_helpers.py
, it might be interesting to make the JsonFileSchemaLoader a subclass of ResourceSchemaLoader
. And we can refactor the SchemaLoader
interface to name get_json_schema()
to get_schema()
. And this class json_file_schema_loader
would just be responsible for the overriding get_schema()
and can reuse the dependent methods that were originally duplicated.
It does expand the scope of changes a little bit, but would help us cut down on the code duplication. Does this all make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I've refactored it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, the way the method names are defined, we won't be properly invoking schema retrieval according to the declarative interface SchemaLoader
. I mentioned a few alternatives in the comments
airbyte-cdk/python/airbyte_cdk/sources/declarative/schema/json_file_schema_loader.py
Outdated
Show resolved
Hide resolved
@@ -50,7 +51,7 @@ def __post_init__(self, options: Mapping[str, Any]): | |||
self.file_path = _default_file_path() | |||
self.file_path = InterpolatedString.create(self.file_path, options=options) | |||
|
|||
def get_json_schema(self) -> Mapping[str, Any]: | |||
def get_schema(self) -> Mapping[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are renaming get_json_schema()
to get_schema()
here, we will also need to rename it on the parent interface class SchemaLoader
not just in DefaultSchemaLoader
There are a few places where we use SchemaLoader.get_json_schema()
at runtime so we need to properly override it so that it gets used everywhere.
Alternatively, if we want to simplify this, we could also leave this as get_json_schema()
, so the parent interface doesn't have to change. And from the perspective of the declarative components we just don't reuse the parent class ResourceSchemaLoader.get_schema(self, name: str)
. And this still allows us to reuse the recursive resolver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what do mean about renaming, as I override get_schema
which present in ResourceSchemaLoader
.
@sherifnada , I would kindly ask you to review it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@artem1205
JsonFileSchemaLoader
also extends SchemaLoader
so it needs to implement get_json_schema
too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've overwritten get_json_schema
-> get_schema
https://github.com/airbytehq/airbyte/pull/19718/files#diff-eebfee995e9a963a52d517c0ebcab9579872dee179cf961719740e6c3fdddc9fR40.
So we do not really need SchemaLoader
anymore. I can remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be getting rid of SchemaLoader
. This JsonFileSchemaLoader and the DefaultSchemaLoader are intended to be subclasses of the SchemaLoader
. From within parts of the low-code CDK, we invoke this get_json_schema()
at runtime. For example here: https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/declarative/declarative_stream.py#L132
If we rename this method in this class, we should also rename it in SchemaLoader
as well instead of removing the interface altogether. And then we also rename wherever it's being used like in the file I mentioned. Just renaming this method name will probably break out backend because we're using it in a few different places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Refactored, renamed it all back to get_json_schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me and the latest merge of master looks to have fixed some of the unrelated test failures. Let just address the two formatting issues that are failing the CI checks and then I'll approve along with the CDK version bump.
@@ -1,7 +1,6 @@ | |||
# | |||
# Copyright (c) 2022 Airbyte, Inc., all rights reserved. | |||
# | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be added back otherwise our file formatter in CI is going to continue failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -94,8 +96,11 @@ def get_refresh_request_body(self) -> Mapping[str, Any]: | |||
def get_token_expiry_date(self) -> pendulum.DateTime: | |||
return self._token_expiry_date | |||
|
|||
def set_token_expiry_date(self, value: pendulum.DateTime): | |||
self._token_expiry_date = value | |||
def set_token_expiry_date(self, initial_time: pendulum.DateTime, value: Union[str, int]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also an extra space between parameters which is throwing off the formatter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes look good! thanks for incorporating the feedback in
/publish-cdk dry-run=false
|
How's this one going? |
Already Merged and closed. |
What
#19753 - Declarative schema loader $ref
#19752 - Oauth Authenticator expiry date format
DeclarativeOauth2Authenticator
expectsexpires_in
in seconds (in get_access_token method). New feature makes it possible to specify date format (if not seconds)*.json
files inshared
folder, e.g.{"$ref": "file.json"}
now. This PR fixes it.shared
folder to resolve '$ref' links.How
token_expiry_date_format
field to Declarative OAuth Authenticator.Recommended reading order
x.java
y.python
🚨 User Impact 🚨
Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changesTests
Unit
Put your unit tests output here.
Integration
Put your integration tests output here.
Acceptance
Put your acceptance tests output here.