-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: add DynamoDB Source #71
Conversation
This reverts commit 45c1628.
use a dict to keep a mapping of schemes to source class, instead of using an else-if chain
It's overridden by the CLI, so it's moot.
@@ -93,6 +93,44 @@ def parse_scheme_from_uri(uri: str) -> str: | |||
class SourceDestinationFactory: | |||
source_scheme: str | |||
destination_scheme: str | |||
sources: Dict[str, Type[SourceProtocol]] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a good example of an implementation that could have been in another PR, since this has nothing to do with the current PR. I'd suggest keeping these out for the future
ingestr/src/sources.py
Outdated
) | ||
|
||
incremental = None | ||
incremental_key = kwargs.get("incremental_key", "").strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might wanna make the default None
here as well, otherwise this could change the behavior
docs/supported-sources/dynamodb.md
Outdated
|
||
To obtain the access keys, use the IAM console on AWS. See [IAM Documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html) for more information. | ||
|
||
### Example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please add an incremental example?
# connection pooling causes issues with duckdb, when the connection | ||
# is reused below, so we disable pooling. | ||
dest_engine = sqlalchemy.create_engine(dest_uri, poolclass=NullPool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is pretty clever, I had to do an uglier solution before, didn't know I could do this 👍
ingestr/main_test.py
Outdated
assert rows[i][1] == pendulum.parse(dynamodb.data[i]["updated_at"]) | ||
|
||
# ingest the rest | ||
result = invoke_ingest_command( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not run another attempt once more, similar to the above, to check the data? also, I'd suggest having some tests for other strategies like append or delete insert.
SourceDestinationFactory