Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: support partitioned datasets #30

Open
12 tasks
shaunc opened this issue Feb 24, 2022 · 0 comments
Open
12 tasks

Feature: support partitioned datasets #30

shaunc opened this issue Feb 24, 2022 · 0 comments
Labels
backlog implementation delayed breakdown Break down issues w/ >1D expected implementation feature tracking issue for feature

Comments

@shaunc
Copy link
Collaborator

shaunc commented Feb 24, 2022

Support kedro's partitioned datasets as dvc remote-url imports.

We create a .dvc in the format of dvc import-url for the partitioned dataset.

Cases

  • Does the data exist and is accessible?
    • Yes
    • ERR: no
    • ERR: credentials wrong
  • .dvc status
    • not exists
    • already exists
    • exists, but not configured for directory
    • exists, with different remote
  • [ ]

Scenarios

  • create or update
    • if .dvc doesn't exist, or exists with matching config as would be generated, create or leave alone
    • if .dvc exists with not matching config: error (msg: "mismatch: [print path]): remove to regnerate." with detail)
@shaunc shaunc added feature tracking issue for feature breakdown Break down issues w/ >1D expected implementation backlog implementation delayed labels Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog implementation delayed breakdown Break down issues w/ >1D expected implementation feature tracking issue for feature
Projects
Status: No status
Development

No branches or pull requests

1 participant