Replies: 1 comment 7 replies
-
757MB and 25GB. I would definitely think we should go with Option 2. Is this needed as part of a test job? If so, we need a way so that we could easily just rerun a code and set it up in Staging workspace. |
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In some of the vision use cases, the data is available as an archive at a given url, example:
To benchmark our components on those, we need not just a dataset registration, but a step that will untar those archives, and potentially split into train/validation sets.
There are two options here:
What would be the preference?
For option 1, I was thinking about creating CLI jobs that you run once in your workspace to register the datasets and be done with it.
Beta Was this translation helpful? Give feedback.
All reactions