Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run Datacomp at scale #319
Run Datacomp at scale #319
Changes from 32 commits
f1dbf76
cd1c7fc
a4fb134
dc83431
ee4a8e0
f3c040f
18d7f9a
2b31bfc
471dc43
8fa0218
d3165ac
3a5179d
f436c72
e4986ca
b09abac
ed18fb9
ddc2ca7
68122c8
a6f6498
5938106
4fdf320
57f210e
01b1cd2
7a055d1
bb08810
eb7550e
eba1214
4704b88
9db9511
85ca6b6
9195f94
3c9ea91
06b316c
7db2865
a366ee0
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GeorgesLorre do we still need to revert the docker image to
main
or do they get automatically tagged asdev
during merge?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int16 maxes out at 32.767 which is why you may be having issues if your image's width was 36.000
However, I think this should be resolved if you move to
int32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would avoid having the user specify the length of the dataset and read it from the hf metadata directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is based on https://stackoverflow.com/questions/47571715/dask-create-strictly-increasing-index