Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix encode_batch and encode_batch_fast to accept ndarrays again #1679

Merged
merged 2 commits into from
Nov 21, 2024

Conversation

diliop
Copy link
Contributor

@diliop diliop commented Nov 8, 2024

This is a follow up to my comment from here which essentially reverts the PyList change from #1665 and PySequence change from #1673 with regards to the input arg for encode_batch and encode_batch_fast, back to Vec<..>. This allows passing ndarray together with list and tuple as input types. I also turned on the tests that were turned off before to make sure that this change for encode_batch is covered. I will follow up with adding more tests for encode_batch_fast but prefer to get this out sooner than later.

@diliop diliop mentioned this pull request Nov 8, 2024
@diliop
Copy link
Contributor Author

diliop commented Nov 11, 2024

@ArthurZucker mind giving this a look since I think sooner than later someone is going to complain about encode_batch or encode_batch_fast not accepting ndarray types 😞

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM would be nice to add a case with ndarrays in the test encode format 🤗 LGTM otherwise, good catch

@ArthurZucker
Copy link
Collaborator

(Sorry about the delay we were on a company wide offsite 😅 🌴

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ArthurZucker
Copy link
Collaborator

Could you update with clippy as well? 🤗

@@ -152,8 +152,6 @@ def test_encode(self):
assert len(output) == 2

def test_encode_formats(self, bert_files):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArthurZucker there are already a set of tests covering np.array - did you have something else in mind?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope I checked and found then so approved and good to go!

@ArthurZucker ArthurZucker merged commit ac34660 into huggingface:main Nov 21, 2024
28 checks passed
@ArthurZucker
Copy link
Collaborator

Thanks! 🤗

ArthurZucker pushed a commit that referenced this pull request Nov 26, 2024
* Fix encode_batch and encode_batch_fast to accept ndarrays again

* Fix clippy

---------

Co-authored-by: Dimitris Iliopoulos <diliopoulos@fb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants