Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Clean-up] Planned removal of the max_size argument #35090

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

HichTala
Copy link

@HichTala HichTala commented Dec 4, 2024

What does this PR do?

This PR aims to serve as a progress tracker for updating the size configurations of detection models on the Hugging Face Hub. The updates focus on aligning the configs with the deprecation of the max_size argument while ensuring backward compatibility where needed. Each update will be submitted as an individual PR directly to the model repositories on the Hub, following the script and guidance provided by @qubvel.

All individual pull requests made directly to the model repositories on the Hub will be tracked here via links, providing a clear overview of progress and ensuring comprehensive updates across relevant models.

Here's a brief summary of what's been done so far regarding the planned deletion of the max_size argument:

This pull request includes multiple changes across several image processing files in the transformers library. The primary focus is on deprecating the max_size parameter and replacing it with size['longest_edge']. Additionally, backwards compatibility adjustments have been made to handle the size parameter more flexibly.

Deprecation of max_size parameter:

Backwards compatibility adjustments:

It follows an issue and a pull request linked bellow:

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@qubvel

…` and triggered unnecessary deprecated warning
…` and triggered unnecessary deprecated warning
Add a test to ensure test can pass successfully and backward compatibility
Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys
…ze mismatching results with previous processing
@HichTala
Copy link
Author

HichTala commented Dec 4, 2024

I started by testing on my own repository to make sure everything works good and avoid spamming all models.

Here is the link to commit that is returned by push_to_hub method: https://huggingface.co/HichTala/test_pull_requests/commit/bb6b52f2f2039b5f4b1a27c33ebc42af072a9206

The pushed config file adds optional arguments and rearranges the arguments in an order sometimes different from that of the original file. If this is ok, we can start pushing the first 5 models.

@HichTala HichTala changed the title Clean max size [Clean-up] Planned removal of the max_size argument Dec 4, 2024
@qubvel
Copy link
Member

qubvel commented Dec 5, 2024

Hi @HichTala! Thanks for continuing with this issue 🤗

It's great that you tried using your own model! I suggest:

  1. Replacing "feature_extractor_type": "DetrFeatureExtractor" with "image_processor_type": "DetrImageProcessor" because the feature extractor is deprecated.
  2. Removing max_size and replacing size with a dictionary.

This way, it will be easier to merge those pull requests, and the difference will be smaller! I suppose it will require modifying the raw JSON file, instead of saving with ImageProcessor.

@HichTala
Copy link
Author

HichTala commented Dec 5, 2024

Hi @qubvel,
I followed your suggestion, here is the PR in my own model, I think we're ready to start PRing the first 5 models!
I'll probably launch it in the next few days.

@qubvel
Copy link
Member

qubvel commented Dec 6, 2024

Hi @HichTala! Looks good to me, let's proceed with the first five 👍

@HichTala
Copy link
Author

HichTala commented Dec 6, 2024

Hi @qubvel, here are the 5 firsts!

@qubvel
Copy link
Member

qubvel commented Dec 6, 2024

Hi @HichTala! Thanks for creating the PRs and providing links. I just verified them, and everything looks perfect. Special thanks for providing a message with the PR. 🤗 I will ask to merge them and then we can proceed with the next batch.

@qubvel
Copy link
Member

qubvel commented Dec 20, 2024

Hi @HichTala, hope you are doing great! 2 PRs were merged by owners, however we can't merge another PRs on our side, so we have to wait to see if they will be accepted for a while. Let's open more PRs 🤗 I would say the next 20-30 would be enough. And if you can, please provide links to the PRs directly instead of commits, thank you!

@HichTala
Copy link
Author

HichTala commented Dec 20, 2024

Hi @qubvel,
Hope you are doing great aswell, here is the next batch ! I opened 10 PRs, the next ones have less than 100 downloads, should I include them ?

@qubvel
Copy link
Member

qubvel commented Dec 20, 2024

Thank you, 10 is enough! We also need to check the other tags, e.g. zero-shot-object-detection, there might be a max_size too

@HichTala
Copy link
Author

Hi @qubvel,
That's a good point! I looked at other tags and found that the following tags can have the max_size parameter in the preprocess_config file:

  • object-detection
  • image segmentation
  • text-image

I'll try to look in the code to see exactly where it's used and whether I may need to remove some parts as we did for object detection.

@HichTala
Copy link
Author

However, I can confirm that none of the zero-shot-object-detection models or other tags have the max_size parameter in the configurations, but some of them have size as a number and not as a dict. I don't know what we want, do we want to normalize the code or not? I may have to investigate the code a little further to see what's in there and see if it might be good or not.

@qubvel
Copy link
Member

qubvel commented Dec 24, 2024

Hi @HichTala! Thanks a lot for investigating it. Let's limit the scope for max_size now, integer size is fine 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants