-
Notifications
You must be signed in to change notification settings - Fork 27.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BLIP #20716
Merged
Merged
Add BLIP #20716
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The documentation is not available anymore as the PR was closed or merged. |
NielsRogge
reviewed
Dec 12, 2022
NielsRogge
reviewed
Dec 12, 2022
NielsRogge
reviewed
Dec 12, 2022
NielsRogge
reviewed
Dec 12, 2022
NielsRogge
reviewed
Dec 12, 2022
NielsRogge
reviewed
Dec 15, 2022
NielsRogge
reviewed
Dec 15, 2022
This was referenced Dec 15, 2022
NielsRogge
reviewed
Dec 15, 2022
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
The failing CI test seems to be related to #20790 |
MKhalusova
pushed a commit
to MKhalusova/transformers
that referenced
this pull request
Dec 28, 2022
* add new model like * add v1 * v1 * v1 * vision encoder logits match * v2 * fix * add docstring * CI tests pass * fix tests * make fixup * add to `toctree` * fix processors * fix processors * fix doc * fill title * add content doc * remove from tokenization auto * fix config * change order * add `# Copied from` * few fixes - add correct license on modeling text - remove dummy argument * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * replace name * refactor a bit * more refactor * remove unused arg * make fixup + remove some `# Adapted from ...` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * more `# Copied from` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * now `generate` supports no prefix * remove `FeatureExtractor` * fix path * correct dependency * fix tests * few fixes * add integration tests * add correct conversion script * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add `blip` to tokenization auto * fix docstrings * fix test + add image * remove processor from uncorrect place * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean up a bit * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean pixel mask * clean pixel mask * fix `F` * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix output * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix pad token id * remove `token_type_ids` * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add comments * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove `token_type_ids` * make fixup * better name * replace with `image_attention_mask` * refactor * make fixup * better docstring * replace `answer_xx` * remove ununsed args * add `labels` * add `labels` * fix processing tests * make fixup * make fixup * put correct repo * remove `pad` * remove `crop` and `center_crop` * Update src/transformers/models/blip/image_processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix * remove `size_divisor` * fix weights `init` * remove unneeded functions * add suggestions * minor changes - change slow test output for PT 1.13 - docstring order * replace `feature_extractor` by `image_processor` * fix doctests * fix weight init order + add fp16 slow test * add `blip` to doctest * add correct repo name and fix test * Update src/transformers/models/blip/processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix tests * use `convert_to_rgb` from `image_transforms` * make fixup * fix large loading issue Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
amyeroberts
pushed a commit
to amyeroberts/transformers
that referenced
this pull request
Jan 4, 2023
* add new model like * add v1 * v1 * v1 * vision encoder logits match * v2 * fix * add docstring * CI tests pass * fix tests * make fixup * add to `toctree` * fix processors * fix processors * fix doc * fill title * add content doc * remove from tokenization auto * fix config * change order * add `# Copied from` * few fixes - add correct license on modeling text - remove dummy argument * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * replace name * refactor a bit * more refactor * remove unused arg * make fixup + remove some `# Adapted from ...` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * more `# Copied from` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * now `generate` supports no prefix * remove `FeatureExtractor` * fix path * correct dependency * fix tests * few fixes * add integration tests * add correct conversion script * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add `blip` to tokenization auto * fix docstrings * fix test + add image * remove processor from uncorrect place * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean up a bit * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean pixel mask * clean pixel mask * fix `F` * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix output * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix pad token id * remove `token_type_ids` * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add comments * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove `token_type_ids` * make fixup * better name * replace with `image_attention_mask` * refactor * make fixup * better docstring * replace `answer_xx` * remove ununsed args * add `labels` * add `labels` * fix processing tests * make fixup * make fixup * put correct repo * remove `pad` * remove `crop` and `center_crop` * Update src/transformers/models/blip/image_processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix * remove `size_divisor` * fix weights `init` * remove unneeded functions * add suggestions * minor changes - change slow test output for PT 1.13 - docstring order * replace `feature_extractor` by `image_processor` * fix doctests * fix weight init order + add fp16 slow test * add `blip` to doctest * add correct repo name and fix test * Update src/transformers/models/blip/processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix tests * use `convert_to_rgb` from `image_transforms` * make fixup * fix large loading issue Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
silverriver
pushed a commit
to silverriver/transformers
that referenced
this pull request
Jan 6, 2023
* add new model like * add v1 * v1 * v1 * vision encoder logits match * v2 * fix * add docstring * CI tests pass * fix tests * make fixup * add to `toctree` * fix processors * fix processors * fix doc * fill title * add content doc * remove from tokenization auto * fix config * change order * add `# Copied from` * few fixes - add correct license on modeling text - remove dummy argument * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * replace name * refactor a bit * more refactor * remove unused arg * make fixup + remove some `# Adapted from ...` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * more `# Copied from` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * now `generate` supports no prefix * remove `FeatureExtractor` * fix path * correct dependency * fix tests * few fixes * add integration tests * add correct conversion script * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add `blip` to tokenization auto * fix docstrings * fix test + add image * remove processor from uncorrect place * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean up a bit * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean pixel mask * clean pixel mask * fix `F` * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix output * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix pad token id * remove `token_type_ids` * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add comments * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove `token_type_ids` * make fixup * better name * replace with `image_attention_mask` * refactor * make fixup * better docstring * replace `answer_xx` * remove ununsed args * add `labels` * add `labels` * fix processing tests * make fixup * make fixup * put correct repo * remove `pad` * remove `crop` and `center_crop` * Update src/transformers/models/blip/image_processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix * remove `size_divisor` * fix weights `init` * remove unneeded functions * add suggestions * minor changes - change slow test output for PT 1.13 - docstring order * replace `feature_extractor` by `image_processor` * fix doctests * fix weight init order + add fp16 slow test * add `blip` to doctest * add correct repo name and fix test * Update src/transformers/models/blip/processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix tests * use `convert_to_rgb` from `image_transforms` * make fixup * fix large loading issue Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
venkat-natchi
pushed a commit
to venkat-natchi/transformers
that referenced
this pull request
Jan 22, 2023
* add new model like * add v1 * v1 * v1 * vision encoder logits match * v2 * fix * add docstring * CI tests pass * fix tests * make fixup * add to `toctree` * fix processors * fix processors * fix doc * fill title * add content doc * remove from tokenization auto * fix config * change order * add `# Copied from` * few fixes - add correct license on modeling text - remove dummy argument * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * replace name * refactor a bit * more refactor * remove unused arg * make fixup + remove some `# Adapted from ...` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * more `# Copied from` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * now `generate` supports no prefix * remove `FeatureExtractor` * fix path * correct dependency * fix tests * few fixes * add integration tests * add correct conversion script * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add `blip` to tokenization auto * fix docstrings * fix test + add image * remove processor from uncorrect place * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean up a bit * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean pixel mask * clean pixel mask * fix `F` * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix output * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix pad token id * remove `token_type_ids` * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add comments * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove `token_type_ids` * make fixup * better name * replace with `image_attention_mask` * refactor * make fixup * better docstring * replace `answer_xx` * remove ununsed args * add `labels` * add `labels` * fix processing tests * make fixup * make fixup * put correct repo * remove `pad` * remove `crop` and `center_crop` * Update src/transformers/models/blip/image_processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix * remove `size_divisor` * fix weights `init` * remove unneeded functions * add suggestions * minor changes - change slow test output for PT 1.13 - docstring order * replace `feature_extractor` by `image_processor` * fix doctests * fix weight init order + add fp16 slow test * add `blip` to doctest * add correct repo name and fix test * Update src/transformers/models/blip/processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix tests * use `convert_to_rgb` from `image_transforms` * make fixup * fix large loading issue Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Awesome contribution! Thank you @younesbelkada. Just noticed Salesforce released BLIP2. Not sure how much work it would be to port to huggingface. |
miyu386
pushed a commit
to miyu386/transformers
that referenced
this pull request
Feb 9, 2023
* add new model like * add v1 * v1 * v1 * vision encoder logits match * v2 * fix * add docstring * CI tests pass * fix tests * make fixup * add to `toctree` * fix processors * fix processors * fix doc * fill title * add content doc * remove from tokenization auto * fix config * change order * add `# Copied from` * few fixes - add correct license on modeling text - remove dummy argument * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * replace name * refactor a bit * more refactor * remove unused arg * make fixup + remove some `# Adapted from ...` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * more `# Copied from` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * now `generate` supports no prefix * remove `FeatureExtractor` * fix path * correct dependency * fix tests * few fixes * add integration tests * add correct conversion script * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add `blip` to tokenization auto * fix docstrings * fix test + add image * remove processor from uncorrect place * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean up a bit * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean pixel mask * clean pixel mask * fix `F` * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix output * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix pad token id * remove `token_type_ids` * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add comments * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove `token_type_ids` * make fixup * better name * replace with `image_attention_mask` * refactor * make fixup * better docstring * replace `answer_xx` * remove ununsed args * add `labels` * add `labels` * fix processing tests * make fixup * make fixup * put correct repo * remove `pad` * remove `crop` and `center_crop` * Update src/transformers/models/blip/image_processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix * remove `size_divisor` * fix weights `init` * remove unneeded functions * add suggestions * minor changes - change slow test output for PT 1.13 - docstring order * replace `feature_extractor` by `image_processor` * fix doctests * fix weight init order + add fp16 slow test * add `blip` to doctest * add correct repo name and fix test * Update src/transformers/models/blip/processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix tests * use `convert_to_rgb` from `image_transforms` * make fixup * fix large loading issue Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
BLIP is a model from salesforce, capable of performing Visual question answering, image captioning and image-text retrieval. This model has been also used in several Stable-diffusion finetuned variants, such as Pokemon stable diffusion or Naruto Stable diffusion to generate text descriptions from images in order to create text-image paired dataset.
Original repo: https://github.com/salesforce/BLIP
Users would be able to use Blip for three main usecases:
1- Conditional Generation (Image captioning):
1- bis Conditional Generation (Image captioning with no prefix!):
2- Visual question answering
3- Image text retrieval (score matching)
cc @NielsRogge
Fixes salesforce/LAVIS#64