Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MM] add a Filter based on the recall of textual entity grouding #114

Closed
HYLcool opened this issue Dec 4, 2023 · 1 comment · Fixed by #139
Closed

[MM] add a Filter based on the recall of textual entity grouding #114

HYLcool opened this issue Dec 4, 2023 · 1 comment · Fixed by #139
Assignees
Labels
dj:multimodal issues/PRs about multimodal data processing enhancement New feature or request

Comments

@HYLcool
Copy link
Collaborator

HYLcool commented Dec 4, 2023

  1. find the phrases in the text (similar to GLIP The noun phrase extraction algorithm microsoft/GLIP#18)
  2. try to locate these phrases in the image
  3. calculate the locating recall and filter out those samples with relatively lower recalls.
@HYLcool HYLcool self-assigned this Dec 4, 2023
@HYLcool HYLcool added enhancement New feature or request dj:multimodal issues/PRs about multimodal data processing labels Dec 4, 2023
@HYLcool HYLcool added this to the Multimodal Support milestone Dec 4, 2023
@HYLcool HYLcool linked a pull request Dec 18, 2023 that will close this issue
Copy link

This issue is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this issue will be closed in 3 day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dj:multimodal issues/PRs about multimodal data processing enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant