Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vision Categories! #3639

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Vision Categories! #3639

wants to merge 12 commits into from

Conversation

lisadunlap
Copy link
Collaborator

@lisadunlap lisadunlap commented Dec 10, 2024

Why are these changes needed?

We have been letting language have all the fun for long enough. Vision will rise again. Join the movement.

  • updated label.py to include a vision option
  • added vision category prompts
  • added support for anthropic and gemini models
  • added vision to overall leaderboard tab
  • abstracted style control so we don't need to duplicate everything in monitor_md

Checks

  • I've run format.sh to lint the changes in this PR.
  • I've included any doc changes needed.
  • I've made sure the relevant tests are passing (if applicable).

input_data["image_hash"] = input_data.conversation_a.map(
lambda convo: convo[0]["content"][1][0]
)
input_data["image_path"] = input_data.image_hash.map(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we get the image_hash here?

Copy link
Collaborator Author

@lisadunlap lisadunlap Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the conversation, is the format still [{content: [text, [images]]}?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the way i have it now doesn't support multi-image

@BabyChouSr
Copy link
Collaborator

I tested out the PR and the leaderboard works well.

Comment on lines +446 to +448
category_name = key_to_category_name[k.replace("_style_control", "")]
if "_style_control" in k:
category_name = f"{category_name} w/ Style Control"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but could you explain what's going on here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah. Based on the previous code logic the stylecontrol leaderboards have the _style_control in their key. Instead of having duplicates for every category name in monitor_md like we have now we add it in here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants