You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I raised 2 questions while trying to reproduce the experiments.
Why does the reward curve for prompt-image alignment shown in this repo fluctuate greatly, but the reward curve shown in Figure 5 in the paper is very smooth?
While I was experimenting, the bert scores are extremely high. For example, for the noisy image below, LLaVA generated a description: "In the image, there is a colorful, abstract, and blurry background with a mix of colors and patterns. The background is filled with a variety of colors, creating a visually interesting and dynamic scene. " The bert score of this description with the prompt "a pig riding a bike" is about 0.82. I think this may be due to the fact that the bert model I used is different from yours. So I would like to know more details about the bert used in your experiments.
Very thanks if you can give me some help :-)
The text was updated successfully, but these errors were encountered:
Many thanks for conducting this excellent work!
I raised 2 questions while trying to reproduce the experiments.
Very thanks if you can give me some help :-)
The text was updated successfully, but these errors were encountered: