You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
📝 A curated list about Personalized Multimodal Models, Personalized Representation Learning~ 📚
Problem Settings: Using 3-5 images of a novel concept/subject (e.g., a pet named <bo>), can we personalize Large Multimodal Models so that: (1) They retain their original capabilities (e.g., Describe a dog) while (2) Enabling tailored their capabilities for the novel concept? (e.g., Describe <bo>)
Please feel free to create pull requests or an issue to add/ correct anything. I really appreciate any help or clarification!
* 🙋♀️ Personalization has been extensively explored in AI/ML/CV... It's now time for personalizing Large Multimodal Models! 🙋♀️*
Over the years, we’ve witnessed the evolution of personalization across various tasks (e.g., object segmentation, image generation). Now, with the rise of Large Multimodal Models (LMMs) -- We have opportunities to personalizing these generalist, large-scale AI systems. It’s time to take the leap and bring personalization into the realm of Large Multimodal Models, making them not only powerful but also user-specific!
^ Above caption are actually generated by GPT-4o, I feed it the figure and asked it to generate a caption, haha!
(This figure is created by me. If there is anything incorrect, please feel free to correct me! Thank you!)
Papers
⚠️ Minor Note: The listed works below are specified for settings where users provide 3-5 images, and the system needs to learn about those concepts. There is research on other subtopics (e.g., role-playing, persona, etc.). For these topics, this repo might provide better coverage.