phi-3-vision

Here are 9 public repositories matching this topic...

roboflow / maestro

streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL

transformers vqa objectdetection captioning fine-tuning multimodal vision-and-language phi-3-vision paligemma florence-2

Updated Dec 16, 2024
Python

mbzuai-oryx / LLaVA-pp

Star

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

conversation lmms vision-language llm llava llama3 phi3 llava-llama3 llava-phi3 llama3-llava phi3-llava llama-3-vision phi3-vision llama-3-llava phi-3-llava llama3-vision phi-3-vision

Updated Jul 10, 2024
Python

retkowsky / Azure-OpenAI-demos

Star

Azure OpenAI (demos, documentation, accelerators).

azure embeddings openai azure-cognitive-services dall-e gpt-4 azure-openai llm chatgpt langchain-python phi-3 phi-3-vision gpt-4o

Updated Oct 9, 2024
Jupyter Notebook

JosefAlbers / Phi-3-Vision-MLX

Star

Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon

Updated Sep 7, 2024
Jupyter Notebook

bhimrazy / chat-with-phi-3-vision

Star

Chat with Phi 3.5/3 Vision LLMs. Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision.

chat-application huggingface streamlit phi-3-vision litserve