Skip to content

Multilingual Multimodal Medical VQA with RAG and COT Reasoning

Notifications You must be signed in to change notification settings

AmuroEita/M3-VQA

Repository files navigation

M3-VQA

M3-VQA, a novel pipeline for multilingual and multimodal biomedical VQA. M3-VQA leverages translation for multilingual inputs, retrieval augmented generation (RAG) for knowledge grounding, and in-context learning (ICL) with Chain-of-Thought prompting for accurate reasoning.

Getting Started

Prerequisites

  1. Get a free API Key for Google Translate and configure locally, please refer to https://cloud.google.com/translate/docs/reference/rest/
  2. Clone the repo
    git clone https://github.com/AmuroEita/M3-VQA.git && cd M3-VQA
  3. Use git lfs fetch the faiss index files
    git lfs pull
  4. Install required Python packages
    pip install -r requirements.txt
  5. Enter your GPT API in utils/GPT-API.txt
    echo "${Your GPT API Key}" > utils/GPT-API.txt
  6. Prepare the datasets
    cd data && python download_data.py
  7. Download the model via hugging face
    huggingface-cli login
    huggingface-cli download --resume-download unsloth/Llama-3.2-11B-Vision-Instruct --local-dir Llama-3.2-11B-Vision-Instruct

Usage

Specify a Question for Testing

Use this mode to provide a specific question for Med-VQA to answer. The following example demonstrates how to test the 11th question in the israel_local_processed.tsv dataset. The process and results will be displayed directly in the command line.

export GOOGLE_APPLICATION_CREDENTIALS="/your_path_to/google_translate.json" && python3 demo.py --dataset data/israel_local_processed.tsv --question_idx 11

Evaluate on a Dataset

Run on the entire dataset to compute accuracy. Results will be saved in the results folder for further analysis.

export GOOGLE_APPLICATION_CREDENTIALS="/your_path_to/google_translate.json" && python3 inference.py

About

Multilingual Multimodal Medical VQA with RAG and COT Reasoning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published