Skip to content

Latest commit

 

History

History
57 lines (39 loc) · 1.61 KB

README.md

File metadata and controls

57 lines (39 loc) · 1.61 KB

VQA using Differential Attention Models

Pytorch implementation of the papers:

model

Usage

1. Clone the repositories.

git clone https://github.com/chirag26495/DAN_VQA.git

2. Download and unzip the dataset from official url of VQA: https://visualqa.org/download.html.

cd basic_vqa/utils
chmod +x download_and_unzip_datasets.csh
./download_and_unzip_datasets.csh

3. Preproccess input data for (images, questions and answers).

$ python resize_images.py --input_dir='../datasets/Images' --output_dir='../datasets/Resized_Images'  
$ python make_vacabs_for_questions_answers.py --input_dir='../datasets'
$ python build_vqa_inputs.py --input_dir='../datasets' --output_dir='../datasets'

4. Train model for VQA task.

$ cd ..
$ python train.py

Pretrained Models and Exemplar Mappings (using VQA2.0 dataset)

Results

  • Quantitative comparison on VQA2.0 Validation set
Model Metric Dataset Accuracy
Basic (LQI) All VQA v2 47.61
Baseline (LQIA) All VQA v2 53.23
SAN-2 All VQA v2 55.28
DAN + LQIA All VQA v2 55.49
DAN-alt. + LQIA All VQA v2 54.16