-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Research] Adding the ViGIL experiments tutorials #105
Conversation
Indicate which param to use if one wants to evaluate or finetune a CLEVR-trained model on CoGenT.
Should now be able to handle re-tokenizing questions if the embedding source is different than the dataset variant.
This make sure that the Tester is doing the correct number of episodes according to the Sampler when doing multi tests.
configs/mac/default_clevr.yaml | ||
|
||
# Add the model parameters: | ||
model: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could move model to default_mac.yaml as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that these experiments are on 2 models (MAC & S-MAC), not sure it's very useful
set: 'trainA' # use CoGenT-A as validation set (-> 10% of the true training set) as we will test on CoGenT. | ||
sampler: | ||
name: 'SubsetRandomSampler' | ||
indices: '~/data/CLEVR_CoGenT_v1.0/vigil_cogent_val_set_indices.txt' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this full validation set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is 10% of the original training set
In this PR, I am adding a Reproducible Research section in the documentation. This section only contains for now a note on how to reproduce the experiments described in the ViGIL paper, which can be taken as an example for the future notes in this section.
Main points:
Added the S-MAC model to
miprometheus.models
. It reuses as many units as possible from the MAC model implementation, such that it only implements the units which are different in terms of equations. Its documentation reflects the published paper (bibtex is indicated).Added all the grid config files to run the experiments:
- The initial training on CLEVR & CoGenT-A,
- The finetuning on CoGenT-B of the CLEVR- & CoGenT-A-trained models,
- The finetuning on CoGenT-A of the CLEVR-trained models,
Default configuration files are available for CLEVR / CogenT / MAC on CLEVR / MAC on CoGenT / S_MAC on CLEVR / S-MAC on CoGenT. The grid config files reuse these. The grid config files are pretty long and complex because the configuration for each experiment is different, but everything should be commented.
multi_tests
key (Was able to do so after the merge of [Feature] Initial multi-tests support in the Tester #98 and ParamRegistry/Interface: added facilities to remove entries #100). Adding support for multi-tests in theTester
was long to implement, but is very useful here as we can run "cross-tests" (e.g. train on CoGenT-A and test on CoGenT-B) with the same command as running "regular tests" (e.g. train on CoGenT-A and test on CoGenT-A).{'words': index}
&{'answer': index}
are the same for both CoGenT & CLEVR, but also that the random embedding weights are the same. I formalized that under the form of an additional param for theCLEVR
class (embedding_source
, doc has been updated, and theCLEVR
class throws a warning).Overall, the pipeline is to run
mip-grid-trainer-gpu
,mip-grid-tester-gpu
andmip-grid-analyzer
3 times.This documentation section should be self-consistent, in that all config files (and indices files for the
Sampler
) are linked (I provide ours for transparency). All commands are indicated.note:
I have tested the overall and it should be working. I cannot 100% guarantee that I squashed every possible bug here, but most should be fixed 🙂
default_configs
, the path to these files can be properly constructed from the path of the specified initial config file (with--c
). I gave a quick shot at this, and it may not be as straightforward as initially thought: we need to handle the case when we specify several config files (separated by commas), and also in theGridWorkers
whenNamed Temporary Files
are created and then superseded by theconfig params
. Cf Extract and add absolute path to nested config files #16,Problem
,Model
,SamplerFactory
...) can properly log to the console and the log file (although this is not critical),sleep
time to 60s in this PR.