[Research] Adding the ViGIL experiments tutorials #105

vmarois · 2018-11-28T01:23:45Z

In this PR, I am adding a Reproducible Research section in the documentation. This section only contains for now a note on how to reproduce the experiments described in the ViGIL paper, which can be taken as an example for the future notes in this section.

Main points:

Added the S-MAC model to miprometheus.models. It reuses as many units as possible from the MAC model implementation, such that it only implements the units which are different in terms of equations. Its documentation reflects the published paper (bibtex is indicated).
Added all the grid config files to run the experiments:
- The initial training on CLEVR & CoGenT-A,
- The finetuning on CoGenT-B of the CLEVR- & CoGenT-A-trained models,
- The finetuning on CoGenT-A of the CLEVR-trained models,

Default configuration files are available for CLEVR / CogenT / MAC on CLEVR / MAC on CoGenT / S_MAC on CLEVR / S-MAC on CoGenT. The grid config files reuse these. The grid config files are pretty long and complex because the configuration for each experiment is different, but everything should be commented.

All tests experiment are indicated in the grid config files with the multi_tests key (Was able to do so after the merge of [Feature] Initial multi-tests support in the Tester #98 and ParamRegistry/Interface: added facilities to remove entries #100). Adding support for multi-tests in the Tester was long to implement, but is very useful here as we can run "cross-tests" (e.g. train on CoGenT-A and test on CoGenT-B) with the same command as running "regular tests" (e.g. train on CoGenT-A and test on CoGenT-A).
To allow the proper test of a CLEVR-trained model on CoGenT samples, we need to make sure that the dicts {'words': index} & {'answer': index} are the same for both CoGenT & CLEVR, but also that the random embedding weights are the same. I formalized that under the form of an additional param for the CLEVR class (embedding_source, doc has been updated, and the CLEVR class throws a warning).

Overall, the pipeline is to run mip-grid-trainer-gpu, mip-grid-tester-gpu and mip-grid-analyzer 3 times.

This documentation section should be self-consistent, in that all config files (and indices files for the Sampler) are linked (I provide ours for transparency). All commands are indicated.

note:

I am not using Exponential Moving Average as this is not implemented yet (was done on the internal repo). I am also not sure that this is making a major difference in performance. We may want to implement this in the future.

I have tested the overall and it should be working. I cannot 100% guarantee that I squashed every possible bug here, but most should be fixed 🙂

⚠️ A few things do need to be fixed to ensure that this can reproduced anywhere:

Ensure that when indicating default_configs, the path to these files can be properly constructed from the path of the specified initial config file (with --c). I gave a quick shot at this, and it may not be as straightforward as initially thought: we need to handle the case when we specify several config files (separated by commas), and also in the GridWorkers when Named Temporary Files are created and then superseded by the config params. Cf Extract and add absolute path to nested config files #16,
Automate the downloads of the dataset files (cf Problem initializer #101 ),
Resolve Not all loggers output are captured in the GridWorkers #104, i.e. ensure that all loggers (Problem, Model, SamplerFactory...) can properly log to the console and the log file (although this is not critical),
Ensure that when starting a grid experiment, each single experiment is moved to one GPU (link Introduce mutex-based experiment configuration to Grid Workers GPU #37). I have set the sleep time to 60s in this PR.

Indicate which param to use if one wants to evaluate or finetune a CLEVR-trained model on CoGenT.

Should now be able to handle re-tokenizing questions if the embedding source is different than the dataset variant.

This make sure that the Tester is doing the correct number of episodes according to the Sampler when doing multi tests.

tkornuta-ibm · 2018-11-28T17:52:33Z

configs/mac/mac_clevr.yaml

+    configs/mac/default_clevr.yaml
+
+# Add the model parameters:
+model:


Maybe we could move model to default_mac.yaml as well?

Given that these experiments are on 2 models (MAC & S-MAC), not sure it's very useful

tkornuta-ibm · 2018-11-28T17:54:35Z

configs/mac/mac_smac_cogent_a_finetuning.yaml

+        set: 'trainA'  # use CoGenT-A as validation set (-> 10% of the true training set) as we will test on CoGenT.
+    sampler:
+      name: 'SubsetRandomSampler'
+      indices: '~/data/CLEVR_CoGenT_v1.0/vigil_cogent_val_set_indices.txt'


Is this full validation set?

No, this is 10% of the original training set

vmarois added 30 commits November 7, 2018 09:31

Added S-MAC, polished doc for example (inter-links).

e0a5c42

Add configs file for training experiments.

75ef7c5

Modified path of indices files

8feaf14

Sleep(60) to ensure one experiment per gpu.

d8465ea

Added config file for finetuning on CoGenT-B.

4e0bd67

Started Research page in the documentation.

3e2a2ee

Updated config file to specify trainer.

f1c2d74

Point to grid config in doc.

6926838

Merge branch 'develop' into research/vigil

8a57388

Merge branch 'feat/multi_tests' into research/vigil

48edbfc

Added ViGIL arXiv link,

25bf280

Fix doc warnings.

4646f5e

Allow changing leaf key in the entire 'testing' section.

3e48112

Merge branch 'feat/multi_tests' into research/vigil

f002d22

Small updates to base yaml files.

e54e71e

Added multi_tests information in initial training yaml file.

425c13a

Fix wrong spacing in yaml file.

f59121b

Specify full --tensorboard in command.

cff42ec

Updated yaml file for finetuning on CoGenT-B.

1b47331

Updated yaml file for finetuning on CoGenT-A.

37a4bb2

Added all indices files for the sampler sections.

a2c7c5d

Almost finished the ViGIL doc section.

09b3d65

Added the paper abstract.

e4d0b4f

Precise --o for mip-index-splitter.4

2abd9bd

Updated CLEVR doc.

b3df9fc

Indicate which param to use if one wants to evaluate or finetune a CLEVR-trained model on CoGenT.

Updated CLEVR class.

133024c

Should now be able to handle re-tokenizing questions if the embedding source is different than the dataset variant.

Update the class doc.

b042da6

Updated the CLEVR methods docstrings.

987d1c7

Support embedding_source also for the train samples.

74eec77

Added embedding_source in config files.

41ca425

vmarois added 17 commits November 20, 2018 16:38

Added BibText at the end of section.

2335c36

Cleaned intro page.

08f0156

Added key to multi_tests in initial training config.

08fc37f

Added default param value for embedding_source.

3049a66

Added warnings when loading dicts from file.

4b327be

Correct formatting.

cf07cb8

Corrected str formatting.

6664661

Specified default params.

af4d5ee

Specify embedding source in default cogent config.

2bad0ca

Corrected typo.

80b4bcc

Updated CLEVR doc.

f079e2e

Fix typo.

00deb7c

Fix weird space in yaml file.

a56044a

Added max_test_episodes key in multi_tests.

b17ca17

This make sure that the Tester is doing the correct number of episodes according to the Sampler when doing multi tests.

Typo.

f2df3df

Added bibtex to SMAC doc.

7aed447

Specified experiments repartition.

7ce8ddf

vmarois assigned tkornuta-ibm Nov 28, 2018

vmarois requested a review from tkornuta-ibm November 28, 2018 01:23

vmarois added the enhancement New feature or request label Nov 28, 2018

Fix typo in doc.

db61781

tkornuta-ibm reviewed Nov 28, 2018

View reviewed changes

tkornuta-ibm approved these changes Nov 28, 2018

View reviewed changes

tkornuta-ibm merged commit 7f27c11 into develop Nov 28, 2018

vmarois deleted the research/vigil branch November 29, 2018 00:17

vmarois mentioned this pull request Nov 29, 2018

0.3.1 release #109

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Research] Adding the ViGIL experiments tutorials #105

[Research] Adding the ViGIL experiments tutorials #105

vmarois commented Nov 28, 2018

tkornuta-ibm Nov 28, 2018

vmarois Nov 28, 2018

tkornuta-ibm Nov 28, 2018

vmarois Nov 28, 2018

[Research] Adding the ViGIL experiments tutorials #105

[Research] Adding the ViGIL experiments tutorials #105

Conversation

vmarois commented Nov 28, 2018

tkornuta-ibm Nov 28, 2018

Choose a reason for hiding this comment

vmarois Nov 28, 2018

Choose a reason for hiding this comment

tkornuta-ibm Nov 28, 2018

Choose a reason for hiding this comment

vmarois Nov 28, 2018

Choose a reason for hiding this comment