-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Conversation
allennlp/training/metric_tracker.py
Outdated
self._epoch_number = state_dict["epoch_number"] | ||
self.best_epoch = state_dict["best_epoch"] | ||
self.best_epoch_metrics = state_dict["best_epoch_metrics"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we try to --recover
a run originally trained with previous versions? Do we need some kind of check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not a supported scenario, though in this particular case it would be totally harmless to add a check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something that will be difficult to fix by a user. Even if it will only appear in v2.0, we may want backwards-compatibility here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 764ce61.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more question: Is metric_tracker loaded when we run allennlp evaluate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, only the trainer uses it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* An initial VilBERT model for NLVR2 (#4423) * Some initial work; lots left to do * Initial test mostly passing, though things are still a bit of a mess * tests are passing with small fixtures * remove prints * Test more stuff * PathLike * Make vilbert pass tests * PR comments * call float before log * add CI Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * Initializing a VilBERT model from a pre-trained transformer (#4495) * saving state * Code is running, though it is returning zero gradients (but not None) * initial test passing, still working on albert * albert works, but bert-base-uncased still gives zero gradients * Loading of weights should now work * black, flake, mypy * remove drop and mask functionality from reader * make comment better * fix tests * flake Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * new data loading (#4497) * first implementation * update docstrings * fixes * fix sharding logic * clean up DatasetReader * fix samplers * fixes * fixes * patch models for now * more fixes * fix linting error * fix model test case * some fixes * fix linting err * updates * rename dataloader -> data_loader * fixes * more JoinableQueue * set daemon=True * fixes * fix * fixes * fix * update shuffle logic * load instances right away when not lazy * add tqdm when num_workers <= 0 * apply_token_indexers * fix bug causing high mem usage * address some of @dirkgr's comments * fix lazy * use sensible default for max_batches_in_mem * ensure workers terminated on err * fix * start adding some tests * more tests * add some more tests * address most of Matt's comments * update PyTorchDataLoader test * get rid of lazy option * fix linting * update docs, change max_batches_per_epoch to max_instances_per_epcoh * update CHANGELOG * fix drop_last validation * fix py2md test fixture * handle drop_last * update docs * implement sharding for most readers * fix worker init fn * limit tqdm output * fixes * ensure vision CI runs on each commit (#4582) * ensure vision CI runs on each commit * fix * try fix CHANGELOG check * ensure models check runs on right branch * Formatting updates for new version of black (#4607) * reformat for new version of black (#4605) * reformat for new version of black * pin black * reformat for black * fix * rename 'node_rank' to 'global_rank' in dataset reader 'DistributedInfo' (#4608) * rename 'node_rank' to 'global_rank' * Clarify doc comments * fix line length * remove duplicate padding calculations in collate fn (#4617) * fix len calculation for new data loader (#4618) * fix len calculation for new data loader * add test Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * make existing readers work with multi-process loading (#4597) * make existing readers work with multi-process loading * add 'overrides' decorator * call apply_token_indexers in predictor * clean up * fix tests * Add MultiTaskModel (#4601) * Initial design of the multi-task model * PR comments, more implementation * changelog and docs fix * More tests, and fixes for those tests * mypy and make test less flaky * Update allennlp/models/multitask.py * Update allennlp/models/multitask.py Co-authored-by: Dirk Groeneveld <groeneveld@gmail.com> * Update allennlp/models/multitask.py Co-authored-by: James Barry <james.barry26@mail.dcu.ie> * respect active heads in get_metrics * Clean up changelog * black (apparently github UI doesn't add newlines?) Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> Co-authored-by: Dirk Groeneveld <groeneveld@gmail.com> Co-authored-by: James Barry <james.barry26@mail.dcu.ie> * Detectron NLVR2 (#4481) * Passes a batch of detectron images to the model in the correct format * Loads a model and runs inference on it * Some initial work; lots left to do * Initial test mostly passing, though things are still a bit of a mess * tests are passing with small fixtures * remove prints * More configurable reader * add image_root and feature extraction to detectron model * Use general detectron cfg functions * Adds TensorField * Fix detectron dependency * Adds a detectron processor that we can use in dataset readers * Test more stuff * PathLike * Make vilbert pass tests * PR comments * call float before log * add CI * PathLike * Adds another NLVR2 reader * add region feature and grid feature configuration json and attrtibute to cfg file * change detectron_utils based on https://github.com/vedanuj/grid-feats-vqa/blob/master/extract_feature.py * add bottom up and top down roi head into detectron2 based on allennlp/models/detectron.py * Fix padding in TensorField * Fix field construction * Adds ability to read an arbitrary file * More type annotations * Remove old reader, add test for new one * Use the right kind of field * Run Jiasen's configs as tests * We don't need this field * Removes detectron reader * Remove detectron reader and field * Unify ArrayField and TensorField * Making sure that no merge will go cleanly from now on * Clean up the new output from the detectron processor a bit * Fix Detectron2 version as v0.2 * saving state * Code is running, though it is returning zero gradients (but not None) * initial test passing, still working on albert * albert works, but bert-base-uncased still gives zero gradients * Note * Formatting * Adds Registrable base classes for image operations * Adds a real example of a image2image module * Run the new code (without implementation) in the nlvr2 reader * Solve some issue involving circular imports * add new modules for vilbert * add parameters for detectron image loader. * push current code on implementing proposal generator. * push current progress on proposal generator * Update FasterRCNNProposalGenerator & Merge Detectron2 config * Loading of weights should now work * black, flake, mypy * Run detectron pipeline pieces one at a time This is unfinished and will not run this way. * Fix the data format for the backbone * Handle image sizes separately * remove drop and mask functionality from reader * make comment better * remove proposal_embedder, and finish proposal generator * working on grid embedder * added simple test for resnet backbone, which passes * Got proposal generator test passing * Change default number of detections per image: 100 => 36 * Fix detectron config hierarchy: test_detectron_per_image * Make number of detections configurable & Add test * rename ProposalGenerator to RegionDetector * try to fix makefile * another attempt at makefile * quotes in the pip command... * added a simple test for the dataset reader, made it pass * add feature caching to the dataset reader * another try with the makefile * a better temporary fix for installing detectron * writing files before committing is good... * fix tests * fix (at least part of) the vilbert tests * ok, this makefile change should actually work * add torchvision, try to remove eager import of detectron code * flake * cleanup * more cleanup * mypy, flake * add back code I shouldn't have removed * black * test and flake fixes * fix region_detector for multiple images and add feature and coords padding * fix imports * restore null grid embedder * add back (todo) null region detector * Bring back import changes, to fix circular imports caused by NLVR2 reader * region detector test passing * model test finally passing * update torchvision version * add vqav2 dataset * add gpu support for detectron feature extraction * add lmdbCache to cache feature into lmdb database * fix typo * update vqa jsonnet * fix url adding by cat * Fixes type annotation * Fixes borked error message * New feature cache * Formatting * Fix the tensor cache * Be explicit about our dependencies * Use the new tensor cache * Adds a test using the tensor cache * Run NLVR dataprep on GPU * Tqdm when finding images * Fixes padding in array field * Adjust max_length when truncating in PretrainedTransformerTokenizer * Fewer print statements * remove VQA from this branch and copy default vilbert parameters. * Sanjay's vision features cache script (#4633) * Use LMDB cache in NLVR2 dataset reader; fix a few typos * Standalone script for caching image features * Removing reference to LMDB cache in NLVR2 dataset reader * Adding back asterisk in nlvr2 dataset reader * Fixing one variable name mistake * Decreasing batch size and making a few cuda-related changes * Loading images in batches to avoid GPU OOM error * Pedantic changes for consistency * Run the pre-processing with the models and not the data loading * Filter out paths of images already cached * Add image extensions other than png * Fixes import error * Makes the vision features script work alongside other scripts or training runs Co-authored-by: sanjays <sanjays@ip-10-0-0-157.us-west-2.compute.internal> Co-authored-by: sanjays <sanjays@ip-10-1-10-157.us-west-2.compute.internal> Co-authored-by: Sanjay Subramanian <sanjays@allennlp-server1.corp.ai2> Co-authored-by: Sanjay Subramanian <sanjays_ssubramanian@hotmail.com> * Adds missing imports * Makes TensorCache into a real MutableMapping * Formatting * Changelog * Fix typecheck * Makes the NLVR2 reader work with Pete's new code * Fix type annotation * Formatting * Backwards compatibility * Fix tests * Fix broken config * Update grid embedder test * Fix vilbert_from_huggingface configuration * Don't run the vilbert_from_huggingface test anymore * Remove unused test fixtures * Fix the region detector test * Fix vilbert-from-huggingface and bring it back * Fuck the linter * Run the region detector test on GPU * Run more stuff on GPU The CPU test runner doesn't have enough memory. * Depend on newer version of Detectron * Reinstall Detectron before running tests * Just force CUDA to be on, instead of reinstalling Detecton2 * Detectron needs CUDA_HOME to be set during install At least this thing fails quickly. * Try a different way of wrangling the detectron installer * Bring back amp * Trying to make tests faster, and passing * use two regions, to make tests pass * black * Documentation for TensorCache * Documentation for the NLVR2 dataset reader * Rename ArrayField to TensorField Co-authored-by: Matt Gardner <mattg@allenai.org> Co-authored-by: jiasenlu <jiasenlu@gatech.edu> Co-authored-by: Jaemin Cho <heythisischo@gmail.com> Co-authored-by: jiasenlu <echosenm@gmail.com> Co-authored-by: sanjays <sanjays@ip-10-0-0-157.us-west-2.compute.internal> Co-authored-by: sanjays <sanjays@ip-10-1-10-157.us-west-2.compute.internal> Co-authored-by: Sanjay Subramanian <sanjays@allennlp-server1.corp.ai2> Co-authored-by: Sanjay Subramanian <sanjays_ssubramanian@hotmail.com> * This should have been part of the previously merged PR * Transformer toolkit (#4577) * transformer toolkit: BertEmbeddings * transformer toolkit: BertSelfAttention * transformer toolkit: BertSelfOutput * transformer toolkit: BertAttention * transformer toolkit: BertIntermediate * transformer toolkit: BertOutput * transformer toolkit: BertLayer * transformer toolkit: BertBiAttention * transformer toolkit: BertEmbeddings * transformer toolkit: BertSelfAttention * transformer toolkit: BertSelfOutput * transformer toolkit: BertAttention * transformer toolkit: BertIntermediate * transformer toolkit: BertOutput * transformer toolkit: BertLayer * transformer toolkit: BertBiAttention * Attention scoring functions * merging output and self output * utility to replicate layers, further cleanup * adding sinusoidal positional encoding * adding activation layer * adding base class for generic loading of pretrained weights * further generalizing, adding tests * updates * adding bimodal encoder, kwargs in from_pretrained_module * vilbert using transformer toolkit * fixing test function * changing to torch.allclose * fixing attention score api * bug fix in bimodal output * changing to older attention modules * _construct_default_mapping returns mapping * adding kwargs to _get_input_arguments, adding examples * using cached_transformers * making transformer_encoder more general * added get_relevant_module, loading by name * fixing constructor name * undoing failure after merge * misc minor changes Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * Transformer toolkit: BiModalEncoder now has separate `num_attention_heads` for both modalities (#4728) * separate num_attention_heads for both modalities, default arguments * adding tests for toolkit examples * debug statements for failing test * removing debug statements, reordering * Let's be more tolerant * removing commented code Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * separating TransformerPooler as a new module (#4730) * separating TransformerPooler as a new module * adding size check * fix failing tests * Generalizing self attention (#4756) * generalizing SelfAttention * typecheck changes * adding shape information to docstring Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * Multitask data loading and scheduling (#4625) * Some initial work, still a bunch left to do * Adds a utility function that can shuffle iterables * remove shuffle * Getting close; saving state before fixing lint and adding tests * mypy and flake * put in some initial schedulers and samplers; just need to write tests * added some tests * changelog * add more-itertools to setup.py * finish docstring * some PR comments addressed * mypy * use homogeneous scheduler by default, not the non-homogeneous one * add option to not shuffle * normalize dataset proportions * Update allennlp/data/data_loaders/multitask_data_loader.py Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * improve independence of vision components (#4793) * improve independence of vision components * fix install * fix failing test * haha, actually fix * include torchvision exception too * fix torchvision install * remove vision push trigger * VQAv2 (#4639) * albert works, but bert-base-uncased still gives zero gradients * Note * Formatting * Adds Registrable base classes for image operations * Adds a real example of a image2image module * Run the new code (without implementation) in the nlvr2 reader * Solve some issue involving circular imports * add new modules for vilbert * add parameters for detectron image loader. * push current code on implementing proposal generator. * push current progress on proposal generator * Update FasterRCNNProposalGenerator & Merge Detectron2 config * Loading of weights should now work * black, flake, mypy * Run detectron pipeline pieces one at a time This is unfinished and will not run this way. * Fix the data format for the backbone * Handle image sizes separately * remove drop and mask functionality from reader * make comment better * remove proposal_embedder, and finish proposal generator * working on grid embedder * added simple test for resnet backbone, which passes * Got proposal generator test passing * Change default number of detections per image: 100 => 36 * Fix detectron config hierarchy: test_detectron_per_image * Make number of detections configurable & Add test * rename ProposalGenerator to RegionDetector * try to fix makefile * another attempt at makefile * quotes in the pip command... * added a simple test for the dataset reader, made it pass * add feature caching to the dataset reader * another try with the makefile * a better temporary fix for installing detectron * writing files before committing is good... * fix tests * fix (at least part of) the vilbert tests * ok, this makefile change should actually work * add torchvision, try to remove eager import of detectron code * flake * cleanup * more cleanup * mypy, flake * add back code I shouldn't have removed * black * test and flake fixes * fix region_detector for multiple images and add feature and coords padding * fix imports * restore null grid embedder * add back (todo) null region detector * Bring back import changes, to fix circular imports caused by NLVR2 reader * region detector test passing * model test finally passing * update torchvision version * add vqav2 dataset * add gpu support for detectron feature extraction * add lmdbCache to cache feature into lmdb database * fix typo * update vqa jsonnet * fix url adding by cat * Fixes type annotation * Fixes borked error message * New feature cache * Formatting * Fix the tensor cache * Be explicit about our dependencies * Use the new tensor cache * Adds a test using the tensor cache * Run NLVR dataprep on GPU * Tqdm when finding images * Fixes padding in array field * Adjust max_length when truncating in PretrainedTransformerTokenizer * Fewer print statements * remove VQA from this branch and copy default vilbert parameters. * add VQAv2 dataset * Added dataset reader and model tests, which are now passing * Sanjay's vision features cache script (#4633) * Use LMDB cache in NLVR2 dataset reader; fix a few typos * Standalone script for caching image features * Removing reference to LMDB cache in NLVR2 dataset reader * Adding back asterisk in nlvr2 dataset reader * Fixing one variable name mistake * Decreasing batch size and making a few cuda-related changes * Loading images in batches to avoid GPU OOM error * Pedantic changes for consistency * Run the pre-processing with the models and not the data loading * Filter out paths of images already cached * Add image extensions other than png * Fixes import error * Makes the vision features script work alongside other scripts or training runs Co-authored-by: sanjays <sanjays@ip-10-0-0-157.us-west-2.compute.internal> Co-authored-by: sanjays <sanjays@ip-10-1-10-157.us-west-2.compute.internal> Co-authored-by: Sanjay Subramanian <sanjays@allennlp-server1.corp.ai2> Co-authored-by: Sanjay Subramanian <sanjays_ssubramanian@hotmail.com> * Adds missing imports * Makes TensorCache into a real MutableMapping * Formatting * Changelog * Fix typecheck * Makes the NLVR2 reader work with Pete's new code * Fix type annotation * Formatting * Backwards compatibility * Restore NLVR to former glory * Types and multi-process reading for VQAv2 * Formatting * Fix tests * Fix broken config * Update grid embedder test * Fix vilbert_from_huggingface configuration * Don't run the vilbert_from_huggingface test anymore * Remove unused test fixtures * Fix the region detector test * Fix vilbert-from-huggingface and bring it back * Fuck the linter * Fix for VQA test * Why was this metric disabled? * Black and flake * Re-add VQA reader * Image featurizers now need to be called with sizes * Run the region detector test on GPU * Run more stuff on GPU The CPU test runner doesn't have enough memory. * Depend on newer version of Detectron * Reinstall Detectron before running tests * Just force CUDA to be on, instead of reinstalling Detecton2 * Fixes VQA2 DatasetReader * Fix documentation * Detectron needs CUDA_HOME to be set during install At least this thing fails quickly. * Try a different way of wrangling the detectron installer * Try a different way of wrangling the detectron installer * Bring back amp * Refactored VQA reader * More training paths * Remove debug code * Don't check in debug code * Auto-detect GPU to use * Apply indexers later * Fix typo * Register the model * Fields live on CPU. Only batches get GPUs. * black * black, flake * mypy * more flake * More realistic training config * Adds a basic Predictor for VQAv2 * Make vilbert output human-readable * Forgot to enumerate * Use the right namspace * Trying to make tests faster, and passing * add image prefix when loading coco image * fix vqav2 dataset reader and config file * use two regions, to make tests pass * black * Output probabilities in addition to logits * Make it possible to turn off the cache * Turn off the cache in the predictor * Fix the VQA predictor * change the experiment to the defualt vilbert hyperparams. * add default experiment_from_huggingface.json * fix typos in vqa reader * Proper probabilities * Formatting * Remove unused variable * Make mypy happy * Fixed loss function, metric, and got tests to pass * Updates the big training config * Put real settings into the vilbert_vqa config * Strings are lists in Python * Make mypy happy * Formatting * Unsatisfying mypy * Config changes to make this run * Fix dimensionality of embeddings * clean the code and add the image_num_heads and combine_num_heads * fix answer vocab and add save and load from pre-extracted vocab * fix loss and update save_answer_vocab script * Typo * Fixed fusion method * Tweaking the VQA config some more * Moved the from_huggingface config * 20 epochs * Set up the learning rate properly * Simplify * Hardcoded answer vocab * Don't be lazy * Steps per epoch cannot be None * Let's chase the right score * Fixing some parameter names * Fields are stored on CPUs * Bigger batch size, easier distributed training * Don't run the debug code by default * VQA with the Transformer Toolkit (#4729) * transformer toolkit: BertEmbeddings * transformer toolkit: BertSelfAttention * transformer toolkit: BertSelfOutput * transformer toolkit: BertAttention * transformer toolkit: BertIntermediate * transformer toolkit: BertOutput * transformer toolkit: BertLayer * transformer toolkit: BertBiAttention * transformer toolkit: BertEmbeddings * transformer toolkit: BertSelfAttention * transformer toolkit: BertSelfOutput * transformer toolkit: BertAttention * transformer toolkit: BertIntermediate * transformer toolkit: BertOutput * transformer toolkit: BertLayer * transformer toolkit: BertBiAttention * Attention scoring functions * merging output and self output * utility to replicate layers, further cleanup * adding sinusoidal positional encoding * adding activation layer * adding base class for generic loading of pretrained weights * further generalizing, adding tests * updates * adding bimodal encoder, kwargs in from_pretrained_module * vilbert using transformer toolkit * fixing test function * changing to torch.allclose * fixing attention score api * bug fix in bimodal output * changing to older attention modules * _construct_default_mapping returns mapping * adding kwargs to _get_input_arguments, adding examples * using cached_transformers * making transformer_encoder more general * added get_relevant_module, loading by name * fixing constructor name * undoing failure after merge * misc minor changes * Transformer toolkit (#4577) * transformer toolkit: BertEmbeddings * transformer toolkit: BertSelfAttention * transformer toolkit: BertSelfOutput * transformer toolkit: BertAttention * transformer toolkit: BertIntermediate * transformer toolkit: BertOutput * transformer toolkit: BertLayer * transformer toolkit: BertBiAttention * transformer toolkit: BertEmbeddings * transformer toolkit: BertSelfAttention * transformer toolkit: BertSelfOutput * transformer toolkit: BertAttention * transformer toolkit: BertIntermediate * transformer toolkit: BertOutput * transformer toolkit: BertLayer * transformer toolkit: BertBiAttention * Attention scoring functions * merging output and self output * utility to replicate layers, further cleanup * adding sinusoidal positional encoding * adding activation layer * adding base class for generic loading of pretrained weights * further generalizing, adding tests * updates * adding bimodal encoder, kwargs in from_pretrained_module * vilbert using transformer toolkit * fixing test function * changing to torch.allclose * fixing attention score api * bug fix in bimodal output * changing to older attention modules * _construct_default_mapping returns mapping * adding kwargs to _get_input_arguments, adding examples * using cached_transformers * making transformer_encoder more general * added get_relevant_module, loading by name * fixing constructor name * undoing failure after merge * misc minor changes Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * separate num_attention_heads for both modalities, default arguments * adding tests for toolkit examples * debug statements for failing test * removing debug statements, reordering * Typo * Some compatibility with the transformer toolkit * Reorganize the image inputs * More transformer toolkit compatibility * Debug settings * Let's be more tolerant * Fix how VilBERT runs Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> * Make the region detector and region embedder lazy * Fix references to the model * Make various automated tests pass * Formatting * More logging * One more logging statement * Read answer vocab from vocab file instead of determining it automatically * Don't keep the files open so long * Use most of the validation set for training as well * Get ready to be lazy * Upgrade paths * Be lazy * Keep unanswerable questions only during test time * Fix the from_huggingface config * Fixes the VQA score * VQA specific metric * Fixes some tests * Tests pass! * Formatting * Use the correct directory * Use the region detector that's meant for testing * Read the test split properly * Be a little more verbose while discovering images * Modernize Vilbert VQA * Update NLVR, but it still doesn't run * Formatting * Remove NLVR * Fix the last test * Formatting * Conditionally export the VilbertVqaPredictor * ModuleNotFoundError is a type of ImportError * Fix test-install * Try the broken test with a fixed seed * Try a bunch of seeds * Smaller model to get bigger magnitudes * Now that the test works, we don't need to specify the seeds anymore Co-authored-by: Matt Gardner <mattg@allenai.org> Co-authored-by: jiasenlu <jiasenlu@gatech.edu> Co-authored-by: Jaemin Cho <heythisischo@gmail.com> Co-authored-by: jiasenlu <echosenm@gmail.com> Co-authored-by: sanjays <sanjays@ip-10-0-0-157.us-west-2.compute.internal> Co-authored-by: sanjays <sanjays@ip-10-1-10-157.us-west-2.compute.internal> Co-authored-by: Sanjay Subramanian <sanjays@allennlp-server1.corp.ai2> Co-authored-by: Sanjay Subramanian <sanjays_ssubramanian@hotmail.com> Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> * SNLI_VE dataset reader (#4799) * adding VE reader * removing jsonlines * blackify * intial VE model * adding VisionReader for common vision components * fix test file * fix doc * temporarily removing VE model * bug fix * cleanup * removing unnecessary check * simplify * Visual entailment model code (#4822) * VE model code * adding VE model * misc minor updates * update changelog * Added GQA reader (#4832) * Adds reader for GQA dataset. Will download questions from https://cs.stanford.edu/people/dorarad/gqa/download.html. * Cleaned up GQA reader tests * Other VQA datasets (#4834) * Make the VQA reader work for the other datasets * Also find pngs * Really support pngs * Remove debug code * More logging * Unexpected formatting * Respect the device * This is how your replace things in named tuples. * Remove unused import * This is how you override a method properly. * This is how you set parameters in detectron. * Also set the device for the region detector * Training configs for all three datasets contained in VQA * Bigger batches * Bigger batches for image processing * Fix vilbert-from-huggingface config * Make the config switch modes for constructing vocab * More vocab, more docs, better way of deriving vocab * Modernize the from_huggingface config * More updates to the from_huggingface config * Better hyperparameters stolen from another project * Fix for inverted parameter * Formatting * Throw a meaningful error message when we don't have images * Add a warning that includes instructions for how to fix things * Remove unused script * Merge issue * adding multilabel option (#4843) * Generalizing transformer layers (#4776) * adding HF tests, docstrings for AttentionLayer, TransformerLayer, TransformerBlock * temp change to check if tests pass * undoing temp change * ci update * more ci updates * changing test run * update makefile * temp change * isolating failing case * further debugging * fail check * reverting to older CI * test with reduced batch size * cleanup * more cleanup * oops, fix * gqa reader fixes during vilbert training (#4851) * Refactored shared code * typecheck fix * rebase * Refactored shared code * typecheck fix * rebase * Cleaned up GQA reader tests * Modify instance format for vilbert-vqa model * update for vision branch bump Co-authored-by: Jackson Stokes <jacksons@Jacksons-MacBook-Pro.local> Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * Toolkit: Adding documentation and small changes for `BiModalAttention` (#4859) * adding documentation for bimodal attn, small fixes * changing the way mask is applied * using large value rather than inf * Update comment Co-authored-by: Dirk Groeneveld <groeneveld@gmail.com> * moving apply_mask to util Co-authored-by: Dirk Groeneveld <groeneveld@gmail.com> * Make tests work again (#4865) * New import paths * Duplicate entries * Dataset readers can't be lazy anymore * Switch to torchvision for vision components 👀, simplify and improve MultiProcessDataLoader (#4821) * implement TorchImageLoader * implement ResnetBackbone * add resize + normalize to image loader * finalize FasterRcnnRegionDetector * pin torchvision * fix VQAv2Reader * add box mask field * dataset reader fixes * fix model tests * doc fixes * add threshold parameters to FasterRcnnRegionDetector * address @dirkgr comments * mask fixes * shape comments * add some more comments * cache answers_by_question_id * implement LocalCacheResource * fix * add read-only option to cache * fix * simplify data loader * make featurizer and detector optional in readers * Cache in memory * back pressure is important I guess * merge * Updated configs * Fixes the way we apply masks * Use more of Jiasen's real settings * Upgrade the from_huggingface config * Switch back to the images on corpnet * Fix random seeds * Bigger model needs smaller batch size * Adds ability to selectively ignore one input * address some comments * format + lint * fixes * Bring back bert-base configs * fix error handling * fix test * fix typo * use lock when possible Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * doc fixes * Only cache, no featurizing (#4870) * implement TorchImageLoader * implement ResnetBackbone * add resize + normalize to image loader * finalize FasterRcnnRegionDetector * pin torchvision * fix VQAv2Reader * add box mask field * dataset reader fixes * fix model tests * doc fixes * add threshold parameters to FasterRcnnRegionDetector * address @dirkgr comments * mask fixes * shape comments * add some more comments * cache answers_by_question_id * implement LocalCacheResource * fix * add read-only option to cache * fix * simplify data loader * make featurizer and detector optional in readers * Cache in memory * back pressure is important I guess * merge * Updated configs * Fixes the way we apply masks * Use more of Jiasen's real settings * Upgrade the from_huggingface config * Switch back to the images on corpnet * Fix random seeds * Bigger model needs smaller batch size * Adds ability to selectively ignore one input * address some comments * format + lint * fixes * Bring back bert-base configs * fix error handling * fix test * Adds the ability to read from a feature cache, but not run any featurization * Update tests * Let's stick with "feature_cache" As long as we're consistent ... * More epochs, more random * Use the new parameters * Fix initialization * Make tests work, add some documentation * Remove the read_from_cache parameter * Cleanup of training configs * Typecheck * Building docs right * Better settings for VQA * Leave the image_feature_dim at 1024 Co-authored-by: epwalsh <epwalsh10@gmail.com> * Make images easier to find for Visual Entailment (#4878) * implement TorchImageLoader * implement ResnetBackbone * add resize + normalize to image loader * finalize FasterRcnnRegionDetector * pin torchvision * fix VQAv2Reader * add box mask field * dataset reader fixes * fix model tests * doc fixes * add threshold parameters to FasterRcnnRegionDetector * address @dirkgr comments * mask fixes * shape comments * add some more comments * cache answers_by_question_id * implement LocalCacheResource * fix * add read-only option to cache * fix * simplify data loader * make featurizer and detector optional in readers * Cache in memory * back pressure is important I guess * merge * Updated configs * Fixes the way we apply masks * Use more of Jiasen's real settings * Upgrade the from_huggingface config * Switch back to the images on corpnet * Fix random seeds * Bigger model needs smaller batch size * Adds ability to selectively ignore one input * address some comments * format + lint * fixes * Bring back bert-base configs * fix error handling * fix test * Adds the ability to read from a feature cache, but not run any featurization * Update tests * Let's stick with "feature_cache" As long as we're consistent ... * More epochs, more random * Use the new parameters * Fix initialization * Make tests work, add some documentation * Remove the read_from_cache parameter * Cleanup of training configs * Typecheck * Building docs right * Better settings for VQA * Open cached paths when reading json lines * By default, autodetect GPUs when training * Switch to torchvision * Download training data from the web * This needs to stay at 1024 until we get the new featurization model * Have a more descriptive error message when images are missing * Update vilbert_ve_from_huggingface.jsonnet Co-authored-by: epwalsh <epwalsh10@gmail.com> Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> * Adding f1 score (#4890) * adding f1 score * updated config * import MultiTaskDataLoader to data_loaders/__init__.py (#4885) * Make GQA work (#4884) * Refactored shared code * typecheck fix * rebase * Refactored shared code * typecheck fix * rebase * Cleaned up GQA reader tests * Modify instance format for vilbert-vqa model * update for vision branch bump * Adding training config for GQA * Unnamed variable * Various GQA fixes * Temporary extra configs needed to make vocab * Remove unused file * Optimize VQA score instead of F-Score * Use our newly created vocab * Remove temporary configs * Don't fail when we don't need to create a directory * Make a config that works on the servers as well * Update comment * A new command to count instances * Temporary config to count instances * Undo temporary changes * Put in the correct number of steps per epoch * Remove this number from the config because it's almost certainly wrong * Don't put Fields in Tuples * Formatting * More informative error message when batches are heterogeneous * Formatting * Not my type * Generate the fields properly when answers are missing * Properly discard instances with missing answers * Changelog * Update number of steps per epoch * Adds a config for balanced GQA * fix file_utils extract with directory * fix Batch._check_types * Fill in URL Co-authored-by: Jackson Stokes <jacksons@Jacksons-MacBook-Pro.local> Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> * Toolkit: Cleaning up TransformerEmbeddings (#4900) * fixing issue of non-deterministic dropout * updating TransformerEmbeddings * ImageFeatureEmbeddings is now a subclass of Embeddings * allowing for no token type embeddings * fixing kwargs for loading pretrained module * Data loading cuda device (#4879) * add test with tensor fields * improve nn.util.move_to_device * ensure start_method is 'spawn' when using lazy and mem pin * add 'non_blocking' arg to 'move_to_device' * fix fake test tensor * fix sampler test * lint * fix 'move_to_device' * fix condition check * add device to data loader * clean up doc string * rename 'device' arg to 'cuda_device' * pinning is very slow, revert * DataLoaders load to CUDA device * fix evaluate test * rename 'multi_process_*' -> 'multiprocess' for consistency (#4906) * MultiProcessDataLoader takes PathLike data_path (#4908) * remove PyTorchDataLoader, add SimpleDataLoader for testing (#4907) * remove PyTorchDataLoader, add SimpleDataLoader for testing * fix test * comments * improve data loading docs (#4909) * improve data loading docs * document best practices, add 'get_batch_size' method to samplers * try fix annoying unrelated test * revert that * clarify handling of 'max_instances_in_memory' * fix imports in file_utils * rename 'master' -> 'primary' for distributed training (#4910) * improve worker error handling in MultiProcessDataLoader (#4912) * improve worker error handling * rename test file * Toolkit decoder (#4914) * adding cross_attention, renaming block -> stack * stack can be initialized with layer too Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * resolve _read type (#4916) * resolve _read type * fix sharded reader * fix data loader arg * Multitask example (#4898) * Make the VQA reader work for the other datasets * Also find pngs * Really support pngs * Remove debug code * More logging * Unexpected formatting * Respect the device * This is how your replace things in named tuples. * Remove unused import * This is how you override a method properly. * This is how you set parameters in detectron. * Also set the device for the region detector * Training configs for all three datasets contained in VQA * Bigger batches * Bigger batches for image processing * Fix vilbert-from-huggingface config * Make the config switch modes for constructing vocab * More vocab, more docs, better way of deriving vocab * Modernize the from_huggingface config * More updates to the from_huggingface config * Better hyperparameters stolen from another project * Fix for inverted parameter * Formatting * Throw a meaningful error message when we don't have images * Add a warning that includes instructions for how to fix things * Remove unused script * Merge issue * Adds named splits to the SNLI-VE reader * Make the multitask data loader discoverable * Formatting * More flexible inputs to the dataset readers * Prototype config for the multitask training job * json_lines_from_file() already calls cached_path() * Visual entailment should track accuracy * Switching to torch * Fixing VE image paths * Formatting * Experimentally use threaded_generator to read instances from readers simultaneously * Vilbert backbone * Fixed paths * Formatting * Adds heads * Revert "Experimentally use threaded_generator to read instances from readers simultaneously" This reverts commit a633e67. * Multitask trains now! * Remove useless parameter from GQA reader * Updated multitask config * Schedulers produce batches, not instances * Track multiple metrics * Make mypy happy * Formatting * Keep better track of which heads have been called * Fix the merge * We have more than strings for input * Remove unused imports * -1 is CPU * Go back to tracking instances per epoch so that the samplers can work * Better error message * A useful sampler to have * We haven't indexed until we've indexed * Makes tests pass * Formatting * Fine-tuning the metric tracker * Update model configs for my changes * Fixing model configs for Akshita's changes * Implement VisionTextModel in terms of VilbertBackbone * Formatting * Fix stale comment * Use the server paths by default, not Dirk's desktop * Fix tests * Formatting again * Removed data loader parameters that don't exist anymore * Clarified comment Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> * Moves vision models to allennlp-models (#4918) * Moves vision models to allennlp-models * Also move test fixtures * Don't return so many instances if we're cutting them out later anyways * We actually need this image * Formatting * Fixing more paths * Prepare for release v2.0.0rc1 * Make releasing work with the renamed master branch, and with the vision branch * Debugging the release process in the slowest way possible * Another attempt at fixing the release process * Generic Callbacks (#4917) * Better Callbacks * Reformatting * Fixes * Tests for updated TrainerCallback * Formatting and Type-Checking fixes * Consistent metric tracker (#4928) * Makes the metric tracker more consistent * Turns out we need best_epoch_metrics after all. * Backwards compatibility * Formatting * Remove old script * Changes CI since we won't have a `vision` branch anymore * fix up CHANGELOG Co-authored-by: Matt Gardner <mattg@allenai.org> Co-authored-by: epwalsh <epwalsh10@gmail.com> Co-authored-by: James Barry <james.barry26@mail.dcu.ie> Co-authored-by: jiasenlu <jiasenlu@gatech.edu> Co-authored-by: Jaemin Cho <heythisischo@gmail.com> Co-authored-by: jiasenlu <echosenm@gmail.com> Co-authored-by: sanjays <sanjays@ip-10-0-0-157.us-west-2.compute.internal> Co-authored-by: sanjays <sanjays@ip-10-1-10-157.us-west-2.compute.internal> Co-authored-by: Sanjay Subramanian <sanjays@allennlp-server1.corp.ai2> Co-authored-by: Sanjay Subramanian <sanjays_ssubramanian@hotmail.com> Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> Co-authored-by: jvstokes <40584422+jvstokes@users.noreply.github.com> Co-authored-by: Jackson Stokes <jacksons@Jacksons-MacBook-Pro.local> Co-authored-by: Karen Hambardzumyan <mahnerak@gmail.com>
Makes the metric tracker's
state_dict
handling consistent.Neither
patience
nortracked_metrics
should be in thestate_dict
, because those come from the constructor. Also,best_epoch_metrics
should be saved and restored.@mahnerak