This page contains instructions for running MaxP baselines on MS MARCO passage ranking task using Capreolus. If you are a Compute Canada user, first follow this guide to set up the environment on CC then continue with this page.
Once the environment is set, you can verify the installation with these instructions.
This requires GPU(s) with 48GB memory (e.g. 3 V100 or a RTX 8000) or a TPU.
-
Make sure you are in the top-level
capreolus
directory; -
Use the following script to run a "mini" version of the MS MARCO fine-tuning, testing if everything is working.
python -m capreolus.run rerank.train with file=docs/reproduction/config_msmarco.txt
This would train the monoBERT for only 3k steps with batch size to be 4, then rerank the top100 documents per query. The script should take no more than 24 hours to finish, and could be fit into a single
v100l
. At the end of execusion, it would display a bunch of metrics, whereMRR@10
should be around0.295
. -
Once the above is done, we can fine-tune a full version on MS MARCO Passage using the following scripts:
niters=10 batch_size=16 validatefreq=$niters # to ensure the validation is run only at the end of training decayiters=$niters # either same with $itersize or 0 threshold=1000 # the top-k documents to rerank python -m capreolus.run rerank.train with \ file=docs/reproduction/config_msmarco.txt \ threshold=$threshold \ reranker.trainer.niters=$niters \ reranker.trainer.batch=$batch_size \ reranker.trainer.decayiters=$decayiters \ reranker.trainer.validatefreq=$validatefreq \ fold=s1
The data preparation time may vary a lot on different machines. After data is prepared, it would take 4~6 hours to train and 6~10 hours to inference with 4 V100s for BERT-base. This should achieve
MRR@10=0.35+
.
In case you are new to slurm, a sample slurm script for the full version fine-tuning could be found under docs/reproduction/sample_slurm_script.sh
.
This should work on cedar
directly via sbatch sample_slurm_script.sh
.
To adapt it to the mini
version, simply change the GPU number and request time into:
#SBATCH --gres=gpu:v100l:1
#SBATCH --time=24:00:00
- Results (with hypperparameter-0) replicated by @crystina-z on 2020-12-06 (commit
6c3759f
) (Tesla V100 on Compute Canada) - Results (with hypperparameter-6) replicated by @Dahlia-Chehata on 2021-03-29 (commit
7915aad
) (Tesla V100 on Compute Canada) - Results (with hypperparameter-7) replicated by @larryli1999 on 2021-05-16 (commit
6d1aed2
) (Tesla V100 on Compute Canada) - Results (MRR@10=0.356) replicated by @andrewyguo on 2021-05-29 (commit
1ce71d9
) (Tesla V100 on Compute Canada)