Main, INFO: Raw paper_dev set is 14475 and paper_test set is 14150. Main, INFO: Paper_dev and paper_test splits dont have a common context/claim. Main, INFO: After filtering paper_dev set is 6510 and paper_test set is 6576. Main, INFO: Read dataset of size 13086 of which the first 6510 examples are from the validation set and the remaining 6576 from the test split. Main, INFO: ================================================== Main, INFO: Created a new Experiment. Model GPTJ Main, INFO: ================================================== Main, INFO: >>>> Command line argument rate => 8.0 Main, INFO: >>>> Command line argument dtpts => 22000 Main, INFO: >>>> Command line argument batch_size => 256 Main, INFO: >>>> Command line argument max_len => 1 Main, INFO: >>>> Command line argument k => 10 Main, INFO: >>>> Command line argument intervention => rank-reduction Main, INFO: >>>> Command line argument lname => fc_in Main, INFO: >>>> Command line argument lnum => 24 Main, INFO: >>>> Command line argument home_dir => ./results/fever/gptj_results Main, INFO: ================================================== Main, INFO: Starting a new intervention with rate 8.0. Dataset size 13086. Batch size 256 Main, INFO: Editing a GPTJForCausalLM Model Main, INFO: Updating Layer: transformer.h.24.mlp.fc_in.weight Main, INFO: Total number of parameters updated is 1 Main, INFO: Edited and put model on cuda:0 in time 5 second Main, INFO: After 501 0-1 Correctness is 55.68862275449102 percentage, Mean F1 score is None, Mean Log Prob is -0.8777261558407081, top-1 accuracy is 56.48702594810379, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 501 examples. Total time taken is 21 second. Avg time per example is 0 second. Main, INFO: Remaining 12585 examples. Expected Total time taken to complete is 8 minutes. Main, INFO: After 1001 0-1 Correctness is 53.74625374625375 percentage, Mean F1 score is None, Mean Log Prob is -0.8865181695628952, top-1 accuracy is 54.74525474525475, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 1001 examples. Total time taken is 41 second. Avg time per example is 0 second. Main, INFO: Remaining 12085 examples. Expected Total time taken to complete is 8 minutes. Main, INFO: After 1501 0-1 Correctness is 54.363757495003334 percentage, Mean F1 score is None, Mean Log Prob is -0.8823678130153654, top-1 accuracy is 55.16322451698868, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 1501 examples. Total time taken is 1 minutes. Avg time per example is 0 second. Main, INFO: Remaining 11585 examples. Expected Total time taken to complete is 8 minutes. Main, INFO: After 2001 0-1 Correctness is 54.622688655672164 percentage, Mean F1 score is None, Mean Log Prob is -0.8802108573562083, top-1 accuracy is 55.27236381809095, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 2001 examples. Total time taken is 1 minutes. Avg time per example is 0 second. Main, INFO: Remaining 11085 examples. Expected Total time taken to complete is 7 minutes. Main, INFO: After 2501 0-1 Correctness is 55.217912834866056 percentage, Mean F1 score is None, Mean Log Prob is -0.879376347841429, top-1 accuracy is 55.77768892443023, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 2501 examples. Total time taken is 1 minutes. Avg time per example is 0 second. Main, INFO: Remaining 10585 examples. Expected Total time taken to complete is 7 minutes. Main, INFO: After 3001 0-1 Correctness is 55.0483172275908 percentage, Mean F1 score is None, Mean Log Prob is -0.8808683485477934, top-1 accuracy is 55.68143952015995, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 3001 examples. Total time taken is 2 minutes. Avg time per example is 0 second. Main, INFO: Remaining 10085 examples. Expected Total time taken to complete is 6 minutes. Main, INFO: After 3501 0-1 Correctness is 55.15566980862611 percentage, Mean F1 score is None, Mean Log Prob is -0.8802204466536875, top-1 accuracy is 55.955441302485006, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 3501 examples. Total time taken is 2 minutes. Avg time per example is 0 second. Main, INFO: Remaining 9585 examples. Expected Total time taken to complete is 6 minutes. Main, INFO: After 4001 0-1 Correctness is 55.13621594601349 percentage, Mean F1 score is None, Mean Log Prob is -0.8808563318335393, top-1 accuracy is 55.88602849287678, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 4001 examples. Total time taken is 2 minutes. Avg time per example is 0 second. Main, INFO: Remaining 9085 examples. Expected Total time taken to complete is 6 minutes. Main, INFO: After 4501 0-1 Correctness is 55.498778049322375 percentage, Mean F1 score is None, Mean Log Prob is -0.8769427750195908, top-1 accuracy is 56.27638302599422, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 4501 examples. Total time taken is 3 minutes. Avg time per example is 0 second. Main, INFO: Remaining 8585 examples. Expected Total time taken to complete is 5 minutes. Main, INFO: After 5001 0-1 Correctness is 55.628874225154966 percentage, Mean F1 score is None, Mean Log Prob is -0.8764573928297246, top-1 accuracy is 56.38872225554889, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 5001 examples. Total time taken is 3 minutes. Avg time per example is 0 second. Main, INFO: Remaining 8085 examples. Expected Total time taken to complete is 5 minutes. Main, INFO: After 5501 0-1 Correctness is 55.89892746773314 percentage, Mean F1 score is None, Mean Log Prob is -0.8758501490231666, top-1 accuracy is 56.662425013633886, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 5501 examples. Total time taken is 3 minutes. Avg time per example is 0 second. Main, INFO: Remaining 7585 examples. Expected Total time taken to complete is 5 minutes. Main, INFO: After 6001 0-1 Correctness is 55.84069321779703 percentage, Mean F1 score is None, Mean Log Prob is -0.8763388195269863, top-1 accuracy is 56.57390434927512, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 6001 examples. Total time taken is 4 minutes. Avg time per example is 0 second. Main, INFO: Remaining 7085 examples. Expected Total time taken to complete is 4 minutes. Main, INFO: After 6501 0-1 Correctness is 55.66835871404399 percentage, Mean F1 score is None, Mean Log Prob is -0.8778308347186168, top-1 accuracy is 56.39132441162898, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 6501 examples. Total time taken is 4 minutes. Avg time per example is 0 second. Main, INFO: Remaining 6585 examples. Expected Total time taken to complete is 4 minutes. Main, INFO: After 7001 0-1 Correctness is 55.54920725610627 percentage, Mean F1 score is None, Mean Log Prob is -0.8780605666783176, top-1 accuracy is 56.27767461791173, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 7001 examples. Total time taken is 4 minutes. Avg time per example is 0 second. Main, INFO: Remaining 6085 examples. Expected Total time taken to complete is 4 minutes. Main, INFO: After 7501 0-1 Correctness is 55.48593520863885 percentage, Mean F1 score is None, Mean Log Prob is -0.8786302934589584, top-1 accuracy is 56.23250233302226, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 7501 examples. Total time taken is 5 minutes. Avg time per example is 0 second. Main, INFO: Remaining 5585 examples. Expected Total time taken to complete is 3 minutes. Main, INFO: After 8001 0-1 Correctness is 55.1931008623922 percentage, Mean F1 score is None, Mean Log Prob is -0.879389672469473, top-1 accuracy is 55.9305086864142, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 8001 examples. Total time taken is 5 minutes. Avg time per example is 0 second. Main, INFO: Remaining 5085 examples. Expected Total time taken to complete is 3 minutes. Main, INFO: After 8501 0-1 Correctness is 55.41700976355723 percentage, Mean F1 score is None, Mean Log Prob is -0.8782374418149231, top-1 accuracy is 56.13457240324668, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 8501 examples. Total time taken is 5 minutes. Avg time per example is 0 second. Main, INFO: Remaining 4585 examples. Expected Total time taken to complete is 3 minutes. Main, INFO: After 9001 0-1 Correctness is 55.5271636484835 percentage, Mean F1 score is None, Mean Log Prob is -0.878332602710886, top-1 accuracy is 56.22708587934674, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 9001 examples. Total time taken is 6 minutes. Avg time per example is 0 second. Main, INFO: Remaining 4085 examples. Expected Total time taken to complete is 2 minutes. Main, INFO: After 9501 0-1 Correctness is 55.415219450584146 percentage, Mean F1 score is None, Mean Log Prob is -0.8789082533647457, top-1 accuracy is 56.12040837806547, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 9501 examples. Total time taken is 6 minutes. Avg time per example is 0 second. Main, INFO: Remaining 3585 examples. Expected Total time taken to complete is 2 minutes. Main, INFO: After 10001 0-1 Correctness is 55.424457554244576 percentage, Mean F1 score is None, Mean Log Prob is -0.8791329635046635, top-1 accuracy is 56.12438756124388, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 10001 examples. Total time taken is 6 minutes. Avg time per example is 0 second. Main, INFO: Remaining 3085 examples. Expected Total time taken to complete is 2 minutes. Main, INFO: After 10501 0-1 Correctness is 55.2614036758404 percentage, Mean F1 score is None, Mean Log Prob is -0.8796300606158401, top-1 accuracy is 55.94705266165127, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 10501 examples. Total time taken is 7 minutes. Avg time per example is 0 second. Main, INFO: Remaining 2585 examples. Expected Total time taken to complete is 1 minutes. Main, INFO: After 11001 0-1 Correctness is 55.48586492137078 percentage, Mean F1 score is None, Mean Log Prob is -0.8788046109688281, top-1 accuracy is 56.16762112535224, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 11001 examples. Total time taken is 7 minutes. Avg time per example is 0 second. Main, INFO: Remaining 2085 examples. Expected Total time taken to complete is 1 minutes. Main, INFO: After 11501 0-1 Correctness is 55.6212503260586 percentage, Mean F1 score is None, Mean Log Prob is -0.8783763817197063, top-1 accuracy is 56.29075732544996, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 11501 examples. Total time taken is 7 minutes. Avg time per example is 0 second. Main, INFO: Remaining 1585 examples. Expected Total time taken to complete is 1 minutes. Main, INFO: After 12001 0-1 Correctness is 55.37871844012999 percentage, Mean F1 score is None, Mean Log Prob is -0.8794828187241255, top-1 accuracy is 56.045329555870346, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 12001 examples. Total time taken is 8 minutes. Avg time per example is 0 second. Main, INFO: Remaining 1085 examples. Expected Total time taken to complete is 45 second. Main, INFO: After 12501 0-1 Correctness is 55.38756899448044 percentage, Mean F1 score is None, Mean Log Prob is -0.8792307459371642, top-1 accuracy is 56.0595152387809, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 12501 examples. Total time taken is 8 minutes. Avg time per example is 0 second. Main, INFO: Remaining 585 examples. Expected Total time taken to complete is 24 second. Main, INFO: After 13001 0-1 Correctness is 55.31112991308361 percentage, Mean F1 score is None, Mean Log Prob is -0.8792232991952033, top-1 accuracy is 55.972617490962236, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Processed 13001 examples. Total time taken is 9 minutes. Avg time per example is 0 second. Main, INFO: Remaining 85 examples. Expected Total time taken to complete is 3 second. Main, INFO: Saving results. Final Performance is given below: Main, INFO: Final Performance: Dataset size 13086 0-1 Correctness is 55.29573590096286 percentage, Mean F1 score is None, Mean Log Prob is -0.8793689514949165, top-1 accuracy is 55.952926791991445, top-10 accuracy is 100.0, top-5 accuracy is 100.0. Main, INFO: Time taken to store all results 0 second Main, INFO: Experimented Completed.