{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":741289664,"defaultBranch":"master","name":"vectorlm","ownerLogin":"VectorInstitute","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2024-01-10T04:50:08.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/40637123?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1717800191.0","currentOid":""},"activityList":{"items":[{"before":"19afcea6f517137147c3e0837cc4c78d28fe34ea","after":"ff136e5a2ec5b18430b0beb9fcb39ae371d2595a","ref":"refs/heads/jjt/lora-vllm-trice","pushedAt":"2024-06-24T18:51:16.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"lora-trice: Implemented LoRA-TRICE. Tested on two NVIDIA GPUs.","shortMessageHtmlLink":"lora-trice: Implemented LoRA-TRICE. Tested on two NVIDIA GPUs."}},{"before":"5e8944d0fe27ce1ed6b2405c2202179391bd6cbe","after":"2005a7dc7a11c5afa04a4b7ca2e51486cc52a4c7","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-06-18T14:41:59.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm hotswapping: Moved Sampler import into conditional block to avoid importing vLLM when not required.\nRuff formatting fixes.","shortMessageHtmlLink":"vllm hotswapping: Moved Sampler import into conditional block to avoi…"}},{"before":"bc0ae52602feef4813c9cfb59cd93d678230322d","after":"5e8944d0fe27ce1ed6b2405c2202179391bd6cbe","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-06-18T00:51:29.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm hotswapping: Refactoring and cleanup.","shortMessageHtmlLink":"vllm hotswapping: Refactoring and cleanup."}},{"before":null,"after":"19afcea6f517137147c3e0837cc4c78d28fe34ea","ref":"refs/heads/jjt/lora-vllm-trice","pushedAt":"2024-06-07T22:43:11.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"trice [WIP]: implemented baseline trice, basic grad estimate.\nno control variate.\ntested on one GPU only.","shortMessageHtmlLink":"trice [WIP]: implemented baseline trice, basic grad estimate."}},{"before":"04165ff33dda9efb48b1c98261c04a5822f58df5","after":null,"ref":"refs/heads/jjt/lora-benchmarking-revisions","pushedAt":"2024-05-28T15:40:24.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"adil-a","name":"Adil","path":"/adil-a","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/47084919?s=80&v=4"}},{"before":"ce1eaa30c095d903f0fd923594968da48d78a45d","after":"9045f08ac1af38c13100cfabdd17412f57dbffd2","ref":"refs/heads/master","pushedAt":"2024-05-28T15:40:19.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"adil-a","name":"Adil","path":"/adil-a","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/47084919?s=80&v=4"},"commit":{"message":"Add revised benchmarking logic and results (#9)\n\n* Revised estimation of batch count, directly retrieving from len(train_dataloader).\r\nDeleted unused timer_handle argument in Trainer.\r\nRevised handling of \"max_seq_len\" override in benchmarking.\r\nAdded support for automatic switching between  lora and full-rank sharding scheme in benchmarking.\r\n\r\n* Revised handling of unspecified max_seq_length.\r\nAdded llama-3 to benchmark model_list.\r\n\r\n* Benchmarking: Revised benchmark script to ensure consistent per-device train batch size.\r\n\r\n* Benchmarking: replaced trainer.step with trainer.train_step to avoid eval overhead in benchmarking.\r\nRevised benchmark parsing logic; display optimal batch size for each context width value.\r\n\r\n* Benchmarking: Updated reference throughput based on updated logic.\r\n\r\n* Benchmarking: Updated reference throughput descriptions.","shortMessageHtmlLink":"Add revised benchmarking logic and results (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2287886009\" data-permission-text=\"Title is private\" data-url=\"https://github.com/VectorInstitute/vectorlm/issues/9\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/VectorInstitute/vectorlm/pull/9/hovercard\" href=\"https://github.com/VectorInstitute/vectorlm/pull/9\">#9</a>)"}},{"before":"879399f053a04150b32b07198343d589585f9bc6","after":"bc0ae52602feef4813c9cfb59cd93d678230322d","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-24T14:10:45.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm hotswapping [WIP]: added example gemma sampling config.","shortMessageHtmlLink":"vllm hotswapping [WIP]: added example gemma sampling config."}},{"before":"3e27e8476274d94cb06223cdd5680eda020fe454","after":"879399f053a04150b32b07198343d589585f9bc6","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-24T13:22:34.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm hotswapping [WIP]: cleaned up changes in llama_example.py.","shortMessageHtmlLink":"vllm hotswapping [WIP]: cleaned up changes in llama_example.py."}},{"before":"f5068128a3f2454f4a6012436bb16ac6a81632a3","after":"3e27e8476274d94cb06223cdd5680eda020fe454","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-24T13:17:19.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm hotswapping [WIP]: cleaned up documentation related to multiprocess_wrap.","shortMessageHtmlLink":"vllm hotswapping [WIP]: cleaned up documentation related to multiproc…"}},{"before":"059d57f8636edc535813eb13e0c086720faaa447","after":"f5068128a3f2454f4a6012436bb16ac6a81632a3","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-24T13:06:55.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm hotswapping [WIP]: documentation fixes and cleanup.","shortMessageHtmlLink":"vllm hotswapping [WIP]: documentation fixes and cleanup."}},{"before":"31464aafe1b7a13d63061537acb833d7960266cf","after":"059d57f8636edc535813eb13e0c086720faaa447","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-24T00:57:39.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm hotswapping [WIP]: Refactored vLLM integration interface to minimize changes required in llama_example.py.","shortMessageHtmlLink":"vllm hotswapping [WIP]: Refactored vLLM integration interface to mini…"}},{"before":"9585c0197981fbaa3859e5e2fce0a6a834b108af","after":"31464aafe1b7a13d63061537acb833d7960266cf","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-23T18:20:28.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm hotswapping [WIP]: Reduced area of vLLM integration interface.\nCleanup is required.","shortMessageHtmlLink":"vllm hotswapping [WIP]: Reduced area of vLLM integration interface."}},{"before":"609c023af14565f117bb80d6abea25d62cf5496b","after":"9585c0197981fbaa3859e5e2fce0a6a834b108af","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-23T18:14:15.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"[WIP] vllm hotswapping: Reduced area of vLLM integration interface.\nCleanup is required.","shortMessageHtmlLink":"[WIP] vllm hotswapping: Reduced area of vLLM integration interface."}},{"before":"61c39ade355e9fa7743d1dbc481d940aac4da876","after":"609c023af14565f117bb80d6abea25d62cf5496b","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-23T17:19:18.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"[WIP] vllm hotswapping: Implement minimum-viable wrapper for vllm/main.\nCleanup is required.","shortMessageHtmlLink":"[WIP] vllm hotswapping: Implement minimum-viable wrapper for vllm/main."}},{"before":"e707987bf6f3cd86a745a1089c0e959a0be8dc28","after":"61c39ade355e9fa7743d1dbc481d940aac4da876","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-10T00:45:05.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm integration: Added documentation on sampling engine.","shortMessageHtmlLink":"vllm integration: Added documentation on sampling engine."}},{"before":"07405dc1d7fa9bd76fab91539cdfb76201975811","after":"e707987bf6f3cd86a745a1089c0e959a0be8dc28","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-10T00:34:46.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm integration: Added documentation on sampling engine.","shortMessageHtmlLink":"vllm integration: Added documentation on sampling engine."}},{"before":"112ea3c21546bec3695b41f2eba6f25bec05092f","after":"07405dc1d7fa9bd76fab91539cdfb76201975811","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-09T20:14:06.000Z","pushType":"push","commitsCount":4,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"Merge remote-tracking branch 'origin/master' into jjt/lora-vllm-hotswap","shortMessageHtmlLink":"Merge remote-tracking branch 'origin/master' into jjt/lora-vllm-hotswap"}},{"before":"b697dc0909cadbbf3a47b2e861f49ff22beedfc5","after":"112ea3c21546bec3695b41f2eba6f25bec05092f","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-09T19:30:55.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm integration [WIP]: Moved sampler-related logic into Trainer.","shortMessageHtmlLink":"vllm integration [WIP]: Moved sampler-related logic into Trainer."}},{"before":"0703e075233f018e118171e767c7f6aad981870f","after":"04165ff33dda9efb48b1c98261c04a5822f58df5","ref":"refs/heads/jjt/lora-benchmarking-revisions","pushedAt":"2024-05-09T15:12:29.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"Benchmarking: Updated reference throughput descriptions.","shortMessageHtmlLink":"Benchmarking: Updated reference throughput descriptions."}},{"before":"555c55ac2069d6a97195ea2a45b2f467285625ad","after":"0703e075233f018e118171e767c7f6aad981870f","ref":"refs/heads/jjt/lora-benchmarking-revisions","pushedAt":"2024-05-09T15:03:26.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"Benchmarking: Updated reference throughput based on updated logic.","shortMessageHtmlLink":"Benchmarking: Updated reference throughput based on updated logic."}},{"before":"dfbd38807128b424f2068cb5b43adc75f624fc30","after":"555c55ac2069d6a97195ea2a45b2f467285625ad","ref":"refs/heads/jjt/lora-benchmarking-revisions","pushedAt":"2024-05-08T12:46:49.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"Benchmarking: Revised benchmark script to ensure consistent per-device train batch size.","shortMessageHtmlLink":"Benchmarking: Revised benchmark script to ensure consistent per-devic…"}},{"before":"11a1ba598a72139b07e0bf1085274a1c819a8618","after":"b697dc0909cadbbf3a47b2e861f49ff22beedfc5","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-07T15:49:54.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm integration [WIP]: Implemented lora hotswap.\nStill need to move barrier logic into _VLLMCallbackWrapper.","shortMessageHtmlLink":"vllm integration [WIP]: Implemented lora hotswap."}},{"before":"ebb7bc9103612758d47e410eb3b46a5dcbf94d65","after":"11a1ba598a72139b07e0bf1085274a1c819a8618","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-06T18:28:24.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm integration [WIP]: Implemented inference during training.","shortMessageHtmlLink":"vllm integration [WIP]: Implemented inference during training."}},{"before":"ca2cad8e0151957d40ad0da829ccae78934e0852","after":"ebb7bc9103612758d47e410eb3b46a5dcbf94d65","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-05-06T14:13:50.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"vllm integration: Eliminated duplicate vllm ResultHandler.","shortMessageHtmlLink":"vllm integration: Eliminated duplicate vllm ResultHandler."}},{"before":"573b6f72deed0fad508c4e17c573a74fc3cde432","after":"dfbd38807128b424f2068cb5b43adc75f624fc30","ref":"refs/heads/jjt/lora-benchmarking-revisions","pushedAt":"2024-04-30T01:53:56.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"Revised handling of unspecified max_seq_length.\nAdded llama-3 to benchmark model_list.","shortMessageHtmlLink":"Revised handling of unspecified max_seq_length."}},{"before":null,"after":"573b6f72deed0fad508c4e17c573a74fc3cde432","ref":"refs/heads/jjt/lora-benchmarking-revisions","pushedAt":"2024-04-30T01:12:18.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"Revised estimation of batch count, directly retrieving from len(train_dataloader).\nDeleted unused timer_handle argument in Trainer.\nRevised handling of \"max_seq_len\" override in benchmarking.\nAdded support for automatic switching between  lora and full-rank sharding scheme in benchmarking.","shortMessageHtmlLink":"Revised estimation of batch count, directly retrieving from len(train…"}},{"before":"675367b1ddb28e2a0c0224ed23537c7d9743c629","after":"ca2cad8e0151957d40ad0da829ccae78934e0852","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-04-26T15:20:41.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"Added train_parameters.get(\"sampler\").","shortMessageHtmlLink":"Added train_parameters.get(\"sampler\")."}},{"before":"5f15dc9b48339ab53fff688842ef724734dcb65a","after":null,"ref":"refs/heads/jjt/lora-baseline","pushedAt":"2024-04-26T07:47:08.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"adil-a","name":"Adil","path":"/adil-a","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/47084919?s=80&v=4"}},{"before":"8320c48c0f989c45bc76e2ee7ac60cc218ffb3a6","after":"ce1eaa30c095d903f0fd923594968da48d78a45d","ref":"refs/heads/master","pushedAt":"2024-04-26T07:46:56.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"adil-a","name":"Adil","path":"/adil-a","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/47084919?s=80&v=4"},"commit":{"message":"Implemented baseline LoRA peft with FSDP integration, tested on one node. (#5)\n\n* Implemented baseline LoRA peft for one Nvidia GPU.\r\n\r\n* Added support for saving lora adapters.\r\nAdded support for non-fsdp models.\r\n\r\n* save_utils: added support for non-FSDP optimizers.\r\ntrainer: replaced clip_grad_norm_ with nn.utils.clip_grad_norm_ for lora compatibility.\r\n\r\n* example_lora: highlighted current lora (non-fsdp) limitations.\r\n\r\n* Added instructions on LoRA on one GPU.\r\n\r\n* Added example script for launching lora.\r\n\r\n* Revised instructions on LoRA on one GPU.\r\n\r\n* Implemented LoRA FSDP.\r\nAlso see https://github.com/facebookresearch/llama-recipes/blob/674b37ee66f59a7845cbc3868948f4d7fa69c679/src/llama_recipes/utils/fsdp_utils.py#L9\r\n\r\n* Reverted automatic formatter changes in README.md\r\n\r\n* Eliminated non-FSDP logic from save_utils.\r\nSet model path to local copy of llama-2-7b in example config.\r\n\r\n* Moved lora config out of example config.yaml.\r\n\r\n* Implemented LoRA benchmarking logic for worker.\r\n\r\n* model_utils: Refactored get_lora_model to reduce interface width. (this method no longer wraps load_model_and_tokenizer)\r\ntest_modelling: revised base model fixture scope since torch FSDP wrap is in-place.\r\nlaunch_benchmark: added confirmation before launching.\r\n\r\n* test_modelling: moved text output to data/.\r\n\r\n* added example yaml config for lora benchmarking.\r\n\r\n* launch_benchmark: marked qos flag as optional.\r\n\r\n* launch_benchmark: added option to limit number of jobs launched.\r\n\r\n* launch_benchmark: implemented torch profiler integration.\r\n\r\n* Merged changes from low CPU memory usage feature (#6) into jjt/lora-benchmarking\r\n\r\n* added changes to implement low cpu mem usage feature\r\n\r\n* implemented new ruff linting changes and ran a fix across files\r\n\r\n* Revised launch_benchmark.py to use new profiling path.\r\n\r\n* Enabled automatic creation of data/trace folder.\r\n\r\n* Added instructions for profiling tools.\r\n\r\n* Cleaned up duplicate imports from merge.\r\n\r\n* Cleaned up duplicate imports from merge.\r\n\r\n* Cleaned up parse_benchmark.py\r\n\r\n* Integrated LoRA logic into llama_example.py.\r\n\r\n* Moved lora_configs into train_parameters in config yaml. Adjusted docs/config.md accordingly.\r\n\r\n* Revised handling of nproc-per-node in benchmark script.\r\n\r\n* Included parameter_count info in benchmark output.\r\n\r\n* Implemented basic util for parsing benchmarking output.\r\n\r\n* model_utils: Enabled  low_cpu_mem_usage in auto model from_pretrained by default.\r\n\r\n* launch_lora_benchmark.sh: implemented automatic identification of num_gpus.\r\nlora-benchmark: switched\r\nparse_benchmark: implemented option to specify benchmark artifact folder to load.\r\n\r\n* requirements.txt: included accelerate to support low_cpu_mem loading.\r\n\r\n* benchmark.py: adjusted BenchmarkingDataset to avoid StopIteration exception.\r\n\r\n* benchmark.py: added env var flag to toggle export_trace\r\n\r\n* parse_benchmark: included profiler table in output file.\r\nlaunch_benchmark: automated folder creation.\r\nlaunch_lora_benchmark: included model info in slurm output.\r\n\r\n* get_lora_model_from_base_model: enabled peft for models loaded via low_cpu_mem.\r\nMore investigation might be needed.\r\n\r\n* model_utils: revised dtype handling for peft-wrapped models.\r\n\r\n* parse_benchmark: implemented sorting of profiler table output.\r\nlaunch_benchmark: revised default run time limit.\r\n\r\n* Merged example_lora into examples/llama_example.pu\r\n\r\n* Added instructions related to parse_benchmark\r\n\r\n* parse_benchmark: implemented aggregation across repeated metrics.\r\n\r\n* Implemented non-LoRA profiling and benchmarking.\r\n\r\n* Various static typechecking and formatting fixes.\r\n\r\n* Implemented restoring LoRA train state from filesystem.\r\nDuring training the adapter weights are saved to and loaded from the filesystem. The base model weights are loaded separately.\r\nRevised reference to optim_state_dict_to_load in load_optimizer.\r\n\r\n* Included train step number in LoRA adapter output path.\r\n\r\n* Added reference throughput table to documentation.\r\n\r\n* Added unit description to reference throughput table.\r\nApplied markdown formatting via prettier.\r\n\r\n* Added unit description to reference throughput table.\r\nApplied markdown formatting via prettier.\r\n\r\n* Benchmark: added option to override max_length of pre-trained model.\r\n\r\n* Deleted unused `accelerate` dependency from requirements.txt\r\n\r\n* Benchmark: added comment on max_length.\r\n\r\n* Benchmark: added comment on batch size.\r\n\r\n* Benchmark: added option to override batch size.\r\n\r\n* Benchmark throughput documentation: revised word choices.\r\n\r\n* Moved profiling-tracking logic out of Trainer.\r\n\r\n* Eliminated hasattr check related to no_sync since FSDP is always enabled.\r\n\r\n* Replaced peft fsdp_auto_wrap_policy to eliminate implicit `accelerate` dependency.\r\nEliminated redundant bfloat16 type conversion.\r\nFixed scope of placeholder for `is_peft_adapter_restored`.\r\n\r\n* Configured LoRA auto-wrap policy as off by default- enable the policy only when LoRA is required.\r\n\r\n* Revised punctuation in lora_requires_grad_policy_fn.\r\n\r\n* Renamed declarative `enable_lora` with descriptive `is_lora_enabled`.\r\n\r\n* Replaced optimizer.load_state_dict with load_sharded_optimizer_state_dict for PEFT optimizer.\r\nAdded LoRA/PEFT documentations.\r\n\r\n* benchmarking: deleted unused TypeVar in parse_benchmark.py\r\n\r\n* Replaced config getattr and hasattr with dict methods.\r\n\r\n* Deleted redundant lora-specific launch scripts.\r\n\r\n* Added launch_benchmark.sh for throughput benchmarks.\r\n\r\n* Benchmark: run `makedirs` only `if __name__ == \"__main__\"`.\r\n\r\n* Replaced peft class attributes in Trainer with instance attributes.\r\nAdded information about benchmarking environment.\r\nAdditional formatting fixes.\r\n\r\n---------\r\n\r\nCo-authored-by: Adil <47084919+adil-a@users.noreply.github.com>","shortMessageHtmlLink":"Implemented baseline LoRA peft with FSDP integration, tested on one n…"}},{"before":"aa1fe8b86cd55e3c3b939b2e613d2a15a6032c5f","after":"675367b1ddb28e2a0c0224ed23537c7d9743c629","ref":"refs/heads/jjt/lora-vllm-hotswap","pushedAt":"2024-04-25T20:33:29.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jacobthebanana","name":null,"path":"/jacobthebanana","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/50071502?s=80&v=4"},"commit":{"message":"Added reference sampling steps to llama_example. Added example sampling configs and documentations.","shortMessageHtmlLink":"Added reference sampling steps to llama_example. Added example sampli…"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEbdkL9gA","startCursor":null,"endCursor":null}},"title":"Activity · VectorInstitute/vectorlm"}