Pass `deepspeed` and `fsdp` as `None` explicitly when merging adapters to allow custom device_map #1575

chiragjn · 2024-04-29T10:40:19Z

I wanted to merge my adapters using multiple gpus in a not so large main memory scenarios, so I tried passing device_map=auto

but if we have deepspeed or fsdp set in config yaml, then the axolotl sets ACCELERATE_USE_* in env
https://github.com/OpenAccess-AI-Collective/axolotl/blob/5294653a2d353066600cbc66bb06f7c63c87147b/src/axolotl/utils/trainer.py#L428-L432

which later down resets the device_map value back to None
https://github.com/OpenAccess-AI-Collective/axolotl/blob/5294653a2d353066600cbc66bb06f7c63c87147b/src/axolotl/utils/config/__init__.py#L46-L48

kinda self sabotaging - forcing me to merge adapters on cpu

I am not adding a default device_map, so people can choose what they want - just making sure it will be respected

…llow custom device_map

NanoCode012 · 2024-04-30T14:21:51Z

Hey, this looks okay to me. Were you able to confirm this merges successfully on GPU? Did you merge on single gpu or multiple?

chiragjn · 2024-04-30T16:31:04Z

Yes I was able to confirm and merge both on single and multiple gpus if I pass device_map=auto / --device_map auto separately in **kwargs

One caveat, if you end up running merging in the same script as training and it is launched with accelerate launch --use_deepspeed ... then ACCELERATE_USE_DEEPSPEED will be set by accelerate lib itself instead of axolotl and the fix in the PR will not work
So make sure to call the merging script separately without accelerate launch or forcefully unset those env variables

NanoCode012 · 2024-05-01T04:44:08Z

Was merging on gpus much faster than RAM? On this note, how about single vs multiple?

I wonder if the Readme should also be updated to include info on passing device map.

chiragjn · 2024-05-01T05:38:01Z

Was merging on gpus much faster than RAM?

Yes much much faster, but more importantly for me the constraint was limited main memory.
imagine g5.xlarge on AWS, A10 24 GB GPU, 16 GB main memory -> can't comfortably fit Lllama 3 8B @ bf16 in main memory but can on the gpu

On this note, how about single vs multiple?

Both work well, but the main point the number of gpus required will be different based on how many gpus the model can fit in half precision.
Now people can pass their own custom placement via device_map

chiragjn · 2024-05-07T13:37:48Z

@NanoCode012 @winglian bump on this

NanoCode012 · 2024-05-07T13:48:04Z

Thanks! Sorry I missed this.

Pass deepspeed and fsdp as None explicitly when merging adapters to a…

a7200f0

…llow custom device_map

chiragjn changed the title ~~Pass deepspeed and fsdp as None explicitly when merging adapters to allow custom device_map~~ Pass deepspeed and fsdp as None explicitly when merging adapters to allow custom device_map Apr 29, 2024

chiragjn changed the title ~~Pass deepspeed and fsdp as None explicitly when merging adapters to allow custom device_map~~ Pass deepspeed and fsdp as None explicitly when merging adapters to allow custom device_map Apr 29, 2024

NanoCode012 approved these changes May 7, 2024

View reviewed changes

NanoCode012 merged commit 9e1480e into axolotl-ai-cloud:main May 7, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass `deepspeed` and `fsdp` as `None` explicitly when merging adapters to allow custom device_map #1575

Pass `deepspeed` and `fsdp` as `None` explicitly when merging adapters to allow custom device_map #1575

chiragjn commented Apr 29, 2024 •

edited

Loading

NanoCode012 commented Apr 30, 2024

chiragjn commented Apr 30, 2024 •

edited

Loading

NanoCode012 commented May 1, 2024

chiragjn commented May 1, 2024 •

edited

Loading

chiragjn commented May 7, 2024

NanoCode012 commented May 7, 2024

Pass deepspeed and fsdp as None explicitly when merging adapters to allow custom device_map #1575

Pass deepspeed and fsdp as None explicitly when merging adapters to allow custom device_map #1575

Conversation

chiragjn commented Apr 29, 2024 • edited Loading

NanoCode012 commented Apr 30, 2024

chiragjn commented Apr 30, 2024 • edited Loading

NanoCode012 commented May 1, 2024

chiragjn commented May 1, 2024 • edited Loading

chiragjn commented May 7, 2024

NanoCode012 commented May 7, 2024

Pass `deepspeed` and `fsdp` as `None` explicitly when merging adapters to allow custom device_map #1575

Pass `deepspeed` and `fsdp` as `None` explicitly when merging adapters to allow custom device_map #1575

chiragjn commented Apr 29, 2024 •

edited

Loading

chiragjn commented Apr 30, 2024 •

edited

Loading

chiragjn commented May 1, 2024 •

edited

Loading