Running efficiency while setting requires_grad=False for both input and output embeddings of LLM #374

VERSPD0 · 2023-08-16T07:54:34Z

VERSPD0
Aug 16, 2023

Hi, thanks for your work!

While going through the code, I noticed the way to froze the LLM is to setting requires_grad=False for both input and output embeddings. Though I do understand this implementation can block the backpropagation for those parameters between the input and output layers, I still concerned about whether it would make it slow during training because gradients of parameters with requires_grad=True in the LLM may still be calculated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running efficiency while setting requires_grad=False for both input and output embeddings of LLM #374

{{title}}

Replies: 0 comments

Select a reply

Running efficiency while setting requires_grad=False for both input and output embeddings of LLM #374

VERSPD0 Aug 16, 2023

Replies: 0 comments

VERSPD0
Aug 16, 2023