Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
FIX: Transpose weight matrix based on fan_in_fan_out condition in PiS…
…SA initialization (huggingface#2103) Previously, the weight matrix was converted to float32 without considering the need for transposition. This update ensures that the weight matrix is transposed when the fan_in_fan_out condition is met, resolving dimension mismatch issues during GPT-2 training.
- Loading branch information