From 6868f6ff30898989e4aa5890973911b2edc5e8d8 Mon Sep 17 00:00:00 2001 From: Phil Wang Date: Thu, 22 Dec 2022 12:06:27 -0800 Subject: [PATCH] caveat --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 1f7c83f..367e49b 100644 --- a/README.md +++ b/README.md @@ -73,7 +73,9 @@ k = apply_rotary_emb(freqs, k) ## Length Extrapolatable Rotary Embeddings -In this paper, they were able to fix length extrapolation issue with rotary embeddings by giving it a decay similar to ALiBi. They named this technique XPos, and you can use it by setting `use_xpos = True` on initialization +In this paper, they were able to fix length extrapolation issue with rotary embeddings by giving it a decay similar to ALiBi. They named this technique XPos, and you can use it by setting `use_xpos = True` on initialization. + +This can only be used for autoregressive transformers ```python import torch