Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the data lengths for Figure 2 #27

Open
chen-yingfa opened this issue Apr 23, 2024 · 0 comments
Open

Questions about the data lengths for Figure 2 #27

chen-yingfa opened this issue Apr 23, 2024 · 0 comments

Comments

@chen-yingfa
Copy link

Hi, thank you very much for this work!

In the arXiv paper regarding Figure 2, it says that the models are trained on sequences with 256 tokens and evaluated on 1024 tokens, however, in the code, it seems that the training data consist of sequences of both 256 in lengths of shorter sequences (64 and 128), and the evaluation data, similarly, consists of sequences of different lengths up to 1024. Can you please confirm whether this data mixture is the data used to produce Figure 2?

Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant