Skip to content

Replicating Andrej Karpathys Transformer Lecture #3179

Answered by cgarciae
felixmin asked this question in Q&A
Discussion options

You must be logged in to vote

Hey @felixmin, in case it helps, here is my port of nanoGPT which is very similar to the lecture: https://github.com/cgarciae/nanoGPT-jax/blob/master/model.py

And is there a way to use nn.Sequential providing the deterministic variable to each block in GPTLanguageModel?

Currently no. There is a proposal that could allow layers like Dropout to receive some of their arguments via flags, #2131. While it was implemented for internal usage, flags are currently not exposed to the user.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@felixmin
Comment options

Answer selected by felixmin
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants