TODO Objectives use tokun as tokenizer reduce embedding dimension to 256 transfer learn from the original llama3 rewrite llama3 in tensorflow: naive line by line => separate repo (llama3-tensorflow) improved: keras layers + my take