Test different input sequence lengths for Llama #1070

pmarkovicTT · 2025-01-20T16:23:04Z

Add test to make sure Llama compiles and run fwd pass with different input sequence lengths as we will have inputs of various lengths during training.

Close #1071

forge/test/mlir/llama/test_llama_inference.py

nvukobratTT · 2025-01-20T17:11:52Z

forge/test/mlir/llama/test_llama_inference.py

+    ],
+)
+@pytest.mark.parametrize("seq_len", [1, 2, 4, 7, 8, 16, 28, 32, 63, 64, 99, 117, 128, 256, 341, 512, 1024, 1790, 2048])
+@pytest.mark.skip(reason="No need to run in CI as it takes a long time to run.")


My recommendation is to choose which of these will be part of the training focus, instead of skipping it entirely.

E.g. if we're going to focus on training 2048 seq len model, let's fully compile and run as part of push CI that variant alone.

That's right - understanding which sequence length is relevant for Llama finetuning is one of the training team's tasks.
Once we establish which set of seq lengths is needed, we will continue with PCC tests and run as part of CI.

Agree as well, will update seq_len parameters with required ones for training once we choose them (we will run some experiments separately).

This is updated to use only dim sizes we care about. Additionally, I setup only one hidden layer to be used for test to speed it up (while I also ran full model test locally to make sure it passes).

nvukobratTT · 2025-01-20T17:13:25Z

forge/test/mlir/llama/test_llama_inference.py

+    input_ids = tokenizer(prompt, padding="max_length", truncation=True, return_tensors="pt").input_ids
+
+    # Compile the model and run fwd pass
+    compiled_model = forge.compile(framework_model, input_ids)


Do we want to test out bwd compile/run as well?

One general question, is there a clean way to test a backward part of a graph in isolation? For example, our compile should return compiled context that contains information about each compiled component (e.g. fwd, bwd, loss, etc.).

Therefore, is there a clean way to just call the bwd part of the graph with random inputs, without a need to run the forward part, and initialize the loss and optimizer part of the training workflow?

Note: this is not a requirement for this PR, just a general question that can be useful here as well. I.e. can we have granular tests that target specific functionality, rather than the whole workflow (only the bwd part of the model). I see this as especially useful for bwd generallity push in the future. cc @vladimirjovanovicTT

I think this is a must-have functionality as part of our training generality/BFS effort.
Let's discuss the implementation details offline.

github-actions · 2025-01-20T17:43:52Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	823 ran	490 passed	333 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-01-20T17:48:19Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	665 ran	434 passed	231 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-01-21T16:09:01Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	665 ran	437 passed	228 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-01-21T16:15:23Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	823 ran	492 passed	331 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-02-04T17:42:52Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	510 ran	451 passed	59 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-02-04T17:43:01Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	510 ran	451 passed	59 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-02-04T17:44:13Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	568 ran	489 passed	79 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-02-04T17:47:29Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	568 ran	489 passed	79 skipped	0 failed

Test	Result
No test annotations available

pmarkovicTT requested review from nvukobratTT, pilkicTT and dgolubovicTT as code owners January 20, 2025 16:23

pmarkovicTT requested a review from vladimirjovanovicTT January 20, 2025 16:23

pmarkovicTT self-assigned this Jan 20, 2025

nvukobratTT reviewed Jan 20, 2025

View reviewed changes

pmarkovicTT added 4 commits February 4, 2025 16:47

Add test for different input seq lens

74a63d2

Add padding and truncation

785f6bc

Add verify check

0e2ab4b

Update required dims for testing

710afb4

pmarkovicTT force-pushed the pmarkovic/test-input-seq-lengths-llama branch from 5b45490 to 710afb4 Compare February 4, 2025 16:47

pmarkovicTT requested a review from nvukobratTT February 4, 2025 16:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test different input sequence lengths for Llama #1070

Test different input sequence lengths for Llama #1070

pmarkovicTT commented Jan 20, 2025 •

edited

Loading

nvukobratTT Jan 20, 2025

vladimirjovanovicTT Jan 20, 2025

pmarkovicTT Jan 21, 2025

pmarkovicTT Feb 4, 2025

nvukobratTT Jan 20, 2025

vladimirjovanovicTT Jan 20, 2025

github-actions bot commented Jan 20, 2025

github-actions bot commented Jan 20, 2025

github-actions bot commented Jan 21, 2025

github-actions bot commented Jan 21, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

Test different input sequence lengths for Llama #1070

Are you sure you want to change the base?

Test different input sequence lengths for Llama #1070

Conversation

pmarkovicTT commented Jan 20, 2025 • edited Loading

nvukobratTT Jan 20, 2025

Choose a reason for hiding this comment

vladimirjovanovicTT Jan 20, 2025

Choose a reason for hiding this comment

pmarkovicTT Jan 21, 2025

Choose a reason for hiding this comment

pmarkovicTT Feb 4, 2025

Choose a reason for hiding this comment

nvukobratTT Jan 20, 2025

Choose a reason for hiding this comment

vladimirjovanovicTT Jan 20, 2025

Choose a reason for hiding this comment

github-actions bot commented Jan 20, 2025

github-actions bot commented Jan 20, 2025

github-actions bot commented Jan 21, 2025

github-actions bot commented Jan 21, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

pmarkovicTT commented Jan 20, 2025 •

edited

Loading