-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ND behavior in test_matmul.py::test_sd_matmul
#7126
Comments
not
|
@TT-billteng that just looks like a PCC thresholding assertion maybe due to different seeds across runs? |
Ah yes that's true, running again with lower threshold I feel like we should use a "standard" globally-agreed upon PCC threshold to use for testing, or is this too much of an ask? |
I don't think that's feasible because there could be variability based on:
|
can we disable? I still see this failing on main |
I would rather not disable this test. @TT-BrianLiu , do we need to lower the PCC for this matmul test when running on WH or is this something bigger? |
Was it not failing before? Otherwise, 0.999 is pretty reasonable pcc for a matmul |
Having said that, the PCCs in test_matmul.py should be updated so that anything above .999 or with too many digits would be changed.
|
@TT-billteng @bbradelTT Can we close this issue ? |
@prajaramanTT We can't. |
@bbradelTT Do we have any updates on this ? |
I just looked into this. On WH N150 and BH the tests pass. They are skipped on N300 since the grid is too small. I'll create a PR to re-enable the tests. |
### Ticket Link to Github Issue #7126 ### Problem description A test was failing and was skipped ### What's changed After various issues were fixed over time the test now passes. Therefore enable it again. ### Checklist - [x] Post commit CI passes https://github.com/tenstorrent/tt-metal/actions/runs/12776412082 - [ ] Blackhole Post commit (if applicable) Too many issues, but test passed locally. - [ ] Model regression CI testing passes (if applicable) N/A - [ ] Device performance regression CI testing passes (if applicable) N/A - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [ ] New/Existing tests provide coverage for changes
All post commit passed in main after the merge. Checked subsequent runs as well, and there are o failures in all post commit related to GS. Closing. |
pytest tests/ttnn/unit_tests/operations/test_matmul.py::test_sd_matmul
This appears only on N150, but I haven't been able to repro locally yet though
This is not failing on a specific VM or BM.
Some failing runs:
https://github.com/tenstorrent-metal/tt-metal/actions/runs/8558424886/job/23453740581
https://github.com/tenstorrent-metal/tt-metal/actions/runs/8558853752/job/23454745853
https://github.com/tenstorrent-metal/tt-metal/actions/runs/8545275089/job/23414372279
https://github.com/tenstorrent-metal/tt-metal/actions/runs/8542369244/job/23403999132
https://github.com/tenstorrent-metal/tt-metal/actions/runs/8559093722/job/23455622019
The text was updated successfully, but these errors were encountered: