Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ttnn.l1_loss fails with low PCC on both GS and WH [Bug Report][GS][WH] #6390

Closed
Tracked by #6445
npetrovic-tenstorrent opened this issue Mar 14, 2024 · 1 comment
Closed
Tracked by #6445
Assignees
Labels

Comments

@npetrovic-tenstorrent
Copy link
Contributor

npetrovic-tenstorrent commented Mar 14, 2024

ttnn.l1_loss operation fails with low PCC for reduction option cases: mean and sum. The issues occurs on both Grayskull and Wormhole devices.

To Reproduce
Steps to reproduce the behavior:
Checkout main branch Run unit test test_l1_loss_sum.py (or others) using this command pattern:

pytest tests/ttnn/python_api_testing/non_working_unit_tests/wormhole/test_l1_loss_sum.py

Expected behavior
There are few test cases presented in the unit test tests/tt_eager/python_api_testing/non_working_unit_tests/wormhole/test_l1_loss_sum.py and they are expected to fail with low PCC error. (close to 0.00) Eg:

Max ATOL Delta: 1908736.0, Max RTOL Delta: inf, PCC: 0.0, PCC check failed

The same is expected for test_l1_loss_mean as well.

Getting Additional info for the operation under test and its behavior
To get additional information and results for different combinations of input shapes, types, layouts and memory configs for which this operation was tested you can also run locally sweep test:

tests/ttnn/python_api_testing/sweep_tests/test_configs/ci_sweep_tests_broken/wormhole/ttnn_l1_loss_sum_test.yaml

To do this you should:

  1. Follow the Getting Started page to setup the repo, environment variables and python-env
  2. Activate source build/python_env/bin/activate
  3. Run sweeps by using python tests/tt_eager/python_api_testing/sweep_tests/run_pytorch_test.py -i tests/ttnn/python_api_testing/sweep_tests/test_configs/ci_sweep_tests_broken/wormhole/ttnn_l1_loss_sum_test.yaml -o ./result-sweeps
  4. After the run is completed all test sweeps results should be available inside specified output directory (in this case ./result-sweeps). There you will find .csv which holds all executed sweeps, among which you can also find the ones that failed and were recreated by the unit test, which you can get by searching unique data_seed field.
@npetrovic-tenstorrent npetrovic-tenstorrent added bug Something isn't working GS WH labels Mar 14, 2024
@npetrovic-tenstorrent npetrovic-tenstorrent changed the title ttnn.l1_loss fails with low PCC on both GS and WH #6349[Bug Report] [GS][WH] ttnn.l1_loss fails with low PCC on both GS and WH [Bug Report] [GS][WH] Mar 14, 2024
@npetrovic-tenstorrent npetrovic-tenstorrent changed the title ttnn.l1_loss fails with low PCC on both GS and WH [Bug Report] [GS][WH] ttnn.l1_loss fails with low PCC on both GS and WH [Bug Report][GS][WH] Mar 14, 2024
@jliangTT jliangTT added the P2 label Mar 15, 2024
@umadevimcw
Copy link
Contributor

@npetrovic-tenstorrent

To fix the low PCC issue

  • l1 loss returns only a single value in torch whereas in TT single value output will be in 32x32 tensor hence while comparing the results we need to fetch the data at the 0th index
  • Since it involves bfloat16 computation there will be differences between the values hence compall_close is used for comparison rather than comp_pcc

PR: #6492

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants