Skip to content

Commit

Permalink
#0: update all-gather tests to remove all_devices test fixture
Browse files Browse the repository at this point in the history
The all_devices test fixture is unreliable for producing expected chip orderes that are needed for certain CCL operations such as ring all-gathers. Without this change, from machine to machine, the chip order may be incorrect from the order expected by all-gather. To correct the chip, we use fixtures that account for chip ordering.

In particular, 4-chip ring tests that are expected to run on the inner 4-chip ring of the t3000 were updated to use the pcie_mesh_device fixture to ensure that the first 4 chips in the list are those devices in the inner ring.

Additional changes along with the above:
- Removed some redundant `run_all_gather` type test functions to merge into a singla test function
- Added additional n300 all-gather test cases
- Improve error messaging
  • Loading branch information
SeanNijjar committed Sep 30, 2024
1 parent d8706ff commit 203ce06
Show file tree
Hide file tree
Showing 5 changed files with 402 additions and 319 deletions.
23 changes: 23 additions & 0 deletions conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,29 @@ def pcie_mesh_device(request, silicon_arch_name, silicon_arch_wormhole_b0, devic
del mesh_device


@pytest.fixture(scope="function")
def n300_mesh_device(request, silicon_arch_name, silicon_arch_wormhole_b0, device_params):
import ttnn

if ttnn.get_num_devices() < 2:
pytest.skip()

mesh_device = ttnn.open_mesh_device(
ttnn.MeshShape(1, 2),
dispatch_core_type=get_dispatch_core_type(),
**device_params,
)

logger.debug(f"multidevice with {mesh_device.get_num_devices()} devices is created")
yield mesh_device

for device in mesh_device.get_devices():
ttnn.DumpDeviceProfiler(device)

ttnn.close_mesh_device(mesh_device)
del mesh_device


@pytest.fixture(scope="function")
def t3k_mesh_device(request, silicon_arch_name, silicon_arch_wormhole_b0, device_params):
import ttnn
Expand Down
Loading

0 comments on commit 203ce06

Please sign in to comment.