Skip to content

Commit

Permalink
Changing how we setup simulator. (#16375)
Browse files Browse the repository at this point in the history
Previously, `tt_SimulationDevice` needed to be setup with two arguments:
1. Constructor argument with soc descriptor path.
2. Environment variable that points to simulator run script.

Change in `tt_SimulationDevice` now requires only one argument, path to
directory that contains all necessary files for simulator to run. That
directory contains `soc_descriptor.yaml`, `run.sh` script and simulator
application.

This PR also removes need for two environment variables
(`TT_METAL_SIMULATOR_EN` and `TT_REMOTE_EXE`) and replaces them with
single `TT_METAL_SIMULATOR` variable that contains path to simulator
build directory.

In the future, when tt-metal is built only once for all architectures,
this would mean that one would setup `TT_METAL_SIMULATOR` environment
variable that point to simulator directory and would run it against
architecture that is compiled in that directory.

Related PRs:
- [UMD
simulators](https://yyz-gitlab.local.tenstorrent.com/tenstorrent/tt-metal/-/merge_requests/15)
- [UMD](tenstorrent/tt-umd#449)
  • Loading branch information
tt-vjovanovic authored Jan 21, 2025
1 parent 6b4659f commit d0b0f9b
Show file tree
Hide file tree
Showing 9 changed files with 44 additions and 22 deletions.
6 changes: 4 additions & 2 deletions tests/tt_metal/test_utils/env_vars.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

#include "umd/device/device_api_metal.h"
#include "umd/device/tt_cluster_descriptor.h"
#include "umd/device/tt_simulation_device.h"

#include <string>

Expand Down Expand Up @@ -40,8 +41,9 @@ inline std::string get_env_arch_name() {

inline std::string get_umd_arch_name() {

if(std::getenv("TT_METAL_SIMULATOR_EN")) {
return get_env_arch_name();
if(std::getenv("TT_METAL_SIMULATOR")) {
tt_SimulationDeviceInit init(std::getenv("TT_METAL_SIMULATOR"));
return tt::arch_to_str(init.get_arch_name());
}

auto cluster_desc = tt_ClusterDescriptor::create();
Expand Down
4 changes: 2 additions & 2 deletions tt_metal/api/tt-metalium/core_descriptor.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,13 @@ inline std::string get_core_descriptor_file(
}
core_desc_dir += "tt_metal/core_descriptors/";

bool targeting_sim = std::getenv("TT_METAL_SIMULATOR_EN") != nullptr;
bool targeting_sim = std::getenv("TT_METAL_SIMULATOR") != nullptr;
if (targeting_sim) {
switch (arch) {
case tt::ARCH::Invalid:
throw std::runtime_error(
"Invalid arch not supported"); // will be overwritten in tt_global_state constructor
case tt::ARCH::GRAYSKULL: throw std::runtime_error("GRAYSKULL arch not supported for simulator");
case tt::ARCH::GRAYSKULL: return core_desc_dir + "grayskull_versim_1x1_arch.yaml";
case tt::ARCH::WORMHOLE_B0: return core_desc_dir + "wormhole_b0_versim_1x1_arch.yaml";
case tt::ARCH::BLACKHOLE: return core_desc_dir + "blackhole_simulation_1x2_arch.yaml";
default: throw std::runtime_error("Unsupported device arch");
Expand Down
17 changes: 6 additions & 11 deletions tt_metal/api/tt-metalium/get_platform_architecture.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include "assert.hpp"
#include "umd/device/pci_device.hpp"
#include "umd/device/tt_soc_descriptor.h"
#include "umd/device/tt_simulation_device.h"

namespace tt::tt_metal {

Expand All @@ -18,21 +19,17 @@ namespace tt::tt_metal {
*
* This function determines the platform architecture by inspecting the environment
* variables or available physical devices. If the environment variable
* `TT_METAL_SIMULATOR_EN` is set, the architecture is retrieved from the
* `ARCH_NAME` environment variable. Otherwise, the architecture is deduced
* by detecting available physical devices.
* `TT_METAL_SIMULATOR` is set, the architecture is retrieved from simulator.
* Otherwise, the architecture is deduced by detecting available physical devices.
*
* @return tt::ARCH The detected platform architecture. Returns tt::ARCH::Invalid
* if no valid architecture could be detected.
*
* @note
* - If the system is in simulation mode (`TT_METAL_SIMULATOR_EN` is set),
* the `ARCH_NAME` environment variable must be defined.
* - A fatal error occurs if multiple devices are detected with conflicting
* architectures.
*
* @exception std::runtime_error Throws a fatal error if:
* - `ARCH_NAME` is not set when `TT_METAL_SIMULATOR_EN` is enabled.
* - Multiple devices with inconsistent architectures are detected.
*
* Example usage:
Expand All @@ -52,12 +49,10 @@ namespace tt::tt_metal {
*/
inline tt::ARCH get_platform_architecture() {
auto arch = tt::ARCH::Invalid;
if (std::getenv("TT_METAL_SIMULATOR_EN")) {
auto arch_env = std::getenv("ARCH_NAME");
TT_FATAL(arch_env, "ARCH_NAME env var needed for VCS");
arch = tt::get_arch_from_string(arch_env);
if (std::getenv("TT_METAL_SIMULATOR")) {
tt_SimulationDeviceInit init(std::getenv("TT_METAL_SIMULATOR"));
arch = init.get_arch_name();
} else {

// Issue tt_umd#361: tt_ClusterDescriptor::create() won't work here.
// This map holds PCI info for each mmio chip.
auto devices_info = PCIDevice::enumerate_devices_info();
Expand Down
2 changes: 1 addition & 1 deletion tt_metal/common/core_descriptor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ const core_descriptor_t& get_core_descriptor_config(
}
dispatch_cores.push_back(coord);
}
TT_ASSERT(dispatch_cores.size() || std::getenv("TT_METAL_SIMULATOR_EN"), "Dispatch cores size must be positive");
TT_ASSERT(dispatch_cores.size() || std::getenv("TT_METAL_SIMULATOR"), "Dispatch cores size must be positive");

std::vector<CoreCoord> logical_compute_cores;
logical_compute_cores.reserve(compute_cores.size());
Expand Down
16 changes: 16 additions & 0 deletions tt_metal/core_descriptors/grayskull_versim_1x1_arch.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,19 @@ E150:

dispatch_cores:
[]

dispatch_core_type:
"tensix"
2:
compute_with_storage_grid_range:
start: [0, 0]
end: [0, 0]

storage_cores:
[]

dispatch_cores:
[]

dispatch_core_type:
"tensix"
2 changes: 2 additions & 0 deletions tt_metal/llrt/hal.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ Hal::Hal() : arch_(get_platform_architecture()) {

case tt::ARCH::BLACKHOLE: initialize_bh(); break;

case tt::ARCH::QUASAR: TT_THROW("HAL doesn't support Quasar"); break;

case tt::ARCH::Invalid: /*TT_THROW("Unsupported arch for HAL")*/; break;
}
}
Expand Down
9 changes: 8 additions & 1 deletion tt_metal/llrt/llrt.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,9 @@ void wait_until_cores_done(
// poll the cores until the set of not done cores is empty
int loop_count = 1;
auto start = std::chrono::high_resolution_clock::now();
if (std::getenv("TT_METAL_SIMULATOR_EN")) timeout_ms = 0;
bool is_simulator = std::getenv("TT_METAL_SIMULATOR") != nullptr;

if (is_simulator) timeout_ms = 0;
while (!not_done_phys_cores.empty()) {
if (timeout_ms > 0) {
auto now = std::chrono::high_resolution_clock::now();
Expand Down Expand Up @@ -253,6 +255,11 @@ void wait_until_cores_done(
}
}
loop_count++;

// Continuously polling cores on simulator can cause it to run much slower than real hardware.
if (is_simulator)
std::this_thread::sleep_for(std::chrono::milliseconds(100));

// Continuously polling cores here can cause other host-driven noc transactions (dprint, watcher) to drastically
// slow down for remote devices. So when debugging with these features, add a small delay to allow other
// host-driven transactions through.
Expand Down
8 changes: 4 additions & 4 deletions tt_metal/llrt/tt_cluster.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ Cluster::Cluster() {

void Cluster::detect_arch_and_target() {

this->target_type_ = (std::getenv("TT_METAL_SIMULATOR_EN")) ? TargetDevice::Simulator : TargetDevice::Silicon;
this->target_type_ = (std::getenv("TT_METAL_SIMULATOR")) ? TargetDevice::Simulator : TargetDevice::Silicon;

this->arch_ = tt_metal::get_platform_architecture();

Expand Down Expand Up @@ -237,10 +237,9 @@ const std::unordered_map<CoreCoord, int32_t>& Cluster::get_virtual_routing_to_pr
}

void Cluster::open_driver(const bool &skip_driver_allocs) {
const std::string sdesc_path = get_soc_description_file(this->arch_, this->target_type_);

std::unique_ptr<tt_device> device_driver;
if (this->target_type_ == TargetDevice::Silicon) {
const std::string sdesc_path = get_soc_description_file(this->arch_, this->target_type_);
std::unordered_set<chip_id_t> all_chips = this->cluster_desc_->get_all_chips();
std::set<chip_id_t> all_chips_set(all_chips.begin(), all_chips.end());
// This is the target/desired number of mem channels per arch/device.
Expand Down Expand Up @@ -268,7 +267,8 @@ void Cluster::open_driver(const bool &skip_driver_allocs) {
// that is later expected to be populated by unrelated APIs
// TT_FATAL(device_driver->get_target_mmio_device_ids().size() == 1, "Only one target mmio device id allowed.");
} else if (this->target_type_ == TargetDevice::Simulator) {
device_driver = std::make_unique<tt_SimulationDevice>(sdesc_path);
auto simulator_directory = std::getenv("TT_METAL_SIMULATOR");
device_driver = std::make_unique<tt_SimulationDevice>(simulator_directory);
}

barrier_address_params barrier_params;
Expand Down
2 changes: 1 addition & 1 deletion tt_metal/third_party/umd
Submodule umd updated 37 files
+1 −0 device/CMakeLists.txt
+83 −0 device/api/umd/device/blackhole_arc_message_queue.h
+4 −2 device/api/umd/device/blackhole_coordinate_manager.h
+14 −0 device/api/umd/device/blackhole_implementation.h
+13 −0 device/api/umd/device/chip/chip.h
+21 −10 device/api/umd/device/cluster.h
+25 −6 device/api/umd/device/coordinate_manager.h
+4 −0 device/api/umd/device/grayskull_coordinate_manager.h
+9 −5 device/api/umd/device/semver.hpp
+20 −2 device/api/umd/device/tt_simulation_device.h
+5 −1 device/api/umd/device/tt_soc_descriptor.h
+5 −0 device/api/umd/device/types/arch.h
+59 −0 device/api/umd/device/types/blackhole_arc.h
+1 −0 device/api/umd/device/types/cluster_types.h
+4 −0 device/api/umd/device/wormhole_coordinate_manager.h
+136 −0 device/blackhole/blackhole_arc_message_queue.cpp
+22 −3 device/blackhole/blackhole_coordinate_manager.cpp
+31 −1 device/chip/chip.cpp
+95 −72 device/cluster.cpp
+42 −10 device/coordinate_manager.cpp
+20 −12 device/grayskull/grayskull_coordinate_manager.cpp
+1 −1 device/mockup/tt_mockup_device.hpp
+12 −6 device/simulation/tt_simulation_device.cpp
+3 −0 device/tt_cluster_descriptor.cpp
+10 −2 device/tt_soc_descriptor.cpp
+17 −0 device/wormhole/wormhole_coordinate_manager.cpp
+21 −20 tests/api/test_core_coord_translation_bh.cpp
+14 −14 tests/api/test_core_coord_translation_gs.cpp
+16 −15 tests/api/test_core_coord_translation_wh.cpp
+44 −19 tests/api/test_soc_descriptor.cpp
+2 −1 tests/api/test_tlb_manager.cpp
+4 −1 tests/blackhole/CMakeLists.txt
+36 −0 tests/blackhole/test_arc_messages_bh.cpp
+5 −5 tests/galaxy/test_umd_remote_api.cpp
+77 −103 tests/grayskull/test_cluster_gs.cpp
+8 −1 tests/simulation/device_fixture.hpp
+12 −0 tests/test_utils/device_test_utils.hpp

0 comments on commit d0b0f9b

Please sign in to comment.