Skip to content

Commit

Permalink
[SYCL] Make device ids unique per backend (#4247)
Browse files Browse the repository at this point in the history
* [SYCL] Make device ids unique per backend

We decided to make device id numbers unique per backend.
Also, by adding the device_type into each device prefix listing in sycl-ls,
the user can easily set SYCL_DEVICE_FILTER correctly.
Future work: refactor devices and platforms cache to optimize the device retrieval.

Signed-off-by: Byoungro So <byoungro.so@intel.com>
Co-authored-by: Alexey Bader <alexey.bader@intel.com>
Co-authored-by: Romanov Vlad <vlad.romanov@intel.com>
  • Loading branch information
3 people authored Sep 24, 2021
1 parent 987427a commit 7aa5be0
Show file tree
Hide file tree
Showing 7 changed files with 200 additions and 101 deletions.
28 changes: 14 additions & 14 deletions sycl/doc/EnvironmentVariables.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,25 +57,25 @@ subject to change. Do not rely on these variables in production code.

This environment variable limits the SYCL RT to use only a subset of the system's devices. Setting this environment variable affects all of the device query functions (`platform::get_devices()` and `platform::get_platforms()`) and all of the device selectors.

The value of this environment variable is a comma separated list of filters, where each filter is a triple of the form "`backend:device_type:device_num`" (without the quotes). Each element of the triple is optional, but each filter must have at least one value. Possible values of "backend" are:
- host
The value of this environment variable is a comma separated list of filters, where each filter is a triple of the form "`backend`:`device_type`:`device_num`" (without the quotes). Each element of the triple is optional, but each filter must have at least one value. Possible values of `backend` are:
- `host`
- `level_zero`
- opencl
- cuda
- \*
- `opencl`
- `cuda`
- `*`

Possible values of "`device_type`" are:
- host
- cpu
- gpu
- acc
- \*
Possible values of `device_type` are:
- `host`
- `cpu`
- `gpu`
- `acc`
- `*`

`Device_num` is an integer that indexes the enumeration of devices from the sycl-ls utility tool, where the first device in that enumeration has index zero in each backend. For example, `SYCL_DEVICE_FILTER`=2 will return all devices with index '2' from all different backends. If multiple devices satisfy this device number (e.g., GPU and CPU devices can be assigned device number '2'), then default_selector will choose the device with the highest heuristic point.
`device_num` is an integer that indexes the enumeration of devices from the sycl-ls utility tool, where the first device in that enumeration has index zero in each backend. For example, `SYCL_DEVICE_FILTER=2` will return all devices with index '2' from all different backends. If multiple devices satisfy this device number (e.g., GPU and CPU devices can be assigned device number '2'), then default_selector will choose the device with the highest heuristic point. When `SYCL_DEVICE_ALLOWLIST` is set, it is applied before enumerating devices and affects `device_num` values.

Assuming a filter has all three elements of the triple, it selects only those devices that come from the given backend, have the specified device type, AND have the given device index. If more than one filter is specified, the RT is restricted to the union of devices selected by all filters. The RT does not include the "host" backend and the host device automatically unless one of the filters explicitly specifies the "host" device type. Therefore, `SYCL_DEVICE_FILTER`=host should be set to enforce SYCL to use the host device only.
Assuming a filter has all three elements of the triple, it selects only those devices that come from the given backend, have the specified device type, AND have the given device index. If more than one filter is specified, the RT is restricted to the union of devices selected by all filters. The RT does not include the `host` backend and the `host` device automatically unless one of the filters explicitly specifies the `host` device type. Therefore, `SYCL_DEVICE_FILTER=host` should be set to enforce SYCL to use the `host` device only.

Note that all device selectors will throw an exception if the filtered list of devices does not include a device that satisfies the selector. For instance, `SYCL_DEVICE_FILTER`=cpu,level_zero will cause host_selector() to throw an exception. `SYCL_DEVICE_FILTER` also limits loading only specified plugins into the SYCL RT. In particular, `SYCL_DEVICE_FILTER`=level_zero will cause the cpu_selector to throw an exception since SYCL RT will only load the level_zero backend which does not support any CPU devices at this time. When multiple devices satisfy the filter (e..g, `SYCL_DEVICE_FILTER`=gpu), only one of them will be selected.
Note that all device selectors will throw an exception if the filtered list of devices does not include a device that satisfies the selector. For instance, `SYCL_DEVICE_FILTER=cpu,level_zero` will cause `host_selector()` to throw an exception. `SYCL_DEVICE_FILTER` also limits loading only specified plugins into the SYCL RT. In particular, `SYCL_DEVICE_FILTER=level_zero` will cause the `cpu_selector` to throw an exception since SYCL RT will only load the `level_zero` backend which does not support any CPU devices at this time. When multiple devices satisfy the filter (e..g, `SYCL_DEVICE_FILTER=gpu`), only one of them will be selected.

### `SYCL_PRINT_EXECUTION_GRAPH` Options

Expand Down
2 changes: 1 addition & 1 deletion sycl/include/CL/sycl/detail/pi.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ template <class To, class From> To cast(From value);
extern std::shared_ptr<plugin> GlobalPlugin;

// Performs PI one-time initialization.
const std::vector<plugin> &initialize();
std::vector<plugin> &initialize();

// Get the plugin serving given backend.
template <backend BE> __SYCL_EXPORT const plugin &getPlugin();
Expand Down
61 changes: 38 additions & 23 deletions sycl/source/detail/device_filter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,65 +12,80 @@
#include <detail/device_impl.hpp>

#include <cstring>
#include <string_view>

__SYCL_INLINE_NAMESPACE(cl) {
namespace sycl {
namespace detail {

std::vector<std::string_view> tokenize(const std::string &Filter,
const std::string &Delim) {
std::vector<std::string_view> Tokens;
size_t Pos = 0;
size_t LastPos = 0;

while ((Pos = Filter.find(Delim, LastPos)) != std::string::npos) {
std::string_view Tok(Filter.data() + LastPos, (Pos - LastPos));

if (!Tok.empty()) {
Tokens.push_back(Tok);
}
// move the search starting index
LastPos = Pos + 1;
}

// Add remainder if any
if (LastPos < Filter.size()) {
std::string_view Tok(Filter.data() + LastPos, Filter.size() - LastPos);
Tokens.push_back(Tok);
}
return Tokens;
}

device_filter::device_filter(const std::string &FilterString) {
size_t Cursor = 0;
size_t ColonPos = 0;
auto findElement = [&](auto Element) {
size_t Found = FilterString.find(Element.first, Cursor);
if (Found == std::string::npos)
return false;
Cursor = Found;
return true;
std::vector<std::string_view> Tokens = tokenize(FilterString, ":");
size_t TripleValueID = 0;

auto FindElement = [&](auto Element) {
return std::string::npos != Tokens[TripleValueID].find(Element.first);
};

// Handle the optional 1st field of the filter, backend
// Check if the first entry matches with a known backend type
auto It = std::find_if(std::begin(getSyclBeMap()), std::end(getSyclBeMap()),
findElement);
FindElement);
// If no match is found, set the backend type backend::all
// which actually means 'any backend' will be a match.
if (It == getSyclBeMap().end())
Backend = backend::all;
else {
Backend = It->second;
ColonPos = FilterString.find(":", Cursor);
if (ColonPos != std::string::npos)
Cursor = ColonPos + 1;
else
Cursor = Cursor + It->first.size();
TripleValueID++;
}

// Handle the optional 2nd field of the filter - device type.
// Check if the 2nd entry matches with any known device type.
if (Cursor >= FilterString.size()) {
if (TripleValueID >= Tokens.size()) {
DeviceType = info::device_type::all;
} else {
auto Iter = std::find_if(std::begin(getSyclDeviceTypeMap()),
std::end(getSyclDeviceTypeMap()), findElement);
std::end(getSyclDeviceTypeMap()), FindElement);
// If no match is found, set device_type 'all',
// which actually means 'any device_type' will be a match.
if (Iter == getSyclDeviceTypeMap().end())
DeviceType = info::device_type::all;
else {
DeviceType = Iter->second;
ColonPos = FilterString.find(":", Cursor);
if (ColonPos != std::string::npos)
Cursor = ColonPos + 1;
else
Cursor = Cursor + Iter->first.size();
TripleValueID++;
}
}

// Handle the optional 3rd field of the filter, device number
// Try to convert the remaining string to an integer.
// If succeessful, the converted integer is the desired device num.
if (Cursor < FilterString.size()) {
if (TripleValueID < Tokens.size()) {
try {
DeviceNum = stoi(FilterString.substr(Cursor));
DeviceNum = std::stoi(Tokens[TripleValueID].data());
HasDeviceNum = true;
} catch (...) {
std::string Message =
Expand Down
16 changes: 8 additions & 8 deletions sycl/source/detail/pi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ getPluginOpaqueData<cl::sycl::backend::esimd_cpu>(void *);

namespace pi {

static void initializePlugins(std::vector<plugin> *Plugins);
static void initializePlugins(std::vector<plugin> &Plugins);

bool XPTIInitDone = false;

Expand Down Expand Up @@ -369,17 +369,17 @@ bool trace(TraceLevel Level) {
}

// Initializes all available Plugins.
const std::vector<plugin> &initialize() {
std::vector<plugin> &initialize() {
static std::once_flag PluginsInitDone;

std::call_once(PluginsInitDone, []() {
initializePlugins(&GlobalHandler::instance().getPlugins());
// std::call_once is blocking all other threads if a thread is already
// creating a vector of plugins. So, no additional lock is needed.
std::call_once(PluginsInitDone, [&]() {
initializePlugins(GlobalHandler::instance().getPlugins());
});

return GlobalHandler::instance().getPlugins();
}

static void initializePlugins(std::vector<plugin> *Plugins) {
static void initializePlugins(std::vector<plugin> &Plugins) {
std::vector<std::pair<std::string, backend>> PluginNames = findPlugins();

if (PluginNames.empty() && trace(PI_TRACE_ALL))
Expand Down Expand Up @@ -438,7 +438,7 @@ static void initializePlugins(std::vector<plugin> *Plugins) {
GlobalPlugin = std::make_shared<plugin>(PluginInformation,
backend::level_zero, Library);
}
Plugins->emplace_back(
Plugins.emplace_back(
plugin(PluginInformation, PluginNames[I].second, Library));
if (trace(TraceLevel::PI_TRACE_BASIC))
std::cerr << "SYCL_PI_TRACE[basic]: "
Expand Down
43 changes: 31 additions & 12 deletions sycl/source/detail/platform_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -95,27 +95,30 @@ static bool IsBannedPlatform(platform Platform) {

std::vector<platform> platform_impl::get_platforms() {
std::vector<platform> Platforms;
const std::vector<plugin> &Plugins = RT::initialize();

std::vector<plugin> &Plugins = RT::initialize();
info::device_type ForcedType = detail::get_forced_type();
for (unsigned int i = 0; i < Plugins.size(); i++) {

for (plugin &Plugin : Plugins) {
pi_uint32 NumPlatforms = 0;
// Move to the next plugin if the plugin fails to initialize.
// This way platforms from other plugins get a chance to be discovered.
if (Plugins[i].call_nocheck<PiApiKind::piPlatformsGet>(
if (Plugin.call_nocheck<PiApiKind::piPlatformsGet>(
0, nullptr, &NumPlatforms) != PI_SUCCESS)
continue;

if (NumPlatforms) {
std::vector<RT::PiPlatform> PiPlatforms(NumPlatforms);
if (Plugins[i].call_nocheck<PiApiKind::piPlatformsGet>(
if (Plugin.call_nocheck<PiApiKind::piPlatformsGet>(
NumPlatforms, PiPlatforms.data(), nullptr) != PI_SUCCESS)
return Platforms;

for (const auto &PiPlatform : PiPlatforms) {
platform Platform = detail::createSyclObjFromImpl<platform>(
getOrMakePlatformImpl(PiPlatform, Plugins[i]));
getOrMakePlatformImpl(PiPlatform, Plugin));
{
std::lock_guard<std::mutex> Guard(*Plugin.getPluginMutex());
// insert PiPlatform into the Plugin
Plugin.getPlatformId(PiPlatform);
}
// Skip platforms which do not contain requested device types
if (!Platform.get_devices(ForcedType).empty() &&
!IsBannedPlatform(Platform))
Expand All @@ -141,14 +144,26 @@ std::vector<platform> platform_impl::get_platforms() {
// This function matches devices in the order of backend, device_type, and
// device_num.
static void filterDeviceFilter(std::vector<RT::PiDevice> &PiDevices,
const plugin &Plugin) {
RT::PiPlatform Platform) {
device_filter_list *FilterList = SYCLConfig<SYCL_DEVICE_FILTER>::get();
if (!FilterList)
return;

std::vector<plugin> &Plugins = RT::initialize();
auto It =
std::find_if(Plugins.begin(), Plugins.end(), [Platform](plugin &Plugin) {
return Plugin.containsPiPlatform(Platform);
});
if (It == Plugins.end())
return;

plugin &Plugin = *It;
backend Backend = Plugin.getBackend();
int InsertIDx = 0;
int DeviceNum = 0;
// DeviceIds should be given consecutive numbers across platforms in the same
// backend
std::lock_guard<std::mutex> Guard(*Plugin.getPluginMutex());
int DeviceNum = Plugin.getStartingDeviceId(Platform);
for (RT::PiDevice Device : PiDevices) {
RT::PiDeviceType PiDevType;
Plugin.call<PiApiKind::piDeviceGetInfo>(Device, PI_DEVICE_INFO_TYPE,
Expand Down Expand Up @@ -181,6 +196,10 @@ static void filterDeviceFilter(std::vector<RT::PiDevice> &PiDevices,
DeviceNum++;
}
PiDevices.resize(InsertIDx);
// remember the last backend that has gone through this filter function
// to assign a unique device id number across platforms that belong to
// the same backend. For example, opencl:cpu:0, opencl:acc:1, opencl:gpu:2
Plugin.setLastDeviceId(Platform, DeviceNum);
}

std::shared_ptr<device_impl> platform_impl::getOrMakeDeviceImpl(
Expand Down Expand Up @@ -237,12 +256,12 @@ platform_impl::get_devices(info::device_type DeviceType) const {

// Filter out devices that are not present in the SYCL_DEVICE_ALLOWLIST
if (SYCLConfig<SYCL_DEVICE_ALLOWLIST>::get())
applyAllowList(PiDevices, MPlatform, this->getPlugin());
applyAllowList(PiDevices, MPlatform, Plugin);

// Filter out devices that are not compatible with SYCL_DEVICE_FILTER
filterDeviceFilter(PiDevices, Plugin);
filterDeviceFilter(PiDevices, MPlatform);

PlatformImplPtr PlatformImpl = getOrMakePlatformImpl(MPlatform, *MPlugin);
PlatformImplPtr PlatformImpl = getOrMakePlatformImpl(MPlatform, Plugin);
std::transform(
PiDevices.begin(), PiDevices.end(), std::back_inserter(Res),
[PlatformImpl](const RT::PiDevice &PiDevice) -> device {
Expand Down
52 changes: 50 additions & 2 deletions sycl/source/detail/plugin.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -89,10 +89,10 @@ auto packCallArguments(ArgsT &&... Args) {
class plugin {
public:
plugin() = delete;

plugin(RT::PiPlugin Plugin, backend UseBackend, void *LibraryHandle)
: MPlugin(Plugin), MBackend(UseBackend), MLibraryHandle(LibraryHandle),
TracingMutex(std::make_shared<std::mutex>()) {}
TracingMutex(std::make_shared<std::mutex>()),
MPluginMutex(std::make_shared<std::mutex>()) {}

plugin &operator=(const plugin &) = default;
plugin(const plugin &) = default;
Expand Down Expand Up @@ -184,11 +184,59 @@ class plugin {
void *getLibraryHandle() { return MLibraryHandle; }
int unload() { return RT::unloadPlugin(MLibraryHandle); }

// return the index of PiPlatforms.
// If not found, add it and return its index.
// The function is expected to be called in a thread safe manner.
int getPlatformId(RT::PiPlatform Platform) {
auto It = std::find(PiPlatforms.begin(), PiPlatforms.end(), Platform);
if (It != PiPlatforms.end())
return It - PiPlatforms.begin();

PiPlatforms.push_back(Platform);
LastDeviceIds.push_back(0);
return PiPlatforms.size() - 1;
}

// Device ids are consecutive across platforms within a plugin.
// We need to return the same starting index for the given platform.
// So, instead of returing the last device id of the given platform,
// return the last device id of the predecessor platform.
// The function is expected to be called in a thread safe manner.
int getStartingDeviceId(RT::PiPlatform Platform) {
int PlatformId = getPlatformId(Platform);
if (PlatformId == 0)
return 0;
return LastDeviceIds[PlatformId - 1];
}

// set the id of the last device for the given platform
// The function is expected to be called in a thread safe manner.
void setLastDeviceId(RT::PiPlatform Platform, int Id) {
int PlatformId = getPlatformId(Platform);
LastDeviceIds[PlatformId] = Id;
}

bool containsPiPlatform(RT::PiPlatform Platform) {
auto It = std::find(PiPlatforms.begin(), PiPlatforms.end(), Platform);
return It != PiPlatforms.end();
}

std::shared_ptr<std::mutex> getPluginMutex() { return MPluginMutex; }

private:
RT::PiPlugin MPlugin;
backend MBackend;
void *MLibraryHandle; // the handle returned from dlopen
std::shared_ptr<std::mutex> TracingMutex;
// Mutex to guard PiPlatforms and LastDeviceIds.
// Note that this is a temporary solution until we implement the global
// Device/Platform cache later.
std::shared_ptr<std::mutex> MPluginMutex;
// vector of PiPlatforms that belong to this plugin
std::vector<RT::PiPlatform> PiPlatforms;
// represents the unique ids of the last device of each platform
// index of this vector corresponds to the index in PiPlatforms vector.
std::vector<int> LastDeviceIds;
}; // class plugin
} // namespace detail
} // namespace sycl
Expand Down
Loading

0 comments on commit 7aa5be0

Please sign in to comment.