-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy initialization of queue inside dpcpp_default policy #1154
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also add a test that fails before this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might be able to add a test that just includes oneDPL but doesn't call anything, and then invoke it with a device filter that disables all devices (e.g., ONEAPI_DEVICE_SELECTOR='!*:*'
). In this case, the old code should throw an exception whereas this PR should not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have manually confirmed that this test works in principle. I created the following test:
#include <oneapi/dpl/execution>
int main()
{
return 0;
}
Running it as ONEAPI_DEVICE_SELECTOR='!*:*' ./build/test/header_inclusion_only.pass
crashes on main but does not crash with this branch.
I feel that there is benefit in adding this test, but it will require some special changes to the CMake to modify the environment specifically for this test. It is further complicated by the fact that the CMake files currently already do something similar with the variable $ONEAPI_DEVICE_SELECTOR
, which is either set to ONEAPI_DEVICE_SELECTOR=gpu/acc/fpga
or SYCL_DEVICE_FILTER=gpu/acc/fpga
depending on the compiler version and device type.
The dynamic selection branch also makes changes to this variable. In all, I believe that there is value for adding this test but the risks are high to unintentionally break something with this being so close to the code freeze.
device_policy& | ||
operator=(const device_policy& other) | ||
{ | ||
q = other.queue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we really need to create sycl::queue
instance here in other.queue()
call ?
May be good enough simple copy q
state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be sufficient to just copy q
directly as you suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see any serious problems in the current version of the patch. However, the implementations of the policy classes could use some more uglification.
|
||
explicit device_policy(sycl::queue q_) : q(q_) {} | ||
explicit device_policy(sycl::device d_) : q(d_) {} | ||
operator sycl::queue() const { return q; } | ||
explicit device_policy(sycl::device d_) { q.emplace(d_); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you reorder the code so that these existing constructors come before the new copy & move ones, perhaps right after the line 54?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
{ | ||
::std::scoped_lock lock{other.mtx}; | ||
q.swap(other.q); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe better to make simple move assignment to not extend life of our instance sycl::queue
if it's already created earlier:
q = ::std::move(other.q);
Move assignment supported in std::optional
: https://en.cppreference.com/w/cpp/utility/optional/operator%3D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also header file for std::swap
not included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't understand why we do use swap
member function for move and don't use it for copy. I feel like reusing copy/move-assignment from optional
is the right way to go. But if you have arguments why you did swap
please clarify.
My other question is can we make member-init-list there instead of default constructing and then assigning? For example, astatic
function that that locks and returns optional with the correct value category?
Also header file for std::swap not included.
That's something I don't understand at all. we don't use std::swap (at least in this line). For member function including <optional>
is enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean if we take a loot into https://en.cppreference.com/w/cpp/algorithm/swap,
there are 3 headers in which std::swap
declared:
Defined in header <algorithm> (until C++11)
Defined in header <utility> (since C++11)
Defined in header <string_view> (since C++17)
And when I see introduced std::swap
call I'm waiting for the inclusion of one from these headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched the swap with the move constructor and rewrote the copy/move constructor as suggested by @rarutyun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- it was my mistake: sure, you have used
swap
member function, not::std::swap
and no additional headers required in this case.
I am sorry.
operator=(device_policy&& other) | ||
{ | ||
::std::scoped_lock lock{mtx, other.mtx}; | ||
q.swap(other.q); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same:
q = ::std::move(other.q);
|
||
operator sycl::queue() const { return queue(); } | ||
sycl::queue | ||
queue() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct that this function sycl::queue queue()
should be declared as virtual
now in this and base class?
- Let's assume we have created the
policy
instance offpga_policy
After this PR this instance doesn't have createdsycl::queue
instance inside as it's lazy initialized. - In some code we save result of
static_cast<device_policy&>(policy)
intoauto& d_policy
:
fpga_policy policy;
auto& d_policy = static_cast<device_policy&>(policy);
- and after that we call
queue()
function from reference to the base class:
fpga_policy policy;
auto& d_policy = static_cast<device_policy&>(policy);
auto q = d_policy.queue(); // call device_policy::queue() !?
Looks like we will call queue()
method of the base class and creation of sycl::queue
instance will be executed incorrectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we need some another approach to resolve this issue: may be we are able to pack sycl::queue
creation into someone predicates: one predicate for the base class and one for derived class and so on.
May be able to apply something like #1158
but this PR doesn't resolve the problem of broken lazy initialization in constructor of
template <unsigned int other_factor, typename OtherName>
fpga_policy(const fpga_policy<other_factor, OtherName>& other) : base(other.queue()){};
See my comment https://github.com/oneapi-src/oneDPL/pull/1154/files#r1318324154
Need to come up with something more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like that:
- If we do that, this is not the design intent we want
- If we don't do that - this is the API breaking change
{ | ||
} | ||
|
||
fpga_policy() = default; | ||
template <unsigned int other_factor, typename OtherName> | ||
fpga_policy(const fpga_policy<other_factor, OtherName>& other) : base(other.queue()){}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lazy initialization concept would be broken at this place: queue()
call will create sycl::queue
instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be we should pass static_cast<const base&>(other)
into constructor of base class to avoid this broken case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also looks like move constructor of this format and copy/move operaor=
required in this class too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lazy initialization concept would be broken at this place:
queue()
call will createsycl::queue
instance
That's exactly the question what behavior we want for non-default constructors and assignments. I think I want them being initialized (I can change my mind, though), but I also can see the issues with that since we have inheritance
{ | ||
::std::scoped_lock lock{other.mtx}; | ||
q.swap(other.q); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't understand why we do use swap
member function for move and don't use it for copy. I feel like reusing copy/move-assignment from optional
is the right way to go. But if you have arguments why you did swap
please clarify.
My other question is can we make member-init-list there instead of default constructing and then assigning? For example, astatic
function that that locks and returns optional with the correct value category?
Also header file for std::swap not included.
That's something I don't understand at all. we don't use std::swap (at least in this line). For member function including <optional>
is enough
device_policy& | ||
operator=(const device_policy& other) | ||
{ | ||
::std::scoped_lock lock{mtx, other.mtx}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's not done. What if it's self assignment? How does scoped_lock
behave for the same mutex passed twice? My bet it's a deadlock (better to say, lifelock)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, @rarutyun, you are right. Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified the code to now check for self assignment.
device_policy(const device_policy<OtherName>& other) | ||
{ | ||
::std::scoped_lock lock{other.mtx}; | ||
q = other.q; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering what is the correct design? should we call other.queue()
or other.q
? This is relevant for all places. I understand the difference in the behavior and I am trying to understand what is the right one.
Just to state it explicitly. Those are not all comments that I have. I am thinking about Sergey's comment about virtuality and also about additional things for copy/move semantics |
After merging #1652, this patch is outdated. |
There are a few potential issues with oneDPL's global
sycl::queue
, including:To address the issue, this PR lazily initializes the
sycl::queue
inside thedpcpp_default
execution policy on first use.