Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[coroutines/c++] Return object is being prematurely converted. #56532

Closed
no-more-secrets opened this issue Jul 14, 2022 · 24 comments
Closed

[coroutines/c++] Return object is being prematurely converted. #56532

no-more-secrets opened this issue Jul 14, 2022 · 24 comments
Assignees
Labels
coroutines C++20 coroutines

Comments

@no-more-secrets
Copy link

using llvm/clang main built on 2022-07-14.

There appears to be a regression in how coroutine return objects are implicitly converted prior to returning them.

First as background, the get_return_object customization point is allowed to return an object that can be implicitly converted to the real return type. This part is still working fine.

The regression concerns when this implicit conversion happens.

Clang traditionally waited to perform this implicit conversion until after the return_value customization point was called. This allowed doing some tricks that are useful in implementing e.g. a std::optional coroutine. So if you were to print out the order of calls, it would looks like this traditionally:

  1. call get_return_object
  2. call return_value
  3. do implicit conversion of return object to the type that is actually needed.

This may have changed at some point in the last few months, but the version of clang that I built today does it in a different order:

  1. call get_return object
  2. do implicit conversion of return object to the type that is actually needed.
  3. call return_value

I'm not sure if the standard guarantees that it will happen in a certain order... but I know that traditionally it used to be done according to the first ordering (which gcc also does), but now that changed, and it is breaking my implementation of the std::optional coroutine.

It seems sensible to wait as long as possible to perform the implicit conversion, since it affords flexibility to the user in that it permits the object that get_return_object returns to remain alive and accessible to return_value (or other customization point methods) until the very end when it is implicitly converted just before ending the coroutine.

@no-more-secrets
Copy link
Author

If someone needs a minimal reproducer I can provide one, but I'm hoping that it won't be necessary.

@yuanfang-chen
Copy link
Collaborator

@ChuanqiXu9 maybe this relates to https://reviews.llvm.org/D117087?

@ChuanqiXu9
Copy link
Member

@ChuanqiXu9 maybe this relates to https://reviews.llvm.org/D117087?

I guess so. But I am not sure if it is incorrect.

@dpacbach I don't understand how this is related to return_value. From my point of view, get_return_object is not related with return_value. A reduced example may be helpful to understand. Thanks

@ChuanqiXu9
Copy link
Member

@tstellar Hi, may you love to create a new label like coroutines? Recently the number of coroutine issues goes high and it is relatively independent with C++20. So it would be helpful to create a new issue label.

@EugeneZelenko EugeneZelenko added coroutines C++20 coroutines and removed new issue labels Jul 15, 2022
@EugeneZelenko
Copy link
Contributor

@ChuanqiXu9: Done, but you could add labels yourself.

@no-more-secrets
Copy link
Author

no-more-secrets commented Jul 15, 2022

@ChuanqiXu9 Here is a demonstration of the change: https://godbolt.org/z/jfrqsKsnb

There you can see clang trunk vs. clang 14 vs gcc 12.1. Clang 14 and gcc 12.1 agree in that they perform the implicit conversion after the final_suspend (which is even later than I thought, but that's fine). However, clang trunk does the implicit conversion just after calling get_return_object, which I think could be considered a regression. Note that MSVC also agrees with clang 14 and gcc (https://godbolt.org/z/Mcbe8Pd7x), so clang trunk is the only one that does this new behavior.

Note that on the cppreference coroutines page it says this:

When a coroutine reaches a suspension point: the return object obtained earlier is returned to the caller/resumer, after implicit conversion to the return type of the coroutine, if necessary.

To me that implies that it should do the implicit conversion only when necessary (as late as possible), though I am not sure what the C++ standard says officially. At the very least it seems a bit suspicious that clang trunk disagrees with the other two major compilers.

@no-more-secrets
Copy link
Author

no-more-secrets commented Jul 15, 2022

@ChuanqiXu9 Here is another library that will be broken by this change, as it relies on the deferred implicit conversion of the get_return_object, as stated in its readme:

https://github.com/toby-allsopp/coroutine_monad

Requirements
The implementation requires ... Clang 5 or later...

In particular, the MSVC implementation converts the object returned by get_return_object to the return type of the coroutine immediately rather than waiting until the coroutine returns to its caller.

@no-more-secrets
Copy link
Author

no-more-secrets commented Jul 15, 2022

Here is another issue from Lewis Baker (who I'd think is a reliable source on this) who seems to imply that the conversion should be deferred:

lewissbaker/cppcoro#43

@ChuanqiXu9
Copy link
Member

I got it. It surprises me but it looks like the intentional behavior indeed. I just failed to find the corresponding wording. I would try to fix it after I understand how to handle it properly.

@ChuanqiXu9 ChuanqiXu9 self-assigned this Jul 18, 2022
@ChuanqiXu9
Copy link
Member

@dpacbach from https://cplusplus.github.io/CWG/issues/2563.html, it looks like the current behavior of clang is fine.

@no-more-secrets
Copy link
Author

@ChuanqiXu9 Although I am not an expert on the standard, I think you may be correct that the current C++ standard does not mandate the deferred implicit conversion (someone more expert than me would need to comment on that). My guess is that people didn't really appreciate the use cases that deferred conversion enables when writing the original C++20 standard, but which have developed since.

Since deferred conversion seems to be essential for a few common programming patterns that have emerged, I suspect someday that people will try to change the wording of the standard to require deferred conversion in the future, especially if the major compilers start to disagree about it. So if clang breaks that now, then it may be harder to fix in the future.

Thanks for taking a look.

@ChuanqiXu9
Copy link
Member

ChuanqiXu9 commented Jul 18, 2022

My feeling is that is an undefined behavior now. So both behavior are fine. It might change according to the issue. The suggested change is not the final decision. Let's wait for the decision made.

@no-more-secrets
Copy link
Author

Question: is it possible to change your implementation of coroutine RVO so that it can defer conversion and still work correctly? I'd think that, even if the return type needs to be implicitly converted, you can still do RVO with the result even if you defer that conversion. If so, then that seems like it would be ideal.

Even if the standard doesn't mandate it now, it does seem a bit worrying that clang now disagrees with the other major compilers (and past clang versions). I think that ignoring this may turn out to be a mistake.

@lewissbaker Given lewissbaker/cppcoro#43 that you raised, would you have any comments or advice here?

@ChuanqiXu9 ChuanqiXu9 reopened this Jul 19, 2022
@ChuanqiXu9
Copy link
Member

Question: is it possible to change your implementation of coroutine RVO so that it can defer conversion and still work correctly? I'd think that, even if the return type needs to be implicitly converted, you can still do RVO with the result even if you defer that conversion. If so, then that seems like it would be ideal.

Even if the standard doesn't mandate it now, it does seem a bit worrying that clang now disagrees with the other major compilers (and past clang versions). I think that ignoring this may turn out to be a mistake.

@lewissbaker Given lewissbaker/cppcoro#43 that you raised, would you have any comments or advice here?

Oh, your words make sense. I close it too hurry. I would check if it is possible to do RVO with the deferred conversion. But it might be a relative long time since there are too many TODOs and the priority of the issue looks not high to me.

@yfeldblum
Copy link

A good option would be for compilers to agree on the behavior and for compilers and the community to standardize the agreed-on behavior.

Up until recently, the major compilers - GCC, MSVC, and Clang - all agreed on the behavior. And the agreed-on behavior prior was the behavior with slightly more flexibility. It would have been convenient to merge a technical paper that resolves a technical ambiguity in the standard and which merely standardizes what all major compilers already do anyway.

Clang regressing the order here creates the potential for the ambiguity to stay in the standard for the long term. Clang fixing the order here makes it easier to resolve the ambiguity in the standard, which seems to me to be the best option overall.

Leaving this ambiguous in the standard and having the compilers choose different orders seems to me to be a bad option because it requires implementations of some coroutines to be sensitive to, and somehow magically to know, what the order is for any given platform - and to have two implementations to handle the two orders. This looks like it would negatively affects coroutine implementations for std::optional and std::expected, as well as the coroutine implementations for folly::Optional and folly::Expected where I have direct experience. Early conversion requires some coroutine implementations to be more complex or even impossible, while deferred conversion allows those coroutine implementations to be less complex - but having both variations be possible makes those coroutine implementations the most complex. (I think the broader class affected is synchronous monadic coroutines.)

In particular, the coroutine implementation for folly::Expected is currently broken under Clang >= 15 - and fixing it requires a special constructor within folly::Expected which assigns another pointer to this in a way which still permits syntactic copy elision, and a branch in the coroutine implementation which chooses a path based on the version of Clang. Currently, cppcoro cannot implement a coroutine for std::optional under Clang 15 because such a coroutine would need a similar special constructor - and no implementation of std::optional supplies it. If Clang were to fix the regression, then cppcoro could implement a coroutine for std::optional.

Beyond the technical points, Clang having divergent behavior is a new feeling. I am used to MSVC having divergent behavior and Clang having non-divergent behavior - that is, I am used to having custom patches to handle ways in which MSVC has missing or surprising behavior that is unlike the behavior of other compilers, but I am not used to having to do that for Clang.

Fixing this regression in Clang and fixing the ambiguity in the standard strikes me as a good option because it allows implementations of all coroutines to be able to work with only one order - the order which leads to the least complexity in relevant coroutine implementations.

@lewissbaker
Copy link

While we're talking about implementation divergence of get_return_object(), there is also the issue that Clang/MSVC and GCC don't agree in semantics when get_return_object() returns a reference.

With Clang/MSVC, it constructs the return value directly from the reference returned from get_return_object().
With GCC, it first decay-copies the returned reference and then constructs the coroutine's return value from the copy instead of from the original reference.

For example: https://godbolt.org/z/vbcbofxvs

@no-more-secrets
Copy link
Author

@lewissbaker Thanks, though perhaps that should be raised as a new issue. I don't recall that ever causing trouble for the std::optional coroutine implementations which this issue is mainly concerned with, but it does seem like something that could cause trouble in other contexts.

@ChuanqiXu9
Copy link
Member

@yfeldblum thanks for the written-up. But I just saw @lewissbaker kicked off a discussion in wg21. So it may be better to wait for the result from the committee.

@bcardosolopes
Copy link
Member

Hey folks, the discussion in EWG reached consensus that the
the current community expectation should actually match what clang used to do before https://reviews.llvm.org/D117087.
As @GorNishanov nicely put in his cwg2562 response:

Q: whether the prvalue result object may be initialized later (e.g. before after the first actual suspension),
A: Yes it can! This is the intent when get_return_object type != return type of the coroutine
Necessary for supporting std::expected coroutine use-case.
Q: if the initialization does occur later, by what mechanism the prvalue result of get_return_object is forwarded to that initialization.
A: (see below)
Case 1/2. Same type of get_return_object and coroutine return type.
Constructed in the return slot.
Case 3. Different types:
(1) Constructed temporary object prior to initial suspend initialized with a call to get_return_object()
(2) when coroutine needs to to return to the caller and needs to construct return value for the coroutine it is initialized with expiring value of the temporary obtained in step 1.

@ChuanqiXu9 do you have plans to work on this soon? If you're not already planning to work on this I'm happy to do the leg work - we've been reverting D117087 internally for almost an year now, probably a good moment to improve this. From a first look at D117087, it's not clear to me yet:

  1. When implementing (2) above, will this require us re-introducing GetReturnObjectManager?
  2. If so, should we revert D117087 in ToT for now until RVO can be better implemented?
  3. Any other suggestions?

Thanks!

@ChuanqiXu9
Copy link
Member

@bcardosolopes Hi, it is great to hear that you want to work on this. Although I planned to work it myself, I don't have a lot of time though.

  1. When implementing (2) above, will this require us re-introducing GetReturnObjectManager?

Yes. Since now the GRO may be local variable inside the function (coroutine), we need to handle the cleanup of the GRO in case of exceptions. So we need to re-introduce GetReturnObjectManager.

  1. If so, should we revert D117087 in ToT for now until RVO can be better implemented?

I think the best approach is to revert D117087 and implement RVO for the case get_return_object type == return in one shot. The next release branch date will be the end of July. So it looks like we have enough time. But it is also acceptable if you feel you don't have enough time. After all, the correctness is more important than the optimization.

Thanks!

@bcardosolopes
Copy link
Member

Patches are up:

  1. Fix premature initialization (restoring previous behavior): https://reviews.llvm.org/D145639
  2. (Depends on 1) Implement type matching rules that enable RVO: https://reviews.llvm.org/D145641

I've separated this into two to give our internal CIs time to digest this in pieces. We're current facing some crashes related to RVO with the upstream implementation, which (2) might solve. But by independently fixing (1) we detach correctness from optimization.

@no-more-secrets
Copy link
Author

Thanks @bcardosolopes, I tested the fix with my std::optional coroutine and it indeed fixes the problem and restores previous behavior.

CarlosAlbertoEnciso pushed a commit to SNSystems/llvm-debuginfo-analyzer that referenced this issue Mar 17, 2023
Fix llvm/llvm-project#56532

Effectively, this reverts behavior introduced in https://reviews.llvm.org/D117087,
which did two things:

1. Change delayed to early conversion of return object.
2. Introduced RVO possibilities because of early conversion.

This patches fixes (1) and removes (2). I already worked on a follow up for (2)
in a separated patch. I believe it's important to split these two because if the RVO
causes any problems we can explore reverting (2) while maintaining (1).

Notes on some testcase changes:
- `pr59221.cpp` changed to `-O1` so we can check that the front-end honors
  the value checked for. Sounds like `-O3` without RVO is more likely
  to work with LLVM optimizations...
- Comment out delete members `coroutine-no-move-ctor.cpp` since behavior
  now requires copies again.

Differential Revision: https://reviews.llvm.org/D145639
@bcardosolopes
Copy link
Member

Note that https://reviews.llvm.org/D145639 just got reverted (see comments in the diff), will re-land it again next week, altogether with https://reviews.llvm.org/D145641.

@bcardosolopes bcardosolopes reopened this Mar 17, 2023
bcardosolopes added a commit that referenced this issue Mar 22, 2023
…tain conditions

- The cwg2563 issue is fixed by delaying GRO initialization only when the types
  mismatch between GRO and function return.
- When the types match directly initialize, which indirectly enables RVO to
  kick in, partially restores behavior introduced in
  https://reviews.llvm.org/D117087.
- Add entry to release notes.

Background:
#56532
https://cplusplus.github.io/CWG/issues/2563.html
cplusplus/papers#1414

Differential Revision: https://reviews.llvm.org/D145641
@no-more-secrets
Copy link
Author

@bcardosolopes Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
coroutines C++20 coroutines
Projects
Status: No status
Development

No branches or pull requests

7 participants