-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sometimes SetFailed() is called with errno=0 in InputMessenger::OnNewMessages() #1860
Comments
Some further investigation shows that there might be another scenario --- readv() returned Another piece of debugging log supports this scenario. |
Could you show this log? |
This issue may be related to #1693. Although we have removed the |
BTW-1: I also observed once that before BTW-2: During my investigation, I also tried using Clang-15 (instead of Clang-12) with exactly the same building configurations, then I observed a different coredump saying something like "bthread sched_to itself" or so, so it seems LTO may really break something in BRPC code base. BTW-3: About half a year ago, when we are using 0.9.7, I roughly tested LTO capabilities once and BRPC wasn't a blocker at that time (the service behaves correctly online at a benefited performance). Currently I have switched to a building mode that everything but BRPC in the entire dependency graph is built with ThinLTO, and the binary can be started without an immediate coredump (whether it behaves correctly may need a longer verification procedure). Let me know if you need any further context from my scenario. |
I encountered this error before, that is related to tls cache. Compiler options: -fno-gcse, -fno-cse-follow-jumps, -fno-move-loop-invariants may fix this. |
Thanks for the information, I'll give it a shoot later this week. |
Unfortunately I could not easily find corresponding switches in Clang, it seems that optimization switches of Gcc and Clang at such a detailed level differ a lot. |
What compiler did you use in this case? |
Clang 8 at that time, with GLIBC 2.27 and GLIBCXX 3.4.26. |
We are using gcc 8 and gcc 10 with lto to compile brpc and everything works well. |
Describe the bug (描述bug)
m->SetFailed()
is called withsaved_errno == 0
inInputMessenger::OnNewMessages()
, which then triggers a hard error atCHECK(false) << "error_code is 0"
.To Reproduce (复现方法)
In our environment we can constantly reproduce this issue in LTO (
-flto=full
) and ThinLTO (-flto=thin
) modes, but everything runs without error in normal (-fno-lto
) mode.I surfed a bit into the source code, and it turns out that
readv()
may return-1
witherrno == 0
inpappend_from_file_descriptor()
, which then causem->DoRead()
returnsnr == -1
witherrno == 0
inInputMessenger::OnNewMessages()
, which then run intoelse if (errno != EAGAIN)
branch witherrno == 0
and triggers theCHECK
failure.By adding several pieces of logging codes with cares taken not to unintentionally change
errno
, I can confirm the behavior described above, see the attached screenshot for reference.A potential fix / workaround is to change
if (errno == EINTR)
toif (error == 0 || errno == EINTR)
, but I'm not familiar with BRPC code (nor did I figure out why link-time optimization triggers such an issue), so I'd prefer BRPC maintainers taking a look at it. Thanks!Expected behavior (期望行为)
Consistent behavior regardless of link-time optimization options.
Versions (各种版本)
OS: Linux Ubuntu 18.04 in Docker
Compiler: Clang 12.0.1
brpc: 1.1.0 Release
protobuf: 21.1
Additional context/screenshots (更多上下文/截图)
The text was updated successfully, but these errors were encountered: