Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epoll_ctl EPOLL_CTL_MOD called incorrectly #1037

Open
1 of 2 tasks
bigbohne opened this issue Aug 3, 2023 · 2 comments
Open
1 of 2 tasks

epoll_ctl EPOLL_CTL_MOD called incorrectly #1037

bigbohne opened this issue Aug 3, 2023 · 2 comments

Comments

@bigbohne
Copy link

bigbohne commented Aug 3, 2023

Subject

Running the boost::beast example (https://www.boost.org/doc/libs/1_81_0/libs/beast/example/http/client/async/http_client_async.cpp) with LD_PRELOAD=libvma.so fails

Issue type

  • Bug report
  • Feature request

Configuration:

  • Product version: VMA_VERSION: 9.7.2-1
  • OS: Rocky Linux 9.2
  • OFED: MLNX_OFED_LINUX-5.8-3.0.7.0:
  • Boost: 1.74 and 1.82
  • Hardware: ConnectX-5 (MT27800 Family) in a HP ProLiant Server

Actual behavior:

VMA ERROR: epfd_info:492:mod_fd() failed to modify fd=22 in epoll epfd=20 (errno=2 No such file or directory)

Expected behavior:

"Display of the data received from the HTTP Server"

Steps to reproduce:

  • compile https://www.boost.org/doc/libs/1_81_0/libs/beast/example/http/client/async/http_client_async.cpp into an executable
  • Call with ./http-async-client <someserver> <someport> <somepath>
  • Works!
  • Call with LD_PRELOAD=libvma.so ./http-async-client <someserver> <someport> <somepath>
  • Fails!
@bigbohne
Copy link
Author

bigbohne commented Aug 3, 2023

Digging into the strace of both runs: (with and without VMA)

Socket fd=22 is the socket in question here

less strace_vma.log | grep "epoll_ctl(20"

epoll_ctl(20, EPOLL_CTL_ADD, 3, {events=EPOLLIN|EPOLLERR|EPOLLET, data={u32=3, u64=3}}) = 0
epoll_ctl(20, EPOLL_CTL_ADD, 21, {events=EPOLLIN|EPOLLERR, data={u32=21, u64=21}}) = 0
epoll_ctl(20, EPOLL_CTL_ADD, 22, {events=EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP|EPOLLET, data={u32=22, u64=22}}) = 0
epoll_ctl(20, EPOLL_CTL_DEL, 22, NULL) = 0
epoll_ctl(20, EPOLL_CTL_MOD, 22, {events=EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP|EPOLLET, data={u32=22, u64=22}}) = -1 ENOENT (No such file or directory)

in the strace from the boost example (without VMA) one can see that the epoll_ctl calls are correctly done: (fd=6 is the socket in question here)

less strace.log | grep "epoll_ctl("

epoll_ctl(4, EPOLL_CTL_ADD, 3, {events=EPOLLIN|EPOLLERR|EPOLLET, data={u32=3828702728, u64=94823821696520}}) = 0
epoll_ctl(4, EPOLL_CTL_ADD, 5, {events=EPOLLIN|EPOLLERR, data={u32=3828702740, u64=94823821696532}}) = 0
epoll_ctl(4, EPOLL_CTL_ADD, 6, {events=EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP|EPOLLET, data={u32=3828704704, u64=94823821698496}}) = 0
epoll_ctl(4, EPOLL_CTL_MOD, 6, {events=EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP|EPOLLET, data={u32=3828704704, u64=94823821698496}}) = 0

@igor-ivanov
Copy link
Collaborator

Probably suspicious place is https://github.com/Mellanox/libvma/blob/master/src/vma/sock/sock-redirect.cpp#L1036-L1045

  1. socket starts connection as offloaded
  2. can not do connect using offload way
  3. marked as non offloaded
  4. close resources for offloaded socket including removing fd from epoll_fd (epoll_ctl(20, EPOLL_CTL_DEL, 22, NULL) = 0)
  5. do connect as non offloaded
  6. epoll_ctl(20, EPOLL_CTL_MOD, 22, {events=EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP|EPOLLET, data={u32=22, u64=22}}) = -1 ENOENT (No such file or directory)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants