Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VMA ERROR: vlist[0x7fbc567fcad0]:302:push_back() Buff is already a member in a list! #974

Open
1 of 2 tasks
syspro4 opened this issue Nov 30, 2021 · 4 comments
Open
1 of 2 tasks

Comments

@syspro4
Copy link

syspro4 commented Nov 30, 2021

Subject

VMA ERROR: vlist[0x7fbc567fcad0]:302:push_back() Buff is already a member in a list!

Issue type

  • Bug report
  • Feature request

Configuration:

  • Product version: 9.3.1.1
  • OS: Oracle Linux 8.3
  • OFED: MLNX_OFED_LINUX-5.4-1.0.3.0
  • Hardware: Mellanox Technologies MT27700 Family [ConnectX-4]

Actual behavior:

I configured GlusterFS & configured gluster volume and then mounted that gluster volume from a host using glusterfs fuse protocol and ran fio with rw=read and I started seeing following VMA Errors.

fio command:
fio --error_dump=1 --direct=1 --verify_dump=1 --ioengine=libaio --size=100G --name=tt --bs=1M --nrfiles=8 --iodepth=8 --directory=/mnt_vol --rw=read --time_based=1 --runtime=120

Note: Same fio command with rw=write worked perfectly fine.

VMA INFO: ---------------------------------------------------------------------------
VMA INFO: VMA_VERSION: 9.3.1-1 Release built on Oct 9 2021 11:01:45
VMA INFO: Cmd Line: /usr/sbin/glusterfsd -s 192.168.2.244 --volfile-id ns1.192.168.2.244.mnt-vol -p /var/run/gluster/vols/ns1/192.168.2.244-mnt-vol.pid -S /var/run/gluster/834a347a5e8d7a50.socket --brick-name /mnt/vol -l /var/log/glusterfs/bricks/mnt-vol.log --xlator-option *-posix.glusterd-uuid=9e6c297d-dc87-4866-89b6-8ada6d5d35eb --process-name brick --brick-port 49152 --xlator-option ns1-server.listen-
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: Log Level INFO [VMA_TRACELEVEL]
VMA INFO: ---------------------------------------------------------------------------
VMA ERROR: vlist[0x7fbc567fcad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc56ffdad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc44c44ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc567fcad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45445ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc46447ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc46447ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc567fcad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45445ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc46447d70]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc46c48ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc46447ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45445ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc56ffdd70]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45c46ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45c46ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45445ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45c46ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45445ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc37ffdad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc56ffdad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45445ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45c46ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc56ffdad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc37ffdad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc56ffdad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45445ad0]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc56ffdad0]:302:push_back() Buff is already a member in a list!
^C VMA ERROR: vlist[0x7fbc46447d70]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc46447d70]:302:push_back() Buff is already a member in a list!
VMA ERROR: vlist[0x7fbc45445d70]:302:push_back() Buff is already a member in a list!

Please help me.
Thanks in advance.

@syspro4
Copy link
Author

syspro4 commented Nov 30, 2021

I managed to get the crash dump while running fio with rw=read.
Am I missing any configuration/parameter setting?
Please help.

[root@ core]# gdb glfs_epoll004.11.core.80614
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterfsd -s 192.168.2.241 --volfile-id ns1.192.168.2.241.mnt-G'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 epoll_wait_call::get_current_events (this=this@entry=0x7effaed29e00) at iomux/epoll_wait_call.cpp:149
149 iomux/epoll_wait_call.cpp: No such file or directory.
[Current thread is 1 (Thread 0x7effaed2b700 (LWP 80630))]
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-151.0.1.el8.x86_64 keyutils-libs-1.5.10-6.el8.x86_64 krb5-libs-1.18.2-8.el8.x86_64 libacl-2.2.53-1.el8.x86_64 libaio-0.3.112-1.el8.x86_64 libattr-2.4.48-3.el8.x86_64 libcom_err-1.45.6-1.el8.x86_64 libgcc-8.4.1-1.0.1.el8.x86_64 libibverbs-32.0-4.el8.x86_64 libnl3-3.5.0-1.el8.x86_64 librdmacm-32.0-4.el8.x86_64 libstdc++-8.4.1-1.0.1.el8.x86_64 libtirpc-1.1.4-4.el8.x86_64 liburing-1.0.7-3.el8.x86_64 libuuid-2.32.1-27.el8.x86_64 libvma-9.3.1-1.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64 pcre2-10.32-2.el8.x86_64 sssd-client-2.4.0-9.0.1.el8.x86_64 userspace-rcu-0.11.1-3.fc32.x86_64 zlib-1.2.11-17.el8.x86_64

(gdb) bt
#0 epoll_wait_call::get_current_events (this=this@entry=0x7effaed29e00) at iomux/epoll_wait_call.cpp:149
#1 0x00007f00adc88f1c in epoll_wait_helper (__epfd=, __events=__events@entry=0x7effaed29f94, __maxevents=__maxevents@entry=1, __timeout=__timeout@entry=-1, __sigmask=__sigmask@entry=0x0) at sock/sock-redirect.cpp:2440
#2 0x00007f00adc88fe8 in epoll_wait (__epfd=, __events=__events@entry=0x7effaed29f94, __maxevents=__maxevents@entry=1, __timeout=__timeout@entry=-1) at sock/sock-redirect.cpp:2461
#3 0x00007f00ad904732 in event_dispatch_epoll_worker (data=0x7effb0006560) at event-epoll.c:741
#4 0x00007f00ac42715a in start_thread () from /lib64/libpthread.so.0
#5 0x00007f00abc70dd3 in clone () from /lib64/libc.so.6
(gdb)

Following is the code snippet for line 149:
76 int epoll_wait_call::get_current_events()
77 {
...
138 /*
139 * for checking ring migration we need a socket context.
140 * in epoll we separate the rings from the sockets, so only here we access the sockets.
141 * therefore, it is most convenient to check it here.
142 * we need to move the ring migration to the epfd, going over the registered sockets,
143 * when polling the rings was not fruitful.
144 * this will be more similar to the behavior of select/poll.
145 * see RM task 212058
146 /
147 while (!socket_fd_list.empty()) {
148 socket_fd_api
sockfd = socket_fd_list.get_and_pop_front();
149 sockfd->consider_rings_migration();
150 }

Thanks!

@syspro4
Copy link
Author

syspro4 commented Jan 5, 2022

Can some please share some update on this issue?

@igor-ivanov
Copy link
Collaborator

I would recommend setting VMA_TRACELEVEL=4 and look or share debug output.
You can try to launch your application with extra VMA option as VMA_RING_MIGRATION_RATIO_TX=-1 VMA_RING_MIGRATION_RATIO_RX=-1

@igor-ivanov
Copy link
Collaborator

@syspro4 do you see the issue with VMA_RING_MIGRATION_RATIO_TX=-1 VMA_RING_MIGRATION_RATIO_RX=-1 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants