Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ztest: seg fault __deallocate_stack after pthread_join() #5523

Closed
jsalinasintel opened this issue Dec 22, 2016 · 1 comment
Closed

ztest: seg fault __deallocate_stack after pthread_join() #5523

jsalinasintel opened this issue Dec 22, 2016 · 1 comment
Labels
Component: Test Suite Indicates an issue with the test framework or a test case

Comments

@jsalinasintel
Copy link

Version Information

Distribution Name Centos 7.2
Distribution Version 7.2
Linux Kernel 3.10.0-327.36.3.el7.x86_64
Architecture x86_64
ZFS Version master Dec 20th RPM/modinfo say: 0.7.0-rc2
SPL Version master Dec 20th RPM/modinfo say: 0.7.0-rc2
Git Build Data Revision: a3823f4
refs/remotes/origin/master
Built Branches refs/remotes/origin/master: Build #186 of Revision a3823f4 (refs/remotes/origin/master)

Problem

Running zloop we hit an issue:
/usr/sbin/ztest[0x40a7cf]
/lib64/libpthread.so.0(+0xf100)[0x7f24ff29d100]
/lib64/libpthread.so.0(+0x6e78)[0x7f24ff294e78]
/lib64/libpthread.so.0(pthread_join+0xe3)[0x7f24ff296f33]
/lib64/libzpool.so.2(zk_thread_join+0x21)[0x7f25004924b1]

Reproduce

/sbin/ztest -VVVVV -m 1 -r 0 -R 1 -v 5 -a 12 -T 4 -P 10 -s 128m -f /var/tmp

Log detail

ztest.out

                            capacity   operations   bandwidth  ---- errors ----
description                used avail  read write  read write  read write cksum
ztest                     2.05M  110M   698     0 4.75M     0     0     0     0
  mirror                  2.05M  110M   698     0 4.75M     0     0     0     0
    /var/tmp/ztest.0a                   698     0 4.75M     0     0     0     0
starting main threads...
Setting dataset ztest/ds_5 to sync always
/usr/sbin/ztest[0x40a7cf]
/lib64/libpthread.so.0(+0xf100)[0x7f24ff29d100]
/lib64/libpthread.so.0(+0x6e78)[0x7f24ff294e78]
/lib64/libpthread.so.0(pthread_join+0xe3)[0x7f24ff296f33]
/lib64/libzpool.so.2(zk_thread_join+0x21)[0x7f25004924b1]
/usr/sbin/ztest[0x407805]
/usr/sbin/ztest[0x4087e4]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f24feeedb15]
/usr/sbin/ztest[0x4088cd]
child died with signal 11

status

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/ztest'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f24ff294e78 in __deallocate_stack () from /lib64/libpthread.so.0
*
* Backtrace 
*
#0  0x00007f24ff294e78 in __deallocate_stack () from /lib64/libpthread.so.0
#1  0x00007f24ff296f33 in pthread_join () from /lib64/libpthread.so.0
#2  0x00007f25004924b1 in zk_thread_join () from /lib64/libzpool.so.2
#3  0x0000000000407805 in ztest_run ()
#4  0x00000000004087e4 in main ()
*

(only thread 1 is doing something)  
@behlendorf behlendorf added the Component: Test Suite Indicates an issue with the test framework or a test case label Dec 22, 2016
@jsalinasintel
Copy link
Author

Here is another one not sure if it is related:
starting main threads...
2.42 sec in ztest_scrub
8.40 sec in ztest_ddt_repair
/usr/sbin/ztest[0x40a7cf]
/lib64/libpthread.so.0(+0xf100)[0x7f3a9860e100]
/lib64/libpthread.so.0(pthread_join+0x11)[0x7f3a98607e61]
/lib64/libzpool.so.2(zk_thread_join+0x21)[0x7f3a998034b1]
/usr/sbin/ztest[0x407805]
/usr/sbin/ztest[0x4087e4]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f3a9825eb15]
/usr/sbin/ztest[0x4088cd]
child died with signal 11

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/ztest'.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f3a98607e61 in pthread_join () from /lib64/libpthread.so.0
*

  • Backtrace

#0 0x00007f3a98607e61 in pthread_join () from /lib64/libpthread.so.0
#1 0x00007f3a998034b1 in zk_thread_join () from /lib64/libzpool.so.2
#2 0x0000000000407805 in ztest_run ()
#3 0x00000000004087e4 in main ()
*

Thread 757 (Thread 0x7f3a62187700 (LWP 116952)):
#0 0x00007f3a9860d84d in fsync () from /lib64/libpthread.so.0
#1 0x00007f3a9988124c in vdev_file_io_start () from /lib64/libzpool.so.2
#2 0x00007f3a998fea57 in zio_vdev_io_start () from /lib64/libzpool.so.2
#3 0x00007f3a998fcfc1 in zio_nowait () from /lib64/libzpool.so.2
#4 0x00007f3a998fd0d4 in zio_ioctl () from /lib64/libzpool.so.2
#5 0x00007f3a998fd192 in zio_flush () from /lib64/libzpool.so.2
#6 0x00007f3a998835d3 in vdev_config_sync () from /lib64/libzpool.so.2
#7 0x00007f3a9986a16a in spa_sync () from /lib64/libzpool.so.2
#8 0x00007f3a99879325 in txg_sync_thread () from /lib64/libzpool.so.2
#9 0x00007f3a9980296c in zk_thread_helper () from /lib64/libzpool.so.2
#10 0x00007f3a98606dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007f3a98333ced in clone () from /lib64/libc.so.6

@behlendorf behlendorf changed the title ztest seg fault __deallocate_stack after pthread_join ztest: seg fault __deallocate_stack after pthread_join() Jan 24, 2017
FransUrbo pushed a commit to FransUrbo/zfs that referenced this issue Apr 28, 2019
* Simplify threads, mutexs, cvs and rwlocks

* Update the zk_thread_create() function to use the same trick
  as Illumos.  Specifically, cast the new pthread_t to a void
  pointer and return that as the kthread_t *.  This avoids the
  issues associated with managing a wrapper structure and is
  safe as long as the callers never attempt to dereference it.

* Update all function prototypes passed to pthread_create() to
  match the expected prototype.  We were getting away this with
  before since the function were explicitly cast.

* Replaced direct zk_thread_create() calls with thread_create()
  for code consistency.  All consumers of libzpool now use the
  proper wrappers.

* The mutex_held() calls were converted to MUTEX_HELD().

* Removed all mutex_owner() calls and retired the interface.
  Instead use MUTEX_HELD() which provides the same information
  and allows the implementation details to be hidden.  In this
  case the use of the pthread_equals() function.

* The kthread_t, kmutex_t, krwlock_t, and krwlock_t types had
  any non essential fields removed.  In the case of kthread_t
  and kcondvar_t they could be directly typedef'd to pthread_t
  and pthread_cond_t respectively.

* Removed all extra ASSERTS from the thread, mutex, rwlock, and
  cv wrapper functions.  In practice, pthreads already provides
  the vast majority of checks as long as we check the return
  code.  Removing this code from our wrappers help readability.

* Added TS_JOINABLE state flag to pass to request a joinable rather
  than detached thread.  This isn't a standard thread_create() state
  but it's the least invasive way to pass this information and is
  only used by ztest.

TEST_ZTEST_TIMEOUT=3600

Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#4547 
Closes openzfs#5503 
Closes openzfs#5523 
Closes openzfs#6377 
Closes openzfs#6495
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Test Suite Indicates an issue with the test framework or a test case
Projects
None yet
Development

No branches or pull requests

2 participants