-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential crash when deleting a replicated backend #6555
Comments
On main branch the patch triggers (not systematically) an infinite loop at shutdown Thread 1 (Thread 0x7f20c99b4700 (LWP 3080510) "ns-slapd"): #0 __pthread_rwlock_rdlock_full64 (rwlock=0x7f20c66a7400, clockid=0, abstime=0x0) at /usr/src/debug/glibc-2.39-33.fc40.x86_64/nptl/pthread_rwlock_common.c:506 #1 ___pthread_rwlock_rdlock (rwlock=0x7f20c66a7400) at pthread_rwlock_rdlock.c:26 #2 0x00007f20cab3d8ee in slapi_rwlock_rdlock (rwlock=) at ldap/servers/slapd/slapi2runtime.c:288 #3 0x00007f20c5fd2634 in replica_check_validity (replica=) at ldap/servers/plugins/replication/repl5_replica_hash.c:213 #4 0x00007f20c5fab461 in consumer_connection_extension_destructor (ext=, object=, parent=) at ldap/servers/plugins/replication/repl_connext.c:64 #5 0x00007f20caad7b33 in factory_destroy_extension (type=, object=0x7f1bbf000b70, parent=0x0, extension=0x7f1bbf000ca8) at ldap/servers/slapd/factory.c:366 #6 factory_destroy_extension (type=, object=0x7f1bbf000b70, parent=0x0, extension=0x7f1bbf000ca8) at ldap/servers/slapd/factory.c:348 #7 0x000056051977671c in connection_cleanup (conn=conn@entry=0x7f1bbf000b70) at ldap/servers/slapd/connection.c:181 #8 0x000056051977ac43 in connection_done (conn=0x7f1bbf000b70) at ldap/servers/slapd/connection.c:148 #9 connection_table_free (ct=) at ldap/servers/slapd/conntable.c:229 #10 slapd_daemon (ports=0x7ffec39290a0) at ldap/servers/slapd/daemon.c:1287 #11 0x000056051976b7c5 in main (argc=5, argv=0x7ffec39294f8) at ldap/servers/slapd/main.c:1152 I suspect that replica_check_validity should not be called at shutdown as s_hash (list of replica) looks somehow broken. |
Are you sure it is an infinite loop ? It rather looks like a locking issue (deadlock or deallocated lock?): we are not yet trying to access the hash table in this stack. Now it is possible that replica_destroy_name_hash has been called before the connection closure. IMHO replica_destroy_name_hash should set s_hash and s_lock to NULL after releasing them |
Issue Description
When removing a replicated backend (and probably also when disabling replication) a crash may occurs because the connection cleanup code attempt to use a free replica:"
This happen in the CI test with the recent new test: test_multi_subsuffix_replication
Package Version and Platform:
Steps to Reproduce
Steps to reproduce the behavior:
unzip it and check if there arfe cores in the assets/cores directory
Expected results
No crash should occur
Additional context
This crash is also described in #6531
The text was updated successfully, but these errors were encountered: