Replication issues "LDAP error 51 (Server is busy)" #6551

gudtanha · 2025-01-28T09:24:12Z

Issue Description
For no apparent reason, the supplier somtimes isn't able to initialize a consumer, which is reflected in the supplier error log like this:

[timestamp] - INFO - NSMMReplicationPlugin - repl5_tot_run - Beginning total update of replica "agmt="consumer node 2" (consumer:636)".
[timestamp] - ERR - NSMMReplicationPlugin - perform_operation - agmt="cn=consumer node 2" (consumer:636): Failed to send extended operation: LDAP error 51 (Server is busy)

There's no error logged on the consumer. The accesslog of the consumer shows 250 entries received and a clean connection close.
Sometimes, the following error is logged on the consumer - couldn't figure out the realtion to when it's absent or accurring:

[timestamp] - NOTICE - NSMMReplicationPlugin - multisupplier_be_state_change - Replica dc=gi-de,dc=com is going offline; disabling replication
[timestamp] - INFO - bdb_instance_start - Import is running with nsslapd-db-private-import-mem on; No other process is allowed to access the database
[timestamp] - ERR - factory_destructor - ERROR bulk import abandoned
[timestamp] - ERR - bdb_import_run_pass - import userroot: Thread monitoring returned: -23
...
[timestamp] - INFO - bdb_public_bdb_import_main - import userroot: Closing files...
[timestamp] - ERR - bdb_public_bdb_import_main - import userroot: Import failed.
[timestamp] - ERR - process_bulk_import_op - NULL target sdn

Repeating the initialization on the master will eventually work if you try often enough.

But the problem persists for regular (non-scheduled) replication sessions.
Again, for no apparent reason, but across all consumers, sending updates frequently fails, which is reflected in the supplier error log like this:

[timestamp] - WARN - send_updates - %s: Failed to send update operation to receiver (uniqueid %s, CSN %s): %s. %s.
 - agmt="cn=anotherconsumer singlenode" (anotherconsumer:636)[timestamp] - ERR - NSMMReplicationPlugin - perform_operation - agmt="cn=anotherconsumer singlenode" (anotherconsumer:636): Failed to send extended operation: LDAP error 51 (Server is busy)
[timestamp] - ERR - NSMMReplicationPlugin - release_replica - agmt="cn=anotherconsumer singlenode" (anotherconsumer:636): Unable to send endReplication extended operation (Server is busy)

Again, there's no correspondig log on the consumer, the machine isn't under load at the time of the event and network is fine.

Package Version and Platform:

Platform: RHEL9_5
Package and version: 2.5.3.202501211239git86dd51fd1 (supplier) / 2.5.2-2.el9_5 (consumer)
Browser <- is not a aedequate management application

Steps to Reproduce
Steps to reproduce the behavior:

Create supplier/consumer replication agreement according to the docs using 'dsconf'
Watch suppliers error log

Expected results
Initialization works all the time at first run
No updating errors during regular replication

Screenshots
Cockpit Monitoring showing:
Changes Sent: 1:15219845/0 0:18168/0
(The second counter shows up only if replication had issues)

Additional context
We've been running 1.3.10 for 3 years without a single issue.
I started updating to 2.3, where MemberOfPlugin suffers from soft-locks. Version 2.4 has the same Plugin problems, additional cockpit problems and as far as I remember introduced those "server Busy" errors.
After comaring branches, I found out the 1.4 builds on top of 1.3 and 2.5 is closest to 1.4 with regards to the memberOfPlugin.
So once again I updated the complete landscape, this time to 2.5.2 for the consumers (RHEL9_5 repo) and 2.5.3.202501211239git86dd51fd1 for the supplier. Our simple single-supplier replication setup isn't working errorfree, as opposed to what we had with 1.3.10.
I even can't find out why to choose what branch or why all these branches exist on the first hand, if commits are pulled untested anyways.

I already checked what was mentioned in this reply, without any effect:
https://www.mail-archive.com/389-users@lists.fedoraproject.org/msg10204.html

Any help highly appreciated

The text was updated successfully, but these errors were encountered:

gudtanha added the needs triage The issue will be triaged during scrum label Jan 28, 2025

gudtanha changed the title ~~Replication issues " LDAP error 51 (Server is busy)"~~ Replication issues "LDAP error 51 (Server is busy)" Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replication issues "LDAP error 51 (Server is busy)" #6551

Replication issues "LDAP error 51 (Server is busy)" #6551

gudtanha commented Jan 28, 2025 •

edited

Loading

Replication issues "LDAP error 51 (Server is busy)" #6551

Replication issues "LDAP error 51 (Server is busy)" #6551

Comments

gudtanha commented Jan 28, 2025 • edited Loading

gudtanha commented Jan 28, 2025 •

edited

Loading