Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add layer on top of LDAP methods to catch server disconnects and attempt re-connection #4992

Closed
wants to merge 4 commits into from

Conversation

bline
Copy link
Contributor

@bline bline commented May 20, 2017

This pull request adds a wrapper to calls to LDAP.php in order to trap server disconnects and attempt a reconnect. This happens on long running processes such as nextant solr indexer when the AD server times out the connection. This solves nextcloud/fulltextsearch#175 for me.

@mention-bot
Copy link

@bline, thanks for your PR! By analyzing the history of the files in this pull request, we identified @blizzz, @LukasReschke and @MorrisJobke to be potential reviewers.

@LukasReschke LukasReschke requested a review from blizzz May 20, 2017 12:33
@bline
Copy link
Contributor Author

bline commented May 20, 2017

I discovered today this pull request breaks calls to controlPagedResultResponse because $cookie is passed by reference. After some research it seems there is no simple work-around for passing by reference using call_user_func_array(). The only solution I've come up with so far is to make a special case in invokeLDAPMethod for controlPagedResultResponse. Any suggestions?

bline added 2 commits May 20, 2017 17:01
…e was a generic way to pass by reference with call_user_func_array..
@codecov
Copy link

codecov bot commented May 20, 2017

Codecov Report

Merging #4992 into master will decrease coverage by <.01%.
The diff coverage is 41.3%.

@@             Coverage Diff              @@
##             master    #4992      +/-   ##
============================================
- Coverage     54.16%   54.16%   -0.01%     
- Complexity    22276    22281       +5     
============================================
  Files          1379     1379              
  Lines         85301    85321      +20     
  Branches       1322     1322              
============================================
+ Hits          46203    46213      +10     
- Misses        39098    39108      +10
Impacted Files Coverage Δ Complexity Δ
apps/user_ldap/lib/Access.php 17.62% <41.3%> (+0.32%) 300 <12> (+5) ⬆️
lib/private/Files/Cache/Propagator.php 94.93% <0%> (-1.27%) 16% <0%> (ø)
core/js/js.js 61.78% <0%> (+0.56%) 0% <0%> (ø) ⬇️

@bline
Copy link
Contributor Author

bline commented May 21, 2017

OK, this last commit is less invasive, still passing CR around and not changing public method. I've done more testing and it no longer breaks controlPagedResultResponse pass by reference call. Any input welcome. Thanks!

@MorrisJobke MorrisJobke added 3. to review Waiting for reviews and removed 2. developing Work in progress labels May 21, 2017
@rullzer rullzer added this to the Nextcloud 13 milestone May 23, 2017
@blizzz
Copy link
Member

blizzz commented May 24, 2017

@bline thansk for looking into it. Just diving into your code. Up front:

The only solution I've come up with so far is to make a special case in invokeLDAPMethod for controlPagedResultResponse. Any suggestions?

I had the same issue when i wrote the LDAP Wrapper, there is also calls the method directly instead of using the invoker method 🙈 https://github.com/nextcloud/server/blob/master/apps/user_ldap/lib/LDAP.php#L74

@blizzz
Copy link
Member

blizzz commented May 24, 2017

On my local test case it is not working so far. Connection gets re-established, but I am only getting an empty result.

if(!$this->ldap->isResource($cr)) {
// Seems like we didn't find any resource.
\OCP\Util::writeLog('user_ldap', "Could not $command, because resource is missing.", \OCP\Util::DEBUG);
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw $e instead

@blizzz
Copy link
Member

blizzz commented May 24, 2017

It looks like somehow my OpenLDAP instance succeeds with a bind on reconnect, but does not operate. Need to leave now, will debug later.

@blizzz
Copy link
Member

blizzz commented May 24, 2017

So the problem is that methods like executeSearch re-use the old LDAP resource, instead of fetching the newly created one. Now we could change this everywhere so the link is always fetched, but this is error prone. Just waits for a future PR where this is forgotten.

The really right thing to do would be to rip this out of Connection and Access into a Client that is fed with the initial server and bind parameters and otherwise does the talking to LDAP. Partly, the LDAP Wrapper is doing this already. It also could take over the resource-management, and neither Access nor Connection should have anything to do with it. Or we indeed create a Client class that extends the LDAP Wrapper.

This also satisfies my urge to have this handling out of Access (which is already too big anyway).

@blizzz
Copy link
Member

blizzz commented May 24, 2017

Or we indeed create a Client class that extends the LDAP Wrapper.

Extending sucks, as the method signatures don't fit us. All the methods would be doubled and the usages needs to be adjusted either way. Making LDAP also do the connection handling seems better. Although there is also some overhead that needs to go into it (e.g. cloning behaviour).

All in all this is also a hell to backport :(

Best we go for the less invasive but not future-proof solution for <= stable12 and only to the architectural changes for master.

@blizzz
Copy link
Member

blizzz commented May 24, 2017

I created a new PR in a branch where we can work together: #5104 and pushed my changes that fixes the issues I found. Let's follow up there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants