-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-17540: Remove Hadoop Auth Module #2835
base: main
Are you sure you want to change the base?
Conversation
Remove links to old pages that no longer exist, but leave the major changes references alone.
Kerb stuff appears to still work! All tests ran. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic to see all those files go, and all those external deps removed! Just a few comments..
There are some more hadoop auth cleanup in the security.policy |
Thanks for that! I hope I got them all out... |
@janhoy I tried running that script, but couldn't quite grok it. COuld you give an example of what that scripts should be? And let's add an example to the readme or the to script itself! |
It's some time ago, and the script was made to make sure we had redirects for the old regfuide structure, and we also added in page removals. The gist is to mainain the csv file with metadata of all changes, and edit the py script to output the correct htacces. We likely need to make another CSV section for pages removed in 10.0 guide, and then generate the correct redeirects.. I don't remember where / how the generated htaccess file is checked in though. And I know Antora has some built-in support for generating htaccess as well, should look into it.. To be pragmatic and unblock this, I'd make a Blocker JIRA for 10.0 and maintain a list of the removed pages there, so someone can update htaccess in proper way in due time. |
I wrote that comment to memorialize several hours of digging I did back when I moved startup to a context listener. One of the things I found perplexing about SolrDispatchFilter when I first tried to understand it for that task was the lack of a call to doFilter(req,resp,filterchain) ... note that our custom version with the boolean retry doesn't count, because it doesn't make the normal call to the method specified by javax.servlet.Filter. Normally filter implementations look like:
So it was very weird not to find a call to doFilter in the doFilter method, nor in our custom version of it. EVENTUALLY I figured out that that call is made either in the dispatch method, OR in our auth filter (I haven't tried to prove it can't get called twice, but with just SolrDispatchFilter in play that is not currently going to cause a problem since chain.doFilter is a no-op for the final filter). One of the long term goals I have is to start pulling stuff that we are doing in this monster filter out int a series of filters, which will make the individual missions easier to understand and put the cleanup code near the instantiation code where, again it would be much easier to understand (and nesting can be easily seen to be correct). My impulse (not yet informed by actual attempts) is to rework our auth plugin to be auth filters. The other thing I'm pointing out in that comment is that the HadoopAuthFilter is what seems to stand in the way of writing an if block such as:
That is of course the first step to breaking auth out to it's own filter where it becomes
The particular issue with the hadoop auth plugin that complicates the transition is that chain.doFilter() comes before a switch statement and other code... solr/solr/modules/hadoop-auth/src/java/org/apache/solr/security/hadoop/HadoopAuthPlugin.java Line 247 in 6f94c50
At least at the time of that comment it seemed that all the other plugins called chain.doFilter() at the end (or possibly in a shortcut followed by an immediate return statement). Only Hadoop auth seemed to have mandatory actions AFTER doFilter(). If it disappears, we can possibly remove the filterchain argument and make a simpler use of the return value from authenticate(). |
I am going to not touch For CHANGES.txt "Remove Kerberos authentication support from Solr. This in turn removes the Hadoop Auth module". <-- @dsmiley ??? |
Cause and effect is inverted. I suggest:
|
solr/core/src/java/org/apache/solr/security/RuleBasedAuthorizationPlugin.java
Show resolved
Hide resolved
solr/modules/hdfs/src/java/org/apache/solr/hdfs/HdfsDirectoryFactory.java
Outdated
Show resolved
Hide resolved
solr/solr-ref-guide/modules/deployment-guide/pages/solr-on-hdfs.adoc
Outdated
Show resolved
Hide resolved
This reverts commit 3ed7ddf.
solr/modules/hdfs/src/java/org/apache/solr/hdfs/HdfsDirectoryFactory.java
Outdated
Show resolved
Hide resolved
I think there might be a few more places to cleanup based on running the following on your branch
Specifically these findings:
It would be awesome to be able to cleanup the security policy files but I know there is some overlap with the Hadoop hdfs tests too. |
Some added context about delegation tokens - these were a Hadoop construct at one point and expanded elsewhere to avoid hitting the KDC (kerberos server) too much so the delegation token was used in place after the initial authentication happened. Basically it was a secure token passed around instead of doing the whole roundtrip to the KDC for each call. There are some other things the delegation token can do as well (impersonation if needed). As David said the Hadoop authentication framework is not just Kerberos, but has a whole framework for authentication. Its similar to how Hadoop filesystem support isn't just HDFS but also S3 and some other backends. Jetty does have Kerberos/SPNEGO support if we want to go down that route later. The Hadoop implementation for Kerberos support was better than most other Java support out there since not many Kerberos and Java implementations historically and lots of bugs across implementations (Active Directory vs Kerby vs others). I do think its time to remove this module and make it fully opt in (via a plugin or separately supported module). I haven't had time to keep up with the Hadoop side development of this and don't use it anymore. As Gus pointed out, there are some interesting hooks to make the Hadoop auth client stuff work. so cleaning all of that up is worth it and removing a module that isn't used that widely. |
@risdenk I see in |
Okay, I've responded (I think!) to @risdenk comments. I think this is ready for merging???? |
https://issues.apache.org/jira/browse/SOLR-17540
Description
Remove Hadoop Auth
Solution
no more Hadoop Auth
Tests
Just removing things
Tasks
solr-tests.policy
useShortName
feature, maybe only supported by hadoop-auth?javax.security.auth.kerberos
inpackage-list
file in docs render dir