Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shibboleth: restore Institution Log In functionality to dataverse.harvard.edu #2117

Closed
pdurbin opened this issue Apr 29, 2015 · 14 comments
Closed
Assignees
Labels
Component: Code Infrastructure formerly "Feature: Code Infrastructure"
Milestone

Comments

@pdurbin
Copy link
Member

pdurbin commented Apr 29, 2015

Unfortunately, we saw enough instability (#2013) following the launch of Dataverse 4.0 at https://dataverse.harvard.edu that 14 days later (in the interest of simplifying our setup) we stopped fronting Apache with Glassfish over port 443.

Without Apache (mod_shib) in place, we were forced to disable Shibboleth login. (Because the Shibboleth feature is new, only ~20 accounts were affected, which were converted to builtin/local accounts.)

This ticket is about restoring Shibboleth login which we suspect might require a code change. Various options are being explored in a Google Doc including:

@pdurbin pdurbin added Status: Dev Component: Code Infrastructure formerly "Feature: Code Infrastructure" labels Apr 29, 2015
@pdurbin pdurbin self-assigned this Apr 29, 2015
pdurbin added a commit that referenced this issue Apr 29, 2015
Apache and mod_shib once ran on port 443 but we decided to let Glassfish
serve HTTPS requests there instead. Apache is still serving HTTP on port
80 but is now serving HTTPS on port 8181 to the code and configuration
has been changed to accomodate this.

Configuration file examples are from http://shibtest.dataverse.org with
comments. The idea is that the only time users see port 8181 in their
browsers is when they are on
https://shibtest.dataverse.org:8181/shib.xhtml when creating or
converting an account after having authenticated with their institution.
Behind the scenes, other "plumbing" under "/Shibboleth.sso" is available
for metadata exchange and determining users' affiliation. A todo has
been added to make this port configurable from 8181.

Note that a servlet filter has been added because the same origin policy
was preventing resources from being read, which resulted in an ugly page
with missing images.

Finally, more information has been added when testing with TestShib to
not worry about the lack of the "mail" attribute.
@pdurbin
Copy link
Member Author

pdurbin commented Apr 29, 2015

In 5c710d1 we are taking the approach of having Apache serve HTTPS over port 8181 so we can continue to use mod_shib. That port will need to be open in the firewall, of course. Example Apache configuration files can be found here:

https://github.com/IQSS/dataverse/tree/5c710d14c96a20eb027a3948657ef6534c0ae807/scripts/deploy/shibtest.dataverse.org/etc

In Vagrant and other environments, we have used a file at https://github.com/IQSS/dataverse/blob/master/conf/httpd/conf.d/dataverse.conf that includes Shibboleth config and ProxyPass settings but I consider this to be deprecated at this point. I've concentrated all the config at the bottom of /etc/httpd/conf.d/ssl.conf (like others seem to do): https://github.com/IQSS/dataverse/blob/5c710d14c96a20eb027a3948657ef6534c0ae807/scripts/deploy/shibtest.dataverse.org/etc/httpd/conf.d/ssl.conf

This will hopefully give a sense of how ssl.conf has changed:

[root@dvn-vm3 ~]# diff /etc/httpd/conf.d/ssl.conf.orig /etc/httpd/conf.d/ssl.conf
18c18,19
< Listen 443
---
> #Listen 443
> Listen 8181
74c75,76
< <VirtualHost _default_:443>
---
> #<VirtualHost _default_:443>
> <VirtualHost _default_:8181>
78c80,81
< ServerName shibtest.dataverse.org:443
---
> #ServerName shibtest.dataverse.org:443
> ServerName shibtest.dataverse.org:8181
220a224,248
> # We set this header to avoid this:
> # "Failed to download metadata from :8181/Shibboleth.sso/DiscoFeed"
> # Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://shibtest.dataverse.org:8181/Shibboleth.sso/DiscoFeed. This can be fixed by moving the resource to the same domain or enabling CORS.
> Header set Access-Control-Allow-Origin "*"
> 
> # Rewrite all non-Shib traffic to the standard HTTPS port (443)
> # Shib traffic includes:
> # - https://shibtest.dataverse.org:8181/Shibboleth.sso/DiscoFeed
> # - https://shibtest.dataverse.org:8181/Shibboleth.sso/Metadata
> # - https://shibtest.dataverse.org:8181/shib.xhtml (proxied to Glassfish)
> # This will allow users to click "Home" (for example) from
> # shib.xhtml and *not* stay on port 8181 like this:
> # https://shibtest.dataverse.org:8181
> RewriteEngine On
> RewriteCond %{REQUEST_URI} !^/(Shibboleth.sso|shib.xhtml)
> RewriteRule /?(.*) https://%{SERVER_NAME}/$1 [R,L]
> 
> ProxyPass /shib.xhtml ajp://localhost:8009/shib.xhtml
> 
> <Location /shib.xhtml>
>   AuthType shibboleth
>   ShibRequestSetting requireSession 1
>   require valid-user
> </Location>
> 
[root@dvn-vm3 ~]# 

For cosmetic reasons a servlet filter has been introduced so that images show up properly. In testing before the servlet filter was added, images were broken like this:

cors-resources

Please note that because the port has changed, you'll need to re-upload metadata to http://testshib.org/register.html to test with TestShib. Before you do, it is recommended that you adjust the entityId in /etc/shibboleth/shibboleth2.xml to end with "/sp" per #2104.

On any server where we reconfigure Apache to use 8181 Harvard PIN login is expected not to work until we have sent HUIT our updated metadata. I'll handle this in a batch but testing can begin with TestShib per above.

Passing to QA.

@pdurbin pdurbin removed their assignment Apr 29, 2015
@pdurbin
Copy link
Member Author

pdurbin commented May 7, 2015

After chatting with @scolapasta @landreev and @kcondon today I think we've reached a consensus that we're not at all sure that the solution implemented (jumping between ports 443 and 8181) will work in production due to the fact that we load balance between two web servers.

Here's a screenshot that shows the jumping between ports (which also lead to the introduction of a servlet filter):

saml

I'm grabbing this out of QA and will add to the Google doc some of the ideas we've been kicking around (ideas from @whorka too, thanks).

@pdurbin pdurbin self-assigned this May 7, 2015
@eaquigley eaquigley modified the milestones: In Design, Dataverse 4.0: Release Patches May 8, 2015
@scolapasta scolapasta modified the milestones: In Design, 4.0.1 May 8, 2015
@eaquigley eaquigley modified the milestones: In Design, 4.0.1 May 8, 2015
pdurbin added a commit that referenced this issue May 14, 2015
This reverts commit 5c710d1.

This solution forces people installing Dataverse to run mod_shib on a
non-standard port (8181 rather than 443). At the very least this should
be configurable.

In addition it introduces a servlet filter (believed to be secure) to
get around the error "Cross-Origin Request Blocked: The Same Origin
Policy disallows reading the remote resource". This servlet filter
affects even installations that don't install Shibboleth and may
introduce a slight performance penalty.

Finally, we have reason to believe this solution will not work in
Harvard's production environment which includes multiple web servers
behind a load balancer ("content switch"). We haven't figured out a
configuration in staging that we believe will work.

All this is to say that this solution is non-standard and untested so it
doesn't belong code that we soon hope to tag with "4.0", which is why we
are reverting it.
@pdurbin
Copy link
Member Author

pdurbin commented May 14, 2015

Please see the long commit message in cfb5426 in which I reverted the 8181 experiment. It's non-standard and untested and doesn't belong in code that's about to be tagged as "4.0".

Shibboleth works fine when you run mod_shib on the standard port (443) but we currently have a constraint that we can't do this for https://dataverse.harvard.edu . I started a new Google doc about this constraint and others to guide the development of a solution for restoring Shibboleth support to Harvard's installation of Dataverse: Shibboleth Constraints. @scolapasta if you would please review this I would appreciate it. This new "constraints" document is also linked from the older Restore Shibboleth Support doc.

I'd be very curious to know if @bencomp or @akio-sone have tried running mod_shib on the standard port (443) and gotten the Shibboleth feature working. (It still works fine at https://dataverse-demo.iq.harvard.edu .) There's some documentation at http://guides.dataverse.org/en/latest/installation and I'm happy to answer any questions at https://groups.google.com/forum/#!forum/dataverse-community and http://chat.dataverse.org

@pdurbin pdurbin changed the title Shibboleth: restore Institution Log In functionality Shibboleth: restore Institution Log In functionality to dataverse.harvard.edu May 14, 2015
@pdurbin pdurbin removed their assignment Jul 21, 2015
@kcondon
Copy link
Contributor

kcondon commented Jul 21, 2015

@pdurbin I did not think reenabling shib was slated for v4.1?

@pdurbin
Copy link
Member Author

pdurbin commented Jul 21, 2015

@kcondon please feel free to move it to a better milestone. Thanks.

@kcondon kcondon removed this from the 4.1 milestone Jul 21, 2015
@kcondon
Copy link
Contributor

kcondon commented Jul 21, 2015

@pdurbin OK, leaving it milestone-less until we see how things stand with the intention it should be reenabled as soon as possible.

@pdurbin
Copy link
Member Author

pdurbin commented Sep 3, 2015

A few days ago a Harvard user asked if we have Shibboleth working at https://help.hmdc.harvard.edu/Ticket/Display.html?id=226460

@eaquigley
Copy link
Contributor

Had a conversation with @pdurbin today about bringing Shibboleth back to Harvard Dataverse. According to this ticket and @pdurbin, glassfish has been updated so the bug that was appearing with glassfish and shibboleth should no longer happen. However, Shibboleth has not been turned back on for Harvard Dataverse so there has not been a way to tell if the problem has been solved. Shibboleth is still enabled for dataverse-demo (and can log in with Harvard credentials) so maybe we can figure out a way to QA there?

Additionally, we will need to come up with a roll out plan for Shibboleth in Harvard Dataverse as we don't want the situation from last time (having everyone who has converted to shibboleth have to convert back and the ensuing password problems). How will this be announced? What do we do to make sure the same situation doesn't happen again?

So, what milestone should this be moved to and the plan defined?

@mcrosas @scolapasta @kcondon

@kcondon
Copy link
Contributor

kcondon commented Sep 18, 2015

My recollection is that I have talked about this to Phil on a number of occasions as well as Gustavo. We had a plan for rollout: deploy the underlying fix to glassfish, then later reenable Shibboleth after double checking it still worked with the current code base.

Last item was we were holding off rollout until we had InCommon membership and that required some work with Marlena. This was discussed with Marlena a few weeks ago and not sure if we have been approved. We have been busy with other releases and work so this had a lower priority.

@pdurbin
Copy link
Member Author

pdurbin commented Oct 9, 2015

There's no technical reason why we need to wait for InCommon membership before re-enabling Shibboleth. We launched Dataverse 4.0 with Shibboleth support and were accepting logins for Harvard Shibboleth users by exchanging metadata out-of-band (the process for using TestShib as an IdP is documented at http://guides.dataverse.org/en/4.2/installation/shibboleth.html#testing ). That said, we still very much want http://dataverse.harvard.edu to be part of InCommon! If this issue is truly blocked on InCommon membership, the next step is for @scolapasta to answer some questions from @marlenaerdos so I'll assign this issue to him.

Also, a bit of news is that Glassfish 4.1.1 is out so once we've vetted it in #2628 it should obviate the need to install the Grizzly patch we made to make it safe to run Apache (and mod_shib) in front of Glassfish 4.1.

Mostly I'm warming up this old issue because it was just asked on our mailing list if Shibboleth support is still experimental: https://groups.google.com/d/msg/dataverse-community/mlz7DQgSHak/LaEUeyUfCgAJ

@mercecrosas mercecrosas modified the milestone: In Review Nov 30, 2015
@scolapasta scolapasta removed their assignment Jan 27, 2016
@scolapasta scolapasta removed this from the Not Assigned to a Release milestone Jan 28, 2016
@pdurbin
Copy link
Member Author

pdurbin commented May 25, 2016

The other day I asked @scolapasta if I should pass this to QA and give it a milestone of 4.4 and he indicated that I should so I'm doing that now.

From my perspective, the Shib code is in much better shape in the "develop" branch than is was in previous releases (v4.3.1 is the latest release as of this writing) since the following pull requests have been merged:

I'd like to note that even without the fixes above https://dataverse.lib.virginia.edu has been running Shibboleth since the end of March: http://news.library.virginia.edu/2016/03/29/uva-library-launches-libra-data-university-of-virginia-dataverse-repository/

In addition, Odum has been helping me test the pre-release code at https://help.hmdc.harvard.edu/Ticket/Display.html?id=233688 and it seems to be working fine.

The most significant loose ends at this point for "phase 1" as defined in the Remote Authentication Business Requirements Document are the following issues:

At some point we should remove "Experimental" from the top of the Shibboleth section of the Installation Guide. The idea has always been that we'll run Shibboleth in production for a while at https://dataverse.harvard.edu (eat our own dogfood) before doing so. :) That's really what this issue is about.

Passing to QA.

@pdurbin pdurbin added this to the 4.4 milestone May 25, 2016
@pdurbin pdurbin assigned kcondon and unassigned pdurbin May 25, 2016
@kcondon
Copy link
Contributor

kcondon commented May 27, 2016

Closing.

@kcondon kcondon closed this as completed May 27, 2016
pdurbin added a commit that referenced this issue Aug 1, 2016
Also remove "experimental" since #2117 has been closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Code Infrastructure formerly "Feature: Code Infrastructure"
Projects
None yet
Development

No branches or pull requests

5 participants