This document should help you get started configuring the OLFS web application component of Hyrax. This software package was developed, compiled, and tested using the java 1.6.x compiler, the 1.6.x Java Virtual Machine, and Jakarta Tomcat 7.x.x (which also provided the javax.servlet packages).
The OLFS web application is composed of these servlets:
-
Hyrax servlet - The Hyrax servlet provides DAP (and other) services for the Hyrax server. The Hyrax servlet does the majority of the work in the OLFS web application. It does this by providing a flexible "dispatch" mechanism through which incoming requests are evaluated by a series of DispatchHandlers (pieces of software) that can choose to handle or ignore the request. The OLFS ships with a standard set of DispathHandlers which handle requests for OPeNDAP data products, THREDDS catalogs, and OPeNDAP directories. These defalut DispatchHandlers can be augmented by adding custom handlers without the need to recompile the software. All of the DispatchHandlers used by the Hyrax servlet are identified in the olfs.xml configuration file.
-
Viewers servlet - The Viewers servlet provides a service for datasets through which both the Web Services and Java WebStart applications that might be used with the dataset are identified. The Viewers servlet is configured via the viewers.xml file.
-
Docs servlet - The Docs servlet provides clients access to a tree of static documents. By default, a minimal set of documents are provided (containing information about Hyrax). These can be replaced by user supplied documents and images. By changing the images and documents available through the Docs servlet, the data provider can further customize the appearance and layout of the Hyrax server web pages, making them conform better to their parent organization’s visual identity. The Docs servlet has no specific configuration file.
-
Admin Interface Servlet - The Hyrax Administration Interface (HAI) provides server administrators with a GUI for monitoring, controlling, and configuring the server.
-
Gateway Servlet - The Gateway Servlet provides a gateway service that allows Hyrax to be configured to retrieve files (that the server recognizes as data) from the web and then provides DAP services for the retrieved files. The Gateway servlet does not require additional configuration, yet the BES must be correctly configured to perform gateway tasks.
Additionally the OLFS web application relies on one or more instances of the BES to provide it with data access and basic catalog metadata.
The OLFS web application stores its configuration state in a number of files. The server’s configuration is altered by carefully modifying the content of one or more of these files and then restarting the web application (or simply restarting Tomcat).
The remainder of this document is concerned with how to correctly configure the Hyrax and Viewers servlets - the primary components of the OLFS web application.
Beginning with olfs-1.15.0 (part of hyrax-1.13.0) the OLFS will use the following procedure to deterimine its configuration location:
-
It will first look at the value of the user environment variable
OLFS_CONFIG_DIR
. If the variable is set and its value is the pathname of an existing directory that is readable and writable by Tomcat, the OLFS will use it. Otherwise, -
If the directory
/etc/olfs
exists and is readable and writable by Tomcat the OLFS will use it. Otherwise, -
If the directory
/usr/share/olfs
exists and is readable and writable by Tomcat then the OLFS will use it (this was added for Hyrax 1.14.1). Otherwise, -
The OLFS will utilize the default configuration bundled in the web application web archive file (opendap.war).
In this way the OLFS can start without a persistent local configuration.
If the default configuration works for your intended use, then there is
no need create a persistent localized configuration. If changes need to
be made to the configuration, then it is strongly recommended that the
user enable the use of a persistent local configuration. This way
updating the web application won’t destroy your changes. This is easily
done by creating an empty directory and identifying it with the
OLFS_CONFIG_DIR
environment variable. For example:
export OLFS_CONFIG_DIR="/home/tomcat/hyrax"
Alternatively, you can create either the directory /etc/olfs
or the
directory /usr/share/olfs
. Make sure that the directory you create is
both readable and writable by Tomcat.
Once the directory is created (and in the first case the environment variable is set) restart the OLFS (Tomcat). This will cause the OLFS to move a copy of its default configuration into the empty directory and then utilize it. You can then edit the local copy.
SELinux (which is a special gift you get with CentOS-7) will cause some new challenges for those not familiar with the changes it brings to the system environment. For one, Tomcat runs as a confined user. Here we’ll examine how these changes affect the OLFS.
Recent versions of CentOS-7 are shipped with default SELinux settings that prohibit Tomcat from reading or opening the opendap.war file. This can be addressed by issuing the following two commands:
sudo semanage fcontext -a -t tomcat_var_lib_t /var/lib/tomcat/webapps/opendap.war sudo restorecon -rv /var/lib/tomcat/webapps/
After this you will need to restart Tomcat:
sudo service tomcat restart
When using a yum installed Tomcat on CentOS-7.x (or any other Linux
environment that is essentially an SELinux variant), neither the /etc/olfs
or the /usr/share/olfs
configuration locations will work without taking extra steps.
You must alter the SELinux access policies to give the Tomcat user
permission to read and write to one of these directories.
Here is a script/recipe for configuring the /usr/share/olfs
directory for reading
and writing by the Tomcat user. There are probably a number of alternative ways to accomplish this,
but this one worked for me.
#!/bin/sh # You must be the super user to do this stuff... sudo -s # Create the location for the local configuration mkdir -p /usr/share/olfs # Change the group ownership to the tomcat group. # (SELinux will not allow you make the owner tomcat.) chgrp tomcat /usr/share/olfs # Make it writable by the tomcat group sudo chmod g+w /usr/share/olfs # Use semanage to change the context of the target # directory and any (future) child dirs semanage fcontext -a -t tomcat_var_lib_t "/usr/share/olfs(/.*)?" # Use restorecon to commit/do the labeling. restorecon -rv /usr/share/olfs
There is a lot going in the above script and to fully understand it you will need to study the man pages for semanage and restorecon. You may also get some benefit from these articles about SELinux and the permissions issues therein:
In olfs-1.14.1 (part of hyrax-1.12.2) and earlier, the OLFS web application was located in the 'persistent content directory': $CATALINA_HOME/content/opendap. This caused bootstrap problems when the OLFS tried to set itself up on a Linux system in which the Tomcat installation had been done via RPM.
The OLFS web application gets its configuration from four files. In general all of your configuration need will be met by making changes to the first two: olfs.xml and catalog.xml
- olfs.xml
-
Role: Contains the localized OLFS configuration - location of the BES(s), directory view instructions, etc.
Location: In the persistent content directory which by default is located at $CATALINA_HOME/content/opendap/ - catalog.xml
-
Role: Master(top-level) THREDDS catalog content for static THREDDS catalogs.
Location: In the persistent content directory which by default is located at $CATALINA_HOME/content/opendap/ - viewers.xml
-
Role: Contains the localized Viewers configuration.
Location: In the persistent content directory which by default is located at $CATALINA_HOME/content/opendap/ - web.xml
-
Role: Core servlet configuration.
Location: The servlet’s web.xml file located in the WEB-INF directory of the web application "opendap". Typically that means $CATALINA_HOME/webapps/opendap/WEB-INF/web.xml - log4j.xml
-
Role: Contains the logging configuration for Hyrax.
Location: The default location for the log4j.xml is in the WEB-INF directory of the web application "opendap". Typically that means $CATALINA_HOME/webapps/opendap/WEB-INF/log4j.xml However, Hyrax can be configured to look in additional places for the log4j.xml file. Read More About It Here.
The Hyrax servlet is the front end (public interface) for Hyrax. It provides DAP services, THREDDS catalogs, directory views, logging, and authentication services. This is accomplished through a collection of software components called DispatchHandlers. At startup the Hyrax servlet reads the olfs.xml file which contains a list of DispatchHandlers and their configurations. DispatchHandlers on the list are loaded, configured/initialized, and then used to provide the aforementioned services.
Request dispatch is the process by which the OLFS determines what actual piece of code is going to respond to a given incoming request. This version of the OLFS handles each incoming request by offering the request to a series of DispatchHandlers. Each DispatchHandler is asked if it can handle the request. The first DispatchHandler to say that it can handle the request is then asked to do so. The OLFS creates an ordered list of DispatchHandlers objects in memory by reading the olfs.xml.
The order of the list is significant. There is no guarantee that two (or
more) DispatchHandlers may claim a particular request. Since the first
DispatchHandler in the list to claim a request gets to service it,
changing the order of the DispatchHandlers can change the behavior of
the OLFS (and thus of Hyrax). For example the URL
http://localhost:8080/opendap/data/
is recognized by both the
DirectoryDispatchHandler and the ThreddsDispatchHandler, each of which
can provide a directory view; however, only the DirectoryDispatchHandler
can be configured to reject the request and pass it on to another
handler, in this case the ThreddsDispatchHandler. The
result is that if you put the ThreddsDispatchHandler prior to the
DirectoryDispatchHandler on the list, there will be no way to
get an OPeNDAP directory view - the ThreddsDispatchHandler will claim
them all.
This dispatching scheme is useful because it creates extensibility. If a third party wishes to add new functionality to Hyrax, one way is to write a DispatchHandler. To incorporate it into Hyrax, they need only add it to the list in the olfs.xml and add the java classes to the Tomcat lib directory.
The olfs.xml file contains the core configuration of the Hyrax servlet:
-
It configures the BESManager with at least one BES to be used by the OLFS web application.
-
It identifies all of the DispatchHandlers to be used by the Hyrax servlet.
-
It controls both view and access behaviours of the Hyrax servlet.
The <OLFSConfig> element is the document root. It contains two elements that suppy the configuration for the OLFS: <BesManager> and <DispatchHandlers >.
The BESManager element provides configuration for the BESManager class. The BESManager is used whenever the software needs to access BES’s services. This configuration is key to the function of Hyrax, for in it is defined each BES that is connected to a Hyrax installation. The following examples will show a single BES example. For more information on configuring Hyrax to use multiple BES’s look here.
Each BES is identified using a seperate <BES> child element inside of the <BESManager> element.
The <BES> element provides the OLFS with connection and control information for a BES. There are 4 child elements in a <BES> element: <prefix>, <host>, <port>, and <ClientPool>.
This child element of the <BES> element contains the URL prefix that the OLFS will associate with this BES. This provides a mapping between this BES to the URI space serviced by the OLFS. The prefix, then, is a token that is placed between the host:port/context/ part of the Hyrax URL and the catalog root. The catalog root is used to designate a particular BES instance in the event that multiple BES’s are available to a single OLFS.
For a single BES (the default configuration) the tag must be designated by "/". This prefix provides a mapping for each BES connected to the OLFS and the URI space serviced by the OLFS.
-
There must be at least one BES element in the BESManager handler configuration whose prefix has a value of "/" (see example 1). There may be more than one <_BES_>, but only that one is required.
-
For a single BES (the one with "/" as its prefix) no additional effort is required; however, when using multiple BES’s it is neccesary that each BES has a mount point exposed as a directory (aka collection) in the URI space where it’s going to appear. See Configuring With Multiple BES’s for more information.
-
The prefix string must always begin with the slash ("/") character. (See example 2.)
Example 1:
<prefix>/</prefix>
Example 2:
<prefix>/data/nc</prefix>
This child element of the <BES> element contains the host name or IP address of the BES.
Example:
<host>test.opendap.org</host >
This child element of the <BES> element contains port number on which the BES is listening.
Example:
<port>10022</port >
This child element of the <BES> element contains the timeout time, in seconds, for the OLFS to wait for this BES to respond. Defaults to 300 seconds.
Example:
<timeOut>600</timeOut >
This child element of the <BES> element contains in bytes the maximum response size allowed for this BES. Requests that produce a larger response will receive an error. A value of zero (0) indicates that there is no imposed limit. The default value is 0.
Example:
<maxResponseSize>0</maxResponseSize>
This child element of the <BES> element configures the behavior of the pool of client connections that the OLFS maintains with this particular BES. These connections are pooled for efficiency and speed. Currently, the only configuration item available is to control the maximum number of concurrent BES client connections that the OLFS can make. The default is 200, but the size should be optimized for your locale by empirical testing. The size of the Client Pool is controlled by the maximum attribute. The default value of maximum is 200.
Example:
<ClientPool maximum="17" />
If the <ClientPool> element is missing, the pool size defaults to 200.
This child element of the <BES> element contains the port on the BES system that can be used by the Hyrax Admin Interface to control the BES. THe BES must also be configured to open and utilize this admin port.
Example:
<adminPort>11002</adminPort>
The catalog cache element configures the OLFS memory cache of BES
catalog responses. This cache can greatly increase server performance
for small requests. It is configured by it’s two child elements,
maxEntries
and updateIntervalSeconds
.
-
The value of
maxEntries
determines the total number of catalog responses to hold in memory. The default value formaxEntries
is 10000. -
The value of
updateIntervalSeconds
determines how long the catalog update thread will sleep between updates. This value affects the server’s responsiveness to changes in its holdings. If your server’s contents changes frequently, then theupdateIntervalSeconds
should be set to a value that will allow the server to publish new additions/deletions in a timely manner. TheupdateIntervalSeconds
default value 10000 seconds (2.7 hours). -
If for some reason you wish to disable the
CatalogCache
, simply remove (or comment out) theCatalogCache
element and its children from theolfs.xml
file.
The <DispatchHandlers> element has two child elements: <HttpGetHandlers> and <HttpPostHandlers>. The <HttpGetHandlers> contains an ordered list of the DispatchHandler classes used by the OLFS to handle incoming HTTP GET requests.
The <HttpGetHandlers> contains an ordered list of the DispatchHandler classes used by the OLFS to handle incoming HTTP GET requests. The list order is significant, and permutating the order will (probably negatively) change the behavior of the OLFS. Each DispatchHandler on the list will be asked to handle the request. The first DispatchHandler on the list to claim the request will be asked to build the response.
While programmatic support for POST request handlers as part of the Hyrax servlet, there are currently no HttpPostHandlers implemented for use with Hyrax. Maybe down the road…
Both the <HttpGetHandlers> and <HttpPostHandlers> contain an
orderd list of <Handler> elements. Each <Handler> must have an
attribute called className whose value is set to the fully qualified
Java class name for the DispatchHandler implementation to be used. For
example, <Handler className="opendap.bes.VersionDispatchHandler" />
names the class opendap.bes.VersionDispatchHandler.
Each <Handler> element may contain a collection of child elements that provide configuration information to the DispatchHandler implementation. In this example,
<Handler className="opendap.coreServlet.BotBlocker"> <IpAddress&>44.55.66.77</IpAddress> </Handler>
the <Handler> element contains a child element (<IpAddress>) that indicates to the BotBlocker class to block requests from the IP address 44.55.66.77.
Hyrax uses the following DispatchHandlers to handle HTTP GET requests:
-
VersionDispatchHandler: Handles the version document requests.
-
BotBlocker: An optional handler that may be used to block individual IP addresses or groups of IP addresses from accessing your server.
-
NcmlDatasetDispatcher: Specialized handler that filters NcML content retrieved from the BES
-
StaticCatalogDispatch: Provides static THREDDS catalog services for Hyrax.
-
Gateway: For more imformation, see the documentation for Gateway Service.
-
DapDispatcher: Handles all DAP requests.
-
DirectoryDispatchHandler: Handles the OPeNDAP directory view (contents.html) requests.
-
BESThreddsDispatchHandler: Provides dynamic THREDDS catalogs of all BES holdings.
-
FileDispatchHandler: Handles requests for file level access. (README files etc.)
Handles the version document requests. This DispatchHandler has no configuration elements, so it will always be written like this:
<Handler className="opendap.bes.VersionDispatchHandler" />
This optional handler can be used to block access from specific IP addresses or a range of IP addresses using regular expressions. It turns out that many of the web crawling robots do not respect the robots.txt file when one is provided. Since many sites do not want their data holdings exhaustively queried by automated software, we created a simple robot blocking handler to protect system resources from non-compliant robots.
The text value of this element should be the IP address of a system which you would like to block from accessing your service.
For example,
<IpAddress>128.193.64.33</IPAddress>
blocks the system located at
128.193.64.33 from accessing your server.
There can be zero or more <IpAddress> elements in the <BotBlocker>.
The text value of this element should be the regular expression that will be used to match the IP addresses of clients attempting to access Hyrax.
For example, <IpMatch>65\.55\.[012]?\d?\d\.[012]?\d?\d</IpMatch>
matches all IP addresses beginning with 65.55 and thus blocks access for
clients whose IP addresses lie in that range. There can be zero or more
< IpMatch > elements in the Handler configuration for the BotBlocker
<Handler className="opendap.coreServlet.BotBlocker"> <IpAddress>127.0.0.1</IpAddress> <!-- This matches all IPv4 addresses, work yours out from here.... --> <!--<IpMatch>[012]?\d?\d\.[012]?\d?\d\.[012]?\d?\d\.[012]?\d?\d</IpMatch> --> <!-- Any IP starting with 65.55 (MSN bots the don't respect robots.txt --> <IpMatch>65\.55\.[012]?\d?\d\.[012]?\d?\d</IpMatch> </Handler>
The Ncml Dataset Dispatcher is a specialized handler that filters NcML content retrieved from the BES so that the path names in the NcML documents returned to clients are consistent with the paths from the external (to the server) perspective:
<Handler className="opendap.ncml.NcmlDatasetDispatcher" />
Serves static THREDDS catalogs (i.e. THREDDS catalog files stored on disk). It provides both a presentation view (HTML) for humans using browsers and direct catalog access (XML).
Defines the path component that comes after the servlet context and before all catalog requests. For example, if the prefix is thredds, then http://localhost:8080/opendap/thredds/ should give you the top-level static catalog (the contents of the file $CATALINA_HOME/content/opendap/catalog.xml)
If the text value of this element is the string 'true,' this will cause the servlet to ingest all of the static catalog files at startup and hold their contents in memory. See this page for more information about the memory caching operations.
This is a specific development option that allows one to specify the fully qualified path to an XSLT file that will be used to preprocess each THREDDS catalog file read from disk. The default version of this file (found in $CATALINA_HOME/webapps/opndap/xsl/threddsCatalogIngest.xsl) processes the thredds:datasetScan elements in each THREDDS catalog so that they contain specific content for Hyrax:
Note
|
This is a developers option and in general is not recommended for use in an operational server. |
Directs requests to the Gateway Service.
Defines the path component that comes after the servlet context and before all gateway requests. For example, if the prefix is gateway, then http://localhost:8080/opendap/gateway/ will give you the gateway access form page.
Handles DAP request for Hyrax. For example, the DapDispatchHandler will handle requests for all DAP2 and DAP4 products
The <AllowDirectDataSourceAccess /> element controls the user’s ability to directly access data sources via the web interface. If this element is present (and not commented out, as in the example below) a client can get an entire data source (such as an HDF file) by requesting it through the HTTP URL interface. This is not a good practice and is not recommended. By default, Hyrax ships with this option disabled. We recommend that you leave it unchanged unless you requre that users be able to circumvent the OPeNDAP request interface and have direct access to the data products stored on your server.
By default, at least for now, the server will provide the (undefined) DAP2 style response to requests for a dataset resource URL. Commenting out the "UseDAP2ResourceUrlResponse" element will cause the server to return the (well-defined) DAP4 DSR response when a dataset resource URL is requested.
Handles the OPeNDAP directory view (contents.html) requests:
<Handler className="opendap.bes.DirectoryDispatchHandler" />
Provides dynamic THREDDS catalogs of BES data holdings:
<Handler className="opendap.bes.BESThreddsDispatchHandler" />
Handles requests for file level access (README files, etc.). This handler only responds to requests for files that are not considered "data" by the BES. File requests for data files are handled by the opendap.bes.dapResponders.DapDispatcher.
In the following example, the FileDispatchHandler is configured to deny direct access to data sources (note that the <AllowDirectDataSourceAccess /> element is commented out):
<Handler className="opendap.bes.FileDispatchHandler" />
<?xml version="1.0" encoding="UTF-8"?>
<OLFSConfig>
<BESManager>
<BES>
<prefix>/</prefix>
<host>localhost</host>
<port>10022</port>
<timeOut>300</timeOut>
<adminPort>11002</adminPort>
<maxResponseSize>0</maxResponseSize>
<ClientPool maximum="200" maxCmds="2000" />
</BES>
</BESManager>
<DispatchHandlers>
<HttpGetHandlers>
<Handler className="opendap.bes.VersionDispatchHandler" />
<Handler className="opendap.coreServlet.BotBlocker">
<<IpMatch>65\.55\.[012]?\d?\d\.[012]?\d?\d</IpMatch>
</Handler>
<Handler className="opendap.ncml.NcmlDatasetDispatcher" />
<Handler className="opendap.threddsHandler.StaticCatalogDispatch">
<prefix>thredds</prefix>
<useMemoryCache>true</useMemoryCache>
</Handler>
<Handler className="opendap.gateway.DispatchHandler">
<prefix>gateway</prefix>
</Handler>
<Handler className="opendap.bes.BesDapDispatcher" >
<!-- AllowDirectDataSourceAccess / -->
<UseDAP2ResourceUrlResponse />
</Handler>
<Handler className="opendap.bes.DirectoryDispatchHandler">
<!--
If your particular authentication scheme (usually brokered by Apache httpd) utilizes
a particular logout or login location you can have Hyrax display links to those locations
as part of the generated web pages by uncommenting the "AuthenticationControls" element and
editing the logout and/or login locations to match your service instance.
-->
<!-- AuthenticationControls>
<logout>loginPath?login_param=foo</logout>
<logout>logoutPath?logout_param=foo</logout>
</AuthenticationControls -->
</Handler>
<Handler className="opendap.bes.BESThreddsDispatchHandler"/>
<Handler className="opendap.bes.FileDispatchHandler" />
</HttpGetHandlers>
<!--
If you need to accept a constraint expression (ce) that is larger than will fit in a URL query string then you
can configure the server to accept the ce as the body of a POST request referencing the same resource.
If the the Content-Encoding of the request is set to "application/x-www-form-urlencoded" then the server
will ingest all of parameter names "ce" and "dap4:ce" to build the DAP constraint expression. Otherwise
the server will treat the entire POST body as a DAP ce.
By default the maximum length of the POST body is limited to 2000000 characters, and may never be
larger than 10000000 characters (if you need more then get in touch with support@opendap.org). You can adjust
the limit in the configuration for the BesDapDispatcher.
Configuration:
Uncomment the HttpPostHandlers element below. Make sure that the body of the BesDapDispatcher Handler element is
IDENTICAL to it's sister in the HttpGetHandlers element above.
If you need to change the default value of the maximum POST body length do it by adding a
"PostBodyMaxLength" element to the BesDapDispatcher Handler below:
<PostBodyMaxLength>500</PostBodyMaxLength>
The text content of which must be an integer between 0 and 10000000
-->
<!--
<HttpPostHandlers>
<Handler className="opendap.bes.dapResponders.BesDapDispatcher" >
MAKE SURE THAT THE CONTENT OF THIS ELEMENT IS IDENTICAL TO IT'S SISTER IN THE HttpGetHandlers ELEMENT!
(Disregarding the presence of a PostBodyMaxLength element)
</Handler>
</HttpPostHandlers>
-->
</DispatchHandlers>
<!--
This enables or disables the generation of internal timing metrics for the OLFS
If commented out the timing is disabled. If you want timing metrics to be output
to the log then uncomment the Timer and set the enabled attribute's value to "true"
WARNING: There is some performance cost to utilizing the Timer.
-->
<!-- Timer enabled="false" / -->
</OLFSConfig>
We strongly recommend that you do NOT modfiy the web.xml file at this time. Future versions of Server and the OLFS may have "user configurable" parameters in the web.xml file, but this version does not, and doing so will almost certainly result in severe problems. That being said, the following is the details regarding the web.xml file.
The OLFS running in the OPeNDAP context area needs an entry in the web.xml file. Multiple instances of a servlet and/or several different servlets can be configured in the one web.xml file. For instance, you could have a DTS and a Hyrax running from the same web.xml and thus under the same servlet context. Running multiple instances of the OLFS in a single web.xml file (aka context) will NOT work.
Each a servlet needs a unique name which is specified inside a <servlet> element in the web.xml file using the <servlet-name> tag. This is a name of convenience; for example, if one is serving data from an ARGOS satellite one might call that servlet argos.
Additionally, each instance of a <servlet> must specify which Java class contains the actual servlet to run. This is done in the <servlet-class> element. For example, the OLFS servlet class name is opendap.coreServlet.DispatchServlet.
<servlet> <servlet-name>hyrax</servlet-name> <servlet-class>opendap.coreServlet.DispatchServlet</servlet-name> . . . </servlet>
This servlet could then be accessed as http://hostname/opendap/servlet/argos.
You may also add to the end of the web.xml file a set of <servlet-mapping> elements. These allow you to abbreviate the URL or the servlet. By placing the servlet mappings at the end of the web.xml file, our previous example changes its URL to http://hostname/opendap/argos, eliminating the need for the word servlet in the URL:
<servlet-mapping> <servlet-name>argos</servlet-name> <url-pattern>/argos</url-pattern> </servlet-mapping> <servlet-mapping> <servlet-name>argos</servlet-name> <url-pattern>/argos/*</url-pattern> </servlet-mapping>
For more on the <servlet-mapping> element see the Jakarta-Tomcat documentation.
The OLFS uses <init-param> elements inside of each <servlet> element to get specific configuration information.
The <init-param>s common to all OPeNDAP servlets are:
This parameter identifies the name of the XML document file that contains the OLFS configuration. This file must be located in the persistent content directory and is typically called olfs.xml.
For example:
<init-param> <param-name>OLFSConfigFileName</param-name> <param-value>olfs.xml</param-value> </init-param>
This controls output to the terminal from which the servlet engine was launched. The value is a list of flags that turn on debugging instrumentation in different parts of the code. Supported values are:
-
probeRequest: Prints a lengthy inspection of the HttpServletRequest object to stdout. Note: Do not leave this on for long or it will clog your Catalina logs.
-
DebugInterface: Enables the server’s debug interface. This ineractive interface allows a user to look at (and change) the server state via a web browser. Note: Enable this only for analysis purposes and disable when finshed.
For example:
<init-param> <param-name>DebugOn</param-name> <param-value>probeRequest</param-value> </init-param>
Default: If this parameter is not set or the value field is empty, then these features will be disabled - which is what you want (unless there is a problem to analyze).
<servlet> <servlet-name>hyrax</servlet-name> <servlet-class>opendap.coreServlet.DispatchServlet</servlet-class> <init-param> <param-name>DebugOn</param-name> <param-value></param-value> </init-param> <load-on-startup>1</load-on-startup> </servlet> <servlet-mapping> <servlet-name>hyrax</servlet-name> <url-pattern>*</url-pattern> </servlet-mapping> <servlet-mapping> <servlet-name>hyrax</servlet-name> <url-pattern>/hyrax</url-pattern> </servlet-mapping> <servlet-mapping> <servlet-name>hyrax</servlet-name> <url-pattern>/hyrax/*</url-pattern> </servlet-mapping>
The Viewers servlet provides, for each dataset, an HTML page containing links to Java WebStart applications and to WebServices (such as WMS) that can be utilized in conjunction with the dataset. The Viewers servlet is configured via the contents of the viewers.xml file located in the persistent content directory $CATALINA_HOME/content/opendap.
<ViewersConfig> <JwsHandler className="opendap.webstart.IdvViewerRequestHandler"> <JnlpFileName>idv.jnlp</JnlpFileName> </JwsHandler> <JwsHandler className="opendap.webstart.NetCdfToolsViewerRequestHandler"> <JnlpFileName>idv.jnlp</JnlpFileName> </JwsHandler> <JwsHandler className="opendap.webstart.AutoplotRequestHandler" /> <WebServiceHandler className="opendap.viewers.NcWmsService" serviceId="ncWms" > <applicationName>Web Mapping Service</applicationName> <NcWmsService href="/ncWMS/wms" base="/ncWMS/wms" ncWmsDynamicServiceId="lds" /> </WebServiceHandler> <WebServiceHandler className="opendap.viewers.GodivaWebService" serviceId="godiva" > <applicationName>Godiva WMS GUI</applicationName> <NcWmsService href="http://localhost:8080/ncWMS/wms" base="/ncWMS/wms" ncWmsDynamicServiceId="lds"/> <Godiva href="/ncWMS/godiva2.html" base="/ncWMS/godiva2.html"/> </WebServiceHandler> </ViewersConfig>
The Docs (or documentation) servlet provides the OLFS web application with the ability to serve a tree of static documentation files. By default, it will serve the files in the documentation tree provided with the OLFS in the Hyrax distribution. This tree is rooted at $CATALINA_HOME/webapps/opendap/docs/ and contains documentation pertaining to the software in the Hyrax distribution: installation and configuration instruction, release notes, java docs, etc.
If one wishes to replace this information with one’s own set of webpages, one can remove or replace the files in the default directory; however, installing a new version of Hyrax will cause these files to be overwritten after the install (and hopefully AFTER the new release documentation has been read and understood by the user).
The Docs servlet provides an alternative to this. If a docs directory is created in the persistent content directory for Hyrax, the Docs servlet will detect it when Tomcat is launched, and it will serve files from there instead of from the default location.
This scheme provides 2 beneficial effects:
-
It allows localizations of the web documents associated with Hyrax to persist through Hyrax upgrades with no user intervention.
-
It preserves important release documents that ship with the Hyrax software.
In summary, to provide persistent web pages as part of a Hyrax localization simple create the directory $CATALINA_HOME/content/opendap/docs, Place your content in there and away you go. If you later wish to view the web-based documentation bundled with Hyrax, simply change the name of the directory from docs to something else and restart Tomcat (or look in the $CATALINA_HOME/webapps/opendap/docs directory).
If a URL ends in a directory name or a "/" in the Docs servlet, then the servlet will attempt to serve the index.html in that directory. In other words index.html is the default document.
For information about logging, please check out the Hyrax Logging Configuration Documentation.
The following sub-sections detail authentication and authorization.
If your organization desires secure access and authentication layers for Hyrax, the recommended method is to use Hyrax in conjunction the Apache Web Server (httpd).
Most organizations that utilize secure access and authentication for their web presence are already doing so via Apache Web Server, and Hyrax can be integrated nicely with this existing infrastructure.
More about integrating Hyrax with Apache Web Server can be found at these pages:
Hyrax may be used with the security features implemented by Tomcat for authentication and authorization services. We recommend that you read carefully and understand the Tomcat security documentation.
For Tomcat 5.x see:
For Tomcat 6.x see:
We also recommend that you read chapter 12 of the Java Servlet Specification 2.4 that decribes how to configure security constraints at the web application level.
Tomcat security requires fairly extensive additions to the web.xml file. (It is important to keep in mind that altering the <servlet> definitions may render your Hyrax server inoperable - please see the previous sections that discuss this.)
Examples of security content for the web.xml file can be found in the persistent content directory of the Hyrax server, which by default is located at $CATALINA_HOME/content/opendap/.
Tomcat security officially supports context level authentication. This means that you can restrict access to the collection of servlets running in a single web application (i.e. all of the stuff that is defined in a single web.xml file). You can call out different authentication rules for different <url-pattern>s within the web application, but only clients which do not cache ANY security information will be able to easily access the different areas.
For example, in your web.xml file you might have:
<security-constraint> <web-resource-collection> <web-resource-name>fnoc1</web-resource-name> <url-pattern>/hyrax/nc/fnoc1.txt</url-pattern> </web-resource-collection> <auth-constraint> <role-name>fn1</role-name> </auth-constraint> </security-constraint> <security-constraint> <web-resource-collection> <web-resource-name>fnoc2</web-resource-name> <url-pattern>/hyrax/nc/fnoc2.txt</url-pattern> </web-resource-collection> <auth-constraint> <role-name>fn2</role-name> </auth-constraint> </security-constraint> <login-config> <auth-method>BASIC</auth-method> <realm-name>MyApplicationRealm</realm-name> </login-config>
Where the security roles fn1 and fn2 (defined in the tomcat-users.xml file) have no common members.
The complete URI’s would be:
http://localhost:8080/mycontext/hyrax/nc/fnoc1.txt http://localhost:8080/mycontext/hyrax/nc/fnoc2.txt
This works for clients that do not cache anything; however, if you were to access these URLs with a typical browser, then once you had authenticated for one URI, you would be locked out of the other one until you successfully "reset" the browser by purging all caches.
This happens because, in the exchange between Tomcat and the
client, Tomcat sends the header
WWW-Authenticate: Basic realm="MyApplicationRealm"
,
and the client authenticates. When the second URI is accessed, Tomcat
sends the the same authentication challenge with the same
WWW-Authenticate
header. The client, having recently authenticated to
this realm-name (defined in the <login-config> element in the
web.xml file - see above), resends the authentication information, and,
since it is not valid for that url pattern, the request is denied.
You should be careful to back up your modified web.xml file to a location outside the $CATALINA_HOME/webapps/opendap directory, as newly installed versions of Hyrax will overwrite it. You could use an XML ENTITY and an entity reference in the web.xml to cause a local file containing the security configuration to be included in the web.xml. For example, add the ENTITY
[<!ENTITY securityConfig SYSTEM "file:/fully/qualified/path/to/your/security/config.xml">]
to the DOCTYPE declaration at the top of the web.xml, and also add an entity reference (securityConfig, as above) to the content of the web-app element. This would cause your externally held security configuration to be included in the web.xml file.
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE web-app
PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN"
"http://java.sun.com/j2ee/dtds/web-app_2_2.dtd"
[<!ENTITY securityConfig SYSTEM "file:/fully/qualified/path/to/your/security/config.xml">]
>
<web-app>
<!--
Loads a persistent security configuration from the content directory.
This configuration may be empty, in which case no security constraints will be
applied by Tomcat.
-->
&securityConfig;
.
.
.
</web-app>
This will not prevent you from losing your web.xml file when a new version of Hyrax is installed, but adding the ENTITY to the new web.xml file would be easier than remembering an extensive security configuration.
Many OPeNDAP clients accept compressed responses. This can greatly increase the efficiency of the client/server interaction by diminishing the number of bytes actually transmitted over "the wire." Tomcat provides native compression support for the GZIP compression mechanism, however it is NOT turned on by default.
The following example is based on Tomcat 5.15. We recommend that you carefully read the Tomcat documentation related to this topic before proceeding:
-
Tomcat 5.x documentation (see Reference Section for the Apache Tomcat Configuration section)
To enable compression, you will need to edit the $CATALINA_HOME/conf/server.xml file. You will need to locate the <Connector> element associated with your server; typically this will be the only <Connector> element whose port attribute is set equal to 8080. You will need to add or change several of its attributes to enable compression.
With our Tomcat 7.0.76 distribution, we found this default <Connector> element definition in our server.xml file:
<Connector port="8080" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" />
You will need to add four attributes:
compression="force" compressionMinSize="2048" compressableMimeType="text/html,text/xml,text/plain,text/css,text/javascript,application/javascript,application/octet-stream,application/vnd.opendap.dap4.dataset-services+xml,application/vnd.opendap.dap4.dataset-metadata+xml,application/vnd.opendap.dap4.data,application/vnd.opendap.dap4.error+xml,application/json,application/prs.coverage+json,application/rdf+xml,application/x-netcdf;ver=4,application/x-netcdf,image/tiff;application=geotiff"
The list of compressible MIME types includes all known response types for Hyrax.
The compression attribute may have the following values:
-
compression="no" means nothing gets compressed (default if not provided).
-
compression="yes" means only the compressible MIME types get compressed.
-
compression="force" means everything gets compressed (assuming the client accepts gzip and the response is bigger than compressionMinSize)
Note
|
You MUST set compression="force" for compression to work with the OPeNDAP data transport. |
When finished your Connector element should look like this:
<Connector port="8080" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" compression="force" compressionMinSize="2048" compressableMimeType="text/html,text/xml,text/plain,text/css,text/javascript,application/javascript,application/octet-stream,application/vnd.opendap.dap4.dataset-services+xml,application/vnd.opendap.dap4.dataset-metadata+xml,application/vnd.opendap.dap4.data,application/vnd.opendap.dap4.error+xml,application/json,application/prs.coverage+json,application/rdf+xml,application/x-netcdf;ver=4,application/x-netcdf,image/tiff;application=geotiff" />
Restart Tomcat for these changes to take effect.
You can verify the change by using curl as follows:
curl -H "Accept-Encoding: gzip" -I http://localhost:8080/opendap/data/nc/fnoc1.nc.ascii
Note
|
The above URL is for Hyrax running on your local system and accessing a dataset that ships with the server. |
You’ll know that compression is enabled if the response to the curl command contains:
Content-Encoding: gzip
Note
|
If you are using Tomcat in conjunction with the Apache Web Server (our friend httpd) via AJP you will need to also configure Apache to deliver compressed responses Tomcat will not compress content sent over the AJP connection.* |