swift://container.PROVIDER/path
. You will also need to set your
Swift security credentials, through core-site.xml
or via
SparkContext.hadoopConfiguration
.
-Current Swift driver requires Swift to use Keystone authentication method.
+The current Swift driver requires Swift to use the Keystone authentication method, or
+its Rackspace-specific predecessor.
# Configuring Swift for Better Data Locality
@@ -19,41 +20,30 @@ Although not mandatory, it is recommended to configure the proxy server of Swift
# Dependencies
-The Spark application should include hadoop-openstack
dependency.
+The Spark application should include hadoop-openstack
dependency, which can
+be done by including the `hadoop-cloud` module for the specific version of spark used.
For example, for Maven support, add the following to the pom.xml
file:
{% highlight xml %}
core-site.xml
and place it inside Spark's conf
directory.
-There are two main categories of parameters that should to be configured: declaration of the
-Swift driver and the parameters that are required by Keystone.
+The main category of parameters that should be configured are the authentication parameters
+required by Keystone.
-Configuration of Hadoop to use Swift File system achieved via
-
-Property Name | Value |
---|---|
fs.swift.impl | -org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem | -
PROVIDER
can be any name.
+The following table contains a list of Keystone mandatory parameters. PROVIDER
can be
+any (alphanumeric) name.
Property Name | Meaning | Required | |
---|---|---|---|
fs.swift.service.PROVIDER.public |
- Indicates if all URLs are public | +Indicates whether to use the public (off cloud) or private (in cloud; no transfer fees) endpoints | Mandatory |
test
. Then core-site.xml
should inc
{% highlight xml %}