From b6c37ef5a28dfa0ed07f6ab8f154fcd43a2131af Mon Sep 17 00:00:00 2001
From: Gil Vernik <gilv@il.ibm.com>
Date: Sun, 8 Jun 2014 10:23:41 +0300
Subject: [PATCH 1/4] Openstack Swift support

---
 docs/openstack-integration.md | 83 +++++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)
 create mode 100644 docs/openstack-integration.md
diff --git a/docs/openstack-integration.md b/docs/openstack-integration.md
new file mode 100644
index 0000000000000..42cd3067edf80
--- /dev/null
+++ b/docs/openstack-integration.md
@@ -0,0 +1,83 @@
+---
+layout: global
+title: Accessing Openstack Swift storage from Spark
+---
+
+# Accessing Openstack Swift storage from Spark
+
+Spark's file interface allows it to process data in Openstack Swift using the same URI formats that are supported for Hadoop. You can specify a path in Swift as input through a URI of the form `swift://<container.service_provider>/path`. You will also need to set your Swift security credentials, through `SparkContext.hadoopConfiguration`. 
+
+#Configuring Hadoop to use Openstack Swift
+Openstack Swift driver was merged in Hadoop verion 2.3.0 ([Swift driver](https://issues.apache.org/jira/browse/HADOOP-8545))  Users that wish to use previous Hadoop versions will need to configure Swift driver manually. 
+<h2>Hadoop 2.3.0 and above.</h2>
+An Openstack Swift driver was merged into Haddop 2.3.0 . Current Hadoop driver requieres Swift to use Keystone authentication. There are additional efforts to support temp auth for Hadoop [Hadoop-10420](https://issues.apache.org/jira/browse/HADOOP-10420).
+To configure Hadoop to work with Swift one need to modify core-sites.xml of Hadoop and setup Swift FS.
+  
+    <configuration>
+      <property>
+	  <name>fs.swift.impl</name>
+	    <value>org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem</value>
+	  </property>
+    </configuration>
+
+
+<h2>Configuring Spark - stand alone cluster</h2>
+You need to configure the compute-classpath.sh and add Hadoop classpath for 
+
+  
+    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/common/lib/*
+    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/hdfs/*
+    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/tools/lib/*
+    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/hdfs/lib/*
+    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/mapreduce/*
+    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/mapreduce/lib/*
+    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/yarn/*
+    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/yarn/lib/*
+
+Additional parameters has to be provided to the Hadoop from Spark. Swift driver of Hadoop uses those parameters to perform authentication in Keystone needed to access Swift.
+List of mandatory parameters is : `fs.swift.service.<PROVIDER>.auth.url`, `fs.swift.service.<PROVIDER>.auth.endpoint.prefix`, `fs.swift.service.<PROVIDER>.tenant`, `fs.swift.service.<PROVIDER>.username`,
+`fs.swift.service.<PROVIDER>.password`, `fs.swift.service.<PROVIDER>.http.port`, `fs.swift.service.<PROVIDER>.http.port`, `fs.swift.service.<PROVIDER>.public`. 
+Create core-sites.xml and place it under /spark/conf directory. Configure core-sites.xml with general Keystone parameters, for example
+
+
+	  <property>
+	      <name>fs.swift.service.<PROVIDER>.auth.url</name>
+	      <value>http://127.0.0.1:5000/v2.0/tokens</value>
+	  </property>
+	  <property>
+	      <name>fs.swift.service.<PROVIDER>.auth.endpoint.prefix</name>
+	      <value>endpoints</value>
+	  </property>
+	      <name>fs.swift.service.<PROVIDER>.http.port</name>
+	      <value>8080</value>
+	  </property>
+	  <property>
+	      <name>fs.swift.service.<PROVIDER>.region</name>
+	      <value>RegionOne</value>
+	  </property>
+	  <property>
+	      <name>fs.swift.service.<PROVIDER>.public</name>
+	      <value>true</value>
+	  </property>
+
+We left with `fs.swift.service.<PROVIDER>.tenant`, `fs.swift.service.<PROVIDER>.username`, `fs.swift.service.<PROVIDER>.password`. The best way is to provide them to SparkContext in run time, which seems to be impossible yet.
+Another approach is to change Hadoop Swift FS driver to provide them via system environment variables. For now we provide them via core-sites.xml 
+
+	   <property>
+	      <name>fs.swift.service.<PROVIDER>.tenant</name>
+	      <value>test</value>
+	  </property>
+	  <property>
+	      <name>fs.swift.service.<PROVIDER>.username</name>
+	      <value>tester</value>
+	  </property>
+	  <property>
+	      <name>fs.swift.service.<PROVIDER>.password</name>
+	      <value>testing</value>
+	  </property>
+	  <property>
+<h3> Usage </h3>
+Assume you have a Swift container `logs` with an object `data.log`. You can use `swift://` scheme to access objects from Swift.
+
+    val sfdata = sc.textFile("swift://logs.<PROVIDER>/data.log")
+

From ce483d76a1d524800859764b967c8b5a98fbd9ea Mon Sep 17 00:00:00 2001
From: Gil Vernik <gilv@il.ibm.com>
Date: Sun, 8 Jun 2014 10:34:04 +0300
Subject: [PATCH 2/4] SPARK-938 - Openstack Swift object storage support

This is initial documentation describing how to integrate Spark with
Swift. This commit contains documentation for stand alone cluster.
Next patches will contain details how to integrate Swift in other
deployment of Spark.
---
 docs/openstack-integration.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/openstack-integration.md b/docs/openstack-integration.md
index 42cd3067edf80..ca422d298cc11 100644
--- a/docs/openstack-integration.md
+++ b/docs/openstack-integration.md
@@ -60,7 +60,7 @@ Create core-sites.xml and place it under /spark/conf directory. Configure core-s
 	      <value>true</value>
 	  </property>
 
-We left with `fs.swift.service.<PROVIDER>.tenant`, `fs.swift.service.<PROVIDER>.username`, `fs.swift.service.<PROVIDER>.password`. The best way is to provide them to SparkContext in run time, which seems to be impossible yet.
+We left with `fs.swift.service.<PROVIDER>.tenant`, `fs.swift.service.<PROVIDER>.username`, `fs.swift.service.<PROVIDER>.password`. The best way to provide those parameters to SparkContext in run time, which seems to be impossible yet.
 Another approach is to change Hadoop Swift FS driver to provide them via system environment variables. For now we provide them via core-sites.xml 
 
 	   <property>

From eff538dd8fb7e306c84874e9b4c7da68fa0fe5d0 Mon Sep 17 00:00:00 2001
From: Gil Vernik <gilv@il.ibm.com>
Date: Sun, 8 Jun 2014 10:34:04 +0300
Subject: [PATCH 3/4] SPARK-938 - Openstack Swift object storage support

Documentation how to integrate Spark with Openstack Swift.
---
 core/pom.xml                  |   6 +-
 docs/openstack-integration.md | 143 ++++++++++++++++++----------------
 pom.xml                       |  11 +++
 yarn/pom.xml                  |   4 +
 4 files changed, 96 insertions(+), 68 deletions(-)

diff --git a/core/pom.xml b/core/pom.xml
index bab50f5ce2888..93dadafe57046 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -35,7 +35,11 @@
       <groupId>org.apache.hadoop</groupId>
       <artifactId>hadoop-client</artifactId>
     </dependency>
-    <dependency>
+     <dependency>
+        <groupId>org.apache.hadoop</groupId>
+        <artifactId>hadoop-openstack</artifactId>
+     </dependency>
+     <dependency>
       <groupId>net.java.dev.jets3t</groupId>
       <artifactId>jets3t</artifactId>
     </dependency>
diff --git a/docs/openstack-integration.md b/docs/openstack-integration.md
index 42cd3067edf80..07f22b3f12b13 100644
--- a/docs/openstack-integration.md
+++ b/docs/openstack-integration.md
@@ -8,76 +8,85 @@ title: Accessing Openstack Swift storage from Spark
 Spark's file interface allows it to process data in Openstack Swift using the same URI formats that are supported for Hadoop. You can specify a path in Swift as input through a URI of the form `swift://<container.service_provider>/path`. You will also need to set your Swift security credentials, through `SparkContext.hadoopConfiguration`. 
 
 #Configuring Hadoop to use Openstack Swift
-Openstack Swift driver was merged in Hadoop verion 2.3.0 ([Swift driver](https://issues.apache.org/jira/browse/HADOOP-8545))  Users that wish to use previous Hadoop versions will need to configure Swift driver manually. 
-<h2>Hadoop 2.3.0 and above.</h2>
-An Openstack Swift driver was merged into Haddop 2.3.0 . Current Hadoop driver requieres Swift to use Keystone authentication. There are additional efforts to support temp auth for Hadoop [Hadoop-10420](https://issues.apache.org/jira/browse/HADOOP-10420).
+Openstack Swift driver was merged in Hadoop verion 2.3.0 ([Swift driver](https://issues.apache.org/jira/browse/HADOOP-8545)).  Users that wish to use previous Hadoop versions will need to configure Swift driver manually. Current Swift driver requieres Swift to use Keystone authentication method. There are recent efforts to support also temp auth [Hadoop-10420](https://issues.apache.org/jira/browse/HADOOP-10420).
 To configure Hadoop to work with Swift one need to modify core-sites.xml of Hadoop and setup Swift FS.
   
-    <configuration>
-      <property>
-	  <name>fs.swift.impl</name>
-	    <value>org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem</value>
-	  </property>
-    </configuration>
+	<configuration>
+		<property>
+			<name>fs.swift.impl</name>
+			<value>org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem</value>
+		</property>
+	</configuration>
 
+#Configuring Swift 
+Proxy server of Swift should include `list_endpoints` middleware. More information available [here](https://github.com/openstack/swift/blob/master/swift/common/middleware/list_endpoints.py)
 
-<h2>Configuring Spark - stand alone cluster</h2>
-You need to configure the compute-classpath.sh and add Hadoop classpath for 
+#Configuring Spark
+To use Swift driver, Spark need to be compiled with `hadoop-openstack-2.3.0.jar` distributted with Hadoop 2.3.0. 
+For the Maven builds, Spark's main pom.xml should include 
 
-  
-    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/common/lib/*
-    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/hdfs/*
-    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/tools/lib/*
-    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/hdfs/lib/*
-    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/mapreduce/*
-    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/mapreduce/lib/*
-    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/yarn/*
-    CLASSPATH = <YOUR HADOOP PATH>/share/hadoop/yarn/lib/*
-
-Additional parameters has to be provided to the Hadoop from Spark. Swift driver of Hadoop uses those parameters to perform authentication in Keystone needed to access Swift.
-List of mandatory parameters is : `fs.swift.service.<PROVIDER>.auth.url`, `fs.swift.service.<PROVIDER>.auth.endpoint.prefix`, `fs.swift.service.<PROVIDER>.tenant`, `fs.swift.service.<PROVIDER>.username`,
-`fs.swift.service.<PROVIDER>.password`, `fs.swift.service.<PROVIDER>.http.port`, `fs.swift.service.<PROVIDER>.http.port`, `fs.swift.service.<PROVIDER>.public`. 
-Create core-sites.xml and place it under /spark/conf directory. Configure core-sites.xml with general Keystone parameters, for example
-
-
-	  <property>
-	      <name>fs.swift.service.<PROVIDER>.auth.url</name>
-	      <value>http://127.0.0.1:5000/v2.0/tokens</value>
-	  </property>
-	  <property>
-	      <name>fs.swift.service.<PROVIDER>.auth.endpoint.prefix</name>
-	      <value>endpoints</value>
-	  </property>
-	      <name>fs.swift.service.<PROVIDER>.http.port</name>
-	      <value>8080</value>
-	  </property>
-	  <property>
-	      <name>fs.swift.service.<PROVIDER>.region</name>
-	      <value>RegionOne</value>
-	  </property>
-	  <property>
-	      <name>fs.swift.service.<PROVIDER>.public</name>
-	      <value>true</value>
-	  </property>
-
-We left with `fs.swift.service.<PROVIDER>.tenant`, `fs.swift.service.<PROVIDER>.username`, `fs.swift.service.<PROVIDER>.password`. The best way is to provide them to SparkContext in run time, which seems to be impossible yet.
-Another approach is to change Hadoop Swift FS driver to provide them via system environment variables. For now we provide them via core-sites.xml 
-
-	   <property>
-	      <name>fs.swift.service.<PROVIDER>.tenant</name>
-	      <value>test</value>
-	  </property>
-	  <property>
-	      <name>fs.swift.service.<PROVIDER>.username</name>
-	      <value>tester</value>
-	  </property>
-	  <property>
-	      <name>fs.swift.service.<PROVIDER>.password</name>
-	      <value>testing</value>
-	  </property>
-	  <property>
-<h3> Usage </h3>
-Assume you have a Swift container `logs` with an object `data.log`. You can use `swift://` scheme to access objects from Swift.
-
-    val sfdata = sc.textFile("swift://logs.<PROVIDER>/data.log")
+	<swift.version>2.3.0</swift.version>
+
+
+	<dependency>
+		<groupId>org.apache.hadoop</groupId>
+		<artifactId>hadoop-openstack</artifactId>
+		<version>${swift.version}</version>
+	</dependency>
+
+in addition, pom.xml of the `core` and `yarn` projects should include
+
+	<dependency>
+		<groupId>org.apache.hadoop</groupId>
+		<artifactId>hadoop-openstack</artifactId>
+	</dependency>
+
+
+Additional parameters has to be provided to the Swift driver. Swift driver will use those parameters to perform authentication in Keystone prior  accessing Swift. List of mandatory parameters is : `fs.swift.service.<PROVIDER>.auth.url`, `fs.swift.service.<PROVIDER>.auth.endpoint.prefix`, `fs.swift.service.<PROVIDER>.tenant`, `fs.swift.service.<PROVIDER>.username`,
+`fs.swift.service.<PROVIDER>.password`, `fs.swift.service.<PROVIDER>.http.port`, `fs.swift.service.<PROVIDER>.http.port`, `fs.swift.service.<PROVIDER>.public`, where `PROVIDER` is any name. `fs.swift.service.<PROVIDER>.auth.url` should point to the Keystone authentication URL.
+
+Create core-sites.xml with the mandatory parameters and place it under /spark/conf directory. For example:
+
+
+	<property>
+		<name>fs.swift.service.<PROVIDER>.auth.url</name>
+		<value>http://127.0.0.1:5000/v2.0/tokens</value>
+	</property>
+	<property>
+		<name>fs.swift.service.<PROVIDER>.auth.endpoint.prefix</name>
+		<value>endpoints</value>
+	</property>
+		<name>fs.swift.service.<PROVIDER>.http.port</name>
+		<value>8080</value>
+	</property>
+	<property>
+		<name>fs.swift.service.<PROVIDER>.region</name>
+		<value>RegionOne</value>
+	</property>
+	<property>
+		<name>fs.swift.service.<PROVIDER>.public</name>
+		<value>true</value>
+	</property>
+
+We left with `fs.swift.service.<PROVIDER>.tenant`, `fs.swift.service.<PROVIDER>.username`, `fs.swift.service.<PROVIDER>.password`. The best way to provide those parameters to SparkContext in run time, which seems to be impossible yet.
+Another approach is to adapt Swift driver to obtain those values from system environment variables. For now we provide them via core-sites.xml. 
+Assume a tenant `test` with user `tester` was defined in Keystone, then the core-sites.xml shoud include: 
+
+	<property>
+		<name>fs.swift.service.<PROVIDER>.tenant</name>
+		<value>test</value>
+	</property>
+	<property>
+		<name>fs.swift.service.<PROVIDER>.username</name>
+		<value>tester</value>
+	</property>
+	<property>
+		<name>fs.swift.service.<PROVIDER>.password</name>
+		<value>testing</value>
+	</property>
+# Usage
+Assume there exists Swift container `logs` with an object `data.log`. To access `data.log` from Spark the `swift://` scheme should be used. 
+For example:
+
+	val sfdata = sc.textFile("swift://logs.<PROVIDER>/data.log")
 
diff --git a/pom.xml b/pom.xml
index 86264d1132ec4..79cf5fdc23d01 100644
--- a/pom.xml
+++ b/pom.xml
@@ -132,6 +132,7 @@
     <codahale.metrics.version>3.0.0</codahale.metrics.version>
     <avro.version>1.7.6</avro.version>
     <jets3t.version>0.7.1</jets3t.version>
+    <swift.version>2.3.0</swift.version>
 
     <PermGen>64m</PermGen>
     <MaxPermGen>512m</MaxPermGen>
@@ -584,6 +585,11 @@
           </exclusion>
         </exclusions>
       </dependency>
+      <dependency>
+        <groupId>org.apache.hadoop</groupId>
+        <artifactId>hadoop-openstack</artifactId>
+        <version>${swift.version}</version>
+      </dependency>
       <dependency>
         <groupId>org.apache.hadoop</groupId>
         <artifactId>hadoop-yarn-api</artifactId>
@@ -1024,6 +1030,11 @@
           <artifactId>hadoop-client</artifactId>
           <scope>provided</scope>
         </dependency>
+   <dependency>
+      <groupId>org.apache.hadoop</groupId>
+      <artifactId>hadoop-openstack</artifactId>
+      <scope>provided</scope>
+    </dependency>
         <dependency>
           <groupId>org.apache.hadoop</groupId>
           <artifactId>hadoop-yarn-api</artifactId>
diff --git a/yarn/pom.xml b/yarn/pom.xml
index 6993c89525d8c..e58d8312f1a86 100644
--- a/yarn/pom.xml
+++ b/yarn/pom.xml
@@ -55,6 +55,10 @@
       <groupId>org.apache.hadoop</groupId>
       <artifactId>hadoop-client</artifactId>
     </dependency>
+     <dependency>
+        <groupId>org.apache.hadoop</groupId>
+        <artifactId>hadoop-openstack</artifactId>
+     </dependency>
     <dependency>
       <groupId>org.scalatest</groupId>
       <artifactId>scalatest_${scala.binary.version}</artifactId>

From 39a9737e16b27435f448030f1f7f7a6c506e08dc Mon Sep 17 00:00:00 2001
From: Gil Vernik <gilv@il.ibm.com>
Date: Thu, 12 Jun 2014 12:13:29 +0300
Subject: [PATCH 4/4] Spark integration with Openstack Swift

---
 core/pom.xml                  |   4 -
 docs/openstack-integration.md | 301 ++++++++++++++++++++++++----------
 pom.xml                       |  13 +-
 yarn/pom.xml                  |   4 -
 4 files changed, 215 insertions(+), 107 deletions(-)

diff --git a/core/pom.xml b/core/pom.xml
index 93dadafe57046..fe6b2daba0581 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -35,10 +35,6 @@
       <groupId>org.apache.hadoop</groupId>
       <artifactId>hadoop-client</artifactId>
     </dependency>
-     <dependency>
-        <groupId>org.apache.hadoop</groupId>
-        <artifactId>hadoop-openstack</artifactId>
-     </dependency>
      <dependency>
       <groupId>net.java.dev.jets3t</groupId>
       <artifactId>jets3t</artifactId>
diff --git a/docs/openstack-integration.md b/docs/openstack-integration.md
index a1aac02f6275e..a3179fce59c13 100644
--- a/docs/openstack-integration.md
+++ b/docs/openstack-integration.md
@@ -1,110 +1,237 @@
-yout: global
-title: Accessing Openstack Swift storage from Spark
+layout: global
+title: Accessing Openstack Swift from Spark
 ---
 
-# Accessing Openstack Swift storage from Spark
+# Accessing Openstack Swift from Spark
 
 Spark's file interface allows it to process data in Openstack Swift using the same URI 
 formats that are supported for Hadoop. You can specify a path in Swift as input through a 
-URI of the form `swift://<container.service_provider>/path`. You will also need to set your 
-Swift security credentials, through `SparkContext.hadoopConfiguration`. 
-
-#Configuring Hadoop to use Openstack Swift
-Openstack Swift driver was merged in Hadoop verion 2.3.0 ([Swift driver](https://issues.apache.org/jira/browse/HADOOP-8545)).  Users that wish to use previous Hadoop versions will need to configure Swift driver manually. Current Swift driver 
+URI of the form `swift://<container.PROVIDER/path`. You will also need to set your 
+Swift security credentials, through `core-sites.xml` or via `SparkContext.hadoopConfiguration`. 
+Openstack Swift driver was merged in Hadoop version 2.3.0 ([Swift driver](https://issues.apache.org/jira/browse/HADOOP-8545)).  Users that wish to use previous Hadoop versions will need to configure Swift driver manually. Current Swift driver 
 requieres Swift to use Keystone authentication method. There are recent efforts to support 
-also temp auth [Hadoop-10420](https://issues.apache.org/jira/browse/HADOOP-10420).
-To configure Hadoop to work with Swift one need to modify core-sites.xml of Hadoop and 
-setup Swift FS.
-  
-	<configuration>
-		<property>
-			<name>fs.swift.impl</name>
-			<value>org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem</value>
-		</property>
-	</configuration>
+temp auth [Hadoop-10420](https://issues.apache.org/jira/browse/HADOOP-10420).
 
-#Configuring Swift 
+# Configuring Swift 
 Proxy server of Swift should include `list_endpoints` middleware. More information 
-available [here] (https://github.com/openstack/swift/blob/master/swift/common/middleware/list_endpoints.py)
-
-#Configuring Spark
-To use Swift driver, Spark need to be compiled with `hadoop-openstack-2.3.0.jar` 
-distributted with Hadoop 2.3.0.  For the Maven builds, Spark's main pom.xml should include 
-
-	<swift.version>2.3.0</swift.version>
+available [here](https://github.com/openstack/swift/blob/master/swift/common/middleware/list_endpoints.py)
 
+# Compilation of Spark
+Spark should be compiled with `hadoop-openstack-2.3.0.jar` that is distributted with Hadoop 2.3.0. 
+For the Maven builds, the `dependencyManagement` section of Spark's main `pom.xml` should include 
 
+	<dependencyManagement>
+	---------
 	<dependency>
 		<groupId>org.apache.hadoop</groupId>
 		<artifactId>hadoop-openstack</artifactId>
-		<version>${swift.version}</version>
+		<version>2.3.0</version>
 	</dependency>
+	----------
+	</dependencyManagement>
 
-in addition, pom.xml of the `core` and `yarn` projects should include
+in addition, both `core` and `yarn` projects should add `hadoop-openstack` to the `dependencies` section of their `pom.xml`
 
+	<dependencies>
+	----------
 	<dependency>
 		<groupId>org.apache.hadoop</groupId>
 		<artifactId>hadoop-openstack</artifactId>
 	</dependency>
+	----------
+	</dependencies>
+# Configuration of Spark
+Create `core-sites.xml` and place it inside `/spark/conf` directory. There are two main categories of parameters that should to be 
+configured: declaration of the Swift driver and the parameters that are required by Keystone. 
+
+Configuration of Hadoop to use Swift File system achieved via 
+
+<table class="table">
+<tr><th>Property Name</th><th>Value</th></tr>
+<tr>
+  <td>fs.swift.impl</td>
+  <td>org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem</td>
+<tr>
+</table>
+
+Additional parameters requiered by Keystone and should be provided to the Swift driver. Those 
+parameters will be used to perform authentication in Keystone to access Swift. The following table 
+contains a list of Keystone mandatory parameters. `PROVIDER` can be any name.
+
+<table class="table">
+<tr><th>Property Name</th><th>Meaning</th><th>Required</th></tr>
+<tr>
+  <td>fs.swift.service.PROVIDER.auth.url</td>
+  <td>Keystone Authentication URL</td>
+  <td>Mandatory</td>
+</tr>
+<tr>
+  <td>fs.swift.service.PROVIDER.auth.endpoint.prefix</td>
+  <td>Keystone endpoints prefix</td>
+  <td>Optional</td>
+</tr>
+<tr>
+  <td>fs.swift.service.PROVIDER.tenant</td>
+  <td>Tenant</td>
+  <td>Mandatory</td>
+</tr>
+<tr>
+  <td>fs.swift.service.PROVIDER.username</td>
+  <td>Username</td>
+  <td>Mandatory</td>
+</tr>
+<tr>
+  <td>fs.swift.service.PROVIDER.password</td>
+  <td>Password</td>
+  <td>Mandatory</td>
+</tr>
+<tr>
+  <td>fs.swift.service.PROVIDER.http.port</td>
+  <td>HTTP port</td>
+  <td>Mandatory</td>
+</tr>
+<tr>
+  <td>fs.swift.service.PROVIDER.region</td>
+  <td>Keystone region</td>
+  <td>Mandatory</td>
+</tr>
+<tr>
+  <td>fs.swift.service.PROVIDER.public</td>
+  <td>Indicates if all URLs are public</td>
+  <td>Mandatory</td>
+</tr>
+</table>
+
+For example, assume `PROVIDER=SparkTest` and Keystone contains user `tester` with password `testing` defined for tenant `tenant`. 
+Than `core-sites.xml` should include:
 
+	<configuration>
+		<property>
+			<name>fs.swift.impl</name>
+			<value>org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem</value>
+		</property>
+		<property>
+			<name>fs.swift.service.SparkTest.auth.url</name>
+			<value>http://127.0.0.1:5000/v2.0/tokens</value>
+		</property>
+		<property>
+			<name>fs.swift.service.SparkTest.auth.endpoint.prefix</name>
+			<value>endpoints</value>
+		</property>
+			<name>fs.swift.service.SparkTest.http.port</name>
+			<value>8080</value>
+		</property>
+		<property>
+			<name>fs.swift.service.SparkTest.region</name>
+			<value>RegionOne</value>
+		</property>
+		<property>
+			<name>fs.swift.service.SparkTest.public</name>
+			<value>true</value>
+		</property>
+		<property>
+			<name>fs.swift.service.SparkTest.tenant</name>
+			<value>test</value>
+		</property>
+		<property>
+			<name>fs.swift.service.SparkTest.username</name>
+			<value>tester</value>
+		</property>
+		<property>
+			<name>fs.swift.service.SparkTest.password</name>
+			<value>testing</value>
+		</property>
+	</configuration>
 
-Additional parameters has to be provided to the Swift driver. Swift driver will use those 
-parameters to perform authentication in Keystone prior  accessing Swift. List of mandatory 
-parameters is : `fs.swift.service.<PROVIDER>.auth.url`, 
-`fs.swift.service.<PROVIDER>.auth.endpoint.prefix`, `fs.swift.service.<PROVIDER>.tenant`, 
-`fs.swift.service.<PROVIDER>.username`,
-`fs.swift.service.<PROVIDER>.password`, `fs.swift.service.<PROVIDER>.http.port`, 
-`fs.swift.service.<PROVIDER>.http.port`, `fs.swift.service.<PROVIDER>.public`, where 
-`PROVIDER` is any name. `fs.swift.service.<PROVIDER>.auth.url` should point to the Keystone 
-authentication URL.
-
-Create core-sites.xml with the mandatory parameters and place it under /spark/conf 
-directory. For example:
-
-
-	<property>
-		<name>fs.swift.service.<PROVIDER>.auth.url</name>
-		<value>http://127.0.0.1:5000/v2.0/tokens</value>
-	</property>
-	<property>
-		<name>fs.swift.service.<PROVIDER>.auth.endpoint.prefix</name>
-		<value>endpoints</value>
-	</property>
-		<name>fs.swift.service.<PROVIDER>.http.port</name>
-		<value>8080</value>
-	</property>
-	<property>
-		<name>fs.swift.service.<PROVIDER>.region</name>
-		<value>RegionOne</value>
-	</property>
-	<property>
-		<name>fs.swift.service.<PROVIDER>.public</name>
-		<value>true</value>
-	</property>
-
-We left with `fs.swift.service.<PROVIDER>.tenant`, `fs.swift.service.<PROVIDER>.username`, 
-`fs.swift.service.<PROVIDER>.password`. The best way to provide those parameters to 
-SparkContext in run time, which seems to be impossible yet.
-Another approach is to adapt Swift driver to obtain those values from system environment 
-variables. For now we provide them via core-sites.xml. 
-Assume a tenant `test` with user `tester` was defined in Keystone, then the core-sites.xml 
-shoud include: 
-
-	<property>
-		<name>fs.swift.service.<PROVIDER>.tenant</name>
-		<value>test</value>
-	</property>
-	<property>
-		<name>fs.swift.service.<PROVIDER>.username</name>
-		<value>tester</value>
-	</property>
-	<property>
-		<name>fs.swift.service.<PROVIDER>.password</name>
-		<value>testing</value>
-	</property>
-# Usage
-Assume there exists Swift container `logs` with an object `data.log`. To access `data.log` 
-from Spark the `swift://` scheme should be used. For example:
-
-	val sfdata = sc.textFile("swift://logs.<PROVIDER>/data.log")
+Notice that `fs.swift.service.PROVIDER.tenant`, `fs.swift.service.PROVIDER.username`, 
+`fs.swift.service.PROVIDER.password` contains sensitive information and keeping them in `core-sites.xml` is not always a good approach. 
+We suggest to keep those parameters in `core-sites.xml` for testing purposes when running Spark via `spark-shell`. For job submissions they should be provided via `sparkContext.hadoopConfiguration`
+
+# Usage examples
+Assume Keystone's authentication URL is `http://127.0.0.1:5000/v2.0/tokens` and Keystone contains tenant `test`, user `tester` with password `testing`. In our example we define `PROVIDER=SparkTest`. Assume that Swift contains container `logs` with an object `data.log`. To access `data.log` 
+from Spark the `swift://` scheme should be used.
+
+## Running Spark via spark-shell
+Make sure that `core-sites.xml` contains `fs.swift.service.SparkTest.tenant`, `fs.swift.service.SparkTest.username`, 
+`fs.swift.service.SparkTest.password`. Run Spark via `spark-shell` and access Swift via `swift:\\` scheme.
+
+	val sfdata = sc.textFile("swift://logs.SparkTest/data.log")
+	sfdata.count()
+
+## Job submission via spark-submit
+In this case `core-sites.xml` need not contain `fs.swift.service.SparkTest.tenant`, `fs.swift.service.SparkTest.username`, 
+`fs.swift.service.SparkTest.password`. Example of Java usage:
+
+	/* SimpleApp.java */
+	import org.apache.spark.api.java.*;
+	import org.apache.spark.SparkConf;
+	import org.apache.spark.api.java.function.Function;
+
+	public class SimpleApp {
+	  public static void main(String[] args) {
+	    String logFile = "swift://logs.SparkTest/data.log";
+	    SparkConf conf = new SparkConf().setAppName("Simple Application");
+	    JavaSparkContext sc = new JavaSparkContext(conf);
+	    sc.hadoopConfiguration().set("fs.swift.service.ibm.tenant", "test");
+	    sc.hadoopConfiguration().set("fs.swift.service.ibm.password", "testing");
+	    sc.hadoopConfiguration().set("fs.swift.service.ibm.username", "tester");
+	    
+	    JavaRDD<String> logData = sc.textFile(logFile).cache();
+
+	    long num = logData.count();
+
+	    System.out.println("Total number of lines: " + num);
+	  }
+	}
+
+The directory sturture is 
+
+	find .
+	./src
+	./src/main
+	./src/main/java
+	./src/main/java/SimpleApp.java
+
+Maven pom.xml is
+
+	<project>
+		<groupId>edu.berkeley</groupId>
+		<artifactId>simple-project</artifactId>
+		<modelVersion>4.0.0</modelVersion>
+		<name>Simple Project</name>
+		<packaging>jar</packaging>
+		<version>1.0</version>
+		<repositories>
+		        <repository>
+		                <id>Akka repository</id>
+		                <url>http://repo.akka.io/releases</url>
+		        </repository>
+		</repositories>
+		<build>
+		        <plugins>
+		                <plugin>
+		                        <groupId>org.apache.maven.plugins</groupId>
+		                        <artifactId>maven-compiler-plugin</artifactId>
+		                        <version>2.3</version>
+		                        <configuration>
+		                                <source>1.6</source>
+		                                <target>1.6</target>
+		                        </configuration>
+		                </plugin>
+		        </plugins>
+		</build>
+		<dependencies>
+		        <dependency> <!-- Spark dependency -->
+		                <groupId>org.apache.spark</groupId>
+		                <artifactId>spark-core_2.10</artifactId>
+		                <version>1.0.0</version>
+		        </dependency>
+		</dependencies>
+
+	</project>
+
+Compile and execute
+
+	mvn package
+	SPARK_HOME/spark-submit  --class "SimpleApp"   --master local[4]   target/simple-project-1.0.jar
 
diff --git a/pom.xml b/pom.xml
index 79cf5fdc23d01..92cf6bab1edf8 100644
--- a/pom.xml
+++ b/pom.xml
@@ -132,8 +132,7 @@
     <codahale.metrics.version>3.0.0</codahale.metrics.version>
     <avro.version>1.7.6</avro.version>
     <jets3t.version>0.7.1</jets3t.version>
-    <swift.version>2.3.0</swift.version>
-
+    
     <PermGen>64m</PermGen>
     <MaxPermGen>512m</MaxPermGen>
   </properties>
@@ -585,11 +584,6 @@
           </exclusion>
         </exclusions>
       </dependency>
-      <dependency>
-        <groupId>org.apache.hadoop</groupId>
-        <artifactId>hadoop-openstack</artifactId>
-        <version>${swift.version}</version>
-      </dependency>
       <dependency>
         <groupId>org.apache.hadoop</groupId>
         <artifactId>hadoop-yarn-api</artifactId>
@@ -1030,11 +1024,6 @@
           <artifactId>hadoop-client</artifactId>
           <scope>provided</scope>
         </dependency>
-   <dependency>
-      <groupId>org.apache.hadoop</groupId>
-      <artifactId>hadoop-openstack</artifactId>
-      <scope>provided</scope>
-    </dependency>
         <dependency>
           <groupId>org.apache.hadoop</groupId>
           <artifactId>hadoop-yarn-api</artifactId>
diff --git a/yarn/pom.xml b/yarn/pom.xml
index e58d8312f1a86..6993c89525d8c 100644
--- a/yarn/pom.xml
+++ b/yarn/pom.xml
@@ -55,10 +55,6 @@
       <groupId>org.apache.hadoop</groupId>
       <artifactId>hadoop-client</artifactId>
     </dependency>
-     <dependency>
-        <groupId>org.apache.hadoop</groupId>
-        <artifactId>hadoop-openstack</artifactId>
-     </dependency>
     <dependency>
       <groupId>org.scalatest</groupId>
       <artifactId>scalatest_${scala.binary.version}</artifactId>

Property Name	Meaning	Required
fs.swift.service.PROVIDER.auth.url	Keystone Authentication URL	Mandatory
fs.swift.service.PROVIDER.auth.endpoint.prefix	Keystone endpoints prefix	Optional
fs.swift.service.PROVIDER.tenant	Tenant	Mandatory
fs.swift.service.PROVIDER.username	Username	Mandatory
fs.swift.service.PROVIDER.password	Password	Mandatory
fs.swift.service.PROVIDER.http.port	HTTP port	Mandatory
fs.swift.service.PROVIDER.region	Keystone region	Mandatory
fs.swift.service.PROVIDER.public	Indicates if all URLs are public	Mandatory