-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-7481] [build] Add spark-cloud module to pull in object store access, + documentation #12004
Changes from all commits
1ecace7
38e74f9
79b8ce0
2bcbf3a
576b72c
f9d0923
1fab96e
4065c28
797ec49
5768c42
0fcdc36
b6d2002
abae7fb
6851aa4
c738048
ea5e1fa
aa4ea89
83d9368
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!-- | ||
~ Licensed to the Apache Software Foundation (ASF) under one or more | ||
~ contributor license agreements. See the NOTICE file distributed with | ||
~ this work for additional information regarding copyright ownership. | ||
~ The ASF licenses this file to You under the Apache License, Version 2.0 | ||
~ (the "License"); you may not use this file except in compliance with | ||
~ the License. You may obtain a copy of the License at | ||
~ | ||
~ http://www.apache.org/licenses/LICENSE-2.0 | ||
~ | ||
~ Unless required by applicable law or agreed to in writing, software | ||
~ distributed under the License is distributed on an "AS IS" BASIS, | ||
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
~ See the License for the specific language governing permissions and | ||
~ limitations under the License. | ||
--> | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
<parent> | ||
<groupId>org.apache.spark</groupId> | ||
<artifactId>spark-parent_2.11</artifactId> | ||
<version>2.2.0-SNAPSHOT</version> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We may need to make this 2.3.0-SNAPSHOT now, because that's what's correct for master, then change it if it back-ports. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd noticed that this morning.... |
||
<relativePath>../pom.xml</relativePath> | ||
</parent> | ||
|
||
<artifactId>spark-hadoop-cloud_2.11</artifactId> | ||
<packaging>jar</packaging> | ||
<name>Spark Project Cloud Integration</name> | ||
<description> | ||
Contains support for cloud infrastructures, specifically the Hadoop JARs and | ||
transitive dependencies needed to interact with the infrastructures. | ||
|
||
Any project which explicitly depends upon the spark-hadoop-cloud artifact will get the | ||
dependencies; the exact versions of which will depend upon the hadoop version Spark was compiled | ||
against. | ||
|
||
The imports of transitive dependencies are managed to make them consistent | ||
with those of the Spark build. | ||
|
||
WARNING: the signatures of methods in the AWS and Azure SDKs do change between | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where does an end user need to act on this -- the profile is in theory setting all this up correctly right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would only include the first sentence here. The description here should be short since nobody will likely read it. Anything substantive could go in docs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cutting back to the first line, it can be covered in docs. One option with the docs is to trim them back and say "consult the Hadoop documentation for object store setup, and I can be more explicit there on version pain. |
||
versions: use exactly the same version with which the Hadoop JARs were | ||
built. | ||
</description> | ||
<properties> | ||
<sbt.project.name>hadoop-cloud</sbt.project.name> | ||
</properties> | ||
|
||
<dependencies> | ||
<dependency> | ||
<groupId>org.apache.hadoop</groupId> | ||
<artifactId>hadoop-aws</artifactId> | ||
<scope>${hadoop.deps.scope}</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.apache.hadoop</groupId> | ||
<artifactId>hadoop-openstack</artifactId> | ||
<scope>${hadoop.deps.scope}</scope> | ||
</dependency> | ||
<!-- | ||
Add joda time to ensure that anything downstream which doesn't pull in spark-hive | ||
gets the correct joda time artifact, so doesn't have auth failures on later Java 8 JVMs | ||
--> | ||
<dependency> | ||
<groupId>joda-time</groupId> | ||
<artifactId>joda-time</artifactId> | ||
<scope>${hadoop.deps.scope}</scope> | ||
</dependency> | ||
<!-- explicitly declare the jackson artifacts desired --> | ||
<dependency> | ||
<groupId>com.fasterxml.jackson.core</groupId> | ||
<artifactId>jackson-databind</artifactId> | ||
<scope>${hadoop.deps.scope}</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>com.fasterxml.jackson.core</groupId> | ||
<artifactId>jackson-annotations</artifactId> | ||
<scope>${hadoop.deps.scope}</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>com.fasterxml.jackson.dataformat</groupId> | ||
<artifactId>jackson-dataformat-cbor</artifactId> | ||
<scope>${hadoop.deps.scope}</scope> | ||
</dependency> | ||
<!--Explicit declaration to force in Spark version into transitive dependencies --> | ||
<dependency> | ||
<groupId>org.apache.httpcomponents</groupId> | ||
<artifactId>httpclient</artifactId> | ||
<scope>${hadoop.deps.scope}</scope> | ||
</dependency> | ||
<!--Explicit declaration to force in Spark version into transitive dependencies --> | ||
<dependency> | ||
<groupId>org.apache.httpcomponents</groupId> | ||
<artifactId>httpcore</artifactId> | ||
<scope>${hadoop.deps.scope}</scope> | ||
</dependency> | ||
</dependencies> | ||
|
||
<profiles> | ||
|
||
<profile> | ||
<id>hadoop-2.7</id> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So this only needs to come in for Hadoop 2.7+, not 2.6? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes
There's going to be an aggregate POM in trunk, |
||
<dependencies> | ||
<dependency> | ||
<groupId>org.apache.hadoop</groupId> | ||
<artifactId>hadoop-azure</artifactId> | ||
<scope>${hadoop.deps.scope}</scope> | ||
</dependency> | ||
</dependencies> | ||
</profile> | ||
|
||
</profiles> | ||
|
||
</project> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call this
hadoop-cloud
perhaps?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so org/apache/spark + hadoop-cloud? I'll cause too much confusion were any JAR created thrown into a lib/ directory; you'd get
& people would be trying to understand why the hadoop-* was out of sync, who to ping, etc.
There's actually a hadoop-cloudproject POM coming in hadoop-trunk to try and be a one-stop-dependency for all cloud bindings (avoiding the ongoing "declare new dependencies per version"). the names are way too close.
I'd had it as spark-cloud, you'd felt spark-hadoop-cloud was better. I can't think of what else would do, but I do think spark- is the string which should go at the front