Skip to content

Commit

Permalink
[SQL] Update SQL readme to include instructions on generating golden …
Browse files Browse the repository at this point in the history
…answer files based on Hive 0.13.1.

Author: Yin Huai <yhuai@databricks.com>

Closes #5702 from yhuai/howToGenerateGoldenFiles and squashes the following commits:

9c4a7f8 [Yin Huai] Update readme to include instructions on generating golden answer files based on Hive 0.13.1.
  • Loading branch information
yhuai authored and rxin committed Apr 25, 2015
1 parent a7160c4 commit aa6966f
Showing 1 changed file with 22 additions and 1 deletion.
23 changes: 22 additions & 1 deletion sql/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,35 @@ Spark SQL is broken up into four subprojects:

Other dependencies for developers
---------------------------------
In order to create new hive test cases , you will need to set several environmental variables.
In order to create new hive test cases (i.e. a test suite based on `HiveComparisonTest`),
you will need to setup your development environment based on the following instructions.

If you are working with Hive 0.12.0, you will need to set several environmental variables as follows.

```
export HIVE_HOME="<path to>/hive/build/dist"
export HIVE_DEV_HOME="<path to>/hive/"
export HADOOP_HOME="<path to>/hadoop-1.0.4"
```

If you are working with Hive 0.13.1, the following steps are needed:

1. Download Hive's [0.13.1](https://hive.apache.org/downloads.html) and set `HIVE_HOME` with `export HIVE_HOME="<path to hive>"`. Please do not set `HIVE_DEV_HOME` (See [SPARK-4119](https://issues.apache.org/jira/browse/SPARK-4119)).
2. Set `HADOOP_HOME` with `export HADOOP_HOME="<path to hadoop>"`
3. Download all Hive 0.13.1a jars (Hive jars actually used by Spark) from [here](http://mvnrepository.com/artifact/org.spark-project.hive) and replace corresponding original 0.13.1 jars in `$HIVE_HOME/lib`.
4. Download [Kryo 2.21 jar](http://mvnrepository.com/artifact/com.esotericsoftware.kryo/kryo/2.21) (Note: 2.22 jar does not work) and [Javolution 5.5.1 jar](http://mvnrepository.com/artifact/javolution/javolution/5.5.1) to `$HIVE_HOME/lib`.
5. This step is optional. But, when generating golden answer files, if a Hive query fails and you find that Hive tries to talk to HDFS or you find weird runtime NPEs, set the following in your test suite...

```
val testTempDir = Utils.createTempDir()
// We have to use kryo to let Hive correctly serialize some plans.
sql("set hive.plan.serialization.format=kryo")
// Explicitly set fs to local fs.
sql(s"set fs.default.name=file://$testTempDir/")
// Ask Hive to run jobs in-process as a single map and reduce task.
sql("set mapred.job.tracker=local")
```

Using the console
=================
An interactive scala console can be invoked by running `build/sbt hive/console`.
Expand Down

0 comments on commit aa6966f

Please sign in to comment.