Skip to content

A minimal Apache Hive server in a Docker image

License

Notifications You must be signed in to change notification settings

fredrikhgrelland/weehive

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WeeHive

A minimal-as-possible Docker container running Apache Hive on Hadoop. Intended for non-production use cases like testing out Hive code or running integration tests.

Setup

  1. Install Docker.

  2. Make sure that you have at least a few GB of memory allocated to Docker. Instructions:

  3. Clone this repository.

  4. From the repository root, build the Docker image.

    docker build -t weehive .

Usage

Beeline

docker run --rm -it \
  -v weehive_hadoop:/usr/local/hadoop/warehouse \
  -v weehive_meta:/usr/local/hadoop/metastore_db \
  weehive

You will be shown the Beeline shell. The weehive_hadoop and weehive_meta volume names can be changed to be project-specific names if you want.

Remote connection

  1. Run the server.

    docker run --rm -it -p 10000:10000 \
      -v weehive_hadoop:/usr/local/hadoop/warehouse \
      -v weehive_meta:/usr/local/hadoop/metastore_db \
      weehive hiveserver2
  2. Wait ~90 seconds for Hive to fully start.

  3. Connect using the JDBC URL jdbc:hive2://localhost:10000. Example from an external beeline:

    beeline -u jdbc:hive2://localhost:10000

Loading data from file

  1. Mount the data as a volume by adding a -v <sourcedir>:/usr/local/hadoop/data to one of the docker run commands above.
  2. Follow instructions to load data

Development

docker build -t weehive:local .
docker run --rm -it weehive:local

About

A minimal Apache Hive server in a Docker image

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Dockerfile 100.0%