Skip to content
This repository has been archived by the owner on Jun 12, 2023. It is now read-only.

Building TileDB

Stavros Papadopoulos edited this page Mar 25, 2017 · 95 revisions

Requirements

TileDB has been tested on Ubuntu Linux (v.15.10), CentOS Linux (v.7.3.1611) and Mac OS X El Capitan (v.10.11.2), but TileDB should work with any reasonably recent version of Ubuntu, CentOS or Mac OS X. Moreover, it has been tested with g++ v.5.3.0 on Ubuntu, g++ v.4.8.5 on CentOS, and Clang Apple LLVM version 7.3.0 on Mac OS X.

Library Dependencies

TileDB has minimal dependencies, relying only on the following standard libraries:

Required dependencies:

Optional dependencies:

  • MPI (for multi-processing operations)
  • OpenMP (for multi-threaded operations)
  • Google Test (for code testing)
  • Doxygen (for code documentation)

Build on Ubuntu Linux

Step 1 - Install cmake, wget and git

You first need to install three useful programs, cmake, GNU wget and git. Simply run:

sudo apt-get install cmake
sudo apt-get install wget
sudo apt-get install git

Step 2 - Install zlib, OpenSSL, LZ4 and MPI

Simply run:

sudo apt-get install zlib1g-dev
sudo apt-get install libssl-dev
sudo apt-get install liblz4-dev

For MPI, you can either install mpich with

sudo apt-get install mpich

or OpenMPI with

sudo apt-get install libopenmpi-dev

Step 3 - Install Blosc

First clone the Blosc Github repo:

git clone https://github.com/Blosc/c-blosc.git

Then run:

cd c-blosc
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX='/usr'
cmake --build .
ctest
sudo cmake --build . -- target install

Step 4 - Install Zstandard

First get and unzip the build folder:

wget https://github.com/facebook/zstd/archive/v1.1.0.tar.gz
tar xf v1.1.0.tar.gz

Then install the library:

cd zstd-1.0.0
sudo make install PREFIX='/usr'

Step 5 - Install Google Test

First run:

sudo apt-get install libgtest-dev

This command does not create the library - you need to do some manual building. Run the following:

cd /usr/src/gtest
sudo cmake .
sudo make
sudo mv libgtest* /usr/lib/

Step 6 - Install Doxygen

If you wish to compile the TileDB code documentation, you need Doxygen. Simply run:

sudo apt-get install doxygen

Step 7 - Clone the TileDB Repo

Run:

git clone https://github.com/Intel-HLS/TileDB.git

This will create a directory TileDB in you current working directory, containing the TileDB source files.

Step 8 - Build TileDB

Now you are ready to build TileDB. First, enter the TileDB source file directory:

cd <TileDB source file directory path>

Then, create a new directory, say build, and enter this directory:

mkdir build
cd build

Next, run

cmake ..

This automatically finds all TileDB dependencies and creates the Makefile. Subsequently, create the TileDB C library:

make -j X

where X is the practically the level of parallelism for make (e.g., we use 4 on our machine).

To install the (static and shared) libraries, run

sudo make install

This typically installs the TileDB libraries in /usr/local/lib and the header files in /usr/local/include.

To create the examples, run:

make examples -j X

To run the code tests, run:

make check -j X

You can create the Doxygen documentation by running:

make doc

Finally, you can clean up the built files by running:

make clean

Please make sure to read the Compilation Flags and Troubleshooting sections below for some important information you may need depending on your machine and preferences.

Build on CentOS Linux

Step 1 - Install cmake, wget and git

You first need to install two useful programs, cmake, GNU wget and git. Simply run:

sudo yum install cmake
sudo yum install wget
sudo yum install git

Step 2 - Install zlib, OpenSSL, LZ4, Blosc and MPI

Simply run:

sudo yum install zlib-devel
sudo yum install openssl-devel
sudo yum install epel-release
sudo yum install lz4-devel
sudo yum install blosc-devel

For MPI, you can either install mpich with

sudo yum install mpich-devel

or OpenMPI with

sudo yum install openmpi-devel

Step 3 - Install Zstandard

First get and unzip the build folder:

wget https://github.com/facebook/zstd/archive/v1.1.0.tar.gz
tar xf v1.1.0.tar.gz

Then install the library:

cd zstd-1.0.0
sudo make install PREFIX='/usr'

Step 4 - Install Google Test

Get the Google Test development release:

wget https://github.com/google/googletest/archive/release-1.7.0.tar.gz
tar xf release-1.7.0.tar.gz

Compile the sources:

cd googletest-release-1.7.0
cmake .
make

Move the header and library files:

sudo mv include/gtest /usr/include/gtest
sudo mv libgtest_main.a libgtest.a /usr/lib/

Step 5 - Install Doxygen

If you wish to compile the TileDB code documentation, you need Doxygen. Simply run:

sudo yum install doxygen

Step 6 - Clone the TileDB Repo

Run:

git clone https://github.com/Intel-HLS/TileDB.git

This will create a directory TileDB in you current working directory, containing the TileDB source files.

Step 7 - Build TileDB

Now you are ready to build TileDB. First, enter the TileDB source file directory:

cd <TileDB source file directory path>

Then, create a new directory, say build, and enter this directory:

mkdir build
cd build

Next, run

cmake ..

This automatically finds all TileDB dependencies and creates the Makefile. Subsequently, create the TileDB C library:

make -j X

where X is the practically the level of parallelism for make (e.g., we use 4 on our machine).

To install the (static and shared) libraries, run

sudo make install

This typically installs the TileDB libraries in /usr/local/lib and the header files in /usr/local/include.

To create the examples, run:

make examples -j X

To run the code tests, run:

make check -j X

You can create the Doxygen documentation by running:

make doc

Finally, you can clean up the built files by running:

make clean

Please make sure to read the Compilation Flags and Troubleshooting sections below for some important information you may need depending on your machine and preferences.

Build on Mac OS X

Mac OS X uses Clang as its default C++ compiler, which unfortunately does not support OpenMP. Therefore, the TileDB builder deactivates OpenMP under Mac OS X, which comes at some small performance cost (e.g., internal sorting is not parallel any more when loading unsorted sparse cells into an array).

Step 1 - Install Homebrew

First, you need to install Homebrew, which is a package manager for Mac OS X. After installing Homebrew, you will be able to install packages to your Mac by running:

brew install <package-name>

You can find the installation guidelines on its website, but an easy way to install Homebrew is by running the following command:

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Step 2 - Install cmake, wget and git

You also need to install three useful programs, cmake, GNU wget and git. Simply run

brew install cmake
brew install wget
brew install git

Step 3 - Install zlib, OpenSSL, LZ4 and MPI

Simply run

brew install lzlib
brew install openssl
brew install lz4

For MPI, you can either install mpich with

brew install mpich

or OpenMPI with

brew install openmpi

Step 4 - Install Blosc

First clone the Blosc Github repo:

git clone https://github.com/Blosc/c-blosc.git

Then run:

cd c-blosc
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX='/usr'
cmake --build .
ctest
sudo cmake --build . -- target install

Step 5 - Install Zstandard

First get and unzip the build folder:

wget https://github.com/facebook/zstd/archive/v1.1.0.tar.gz
tar xf v1.1.0.tar.gz

Then install the library:

cd zstd-1.0.0
sudo make install PREFIX='/usr'

Step 6 - Install Google Test

Get the Google Test development release:

wget https://github.com/google/googletest/archive/release-1.7.0.tar.gz
tar xf release-1.7.0.tar.gz

Compile the sources:

cd googletest-release-1.7.0
cmake .
make

Move the header and library files:

sudo mv include/gtest /usr/include/gtest
sudo mv libgtest_main.a libgtest.a /usr/lib/

Note: The last commands may not work on your Max OS due to Apple's System Integrity Protection. If that is the case, you need to reboot your system and press cmd+r while it is booting to enter the recovery mode. Then go to Utilities > Terminal from the menu at the top, and run the following commands:

csrutil disable
reboot 

This will reboot your system, and then you will be able to move the header and library files as described above.

Step 7 - Install Doxygen

If you wish to compile the TileDB code documentation, you need Doxygen. Simply run:

brew install doxygen

Step 8 - Clone the TileDB Repo

Run:

git clone https://github.com/Intel-HLS/TileDB.git

This will create a directory TileDB in you current working directory, containing the TileDB source files.

Step 9 - Build TileDB

Now you are ready to build TileDB. First, enter the TileDB source file directory:

cd <TileDB source file directory path>

Then, create a new directory, say build, and enter this directory:

mkdir build
cd build

Next, run

cmake ..

This automatically finds all TileDB dependencies and creates the Makefile. Subsequently, create the TileDB C library:

make -j X

where X is the practically the level of parallelism for make (e.g., we use 4 on our machine).

To install the (static and shared) libraries, run

sudo make install

This typically installs the TileDB libraries in /usr/local/lib and the header files in /usr/local/include.

To create the examples, run:

make examples -j X

To run the code tests, run:

make check -j X

You can create the Doxygen documentation by running:

make doc

Finally, you can clean up the built files by running:

make clean

Please make sure to read the Compilation Flags and Troubleshooting sections below for some important information you may need depending on your machine and preferences.

Build on Lustre

TileDB runs on Lustre. Lustre is a parallel block-level file system (FS) enabling clients from multiple nodes to simultaneously read or write from the mounted directories. It is also compliant with POSIX which means local FS file control functions such as read, write and fcntl work out of the box.

TileDB uses POSIX file locks to ensure consistency. Lustre supports POSIX file locking semantics and exposes local (mount with -o localflock) and cluster (mount with -o flock) level consistent locking. Hence, to run TileDB on Lustre, it must be mounted with either of these options. If concurrent processes are writing to a file on the same node then consider mounting Lustre client with localflock instead of flock. This reduces remote procedure calls to the Lustre metadata server.

Compilation Flags

You can pass various flags to cmake before building TileDB. Specifically:

  • To change the default installation directory (typically /usr/local), set -DCMAKE_INSTALL_PREFIX. For instance, cmake -DCMAKE_INSTALL_PREFIX=/usr sets the installation library directory to /usr/lib and header directory to /usr/include. Note that you will need to later make with sudo in case the installation directory needs proper permissions.

  • To control the type of build, set -DCMAKE_BUILD_TYPE=Debug for debug mode, and DCMAKE_BUILD_TYPE=Release for release mode (the default is release mode if you omit this flag).

  • To enable verbose error messages, add flag -DTILEDB_VERBOSE=1 (the default is 0, i.e., no error messages).

  • To enable MPI, use flag -DUSE_MPI=1.

  • To enable OpenMP for multi-threading, add flag -DUSE_OPENMP=1 (the default is 0, i.e., OpenMP is disabled). Note that OpenMP is not supported with Clang on Mac OS X and, thus, this flag will be ignored.

  • TileDB needs your MAC address for creating certain folder names (this is important if you use TileDB in a parallel file system). The MAC address is typically retrieved through some network interface, e.g., "eth0" for late Ubuntu versions, and "en0" for Mac OS X. If your network interface is not "eth0" for Ubuntu or "en0" for Mac OS X, you need to find which interface carries your MAC address (by running ifconfig in terminal) and then set it with flag -DMAC_ADDRESS_INTERFACE, e.g., cmake -DMAC_ADDRESS_INTERFACE=em0.

  • You can tune the compression level of GZIP, Zstandard and Blosc by setting flags -DCOMPRESSION_LEVEL_GZIP, -DCOMPRESSION_LEVEL_ZSTD and -DCOMPRESSION_LEVEL_BLOSC, respectively. If you omit these flags, TileDB will use their respective defaults ones.

  • You can enable/disable parallel sort in the case of sparse arrays when you write in unsorted mode through flag -DUSE_PARALLE_SORT, e.g., cmake -DUSE_PARALLE_SORT=1 enables the parallel sort. Note that parallel sort is disabled by default.

  • In case cmake does not find your (already installed) OpenSSL library (this usually happens in Mac OS X), you need to manually specify the OpenSSL root directory with flag -DOPENSSL_ROOT_DIR.

  • In case you have multiple MPI compilers (e.g., both mpich and OpenMPI) and you need to choose a particular one, or in case cmake does not properly find your MPI compiler or library, depending on your machine, you may need to set one or more MPI flags that come with cmake, such as the following (use full paths): -DMPI_C_COMPILER, -DMPI_CXX_COMPILER, -DMPI_C_INCLUDE_PATH, -DMPI_CXX_INCLUDE_PATH, -DMPI_C_LIBRARIES, and -DMPI_CXX_LIBRARIES. Contact us if you need more help here.

Troubleshooting

Trouble compiling TileDB with (mpich) MPI enabled in Mac OS X: When building TileDB with make and using mpich, if you see an error message complaining about the gfortran library missing, simply run:

brew reinstall gcc
brew reinstall mpich

This should fix any potential inconsistencies between the gcc and mpich versions.

Trouble running (mpich) MPI programs in Mac OS X: We came across the following problem in Mac OS X when we ran an MPI-based program with mpiexec (with the mpich version of MPI). The program failed with a weird message about the hostname. You need to do the following. First, go to the System Preferences > Sharing settings and make sure you enable Remote Login. Then, set a proper host name by running:

sudo scutil HostName your-host-name

where your-host-name is a name of your choice. Next, add 127.0.0.1 your-host-name to the /etc/hosts/ file. Finally, add your public key to the authorized keys file by running:

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

The above should do the trick.

Trouble running (openmpi) MPI programs in CentOS: When running (OpenMPI) MPI programs with mpirun in CentOS, we came across an error where the program was hanging on MPI_Init and then timed out with an error mentioning hfi_wait_for_device. In this case, try to pass --mca pml ob1 to the mpirun command. This fixed the problem on our machine.