FEMTO is an indexing and search system for queries on sequences of bytes. FEMTO stands for the FM-index for External Memory with Throughput Optimizations. This tool supports building large indexes in parallel with MPI and then searching large indexes with a multithreaded server.
FEMTO requires a 64-bit machine to build and test. 32-bit machines are supported for search only. FEMTO is known to build with GCC for Linux/x86-64.
To build FEMTO from a release tarball, you will need a C++ compiler, libssl-dev, and optionally MPI. When building from source, you will also need flex, bison, autotools, and libtool. It has worked with GNU bison 2.5 and 2.4.1.
MPI is required for parallel index construction. Note that MPI runs across machines of different endianness are not supported.
If you'd like to use MPI-parallel index construction, you'll need to install a version of MPI which supports threads. We have used OpenMPI 1.8.8, configured in the following way:
./configure --prefix=/opt/openmpi1.8.8 --enable-mpirun-prefix-by-default --enable-mpi-thread-multiple --with-threads make make install # on all compute nodes # To make sure mpirun and mpicc are in the path for use with FEMTO export PATH=$PATH:/opt/openmpi1.8.8/bin export LD_LIBRARY_PATH=/opt/openmpi1.8.8/lib
Make sure you have met the requirements first!
We recommend starting with a FEMTO release tarball from https://github.com/femto-dev/femto/releases .
If you prefer to use a source checkout, there are additional build dependencies.
If you're starting out with a source checkout as with
git clone https://github.com/femto-dev/femto.git cd femto
you will need to also generate the configure script:
sh autogen.sh
To build FEMTO, issue the following commands:
./configure make
You'll see lots of warnings that things are declared/defined but not used; this is normal and not a problem. If you get errors and compilation fails, you may not have all the required dev libraries installed. (e.g. if it is running g++ and fails to find -lssl, that would indicate you need to install libssl)
To run the included unit tests, use
make check
To install FEMTO in a particular place, be sure to include --prefix in your configure line, as in
./configure --prefix ~/femto_install
As usual,
make install
will install the FEMTO tools to the destination specified by ./configure.
You can also run the commands from the build directory.
See src/mod_femto/README for information about installing the FEMTO apache module.
To build an index, run
femto/src/dcx_cc/femto_index --tmp /path/to/tmp_dir \ --outfile index.femto \ files_or_directories_to_index
Then, to query the index, use femto_search. To count the number of occurrences (quickly!), use:
femto/src/main_cc/femto_search /path/to/index_dir --count 'pattern'
To report documents that matched (time depends on # reported), use:
femto/src/main_cc/femto_search /path/to/index_dir 'pattern'
To report documents and offsets that matched (time depends on # reported), use:
femto/src/main_cc/femto_search /path/to/index_dir --offsets 'pattern'
To learn more about what kinds of patterns you can use, see femto/src/main/QUERY_FORMAT.txt
FEMTO source includes the Google RE2 package, jQuery, jQuery SlickGrid, and jQuery SVG.