The instruction is for single-node benchmark. We provide two methods for cluster setup.
- k8s/README.md - Deploy TG containers using kubernetes (k8s)
- benchmark_on_cluster/README.md - Manually install and configure TigerGraph
The TigerGraph implementation expects the data to be in composite-projected-fk
CSV layout. To generate data that confirms this requirement, run Datagen with the --explode-edges
option. In Datagen's directory (ldbc_snb_datagen_spark
), issue the following commands. We assume that the Datagen project is built and the ${PLATFORM_VERSION}
, ${DATAGEN_VERSION}
environment variables are set correctly.
export SF=desired_scale_factor # for example SF=1
export LDBC_SNB_DATAGEN_MAX_MEM=available_memory # for example LDBC_SNB_DATAGEN_MAX_MEM=8G
export LDBC_SNB_DATAGEN_JAR=$(sbt -batch -error 'print assembly / assemblyOutputPath')
rm -rf out-sf${SF}/
tools/run.py \
--cores $(nproc) \
--memory ${LDBC_SNB_DATAGEN_MAX_MEM} \
-- \
--format csv \
--scale-factor ${SF} \
--explode-edges \
--mode bi \
--output-dir out-sf${SF} \
--format-options compression=gzip
-
To download and use the sample data set, run:
scripts/get-sample-data-set.sh
-
To use other data sets, adjust the variables in
scripts/configure-data-set.sh
:TG_DATA_DIR
- a folder containing theinitial_snapshot
,inserts
anddeletes
directories.TG_LICENSE
- optional, trial license is used if not specified, sufficient for SF30 and smaller.SF
- scale factor
-
Run:
. scripts/configure-data-set.sh
-
Load the data:
scripts/load-in-one-step.sh
This step may take a while, as it is responsible for defining the schema, loading the data and installing the queries. The TigerGraph container terminal can be accessed via:
docker exec --user tigergraph -it snb-bi-tg bash
If a web browser is available, TigerGraph GraphStudio can be accessed via http://localhost:14240/.
-
The substitution parameters should be generated using the
paramgen
.
Test loading the microbatches:
scripts/batches.sh
scripts/backup-database.sh
and scripts/restore-database.sh
scripts to achieve this.
To run the queries, issue:
scripts/queries.sh
For a test run, use:
scripts/queries.sh --test
Results are written to output/output-sf${SF}/results.csv
and output/output-sf${SF}/timings.csv
.
To run the benchmark, issue:
scripts/benchmark.sh
- Because the current TigerGraph datetime use ecpoch in seconds but the datetime in LDBC SNB benchmarks is in milliseconds. So we store the datetime as INT64 in the datatime and write user defined functions to do conversion. The dateime value in the dataset is considered as the local time. INT64 datetime in millisecond
value
can be converted to datetime usingdatetime_to_epoch(value/1000)
. - The user defined function is in
ExprFunctions.hpp
(for query) andTokenBank.cpp
(for loader).