GitHub - alexfallin/MST-Artifact-SC23: The reproducibility artifact for our SC23 MST paper

Artifact Identification

Main contributions of the paper
- The paper presents a high-speed Minimum Spanning Tree (MST) implementation for GPUs
- The paper describes the optimizations that make the implementation fast, and provides empirical evidence to demonstrate the benefit of these optimizations
- The paper evaluates the resulting implementation on 2 GPUs and 2 CPUs using 17 input graphs from various domains to compare with state-of-the-art GPU-parallel, CPU-parallel, and CPU-serial MST implementations
Role of the computation artifacts
- The computational artifact can run the described MST implementation as well as the codes from the related work with the 17 input graphs and compute the median runtimes
- It can also run various de-optimized versions of our MST code and compute the median runtimes
Software architecture
- The URL for the DOI is: 10.5281/zenodo.8214364
- The URL for the Git repository is: https://github.com/alexfallin/MST-Artifact-SC23/
- The codes from the related work, our MST code, the de-optimized codes, and the input files each have a dedicated directory
- Each directory contains a build and run script that installs and runs the code in that directory, respectively
- The “set up.sh” script downloads the 17 inputs, converts them into the various needed formats, and runs each build script to install the codes
- The “run all compare.sh” script executes the timed experiments for each code and writes the results to [code] out.csv files that are placed in the root folder
- The “run all deoptimize.sh” script runs the de-optimization experiments and writes the results to ecl mst [deopts] out.csv file
- The “generate compare tables.py” script reads the output files and converts them into CSV tables similar to those shown in the paper
- The “generate deopt tables.py” script reads the output files and con verts them into CSV tables similar to those shown in the paper
Extent of reproducibility
- The computational artifacts can run our code, the codes from the literature we used for the performance comparison, and our de-optimized codes and can reproduce the result figures from the paper

Reproducibility of Experiments

Experiment workflow
- Install all the codes and input files using “set up.sh” script
- Run the “run all compare.sh” and “run all deoptimize.sh” scripts to produce the result CSV files
- Run the “generate * tables.py” scripts to generate the tables that show the results of the above experiments
Execution time
- About 48 hours to run the related work comparisons
- About 2 hours to run the de-optimization experiments
Expected results
- The CSV files that contain the runtimes for each code
- Combined CSV tables for both throughput and runtime
How the results relate to the paper results
- The results should show similar runtime and throughput relationships between the different implementations as in the paper
- The results will differ based on the CPU and GPU that is used

Artifact Dependencies and Requirements

Hardware resources
- We used a 3.5 GHz Ryzen Threadripper 2950X CPU, a 2.9 GHz Xeon Gold 6226R CPU, a 1.2 GHz TITAN V GPU, and a 1.67 GHz RTX 3080 Ti GPU for the experiments
- The minimum hardware requirement is a multi-core CPU and a CUDA-capable GPU with a Compute Capability of at least 7.0
Operating system
- We used Fedora Linux 34 and 36. No specific operating system is required to run the experiments
Software libraries needed
- We used GCC 11.3.1 and 12.2.1 with the “-O3 -march=native” flags to compile the codes and NVCC 11.7 and 12.0 with the “-arch=sm 70 and -arch=sm 86” flags for the CUDA codes
- The related works require:
  - Boost library 1.58+
  - libllvm 7.0+
  - libfmt 4.0+
  - gcc version 9.3+
  - cmake version 3.26+
  - CUDA 11.0+
  - NVIDIA driver 450.80.02+
  - Anaconda 23.5+
  - Python 3.0+
  - Matplotlib
  - NumPy
Input datasets needed
- We used 17 inputs for the experiments. They can be downloaded using either set up.sh in the root directory of the artifact or download inputs.sh in the Inputs folder
- The only constraints on inputs are that they must be in binary 32-bit CSR format1 and they must fit in the selected GPU’s memory

Artifact Installation and Deployment Process

Process to install and compile the codes
- The “set up.sh” script will download the inputs, convert them to the various required formats, and will also download and build the related works
Process to deploy the codes
- Use the “run all compare.sh” script to run each code to compare their run times. The estimated time to complete these runs is 48 hours, subject to system speed
- Use the “run all deoptimize.sh” script to run the de-optimization experiments from the paper. The estimated run time of these experiments is 2 hours, subject to system speed

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Deoptimizations		Deoptimizations
ECL-MST		ECL-MST
Gunrock_GPU		Gunrock_GPU
Inputs		Inputs
Jucele_GPU		Jucele_GPU
Lonestar_CPU		Lonestar_CPU
PBBS_CPU		PBBS_CPU
UMinho_BOTH		UMinho_BOTH
cuGraph_GPU		cuGraph_GPU
README.md		README.md
generate_compare_tables.py		generate_compare_tables.py
generate_deoptimize_tables.py		generate_deoptimize_tables.py
run_all_compare.sh		run_all_compare.sh
run_all_deoptimize.sh		run_all_deoptimize.sh
set_up.sh		set_up.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Artifact Identification

Reproducibility of Experiments

Artifact Dependencies and Requirements

Artifact Installation and Deployment Process

About

Releases 2

Packages

Languages

alexfallin/MST-Artifact-SC23

Folders and files

Latest commit

History

Repository files navigation

Artifact Identification

Reproducibility of Experiments

Artifact Dependencies and Requirements

Artifact Installation and Deployment Process

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages