Distributed Join on Social Network Data

Application to perform join operations using different distributed algorithms. In particular, can be used to detect subgraphs in social network graphs.

Authors

Cauim de Souza Lima
Victor Hugo Vianna

Requirements

OpenMPI
Boost lib

Build

If you're running the code from Polytechnique's cluster, set SALLES_DINFO=true in the Makefile (be careful with the blank spaces), otherwise set SALLES_DINFO=false. Then open a shell and run "make" from the root directory. This will prompt the creation of two files: bin/test_join and bin/test_triangles

Testing join

tests/join/relations should contain all the relation files you want to join, in the usual format. Ex:

tests/join/relations/relation1.txt
10 0
10 1

tests/join/inputs should contain a file with the join information. Such file must start with the number N of relations you want to join, followed by N lines. Each line must have the arity of the relation, the name of its file and the variable indices. Ex:

tests/join/input/test1.txt
2
2 relation1.txt 0 1
2 relation1.txt 0 2

In the example above, the first line means that we are going to join 2 relations. The first has an arity of 2, it's contained in "tests/join/inputs/relation1.txt" and the variables are (x₀, x₁). Similarly, the second one has arity 2, the same file will be used, but now its variables are (x₀, x₂).
If you want, you can add a verification file to tests/join/output with the SAME NAME as your input. This file will be compared to the output of the program. In this case we could have

tests/join/output/test1.txt
10 0 0
10 0 1
10 1 0
10 1 1

To run the program, open a shell from the ROOT DIRECTORY and type in

salloc -n <# of processes> mpirun -x LD_LIBRARY_PATH=./lib bin/test_join test1.txt <algorithm> <hash>

(If you're not running from Polytechnique, remove the "-x LD_LIBRARY_PATH=./lib")
<algorithm> option can be (without quotes): "sequential", "normal_distrib", "optimized_distrib", "hypercube"
<hash> option can be (without quotes): "mod_hash", "mult_hash", "murmur_hash" (if no hash is provided, mod_hash will be used)

Testing triangles

Same as above, except that the input to the program is just a single relation file tests/triangles/inputs/input1.txt where we will detect every possible triangle. Ex:

tests/triangles/inputs/input1.txt
0 1
1 2
2 0

Should output
0 1 2
1 2 0
2 0 1

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
include		include
lib		lib
src		src
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Join on Social Network Data

Authors

Requirements

Build

Testing join

Testing triangles

About

Releases

Packages

Languages

victorvianna/network-join

Folders and files

Latest commit

History

Repository files navigation

Distributed Join on Social Network Data

Authors

Requirements

Build

Testing join

Testing triangles

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages