CXS

The project CXS (originally CXS338) is a fork of MIT Haystack's CorrelX VLBI Correlator, developed by A.J. Vazquez Alvarez on a postdoctoral research position at MIT Haystack back in 2015-2017. The original project's main objectives were "scalability, flexibility and simplicity". This project aims at adding "performance" to that list.

This project (CXS) starts as a migration of CorrelX to run on Apache Spark as part of a Masters' Thesis on Big Data at UNED by this author in 2021, as a proof of concept with the following objectives:

Simplifying architecture and usage (simplicity).
Migrating from Python 2 to Python 3 (flexibility).
Migrating from Hadoop to Spark (performance).
Running a test correlation on a cloud computing service (scalability).

Versions

About the naming convention:

CXH227: CorrelX on Hadoop 2, Python 2.7 (CorrelX legacy).
CXPL38: CorrelX on Pipeline, Python 3.8.
CXS338: CorrelX on Spark 3, Python 3.8.
CXS3311: CorrelX on Spark 3, Python 3.11.

Configuration

Download Apache Spark 3.5.1 pre-built for Apache Hadoop 3:

wget https://ftp.cixug.es/apache/spark/spark-3.5.1/spark-3.5.1-bin-hadoop3.tgz
tar -xvzf spark-3.5.1-bin-hadoop3.tgz

Create environment and install requirements:

python3.11 -m venv venv3
source venv3/bin/activate
pip install -r requirements.pkg.txt
python cxs/tools/gen_symlinks.py

Add the following lines to venv3/bin/activate (replace the path as required):

export SPARK_HOME=/home/aj/spark-3.5.1-bin-hadoop3
export PYTHONPATH=$PYTHONPATH:`pwd`/src
export PYTHONPATH=$PYTHONPATH:`pwd`/cxs

Reactivate environment:

source venv3/bin/activate

Basic Correlation

Pipeline

bash examples/run_example_vgos.sh

Hadoop

bash sh/configure_hadoop_cx.sh
bash examples/run_example_vgos_hadoop.sh

Name		Name	Last commit message	Last commit date
Latest commit History 137 Commits
app		app
conf		conf
cxs		cxs
demo-videos		demo-videos
examples		examples
logs		logs
sh		sh
src		src
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
correlx-slides.pdf		correlx-slides.pdf
correlx-user-developer-guide.pdf		correlx-user-developer-guide.pdf
perf_comparison.png		perf_comparison.png
requirements.legacy.txt		requirements.legacy.txt
requirements.pkg.txt		requirements.pkg.txt
requirements.txt		requirements.txt
setup.py		setup.py
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CXS

Versions

Configuration

Basic Correlation

Pipeline

Hadoop

Spark

License

ajvazquez/CXS

Folders and files

Latest commit

History

Repository files navigation

CXS

Versions

Configuration

Basic Correlation

Pipeline

Hadoop

Spark