Star-Chart

Purpose of this code is to build a starhopping app using the features of AWS. (Starhopping, in amateur astonomy, is the process of finding objects of interest in the sky using a telescope or binoculars by identifying nearby stars.)

My Data source is the HYG Database and is available here: https://www.astronexus.com/hyg AND https://github.com/astronexus The source dataset combines data from the Hipparcos Catalog, Yale Bright Star Catalog & Gliese Catalog. (I DO NOT own the dataset - refer links above for the original source)

Video Tutorial explaining what I did:

Screenshot of Front End...

Methodology:

Below is the planned methodlogy.

Build the dataset of interesting Stars and DSOs.
Calculate angular distances between them (I expect this to be a compute intensive process and will use Pyspark on AWS EMR for the computation). Store the distances in a separate set of files.
Load the dataset and distances into S3 as multiple .CSVs and use AWS Athena to query the data. This is a pseudo Graph Database with the Stars & DSOs table as the NODES and the distances between them as the EDGES.
Build a front end using AWS Lambda and and a static website hosted on S3 and expose publically
Optionally also load the data into Neo4j. This optimizes querying based on relationships.

Current State:

I have been able to create a queryable database using AWS Athena, am able to query it using Lambda and provide external access via the API Gateway and a static webpage (hosted on S3). Cleaning up and adding bells and whistles

Explanation of files:

htgdata_v3.zip & dso.zip - source data files from the links above. Please refer to original links for data.
Extract_subset_and_combine.py - pulls out a subset of data. I am focussing on stars of Mag 7 or lower and DSO of mag 13 or lower. combinedV2.csv has the dataset
create_pairs.py - combine stars and dsos into pairs. The purpose of this step is to ease creation of a Pyspark dataframe. This would allow me to run computation of Angular Distances on a large dataset without crashing my machine
Calculate_Distances_using_Pyspark.ipynb - used this to do the compute using Pyspark. I spun up AWS EMR instances for the compute. (Calculate_Distances_using_Standard_Python.py - is an alternative without Pyspark but is a lot slower and not scalable)
Lambda-to-search-Athena-based-on-Catalog-Id.py - Lambda function to accept Catalog Name and Id, query Athena and return the closest Stars and DSOs
index.html - static webpage for the query. Working to clean it up and add bells and whistles

Potential Future Upgrades:

Modify the Spark jobs to use Transient Clusters. This would be useful in case I need to build out a pipeline to ingest data on a regular basis
Enhance the front and back ends. Potentially, could use Amplify to build out a React frontend while retaining a Lambda based REST API as the backend. Datastore can be moved to DynoamoDB for vastly improved performance. Advantage of Amplify if that it could be expanded into a mobile app.
Some sort of logic to traverse from a source to a destination node using the edges. This would give the 'star-hopping' functionality. Can optimize using an appropriate algorithm.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Star-Chart

Methodology:

Current State:

Explanation of files:

Potential Future Upgrades:

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
images		images
Calculate_Distances_using_Pyspark.ipynb		Calculate_Distances_using_Pyspark.ipynb
Calculate_Distances_using_Standard_Python.py		Calculate_Distances_using_Standard_Python.py
Extract_subset_and_combine.py		Extract_subset_and_combine.py
Lambda-to-search-Athena-based-on-Catalog-Id.py		Lambda-to-search-Athena-based-on-Catalog-Id.py
README.md		README.md
combinedV2.csv		combinedV2.csv
create_pairs.py		create_pairs.py
dso.zip		dso.zip
hygdata_v3.zip		hygdata_v3.zip
index.html		index.html

vvr-rao/Star-Chart

Folders and files

Latest commit

History

Repository files navigation

Star-Chart

Methodology:

Current State:

Explanation of files:

Potential Future Upgrades:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages