PASTEL (Predictions of AtmoSpheric Trace substances in the Environment using machine Learning)

To view the associated publication for this project see here: link. (upcoming!) For a link to all associated data used along side the above publication see here : link. (upcoming!)

MAJOR UPDATES IN PROGESS!

About

The purpose of this project is to explore the predicability of atmospheric trace substance concentraions from the underlying meteorological variables prior to their collection.

This project/model are very much in the development stage. Results are preliminary.

The PASTEL model uses meteorological variables calculated at along backwards trajectories generated by the NOAA HYSPLIT model as inputs to make predictions about atmospheric trace substance concentrations using Sci-Kit Learn's Random Forest Regression algorithm.

Disclaimer! - I am not a software engineer, I apologize for messy code!

Getting Started

Clone the repository and look at the Zenodo link to get the final cleaned csv used for the main "code/v4.5.5_PASTEL.ipynb" file. This is a priliminary file so it will be necessary to change directory paths within the file itself.

Be sure to look at 'Prerequisites' section for necessary packages.

Prerequisites

The provided 'pvocal_env.yml' file is the environment with the packages currently required to run the main pvocal_env file.

To install the environment file run this line in an anaconda prompt window:

conda env create -f pvocal_env.yml

Installation

Make sure to use the provided environment file and find the data in the Zenodo link.

Usage

This project is in it early stages and has some significant challenges yet, but please feel free to explore what you can with the model. There is most likely a way to speed up the code I've written but this is what I have so far!

Features

It can predict 6 target compounds: DMS, Ethane, Ozone, Carbon Monoxide, Methane, and Methyl Bromide with the code currently there. Within the input dataset there is room for many more to be incorporated.

Contributing

Main contributions include making the model compatible with all VOCs in the spreadsheet and adapting code so that this process is more streamlined.

A way to speed up or modularize this project is probably the main thing it needs right now!

Additionally there exists a package for Utility Based Regression in R here: https://github.com/paobranco/UBL I believe that implementing this for python and as a part of this project would be benifical for more accurate predictions while using the SMOGN algorithm.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Acknowledgments

All contributors for PySPLIT!

The NASA SARP program for giving me the courage to chase the idea this project is based off.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PASTEL (Predictions of AtmoSpheric Trace substances in the Environment using machine Learning)

Table of Contents

About

Getting Started

Prerequisites

Installation

Usage

Features

Contributing

License

Acknowledgments

Files

README.md

Latest commit

History

README.md

File metadata and controls

PASTEL (Predictions of AtmoSpheric Trace substances in the Environment using machine Learning)

Table of Contents

About

Getting Started

Prerequisites

Installation

Usage

Features

Contributing

License

Acknowledgments