Skip to content

vwgeiser/PASTEL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PASTEL (Predictions of AtmoSpheric Trace substances in the Environment using machine Learning)

To view the associated publication for this project see here: link. (upcoming!) For a link to all associated data used along side the above publication see here : link. (upcoming!)

MAJOR UPDATES IN PROGESS!

Table of Contents

About

The purpose of this project is to explore the predicability of atmospheric trace substance concentraions from the underlying meteorological variables prior to their collection.

This project/model are very much in the development stage. Results are preliminary.

The PASTEL model uses meteorological variables calculated at along backwards trajectories generated by the NOAA HYSPLIT model as inputs to make predictions about atmospheric trace substance concentrations using Sci-Kit Learn's Random Forest Regression algorithm.

Disclaimer! - I am not a software engineer, I apologize for messy code!

Getting Started

Clone the repository and look at the Zenodo link to get the final cleaned csv used for the main "code/v4.5.5_PASTEL.ipynb" file. This is a priliminary file so it will be necessary to change directory paths within the file itself.

Be sure to look at 'Prerequisites' section for necessary packages.

Prerequisites

The provided 'pvocal_env.yml' file is the environment with the packages currently required to run the main pvocal_env file.

To install the environment file run this line in an anaconda prompt window:

conda env create -f pvocal_env.yml

Installation

Make sure to use the provided environment file and find the data in the Zenodo link.

Usage

This project is in it early stages and has some significant challenges yet, but please feel free to explore what you can with the model. There is most likely a way to speed up the code I've written but this is what I have so far!

Features

It can predict 6 target compounds: DMS, Ethane, Ozone, Carbon Monoxide, Methane, and Methyl Bromide with the code currently there. Within the input dataset there is room for many more to be incorporated.

Contributing

Main contributions include making the model compatible with all VOCs in the spreadsheet and adapting code so that this process is more streamlined.

A way to speed up or modularize this project is probably the main thing it needs right now!

Additionally there exists a package for Utility Based Regression in R here: https://github.com/paobranco/UBL I believe that implementing this for python and as a part of this project would be benifical for more accurate predictions while using the SMOGN algorithm.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Acknowledgments

All contributors for PySPLIT!

The NASA SARP program for giving me the courage to chase the idea this project is based off.


Copyright (C) 2024 Victor Geiser. Licensed under the General Public License v3.0 (GPLv3).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages