Skip to content

rqthomas/NEON-forecast-code

 
 

Repository files navigation

Water temperature forecasts at National Ecological Observatory Network (NEON) lakes using FLARE (Forecasting Lake And Reservoir Ecosystems)


👥 R. Quinn Thomas, Ryan P. McClure, Tadhg N. Moore, Whitney M. Woelmer, Carl Boettiger, Renato J. Figueiredo, Robert T. Hensley, Cayelan C. Carey

Questions? ✉️ rqthomas@vt.edu


Motivation

Freshwater lakes globally are increasingly threatened as a result of rapidly changing land use and climate (Carpenter et al., 2011). In response, developing forecast workflows has has emerged as a powerful tool to predict future environmental conditions in lakes in order to make informed management decisions for safety, health, and conservation (Carey et al., 2021; Baracchini et al., 2020; Page et al., 2018). However, the discipline of forecasting in lakes is still in the early stages of making forecasts that are robust and reproducible. As a result, there is a dire need for open-source forecast workflows that are broadly applicable to many different lake ecosystems and flexible to different datastreams and local needs.

Here, we applied the FLAREr forecasting system (Thomas et al., 2020) to six NEON lakes to test FLAREr's robustness and scalability to other sites. The NEON lakes serve as an exemplar case to test FLARE because they have reliable, open-source datastreams in which new data can be acquired at relatively low latencies (<1.5 months). The goal of our forecast scaling study was to show that FLAREr is scalable to other lake ecosystems and can produce robust forecasts of water temperatures up to 35-days into the future. Altogether, we hope this workflow is a first step to building a community of lake and reservoir forecast practitioners that develop reliable forecast workflows and make informed decisions for future lake conservation and management.

Prerequisites

FLAREr has been tested across Windows, Mac, and Linux OS. It also requires R version 4.0.x or higher.

Workflow

We have provided all code used to generate forecasts, analyze forecasts, and recreate figures in this manuscript as a GitHub repository that has been archived on Zenodo (Thomas et al. 2022a). There are three steps to the analysis that are documented as separate R scripts within the repository. First, the 01_combined_paper_workflow.R in the workflows/neon_lakes_ms/ directory of the repository obtains the NEON data and NOAA GEFS weather forecasts and then runs FLARE on the six sites. Since this script runs 159 separate 35-day horizon forecasts for the six lakes, the time required to generate all forecasts depends on the number and speed of computer processors available and can be a multi-day execution. This first step produces a set of output files for the GLM-based and day-of-year null forecasts in a "forecasts" directory.

Second, each ensemble forecast from the first step is aggregated to a mean with predictive intervals and scored (by matching to the corresponding observation, if available), with the summary statistics and observations saved as a set of scored files (one per output file) in a scores directory in the repository. The scoring is generated by the "02_score_forecasts.R" script located in the workflows/neon_lakes_ms/ directory of the repository. While the scores can be generated using output files from the first step, we also provide the output files as an additional Zenodo repository (Thomas et al. 2022b) that can be downloaded and scored using the script without needing to re-run the forecasts.

Third, the scored files are analyzed using an Rmarkdown script located in the main directory of repository (analysis_notebook.Rmd) to produce the figures and data reported in the text. The Rmarkdown script can use the scored files produced by the second step or the scores files available in the additional Zenodo repository (Thomas et al. 2022b).

Our analysis can be reproduced by downloading the Zenodo GitHub repository and running the three scripts associated with the steps described above. Re-running the full analysis requires downloading R, Rstudio, and all the required packages, and as noted above, can take multiple days of execution, depending on the computation available. We provide a script that downloads the required packages (`install.R1 in the main directory of the repository). However, there is no guarantee that other versions of R and packages will produce the same results as presented here.

To enable greater reproducibility, we adapted the GitHub repository (Thomas et al. 2022a) to generate a Binder that is produced by mybinder.org. Mybinder.org provides a web-based version of Rstudio for re-running our GitHub repository code that uses the same version of R and R packages that we used in this analysis (https://mybinder.org/v2/zenodo/10.5281/zenodo.6267616/?urlpath=rstudio). As a result, there is more confidence that the analysis can be reproduced by harnessing the Binder infrastructure, which directly re-runs the analysis on a remote server and provides an Rstudio interface via a web browser for running the scripts described above for each of the three analysis steps.

There are important caveats to using the Binder. First, at the time of this analysis, mybinder.org is free to use, and therefore its computational resources have limits and processing times can be slow. Consequently, we do not recommend running the full generation of the 35-day forecasts in the Binder. The Binder is ideally suited for exploring the scored forecasts and reproducing the figures and values presented in the text (i.e., the analysis_notebook.Rmd script described in the third step above). Second, at the time of this analysis, the Binder does not always consistently launch when accessing the Binder link and occasionally the connection times out. It may require accessing the Binder link again to get a successful launch of the R studio interface.

References

Thomas RQ, McClure RP, Moore TM, et al. 2022a Near-term forecasts of NEON lakes reveal gradients of environmental predictability across the U.S.: code (v1.0). Zenodo repository. https://doi.org/10.5281/zenodo. zenodo.6267616

Thomas RQ, McClure RP, Moore TM, et al. 2022b. Near-term forecasts of NEON lakes reveal gradients of environmental predictability across the U.S.: data, forecasts, and scores. Zenodo repository. https://doi.org/10.5281/zenodo.6643596

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • R 99.6%
  • Dockerfile 0.4%