Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions to access GEFS ensemble forecasts. #106

Merged
merged 4 commits into from
Sep 10, 2015
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,11 @@ Imports:
scales,
rgdal,
XML,
jsonlite
jsonlite,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove trailing comma

Suggests:
testthat,
roxygen2,
knitr,
taxize,
ncdf
ncdf,
ncdf4,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove trailing comma

1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ export(erddap_grid)
export(erddap_info)
export(erddap_search)
export(erddap_table)
export(gefs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you used @export tags in the gefs.R file, but maybe you need to re-document/roxygenize as those exported functions should be in the NAMESPACE file

export(ghcnd)
export(ghcnd_clear_cache)
export(ghcnd_countries)
Expand Down
211 changes: 211 additions & 0 deletions R/gefs.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
#' Get GEFS ensemble forecast data for a specific lat/lon.
#'
#' Fetches GEFS forecast data for every 6 hours out to 384 hours past selected date. GEFS
#' is an ensemble of 21 models that can be summarized to estimate likelihoods of forecasts.
#'
#' @importFrom ncdf4 nc_open ncvar_get
#' @importFrom tidyr gather
#' @export
#'
#' @param var the variable to get. Must be one of the variables listed in \code{gefs_variables()}.
#' @param lat,lon the longitude. Will be converted to the nearest GEFS available
#' longitude. If lon is a list of vlaues, it must be a sequential list, and
#' data are returned for the number of longitudes in the list starting with
#' the maximum value and incrementing through the indexed values for the
#' length of the list.
#' @param date A date/string formatted as YYYYMMDD.
#' @param forecast_time a string indicating which time of day UTC the forecast is from. Options are "0000", "0600", "1200", "1800".
#' @param ens_idx sequential list of ensembles to fetch. Default is all 21. Note that the ensembles are labelled 0-20, so ens_idx=1:3 will return ensembles 0, 1, and 2.
#' @param time_idx sequential list of time increments to return. List is the index
#' of times, which are in 6 hour increments. (e.g. c(1,2) fetches the 6 and 12 hour forecast.)
#' @param raw logical to indicate whether to return raw data matrix or reshaped data frame.
#' @param ... additional parameters passed to \code{ncvar_get}.
#' @return a list containing metadata and accompanying data frame of forecast
#' values. If lat/lon are not specified, the $data is an unprocessed matrix.
#'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a @references tag or similar with link to info on the web for this data

#' @author Nicholas Potter \email{potterzot@@gmail.com}
#' @examples \dontrun{
#'
#' #avialable latitudes and longitudes
#' gefs_latitudes()
#' gefs_longitudes()
#'
#' #get a list of all gefs variables
#' gefs_variables()
#'
#' #All GEFS dimensions
#' gefs_dimensions()
#'
#' #values for a specific dimension
#' gefs_dimension_values("height_above_ground")
#'
#' #example location.
#' lat = 46.28125
#' lon = -116.2188
#'
#' #Get forecast for a certain variable.
#' forecast = gefs("Total_precipitation_surface_6_Hour_Accumulation_ens", lat, lon)
#'
#' #Fetch a different date (available up to 10 days prior to today)
#' forecast_yesterday_prec = gefs("Total_precipitation_surface_6_Hour_Accumulation_ens",
#' lat,
#' lon,
#' date=format(as.Date(Sys.time()) - 1, "%Y%m%d"))
#'
#' #specific ensemble and times, for the 1800 forecast.
#' # here ensembles 1-3 (ensembles are numbered starting with 0)
#' # and time for 2 days from today at 1800
#' gefs(var, lat, lon, forecast_time = "1800", ens_idx=2:4, time_idx=1:8)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The var variable is never defined, AFAI can tell

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@potterzot This eg doesn't work for me, thoughts?

gefs(var, lat, lon, forecast_time = "1800", ens_idx=2:4, time_idx=1:8)
#> Error in R_nc_open: NetCDF: Can't read file
#> Error in ncdf::open.ncdf(gefs_url) : 
#>   Error in open.ncdf trying to open file #> http://motherlode.ucar.edu/thredds/dodsC/grib/NCEP/GEFS/Global_1p0deg_Ensemble/members/GEFS_Global_1p0deg_Ensemble_20150909_1800.grib2

i think that URL doesn't exist, perhaps the URL is not being built correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, It's not yet 1800 UTC so it fails. I will rewrite to add a date that is the previous day so that it will work regardless of the hour that someone runs it.

If you try `forecast_time = "0600" for ex., it should work. Will push an update in a sec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, thanks, guess I should have caught that

#'
#' #One ensemble, all latitudes and longitudes (this is a big file) for the
#' # next 3 days.
#' gefs(var, ens=1, time=1:12)
#' }
#'
gefs <- function(var, lat, lon, ...) {
gefs_GET(var, lat, lon, ...)
}

#' @importFrom ncdf4 nc_open
#' @rdname gefs
gefs_CONNECT <- function(date = format(Sys.time(), "%Y%m%d"),
forecast_time = c("0000", "0600", "1200", "1800")) {

#forecast time
forecast_time = match.arg(forecast_time)

#url parts
gefs_url_pre = 'http://motherlode.ucar.edu/thredds/dodsC/grib/NCEP/GEFS/Global_1p0deg_Ensemble/members/GEFS_Global_1p0deg_Ensemble_'
gefs_url_suf = ".grib2"

#final url
gefs_url = paste0(gefs_url_pre, date, "_", forecast_time, gefs_url_suf)

#open the connection
nc_open(gefs_url)
}

#' @importFrom ncdf4 ncvar_get
#' @rdname gefs
gefs_GET <- function(var, lat, lon,
date = format(Sys.time(), "%Y%m%d"),
forecast_time = c("0000", "0600", "1200", "1800"),
ens_idx = 1:21,
time_idx = 1:65,
dims = NULL,
raw = FALSE,
...) {

#Sanity Checks
if (missing(var)) stop("Need to specify the variable to get. A list of variables is available from gefs_variables().")

#get a connection
con = gefs_CONNECT(date, forecast_time)

#Get a subset of data to speed up access
v = con$var[[var]] # lon, lat, height_above_ground, ens (ensemble), time1
varsize = v$varsize
ndims = v$ndims
n_time = varsize[ndims] #time is always the last dimension
dims = c()
for (i in 1:length(v$dim)) { dims[i] = v$dim[[i]]$name }

#get lat/lon indices
lon_start_idx = if(!missing(lon)) which(v$dim[[1]]$vals==round(max(lon) %% 360, 0)) else 1
lat_start_idx = if(!missing(lat)) which(v$dim[[2]]$vals==round(max(lat), 0)) else 1

#indices of dimensions to read from data
start = rep(1,ndims) #number of dims
start[1] = lon_start_idx #first is always longitude
start[2] = lat_start_idx #first is always latitude
start[ndims - 1] = ens_idx[1] #first ensemble
start[ndims] = time_idx[1] #first time

count_n = rep(1,ndims)

#if not specified, get all locations.
count_n[1] = if(missing(lon)) -1 else length(lon)
count_n[2] = if(missing(lat)) -1 else length(lat)

#ensemble is always the 2nd to last dimension. Take either the variable
#max or the last value indicated by the parameter ens
count_n[ndims-1] = min(varsize[ndims-1], length(ens_idx))

#time is always the last dimension
#take the minimum of the variable max or the value indicated by the parameter time
count_n[ndims] = min(varsize[ndims], length(time_idx))

# actual data
# Do not modify the data, so don't convert (- 273.15) * 1.8 + 32. #convert from K to F
d = ncvar_get(con, v, start = start, count = count_n, ...)

#create the data frame
#For now, if lat/lon are not specified, just return a matrix.
if (raw==FALSE) {
d = gather(as.data.frame(d))

for (i in 1:length(count_n)) {
dim_vals = v$dim[[i]]$vals
if(count_n[i]==1) {
d[dims[i]] = dim_vals[start[i]]
}
else {
d[dims[i]] = dim_vals[0:(count_n[i]-1) + start[i]]
}
}
names(d) = c("key", var, dims)
d = d[,c(dims, var)]
}

fname = strsplit(con$filename, "_")[[1]]
date = fname[7]
forecast_time = strsplit(fname, ".grib2")[[8]]
list(forecast_date = date,
forecast_time = forecast_time,
dimensions = dims,
data = d)
}

########################
# helper functions

#' @export
#'
#' @param con an ncdf4 connection.
#' @rdname gefs
gefs_latitudes <- function(con = NULL, ...) {
gefs_dimension_values("lat", con)
}

#' @export
#' @rdname gefs
gefs_longitudes <- function(con = NULL, ...) {
gefs_dimension_values("lon", con)
}

#' @export
#' @rdname gefs
gefs_variables <- function(con = NULL, ...) {
if (is.null(con)) con = gefs_CONNECT(...)
names(con$var)
}

#' @export
#' @rdname gefs
gefs_dimensions <- function(con = NULL, ...) {
if (is.null(con)) con = gefs_CONNECT(...)
names(con$dim)
}

#' @export
#'
#' @param dim (character) the dimension.
#' @rdname gefs
gefs_dimension_values <- function(dim, con = NULL, ...) {
if (is.null(dim) || missing(dim)) stop("dim cannot be NULL or missing.")
if (is.null(con)) con = gefs_CONNECT(...)
con$dim[[dim]]$vals
}



6 changes: 5 additions & 1 deletion R/rnoaa-package.r
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#'
#' \itemize{
#' \item \code{buoy_*} - NOAA Buoy data from the National Buoy Data Center
#' \item \code{gefs} - GEFS forecast ensemble data
#' \item \code{ghcnd_*} - GHCND daily data from NOAA
#' \item \code{isd_*} - ISD/ISH data from NOAA
#' \item \code{homr_*} - Historical Observing Metadata Repository (HOMR) vignette
Expand All @@ -21,14 +22,17 @@
#' \item \code{tornadoes} - From the NOAA Storm Prediction Center
#' }
#'
#' @section A note about ncdf:
#' @section A note about ncdf/ncdf4:
#'
#' Functions to work with buoy data use netcdf files. You'll need the \code{ncdf}
#' package for those functions, and those only. \code{ncdf} is in Suggests in
#' this package, meaning you only need \code{ncdf} if you are using the buoy
#' functions. You'll get an informative error telling you to install \code{ncdf}
#' if you don't have it and you try to use the buoy functions.
#'
#' The gefs function uses \code{ncdf4}, which is available on CRAN and should be
#' straightfoward to install.
#'
#' Installation of \code{ncdf} should be straightforward on Mac and Windows, but
#' on Linux you may have issues. See http://cran.r-project.org/web/packages/ncdf/INSTALL
#'
Expand Down
79 changes: 79 additions & 0 deletions tests/testthat/test-gefs.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
context("gefs")

#set a location
lat = 46.28125
lon = -116.2188

#variable
var = "Temperature_height_above_ground_ens"

test_that("gefs errors", {

expect_error(gefs(lat=lat, lon=lon), "Need to specify the variable to get. A list of variables is available from gefs_variables().")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this and the below two test blocks not make a web call? if not, then you're right that no skip_on_cran() needed

})

test_that("gefs time and ensemble selection returns correct indices.", {
ens_idx = 2:4
time_idx = 5:10
d = gefs(var, lat, lon, ens_idx = ens_idx, time_idx = time_idx)

expect_equal(dim(d$data), c(length(ens_idx) * length(time_idx), 6))
expect_equal(unique(d$data$ens), ens_idx-1)
expect_equal(unique(d$data$time), (time_idx-1) * 6)
})

test_that("gefs metadata", {
today = format(as.Date(Sys.time()) - 1, "%Y%m%d")
forecast_time = "0600"
d = gefs(var, lat, lon, ens=1, date=today, forecast_time=forecast_time)

expect_equal(d$forecast_date, today)
expect_equal(d$forecast_time, forecast_time)
expect_equal(d$dimensions, c("lon", "lat", "height_above_ground", "ens", "time1"))
})

test_that("gefs_variables returns characters.", {
skip_on_cran()

vars = gefs_variables()

expect_is(vars, "character")
expect_is(vars[1], "character")
})

test_that("gefs_latitudes returns numeric.", {
skip_on_cran()

lats = gefs_latitudes()
expect_is(lats, "array")
expect_is(lats[1], "numeric")
})

test_that("gefs_longitudes returns numeric.", {
skip_on_cran()

lats = gefs_longitudes()
expect_is(lats, "array")
expect_is(lats[1], "numeric")
})

test_that("gefs_dimensions returns character list.", {
skip_on_cran()

dims = gefs_dimensions()
expect_is(dims, "character")
expect_is(dims[1], "character")
})

test_that("gefs_dimension_values returns numeric array.", {
skip_on_cran()

vals = gefs_dimension_values("lat")
expect_is(vals, "array")
expect_is(vals[1], "numeric")

#expect_error(gefs_dimension_values(dim=NULL), "dim cannot be NULL or missing.")
})