Skip to content

Releases: creare-com/podpac

3.1.1 Release

21 Jul 20:18
81059ca
Compare
Choose a tag to compare

This release was in support of the GeoWATCH application. Bugs/features added were to support server deployment.

Features

  • Added OGR datasource node for reading shapefiles
  • Compositers.multithreading: For some compositors, it's important to actually evaluate the nodes in serial for performance reasons, regardless of the global multithreading setting. Now compositors user settings['MULTITHREADING'] by default, but OrderedCompositors always set this to False. In either case it can be overwritten on a node-by-node basis.
  • RasterioSource.prefer_overview_closest: when selecting overview levels, we can either select the coarsest overview smaller than the eval coordinates OR we can select the overview with the closest step size to the eval coordinates (this may be coarser than the eval coordinates). Setting this attr to True will select the closest overview instead of the closest higher resolution overview.
  • Improved speed of evaluations by eliminating unneccessary CRS validations
  • Added decode_cf attribute to Dataset data source node
  • Default interpolation can now be specificief application-wide through the podpac.settings["DEFAULT_INTERPOLATION"] setting
  • Added MockWCSClient to ogc.py for WCS endpoints that do not implement get_coverage. This make it easy to turn PODPAC into a lightweight WCS server, and then use a PODPAC WCS client.
  • Added prefer_overviews and prefer_overviews_closest attributes to Rasterio data source node. These attributes allow users to pull from the overviews directly for coarse requests.
  • Added the point prober. This allows users to probe the values of an algorithm pipeline at a point. See Node.probe
  • Added the from_name_params method to Node, allowing nodes to be created from the node name + additional parameters.
  • Renamed set_unsafe_eval to allow_unrestricted_code_execution for a more descriptive name.
  • Improved specification of enumerated colormaps in the Style
  • Enabled saving to a geotiff memory file to support WCS calls

Bugfixes

  • Fixed crs mismatch bug in Reproject node
  • Fixed lat/lon ordering bug for different versions of WMS/WCS in from_url method of Coordinates
  • Fixed bug in Coordinates.transform where ArrayCoordinates turned into UniformCoordinates for two CRS with linear mapping.
  • Fixed bug in DataSource node where get_data returns coordinates that are different from the request (this happens in the case where raw data is returned)
  • Fixed BBOX order specification error in WCS node, where different versions of WCS change the order of lat/lon. This is now handled correctly.
  • Fixed a number of interpolation errors:
    • InterpolationMixin will no longer cache internal evaluations which lead to strange caching errors
    • Fixed selector bugs related to negative step sizes
    • Fixed nearest neighbor interpolation bugs related to negative step sizes
    • Fixed Selector uniform coordinates short-cut
  • Fixed bug where DataArray attributes were dropped when doing basic math operations
  • Fixed bug in to_geotiff export function (misplaced parenthesis)

Interpolation Refactor

20 Dec 21:31
Compare
Choose a tag to compare

3.0.0

Interpolation refactoring. Interpolation now lives as an Algorithm Node. As such,
interpolation can exist in any part of a pipeline, and even multiple times. As
part of this improvement, we also implemented "Selectors" which subselect data
based on the interpolation method specified BEFORE data is pulled from remote
servers.

Because this refactor changed the interface somewhat, we bumped the major version number.

The MAJOR change with the PODPAC functionality is that now some Nodes may return DIFFERENT (not interpolated) coordinates than the eval coordinates.

Features

  • Added Interpolation Node and InterpolationMixin to restore backwards compatibility with most nodes.
  • Replace WCS node with a new version that uses owslib under the hood. Also added authentiation support.
  • Added SoilGrids WCS data sources
  • Added an "Xarray" interpolator, which uses xarray's interpolation methods. This now allows linear project for time, for example.
  • Interpolators will now throw warning if the user specifies an interpolation parameter which is not used.
  • Improved interpolation documentation
  • Added "Autozoom" functionality for TerrainTiles datasource
  • Added Compositor nodes that combine multiple files/tiles of a single datasource BEFORE interpolation
  • Removed SMAP PyDAP datalib -- it was always unstable whereas the EGI version usually works
  • Improved Rasterio node -- it now read datasources directly using Rasterio instead of going through s3fs.

Bugfixes

  • Can now clear ram cache before cache is eliminated
  • Fixed #303, UnitsDataArray deserialization
  • Removed support for "numpy" return type in Algorithm nodes, since coordinates can now be altered in Algorithm Nodes
  • Fixed styling and plugin information is being set 7aef43b5a
  • Fixed some floating point rounding issues at tile edges 8ac834d
  • Fixed Coordinates.from_url to work correctly with different versions of OCG WMS call (and possible WCS calls, but the WCS documentation and my reference servers disagree...)

SoilScape + Cache Expiration

17 Oct 01:58
Compare
Choose a tag to compare

Introduction

Adding subdataset support for hdf4 data sources (i.e. downloaded MODIS netcdf file), wrapping SoilScape data, and adding
expiration to cache.

This release also drops Python 3.5 support.

Features

  • Subdataset support in Rasterio Node, see #410
  • Adding SoilScape data source, and disk cache expiration, see #419
  • PyDAP node will now retry requests incase of server throttling, see 514dc5d

Bugfixes

  • Added dimensions to modis and cosmos compositors
  • Fixed version numbers in smap_egi datasource, and these are now looked up automatically
  • Fixed a precision bug on selection with time coordinates

2.2.1 Landsat8, Sentinel, and MODIS

14 Jul 14:46
Compare
Choose a tag to compare

Introduction

Wrapping Landsat8, Sentinel2, and MODIS data and improving interpolation.

Features

  • Added datalib.satutils which wraps Landsat8 and Sentinel2 data
  • Added datalib.modis_pds which wraps MODIS products ["MCD43A4.006", "MOD09GA.006", "MYD09GA.006", "MOD09GQ.006", "MYD09GQ.006"]
  • Added settings['AWS_REQUESTER_PAYS'] and authentication.S3Mixing.aws_requester_pays attribute to support Sentinel2 data
  • Added issubset method to Coordinates which allows users to test if a coordinate is a subset of another one
  • Added environmental variables in Lambda function deployment allowing users to specify the location of additional
    dependencies (FUNCTION_DEPENDENCIES_KEY) and settings (SETTINGS). This was in support the WMS service.
  • Intake nodes can now filter inputs by additional data columns for .csv files / pandas dataframes by using the pandas
    query method.
  • Added documentation on Interpolation and Wrapping Datasets

Bug Fixes

  • Added dims attributes to Compositor nodes which indicates the dimensions that sources are expected to have. This
    fixes a bug where Nodes throw and error if Coordinates contain extra dimensions when the Compositor sources are missing
    those dimensions.
  • COSMOSStations will no longer fail for sites with no data or one data point. These sites are now automatically filtered.
  • Fixed core.data.file_source closing files prematurely due to using context managers
  • Fixed heterogenous interpolation (where lat/lon uses a different interpolator than time, for example)
  • datalib.TerrainTiles now accesses S3 anonymously by default. Interpolation specified at the compositor level are
    also now passed down to the sources.

Breaking changes

  • Fixed core.algorithm.signal.py and in the process removed SpatialConvolution and TemporalConvolutions. Users now
    have to label the dimensions of the kernel -- which prevents results from being modified if the eval coordinates are
    transposed. This was a major bug in the Convolution node, and the new change obviates the need for the removed Nodes,
    but it may break some pipelines.

2.0.0 Parallel Computation and MODIS

23 Apr 11:20
Compare
Choose a tag to compare

Introduction

This is the final release supported by NASA under Contract No 80NSSC18C0061.

Features

  • Added MODIS datasource datalib.modis_pds
  • Added datalib.weathercitizen to retrieve weathercitizen data
  • Added datalib.cosmos_stations to retrieve soil moisture data from the stationary COSMOS soil moisture network
  • Added algorithm.ResampleReduce, which allows users to coarsen a dataset based on a reduce operation (such as mean, max, etc.).
  • Added the managers.parallel submodule that enables parallel computation with PODPAC in a multi-threaded, multi-process, or multi-AWS-Lambda-function way
  • Added the managers.multi_process submodule that enables PODPAC nodes to be run in another process.
  • Added the compositor.UniformTileCompositor and compositor.UniformTileMixin to enable compositing of data sources BEFORE harmonization (so that interpolation can happen across data sources with the same coordinate systems)
  • Added the validate_crs flag to Coordinates so that validation of the crs can be skipped if needed (this improved speed)
  • Added managers.aws.Lambda.eval_timeout attribute so that Python will only wait a certain amount of time before returning from a invoke call.
  • Added asynchronous evaluation of managers.aws.Lambda Nodes. Note! This does not do error checking (e.g. exceeding AWS allocation f the number of allowed concurrent threads).
  • Added alglib.climatology, which allows computation of Beta distribution fits to time series for each day of the year
  • Added core.algorithm.stats.DayOfYearWindow, which is similar to GroupReduce, but also allows a window for the computation and only works for time
  • Coordinates will now be automatically simplified when transformed from one crs to another. Previously, transformations always resulted in DependentCoordinates. This change adds the .simply method to some Coordinates1d classes.
  • Improved performance of PODPAC AWS Lambda function by only downloading dependencies when the function is not already "hot". "Hot" functions retain the files in the /tmp directory of the Lambda function.
  • data.Zarr will now use zarr.open_consolidated whenever the file_mode='r' for more efficient read operations on S3
  • Added PODPAC version to pipeline definitions

Bug Fixes

  • Fixed algorithm.GroupReduce to accept dayofyear, weekofyear, season, and month. It also now returns the time coordinate in one of these units.
  • Implemented a circular dependency check to avoid infinite recursion and locking up due to cache accessing. This change also defined the NodeDefinitionError exception.
  • Fixed the UnitsDataArray.to_format function's zarr_part format to work propertly with parallel computations
  • Added the [algorithm] dependencies as part of the AWS Lambda function build -- previously the numexpr Python package was missing

Breaking changes

  • Renamed native_coordinates to coordinates
  • Deprecated Pipeline nodes
  • Removed ctype and segment_length from Coordinates. Now specify the boundary attribute for DataSource nodes instead.
  • Fixed datalib.smap_egi module to work with xarray version 0.15
  • Modified PODPAC's root setting directory from ~/.podpac to ~/.config/podpac
  • Changed default cache behaviour to overwrite by default (prevents certain cache lock situations)
  • Removed datalib.airmoss -- it was no longer working!

Maintenance

  • Refactored the way PODPAC keeps track of Node definition. Most all of it is now handled by the base class, previously DataSource, Algorithm, and Compositor had to implement specialized functions.
  • Refactored datalib nodes to prefer using the new cached_property decorator instead of defaults which were causing severe circular dependencies
  • Refactored DataSource nodes that access files on S3 to use a common Mixin
  • Refactored authentication to use more consistent approach across the library

Incorporate Lessons from Drought Monitor Application

07 Feb 20:30
Compare
Choose a tag to compare

Introduction

The purpose of this release was to make the software more robust and to improve the Drought Monitor Application https://creare-com.github.io/podpac-drought-monitor/src/.

Features

  • Algorithm arrays can now be multi-threaded. This allows an algorithm with multiple S3 data sources to fetch the data
    in parallel before doing the computation, speeding up the process. See #343
  • Improvements to AWS interface. See #336
  • Added budgeting / billing capability to manage AWS resources. See #361
  • Added GeoTIFF export / import capability. Lots of work with geotransforms in the Coordinates object. See #364.
  • Nodes can now have multiple output channels. This support multispectral or multichannel data. See #348.

Bug Fixes

  • When intersecting time coordinate of different precision, no intersection would result. See #344
  • Fixed Array datasource serialization 55fcf30

Backwards Incompatible Changes

  • The H5PY, CSV, and Zarr nodes interfaces were unified as such, the following attributes have changed:
    • datakey --> data_key
    • latkey --> lat_key
    • lonkey --> lon_key
    • altkey --> alt_key
    • timekey --> time_key
    • keys --> available_keys
    • CSV.lat_col --> lat_key
    • CSV.lon_col --> lon_key
    • CSV.time_col --> time_key
    • CSV.alt_col --> alt_key

AWS Automation Release

06 Nov 20:37
Compare
Choose a tag to compare

Introduction

The purpose of this release was to develop a short course for AMS2020. A major feature of this release is automated
creation of the PODPAC Lambda function. As part of this we implemented a few more additional
features, and fixed a number of bugs.

Features

  • Automated Lambda Function creation using PODPAC. See #326, #306
  • Added a context manager for easy temporary settings. See #329
  • Added generic algorithm module with Mask and Generic nodes. See #325, #323
    • Note, this required the new unsafe evaluation setting ce8dd68 bbe251a
  • Added styles to node serialization, enabling customization of WMS layers. See #317
  • Made mode attr's read-only by default. See #315
  • Updated the definition of advanced interpolation for nodes 05163b4

Bug

  • Corrected string comparison in AWS Lambda function 7dbaf3f
  • _first_init usage should have called super c02dc03
  • Made style definition consistent 279eab9
  • Fixed DroughtCategory algorithm -- the upper limit was not correct 9b92bb7
  • Fixed failure on pre-commit hook installation for dev version of podpac aebe6b0 #322
  • Fixed to interpolation #320 f9ad493 f9ad493 fadf939
  • Fixed error due to missing quality flag in SMAP(EGI) node 26fcb16

Deprecated Features

  • The Pipeline Node is being deprecated, targeted for 2.0 release. Use Node.from_json instead.

Drought-Monitor application release

31 Jul 20:26
Compare
Choose a tag to compare

Introduction

The purpose of this release was to develop features and fix bugs found while developing a drought monitor web application: https://github.com/creare-com/podpac-drought-monitor

FEATURES

  • Made the file types used to cache items object-specific.
    • Outputs of nodes now cache to hdf5, making them more transferable between systems
    • Coordinates cache a pickle files (fallback file type)
    • Changed the names of cached outputs to be more robust. Now uses: <cleaned-node-prefix>-<cleaned-key>_<node-hash>_<coordinates-hash>_<key-hash>
  • Added the podpac.utils.clear_cache method to globally clear PODPAC cache
  • Implemented interface for retrieving data from the Earthdata Gateway Interface
    • Implemented specific EGI interface to SMAP data
  • Added Zarr DataSource Node
    • Can now read local or remote (s3) Zarr datasources using PODPAC
  • Enabled Python Black, an automated formatter for Python.
  • Removed the S3 DataSource type
  • Implemented Drought Monitor Example application
  • Made PODPAC's JSON serialization consistent with JavaScript (in terms of white space). Now produces the same md5 hash in Python and JavaScript
  • Added substitute_eval_coords option to coord_select algorithms to avoid awkwardness introduced by unmatched xarray coordinates when doing computations to compare different years (for example)
  • Added to_format method to UnitsDataArray objects to allow serialization into multiple different formats programmatically
    • e.g. binary_array = output.to_format('png') # Get byte array of image data
    • e.g. output.to_format('nc', file_name) # Save data to disk in netcdf format
  • Added a public interface to managers: podpac.managers
  • Enabling json output of data from pipelines and the AWS Lambda function
  • Checking and filtering SMAP data based on the quality flag
  • Optimization on Lambda functions, where computations are short-circuited in case the output file already exists on S3
  • Added from_url methods to Coordinates and Node objects to create objects based on WMS/WCS style urls
  • Enabled WMS requests through PODPAC AWS Lambda function handler

BUGS:

  • Fixed naming convention for cached files: previously possible conflicts of hashes could exist
  • Fixed order of dimensions (x/y versus lat/lon) in pyproj. Requires version 2.2 of pyproj.
  • Made Lambda manager more robust to bucket naming conventions (trailing slash is now optional)