Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide support for granule wildcard patterns in data downloader #138

Merged
merged 7 commits into from
Jun 20, 2023
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,14 @@ All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)

## [unreleased]
### Added
- Added support for wildcard search patterns in podaac-data-downloader when executed with the -gr option (i.e. search/download by CMR Granule Ur/Id). Also, added usage details to Downloader.md to describe this new feature [138](https://github.com/podaac/data-subscriber/pull/138).

## 1.13.1
### Fixed
- Fixed an issue where a required library wasn't being included in the installation.

## 1.13.0
### Added
- Added --dry-run option to subscriber and downloader to view the files that _would_ be downloaded without actuall downloading them. [102](https://github.com/podaac/data-subscriber/issues/102)
Expand Down
3 changes: 2 additions & 1 deletion Downloader.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ optional arguments:
-e EXTENSIONS, --extensions EXTENSIONS
Regexps of extensions of products to download. Default is [.nc, .h5, .zip, .tar.gz, .tiff]
-gr GRANULENAME, --granule-name GRANULENAME
Flag to download specific granule from a collection. This parameter can only be used if you know the granule name. Only one granule name can be supplied
Flag to download specific granule from a collection. This parameter can only be used if you know the granule name. Only one granule name can be supplied. Supports wildcard search patterns allowing the user to identify multiple granules for download by using `?` for single- and `*` for multi-character expansion.
--process PROCESS_CMD
Processing command to run on each downloaded file (e.g., compression). Can be specified multiple times.
--version Display script version information and exit.
Expand Down Expand Up @@ -131,6 +131,7 @@ The `-gr` option works by taking the file name, removing the suffix and searchin

Because of this behavior, granules without data suffixes and granules where the the UR does not directly follow this convention may not work as anticipated. We will be adding the ability to download by granuleUR in a future enhancement.

The -gr option supports wildcard search patterns (using `?` for single- and `*` for multi-character expansion) to select and download multiple granules based on the filename pattern. This feature is supported through wildcard search functionality provided through CMR, which is described in the [CMR Search API documentation](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html#parameter-options).

### Download data by cycle

Expand Down
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "podaac-data-subscriber"
version = "1.13.0"
version = "1.13.1"
description = "PO.DAAC Data Subscriber Command Line Tool"
authors = ["PO.DAAC <podaac@podaac.jpl.nasa.gov>"]
readme = "README.md"
Expand All @@ -15,6 +15,7 @@ packages = [
python = "^3.7"
requests = "^2.27.1"
tenacity = "^8.0.1"
packaging = "^23.0"

[tool.poetry.dev-dependencies]
pytest = "^7.1.2"
Expand Down
2 changes: 1 addition & 1 deletion subscriber/podaac_access.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
import tenacity
from datetime import datetime

__version__ = "1.13.0"
__version__ = "1.13.1"
extensions = ["\\.nc", "\\.h5", "\\.zip", "\\.tar.gz", "\\.tiff"]
edl = "urs.earthdata.nasa.gov"
cmr = "cmr.earthdata.nasa.gov"
Expand Down
5 changes: 4 additions & 1 deletion subscriber/podaac_data_downloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ def create_parser():
# Get specific granule from the search
# https://github.com/podaac/data-subscriber/issues/109
parser.add_argument("-gr", "--granule-name", dest="granulename",
help="Flag to download specific granule from a collection. This parameter can only be used if you know the granule name. Only one granule name can be supplied",
help="Flag to download specific granule from a collection. This parameter can only be used if you know the granule name. Only one granule name can be supplied. Supports wildcard search patterns allowing the user to identify multiple granules for download by using `?` for single- and `*` for multi-character expansion.",
default=None)

parser.add_argument("--process", dest="process_cmd",
Expand Down Expand Up @@ -190,6 +190,9 @@ def run(args=None):
('GranuleUR[]', cmr_granule),
('token', token),
]
#jmcnelis, 2023/06/14 - provide for wildcards in granuleur-based search
if '*' in cmr_granule or '?' in cmr_granule:
params.append(('options[GranuleUR][pattern]', 'true'))
if args.verbose:
logging.info("Granule: " + str(cmr_granule))

Expand Down