-
Notifications
You must be signed in to change notification settings - Fork 2
Registry Workshop 2022‐06‐28
Walk through a basic procedure for publishing data to your PDS Node Registry.
Meeting Recording: https://jpl.webex.com/jpl/ldr.php?RCID=5e8a6ef8c4de62963b71ab22f39f0964
See installation guide https://nasa-pds.github.io/registry/install/tools.html#tools
- Registry Manager
- Harvest
From an authorized IP on your institution network.
For example:
% curl -u tloubrieu_en 'https://search-en-prod-di7dor7quy7qwv3husi2wt5tde.us-west-2.es.amazonaws.com'
Enter host password for user 'tloubrieu_en':
{
"name" : "c297449108b402887f1dfbd4d66c2ea6",
"cluster_name" : "445837347542:sbnpsi-prod",
"cluster_uuid" : "GmDxA8ULQNy6JOdBBsRV5A",
"version" : {
"number" : "7.10.2",
"build_type" : "tar",
"build_hash" : "unknown",
"build_date" : "2022-04-15T09:52:37.749040Z",
"build_snapshot" : false,
"lucene_version" : "8.9.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "The OpenSearch Project: https://opensearch.org/"
}
-
lidvid: (for example
urn:nasa:pds:insight_rad::2.1
) -
local file path: (for example
/Users/loubrieu/Documents/pds/registry_workshop_20220628/data/urn-nasa-pds-insight_rad
) - Web URL: (for example https://pds.nasa.gov/data/pds4/test-data/registry/urn-nasa-pds-insight_rad/)
-
Create a new file named ‘auth.cfg‘
- This can be anywhere, but you will need this file anytime you register data, so your $HOME directory is probably a good spot, or somewhere else you can easily access.
-
Add the following to the file:
# true - trust self-signed certificates; false - don't trust.
trust.self-signed = true
user = <your personal user name>
password = <you personal password>
- Save the path that that
auth.cfg
file next step, for example:/Users/loubrieu/Documents/pds/registry_workshop_20220628/auth.cfg
- Start from an example found in the harvest installation folder:
cp ${HARVEST_HOME}/conf/examples/bundles.xml jobs/my-bundle-job.xml
-
Edit your job file
jobs/my-bundle-job.xml
:- Add your node short name, see online docs here
- Edit the path where your bundle is, see online docs here
- Edit the file’s URL replacement rule, see online docs here
- Registry integration, see online docs here (Re-use the path of the auth.cfg file created in Create your OpenSearch Authentication File step above)
-
Example of base URL:
| IMG Node | https://pds-imaging.jpl.nasa.gov/data/ | | EN | https://pds.nasa.gov/data/ |
Save the path for next step, for example: ./jobs/my-bundle-job.xml
harvest -c jobs/my-bundle-job.xml
See Online Docs Here
Get:
- The URL of your OpenSearch database
- The lidvid of your bundle and the URL of your bundle
curl -u tloubrieu_en 'https://search-en-prod-di7dor7quy7qwv3husi2wt5tde.us-west-2.es.amazonaws.com/registry/_search?q={_id:"urn:nasa:pds:insight_rad::2.1"}&pretty=true' \
| json_pp
- Check that the file’s URL are reachable
In your node's API:
curl https://pds.nasa.gov/api/search-en/1.0/products/urn:nasa:pds:insight_rad::2.1
See other Node API endpoints: https://nasa-pds.github.io/pds-api/search-api-user-guide/endpoints.html#endpoints
Across the PDS:
curl https://pds.nasa.gov/api/search/1.0/products/urn:nasa:pds:insight_rad::2.1
It is not!
Why? Because the archive_status
is “staged”
Find the status in the OpenSearch request result.
See Online Docs Here
- First explore registry manager options:
registry-manager --help
registry-manager set-archive-status -help
- Set the status to ‘archived’:
registry-manager set-archive-status -es https://search-en-prod-di7dor7quy7qwv3husi2wt5tde.us-west-2.es.amazonaws.com:443 -auth auth.cfg -lidvid urn:nasa:pds:insight_rad::2.1 -status archived
See Online Docs Here
On your Node’s API server:
curl https://pds.nasa.gov/api/search-en/1.0/products/urn:nasa:pds:insight_rad::2.1
On PDS API server:
curl https://pds.nasa.gov/api/search/1.0/products/urn:nasa:pds:insight_rad::2.1
or view through a web browser: https://pds.nasa.gov/api/search/1.0/products/urn:nasa:pds:insight_rad::2.1
Query someone else’s lidvid. Share your lidvd=ids in the webex chat and query for someone’s else's registered bundle product
curl https://pds.nasa.gov/api/search/1.0/products/urn:nasa:pds:insight_rad::2.1
Other query examples:
Query for the Bundle’s Collections:
curl https://pds.nasa.gov/api/search/1.0/bundles/urn:nasa:pds:insight_rad::2.1/collections
Query for Bundle’s Products:
curl https://pds.nasa.gov/api/search/1.0/bundles/urn:nasa:pds:insight_rad::2.1/products
See Search API User Guide for more details.
If this is test data that you don’t want to leave in your registry.
- Find the package-id in the OpenSearch results by querying the Registry again, and searching for "package-id".
- Also known as a "run ID", this package ID can be used to access/remove past ingestions.
curl -u tloubrieu_en 'https://search-en-prod-di7dor7quy7qwv3husi2wt5tde.us-west-2.es.amazonaws.com/registry/_search?q={_id:"urn:nasa:pds:insight_rad::2.1"}' \
| json_pp
- Then based upon that package ID, you will
registry-manager delete-data -es https://search-en-prod-di7dor7quy7qwv3husi2wt5tde.us-west-2.es.amazonaws.com:443 -auth auth.cfg -packageId 3e755f49-0cde-4d80-bfe8-020fa6537a36