Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a node operator, I want to ingest metadata regarding secondary products that belong to a collection. #108

Closed
jordanpadams opened this issue Mar 18, 2021 · 6 comments · Fixed by NASA-PDS/harvest#47 or NASA-PDS/registry-mgr#21
Assignees
Labels
p.must-have requirement the current issue is a requirement

Comments

@jordanpadams
Copy link
Member

jordanpadams commented Mar 18, 2021

Motivation

...so that I can enable traceability between collections and all associated products, even if they are secondary.

Additional Details

N/A

Acceptance Criteria

Given I have a collection containing secondary products
When I perform harvest and ingest of the collection into the registry
Then I expect to ingest the LID/LIDVID of the secondary product(s)
Then I expect to ingest some identifying information that this is a secondary product versus primary
Then I expect to query that information from the registry

@jordanpadams jordanpadams added p.must-have requirement the current issue is a requirement sprint-backlog labels Mar 18, 2021
@jordanpadams jordanpadams changed the title As a user, I want to search secondary products that belong to a collection. As a user, I want to ingest secondary products that belong to a collection. Mar 26, 2021
@jordanpadams jordanpadams changed the title As a user, I want to ingest secondary products that belong to a collection. As a user, I want to ingest metadata regarding secondary products that belong to a collection. Mar 26, 2021
@jordanpadams jordanpadams changed the title As a user, I want to ingest metadata regarding secondary products that belong to a collection. As a node operator, I want to ingest metadata regarding secondary products that belong to a collection. Mar 29, 2021
@tdddblog
Copy link

Elasticsearch query: Get last version of a collection (query "registry" index):
NOTE: I fixed "vid" datatype to be "float". Delete and recreate registry.

{
  "query": {
    "match": {
      "lid": "urn:nasa:pds:orex.spice:document"
    }
  },
  "size": 0,
  "aggregations": {
    "max_vid": {"max": {"field": "vid"}}
  }
}

@tdddblog
Copy link

Elasticsearch query: Get primary / secondary product references from a collection (query "registry-refs" index):

Version 1:

{
  "query": {
    "bool" : { 
      "must": { "term" : { "collection_lidvid": "urn:nasa:pds:orex.spice:document::6.0" }},
      "filter": { "term" : { "reference_type": "primary" }}
    }
  }
}

Version 2:

{
  "query": {
    "bool": {
      "must": {
        "match_all": {}
      },
      "filter": [
        { "term": { "collection_lidvid": "urn:nasa:pds:orex.spice:document::6.0" }},
        { "term": { "reference_type": "secondary" }}
      ]
    }
  }
}

@rchenatjpl
Copy link

@tdddblog @jordanpadams @tloubrieu-jpl
I added a secondary product to a collection, and I see it in refs-docs.json. Is that sufficient? If not, how do I get 'registry-manager load-data' to load it? Thanks

@rchenatjpl rchenatjpl reopened this May 27, 2021
@rchenatjpl
Copy link

rchenatjpl commented May 27, 2021

I am not seeing what I expect, so I reopened the issue, but feel free to convince me my expectations are wrong. Unzip the attached. On an empty registry
% harvest -c regapp143/regapp143.cfg
[SUMMARY] Output directory: /tmp/harvest/out
[SUMMARY] Output format: json
[SUMMARY] Reading configuration from /Users/rchen/Desktop/regapp143/regapp143.cfg
[WARN] Registry is not configured
[INFO] Processing bundle directory /Users/rchen/Desktop/regapp143
[INFO] Processing bundle /Users/rchen/Desktop/regapp143/bundle_insight_seis.xml
[INFO] Processing collection /Users/rchen/Desktop/regapp143/data_derived/collection_data_derived.xml
[INFO] Processing collection /Users/rchen/Desktop/regapp143/xml_schema/collection_xml_schema.xml
[INFO] Processing products...
[INFO] Processing product /Users/rchen/Desktop/regapp143/data_derived/changelog_extended_multiorigin_v5_2020-10-12.xml
[SUMMARY] Summary:
[SUMMARY] Skipped files: 1
[SUMMARY] Processed files: 4
[SUMMARY] File counts by type:
[SUMMARY] Product_Bundle: 1
[SUMMARY] Product_Collection: 2
[SUMMARY] Product_Observational: 1
[SUMMARY] Package ID: 7f9c4719-bbd4-4500-8874-94a0c60aa155
%
%
% registry-manager load-data -dir /tmp/harvest/out
Elasticsearch URL: http://localhost:9200
Index: registry
Updating schema with fields from /tmp/harvest/out/fields.txt
Updated 62 fields
Loading file: /tmp/harvest/out/refs-docs.json
Loaded 2 document(s)
Loading file: /tmp/harvest/out/registry-docs.json
Loaded 4 document(s)
%
%
% cat regapp143/xml_schema/collection_xml_schema.csv
S,urn:nasa:pds:system_bundle:xml_schema:insight-xml_schema::1.17
S,urn:nasa:pds:system_bundle:xml_schema:pds-xml_schema::1.17
%
%
% curl "http://localhost:9200/registry/_search?q=*&pretty " |& grep urn:nasa:pds:system_bundle:xml_schema
%
I expected the two Secondary members (prefix u:n:p:system_bundle:xml_schema) to show up somewhere.

regapp143.zip

@jordanpadams
Copy link
Member Author

@rchenatjpl sorry no one got back to you. I will let @tdddblog check this out.

in the future, instead of re-opening the feature request, can we create an I&T bug and just as a reference to this ticket? not to be too much of a manager, but it helps enable traceability and metrics :-)

Screen Shot 2021-05-27 at 9 42 34 AM

@jordanpadams
Copy link
Member Author

new ticket created here: nasa-pds-engineering-node/pds-registry-app#168

@jordanpadams jordanpadams transferred this issue from nasa-pds-engineering-node/pds-registry-app Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p.must-have requirement the current issue is a requirement
Projects
None yet
4 participants