-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As a user, I want to check that all Internal References are valid references to other PDS4 products within the current validating bundle #308
Comments
thanks @mit3ch thought we had a ticket for this already, but apparently not. definitely on our radar, but may be a little more complicated than Richard's script because it has to encompass the entire PDS4 archive. may have to wait until the Registries are installed and all PDS4 data is ingested |
Hi Jordan,
Can we do a phased introduction? For now have it check against LIDs/LIDVIDs in the bundle and give a warning if the referenced product is not in the bundle. Later we can expand the functionality to check against the registry. Richard's tool provides the short term functionality. It identified missing products in otherwise validated bundles. He sent me his code, but that doesn't help anyone else.
Mitch
Dr. Mitch Gordon
SETI Institute
Deputy Manager
PDS Ring-Moon Systems Node
276-393-8822
Pronouns: he, him, his
|
@mit3ch This is definitely a reasonable request. I will discuss with our engineer leading this dev to get an idea of how much effort he thinks this will be in order to weight that into the prioritization.
|
Hi Jordan,
I was writing a heads up email to let you know about the last minute issue, but GitHub got there first.
I think we should have phased introduction with an intermediate stage now and full capability once the registry is up. For the short term solution, just check against the products in the bundle and give a warning if there isn't a match. Richard's tool provides this functionality; it identified missing products in several bundles which passed validation. He's given me his code, but that doesn't help anyone else. We can add checking against the registry later.
Mitch
|
Would such a warning/check fire only when the validation context was set to bundle? (most of the time I am validating product deliveries, so I wouldn't want warnings that referenced products were not found, simply because they were in a separate delivery, or had previously been delivered etc.) |
@msbentley great question. i wasn't thinking this would apply only to bundles, but maybe that would make more sense. we can maybe bring this to SWG for more clarification. |
@jordanpadams Is there a good representative bundle in our test resources? There's no test resources provided for this ticket. |
here is some test data. I will send you the path on our servers. But there are only a few products in there that contain references. Here is a snippet from one of the examples:
<Reference_List>
<Internal_Reference>
<lidvid_reference>urn:nasa:pds:compil-comet:polarimetry:filters::1.0</lidvid_reference>
<reference_type>data_to_document</reference_type>
</Internal_Reference>
</Reference_List> You really just need to take any test bundle we have out there, and add a reference similar to this to a LIDVID of another product in the bundle. If the LIDVID does not exist anywhere in that bundle, we should throw an error. |
Jordan,
I should throw a warning, not an error. Products in one bundle routinely reference a product in another bundle. We just want to give the provider/user a heads up that there is a reference to a LID not in the bundle. A more complicated test would parse the LID to determine if the LID indicates the product is part of the bundle (first segment after u:n:p is the bundle base), throw an error for failure in that case and a warning otherwise. However, giving a warning in all cases should be sufficient.
Thanks,
Mitch
|
copy that @mit3ch . we should be able to handle that logic. @qchaupds I think it shouldnt be too complicated to make this happen. the way I see this is we should do this as part of the referential integrity checking we already do with |
We have good success so far. Running validate against a bundle we know has issues. % validate -R pds4.bundle -r report_github308_bundle_invalid.json -s json -t src/test/resources/github308/invalid/bundle_kaguya_derived.xml >& t2 There are 3 warnings for 2 labels regarding a reference pointing to a non-existent logical identifier. {pds-dev3.jpl.nasa.gov}/home/qchau/sandbox/validate 110 % egrep "label|not found" report_github308_bundle_invalid.json
The reference urn:nasa:pds:kaguya_grs_spectra:document:kgrs_calibrated_spectra does not occur anywhere as a logical_identifier: {pds-dev3.jpl.nasa.gov}/home/qchau/sandbox/validate/src/test/resources/github308/invalid 119 % grep -rn "urn:nasa:pds:kaguya_grs_spectra:document:kgrs_calibrated_spectra" . | grep logical_identifier The reference urn:nasa:pds:kaguya_grs_spectra:document:kgrs_ephemerides_doc does not occur anywhere as a logical identifier: pds-dev3.jpl.nasa.gov}/home/qchau/sandbox/validate/src/test/resources/github308/invalid 120 % grep -rn "urn:nasa:pds:kaguya_grs_spectra:document:kgrs_ephemerides_doc" . | grep logical_identifier There is a label src/test/resources/github308/invalid/VALID_odf07155_msgr_11.xml {pds-dev3.jpl.nasa.gov}/home/qchau/sandbox/validate/src/test/resources/github308/invalid 124 % grep logical_identifier /home/qchau/sandbox/validate/src/test/resources/github308/invalid/VALID_odf07155_msgr_11.xml However, the label does get a warning for not belong to anyone which is expected.
|
…ucts within the current bundle 1. Add test resources for github308 to src/test/resources 2. Add functions to support parsing for lid_reference, lidvid_reference and logical_identifier tags and move some constants from function to private class variables for readability in LabelUtil.java 3. Add new check if a reference is pointing to logical_identifier not in the current bundle in BundleReferentialIntegrityRule.java 4. Add debugs to CollectionReferentialIntegrityRule.java 5. Add new tests and update github51 message count in validate.feature Refs: #308 As a user, I want to check that all Internal References are valid references to other PDS4 products within the current validating bundle
…eference or lidvid_reference to map to a logical_identifier 1. Add getIdentifiersCommon() function to refactoring in LabelUtil.java 2. Use lid_reference or lidvid_reference to map to a logical_identifier instead of using a filename in BundleReferentialIntegrityRule.java 3. Remove slash when checking for combination of two string to avoid confusion in BundleReferentialIntegrityRule.java Refs: #308 As a user, I want to check that all Internal References are valid references to other PDS4 products within the current validating bundle
Check that all internal references are valid references to other prod…
@qchaupds @jordanpadams val308b.zip In the attached, validate should catch that the browse product's reference to a LID in this bundle doesn't exist. Eventually and maybe ideally, validate should catch that the data product's reference to a LID outside this bundle doesn't exist. Search for "xxx" in the .xml files. Validate now catches neither, though it does erroneously catch something related to validate#69 |
thanks @rchenatjpl . I created a new ticket for the bug you found here: #432 per your comment about catching LIDs outside this bundle, that is in our plans for next build once we have the data ingested into the registry |
Fix bug introduced by #308 when converting URLs to local paths on Windows
For more information on how to populate this new feature request, see the PDS Wiki on User Story Development: https://github.com/NASA-PDS/nasa-pds.github.io/wiki/Issue-Tracking#user-story-development
Do the best you can with template. If it is too difficult to create a "story" just jot down as much info as you can.
Motivation
...so that I can ensure referential integrity of references within the bundle to other products within the same parent bundle
Additional Details
Need to confirm that every LID or LIDVID referenced in an Internal_Reference class exists.
Concatentate every LID/LIDVID in Identification Area from every label in a bundle. Then for each xml label, verify that each LID/LIDVID not in the Identification_Area is included in the concatenated list. If not giving a warning that there is a missing product. Ideally also check against all registered LIDs & LIDVIDs.
Check with Richard Chen, he has a python script that does this checking.
Acceptance Criteria
Given a product that contains one or more
Internal_Reference
s to product LID/LIDVIDs within the same parent bundleWhen I perform validation of the bundle
Then I expect to validate that all LIDs/LIDVIDs to products within the bundle are valid references
The text was updated successfully, but these errors were encountered: