Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requirement #308 does not appear to be working for checking referential integrity from products to others in the bundle #432

Closed
jordanpadams opened this issue Oct 26, 2021 · 44 comments Β· Fixed by #440 or #762

Comments

@jordanpadams
Copy link
Member

jordanpadams commented Oct 26, 2021

πŸ› Describe the bug identified during I&T

Caught by @rchenatjpl here: #308 (comment)

Looks like #308 implementation was refactored out.

πŸ₯Ό Related Test Case(s)

πŸ” : Related issues


βž• Additional Details

πŸ“œ To Reproduce

See comment referenced above, in the attached test data below, validate does not catch browse product's reference to a LID in this bundle doesn't exist.

πŸ•΅οΈ Expected behavior

Validate should catch invalid reference.

πŸ“š Version of Software Used

🩺 Test Data / Additional context

https://github.com/NASA-PDS/validate/files/7414791/val308b.zip

🏞Screenshots

πŸ–₯ System Info

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

πŸ¦„ Related requirements

πŸ¦„ #308

βš™οΈ Engineering Details

@jordanpadams jordanpadams added bug Something isn't working I&T needs:triage labels Oct 26, 2021
@jordanpadams jordanpadams self-assigned this Oct 26, 2021
@jordanpadams jordanpadams added this to the 16.Tyson.Gay milestone Oct 26, 2021
qchaupds pushed a commit that referenced this issue Nov 9, 2021
…n the bundle

1. Add test resources : src/test/resources/github432
2. Add debug statements : src/main/java/gov/nasa/pds/tools/util/ReferentialIntegrityUtil.java
3. Modify uppercase 'ERROR' to lowercase 'error' so as not to confuse the log file : src/main/java/gov/nasa/pds/tools/validate/rule/pds4/LabelValidationRule.java
4. Add debug statements and printing of warning message when the exceptions.get(location) call returns null : src/main/java/gov/nasa/pds/validate/ValidateLauncher.java
5. Add debug statements and checking for null-ness before calling toString() to avoid NullPointerException : src/main/java/gov/nasa/pds/validate/report/Report.java

Refs:

#432 Missing referential integrity checks from browse products to others in the bundle
@jordanpadams jordanpadams reopened this Nov 12, 2021
@jordanpadams
Copy link
Member Author

jordanpadams commented Nov 12, 2021

@qchaupds merging the PR #442 also merged #440, but that did not fix this issue.

@jordanpadams
Copy link
Member Author

see comments here for more details: #440 (review)

@mace-space
Copy link

Here are bugs I've encountered.

I ran validate using --rule pds4.bundle but no referential checks were performed (even though with that rule option it should check references):

Referential Integrity Check Summary:
0 check(s) passed
0 check(s) failed
0 check(s) skipped

I also tried running it on the specific collection where I had spotted LID errors:

% validate --rule pds4.collection --report-file rav1ciun_validate_browse_collection.log --verbose 2 --target ./wenkert_pdart16_vgr_rav1ciun/browse

Here's an example browse label from that collection:

1 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
2
3 <?xml-model href="https://pds.nasa.gov/pds4/pds/v1/PDS4_PDS_1G00.sch"
4 schematypens="http://purl.oclc.org/dsdl/schematron"?>
5
6 <Product_Browse xmlns="http://pds.nasa.gov/pds4/pds/v1"
7 xmlns:pds="http://pds.nasa.gov/pds4/pds/v1"
8 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
9 xsi:schemaLocation="http://pds.nasa.gov/pds4/pds/v1 https://pds.nasa.gov/pds4/pds/v1/PDS4_PDS_1G00.xsd">
10 <Identification_Area>
11 <logical_identifier>urn:nasa:pds:wenkert_pdart16_vgr_rav1ciun:browse_qedr:vgr_1201-mamqtv-001010-data-001010.001.png</logical_identifier>
12 <version_id>1.0</version_id>
13 <title>RAV1CIUN DATA Browse Product - vgr_1201-mamqtv-001010-data-001010.001.png</title>
14 <information_model_version>1.16.0.0</information_model_version>
15 <product_class>Product_Browse</product_class>
16 </Identification_Area>
17 <Reference_List>
18 <Internal_Reference>
19 <lid_reference>urn:nasa:pds:wenkert_pdart16_vgr_rav1ciun:browse_qedr:vgr_1201-mamqtv-001010-data-001010.001</lid_reference>
20 <reference_type>browse_to_data</reference_type>
21 This is a reference to the full resolution data file corresponding to this browse image.
22 </Internal_Reference>
23 </Reference_List>
24 <File_Area_Browse>
25
26 <file_name>VGR_1201-MAMQTV-001010-DATA-001010.001.png</file_name>
27 <local_identifier>BROWSE_FILE</local_identifier>
28 <creation_date_time>2023-08-18</creation_date_time>
29
30 <Encoded_Image>
31 <local_identifier>BROWSE_IMAGE</local_identifier>
32 0
33 <encoding_standard_id>PNG</encoding_standard_id>
34 </Encoded_Image>
35 </File_Area_Browse>
36 </Product_Browse>

Line 19 points to an incorrect LID, but Validate does not report any of these:

183531 Referential Integrity Check Summary:
183532 30582 check(s) passed
183533 1 check(s) failed
183534 0 check(s) skipped

It passed all of the browse labels (the one fail refers to a .DS_Store file). So, unlike the -R pds4.bundle option, with the -R pds4.collection it does report referential integrity checks passing but it is not catching incorrect LIDs.

(The LID urn:nasa:pds:wenkert_pdart16_vgr_rav1ciun:browse_qedr:vgr_1201-mamqtv-001010-data-001010.001 does not exist (the browse LIDs have .png suffixes), although it shouldn't even be self-referencing the browse_qedr collection but rather the data_qedr collection).

@al-niessner
Copy link
Contributor

@jordanpadams

The check being requested here was purposely turned off by another user request #368 and here is the code that got me there:

// https://github.com/NASA-PDS/validate/issues/368 Product referential integrity
// check throws invalid WARNINGs
// Per request of user, we will disable the reporting until further
// instructions.
// Set the reportFlag to true if desire to do the reporting of this warning.
boolean reportFlag = false;

How do you want to resolve conflicting user requests here?

@al-niessner
Copy link
Contributor

@rchenatjpl @jordanpadams

Will someone who understand PDS well enough please tell me if there should be 1 or 2 warning messages for the tiny test data set attached to this ticket (see val308b.zip at the top of this issue). I found what look like 2 using grep exist but validate only reports one. The other does not look as though it should be part of the bundle and thus not reported here.

@jordanpadams
Copy link
Member Author

@al-niessner you are correct. in an ideal world, we would have some AI algorithm say "these bundle LIDs look kind of a alike, but are different. did you want them to be the same?", but until openAI does out validation, we are stuck with explicit checks.

@jordanpadams
Copy link
Member Author

@mace-space can you try out the latest SNAPSHOT with this fix in place to see if it catches the error?

https://github.com/NASA-PDS/validate/releases/tag/v3.4.0-SNAPSHOT

@rchenatjpl
Copy link
Contributor

@al-niessner @jordanpadams: validate (Release Date: 2023-11-17 21:11:37) caught that browse-ion-moments/ION_MOM.xml incorrectly referenced
urn:nasa:pds:vg1-pls-sat:data-ion-moments-96sec:nonexistentxxx
but missed that data-ion-moments-96sec/ION_MOM.xml incorrectly referenced
urn:nasa:pds:mess-mag-kt17-model-residuals:document:xxxnoexistent

@al-niessner
Copy link
Contributor

al-niessner commented Nov 18, 2023 via email

@jordanpadams
Copy link
Member Author

@al-niessner looks like we have success. Thanks!

@mace-space
Copy link

Sorry for my delay in responding, in the middle of a move. I will look into this more early next week. Thanks for all your help.

@mace-space
Copy link

@jordanpadams I'm seeing exceptions of the following kind when I run validate-3.4.0-SNAPSHOT:

java.lang.reflect.InvocationTargetException
	at jdk.internal.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at gov.nasa.pds.tools.validate.rule.AbstractValidationRule.execute(AbstractValidationRule.java:64)
	at org.apache.commons.chain.impl.ChainBase.execute(ChainBase.java:191)
	at gov.nasa.pds.tools.validate.rule.pds4.LabelInFolderRule$1.run(LabelInFolderRule.java:149)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.NullPointerException
	at gov.nasa.pds.tools.validate.TargetExaminer.isTargetBundleType(TargetExaminer.java:59)
	at gov.nasa.pds.tools.util.Utility.getTargetType(Utility.java:301)
	at gov.nasa.pds.tools.validate.ValidationTarget.<init>(ValidationTarget.java:47)
	at gov.nasa.pds.tools.validate.ValidationTarget.<init>(ValidationTarget.java:41)
	at gov.nasa.pds.tools.util.Utility.getValidationTarget(Utility.java:88)
	at gov.nasa.pds.tools.validate.ValidationProblem.<init>(ValidationProblem.java:69)
	at gov.nasa.pds.tools.label.CachedLSResourceResolver.resolveResource(CachedLSResourceResolver.java:207)
	at gov.nasa.pds.tools.label.XMLCatalogResolver.resolveResource(XMLCatalogResolver.java:381)
	at gov.nasa.pds.tools.validate.rule.pds4.LabelValidationRule.loadSingleSchemaIntoSources(LabelValidationRule.java:397)
	at gov.nasa.pds.tools.validate.rule.pds4.LabelValidationRule.validateLabelSchemas(LabelValidationRule.java:572)
	at gov.nasa.pds.tools.validate.rule.pds4.LabelValidationRule.validateLabel(LabelValidationRule.java:250)
	... 11 more
...etc 

Does this indicate I need to update java?

% java -version
openjdk version "11.0.15" 2022-04-19
OpenJDK Runtime Environment Temurin-11.0.15+10 (build 11.0.15+10)
OpenJDK 64-Bit Server VM Temurin-11.0.15+10 (build 11.0.15+10, mixed mode)

@jordanpadams
Copy link
Member Author

@mace-space hmmm. not sure what is going on here. let me check some things out.

@jordanpadams
Copy link
Member Author

@mace-space how are you executing validate? I can't seem to replicate on our test data sets.

@mace-space
Copy link

I'm running:
% validate-3.4.0-SNAPSHOT/bin/validate --rule pds4.bundle --report-file ./log/rav1ciun_validate_snapshot3.4.0.log --verbose 2 --target ./wenkert_pdart16_vgr_rav1ciun

@mace-space
Copy link

Hi @jordanpadams

I've validating Voyager Jupiter raw radio science bundles and encountered a related error to do with referential integrity checking, so thought I would add details to this issue. (But if you would like me to open another issue, I'm happy to do so).

In the bundle XMLs, it states:

<Target_Identification>
...
        <reference_type>collection_to_target</reference_type>  
    </Internal_Reference>
</Target_Identification>

This should be bundle_to_target but Validate (v3.1.1) did not flag any warnings or errors.
Unlike my comment last month, I have since been able to run the snapshot you provided (v3.4.0) successfully, however it did not flag any errors/warnings either.

@jordanpadams
Copy link
Member Author

@mace-space thanks for the inputs. we actually have never validated reference_types, so this would be a new requirement. In the end, the most important thing is the lid(vid)_reference, but agreed, this is not correct.

Per the IM Spec, it looks like bundle_to_target is not being checked by the schematron in Target_Identification. Only in Reference_List.

@jordanpadams
Copy link
Member Author

Created new SCR: https://github.com/NASA-PDS/PDS4-CCB/issues/7

@mace-space
Copy link

Thanks very much. Is NASA-PDS/PDS4-CCB a private repo? I can't access your link

@jordanpadams
Copy link
Member Author

@mace-space yes. I can add you to the DDWG Team.

@matthewtiscareno
Copy link

@mace-space yes. I can add you to the DDWG Team.

I would think that all CCB reps (including @mace-space) should be added to that repo.

@tbarnes4
Copy link

@jordanpadams Does the referencial checking for all lid(vid)_reference found in a label, cross checking at least within the realm (-R or --rule) validate is running on? I have an instance where the SB dictionary reference is not being flagged as an error when I'm running validate on a bundle to contains two collections, where processed products are referencing the raw products they were processed from. The LIDVID below does not exist.

                <sb:Raw_Data_Product>
                    <Internal_Reference>
                        <lidvid_reference>urn:nasa:pds:nh_alice:kem1_raw:ali_0408629333_0x4b5_eng::1.0</lidvid_reference>
                        <reference_type>processed_data_to_raw_data</reference_type>
                    </Internal_Reference>
                </sb:Raw_Data_Product>

@jordanpadams
Copy link
Member Author

it should... would you mind creating a ticket for this with the example you are referring to?

@jordanpadams
Copy link
Member Author

@tbarnes4 ☝️

@mace-space
Copy link

It looks like the issue I described above is still happening. @jordanpadams shall I open a new issue?

@jordanpadams
Copy link
Member Author

@mace-space yes please. If you wouldn't mind providing a specific data set example, that would be really helpful. thanks!

1 similar comment
@jordanpadams
Copy link
Member Author

@mace-space yes please. If you wouldn't mind providing a specific data set example, that would be really helpful. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment