-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate making incorrect assumption that first object has and object length == file_size
#781
Comments
file_size
file_size
This is not a problem with validate but what you want validate to do and complete contradiction to #684 The label does not define the object_length. For #684 it was determined that when object_length was missing, the file size should be used. Because object_length is missing there is no way to tell if the table A in the file overlaps table B. validate thinks that it does because #684 said that missing the object_length use file_size instead forcing the assumption that there would be only one table. The question is, given that many tables exist in the file, must they use object_length. If not, then do we assume it just works out? Anyway, the user in #684 wanted validate to use file_size when object_length was missing. This user does not want us doing that. You are the arbiter. |
Unfortunately, in this case, I think we have to just assume the tables do not overlap and hope one of the other checks we are doing for the content triggers some other error. I think we can still use @rchenatjpl if this is in a peer review, can we recommend |
@al-niessner Would it be possible for us to "guess" the So from the example, for the first table, we read 1 record, and then calculate the bytes from the expected offset (0) to where we encounter the end of that 1 record (?) and that is our guess for the |
If by "guess" you mean estimate, then maybe. In the case of the data provided for this ticket, no. The file is multiple comma delimited tables where nothing is known about the field sizes just their position and that they are of type string or ascii integer. Obviously ascii integer is constrained in a maximum size, but it can also be 1 byte. Here is an example of what is given:
From validate's perspective, the interesting part is that for any offset to ever work, then one of two things have to be true: One, all sizes are fixed and therefore could be provided in fixed rather than delimited table. Two, sizes are variable but constrained by an upper maximum which must always be padded. If fixed table was used, then all checks would have simply worked. If variable delimited table, then will verify padding indirectly when the file content is checked against the label. It seems that if a file area is given with delimited tables that do not define an object length, then skip this check and let the content checker find problems. PS: for completeness, if padded then object length is known a priori and could easily be defined in the label. |
Sorry, I just now noticed I got tagged. So for me or whomever: recommend object_length for Table_Delimited. |
Checked for duplicates
Yes - I've already checked
π Describe the bug
When I did ran validate against a product with numerous objects and a
file_size
defined, it seems that Validate expects the first object to be that length.π΅οΈ Expected behavior
I expected
file_size
to mean all objects combinedπ To Reproduce
See #684 (comment)
π₯ Environment Info
Mac OSx
π Version of Software Used
gov.nasa.pds:validate
Version 3.4.0-SNAPSHOT
Release Date: 2023-12-04 07:19:11
π©Ί Test Data / Additional context
See See #684 (comment)
π¦ Related requirements
No response
βοΈ Engineering Details
No response
I&T
TestRail Test ID: T8681196
The text was updated successfully, but these errors were encountered: