-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
registry-api does not respect VID when a LIDVID is used as an id, instead returns latest version #234
Comments
This appears to be caused by the default value of URIParameters.selector which is always My gut feeling is that the value of |
arose from registry-api #223 |
No, this VID problem is a sorting problem as stated in yet another ticket that I cannot find but I think @alexdunnjpl implemented a more complex set of classes to be able to sort to fix it. It has nothing to do with URIParameters. |
@al-niessner This is probably the ticket in question (hidden because it's closed) That was a sorting issue, but if I understand what I still need to return to this issue for further investigation but it's not that I think the default is wrong, more that as far as I could see it I just took a look, and what I'm seeing is ostensibly that in Based on what I'm seeing, it seems reasonable to rename |
So after the most-recent batch of changes (see branch That remains true having ensured that provenance is populated per the following:
Given {
"bool" : {
"must" : [
{
"term" : {
"lid" : {
"value" : "urn:nasa:pds:insight_rad:data_derived",
"boost" : 1.0
}
}
},
{
"terms" : {
"ops:Tracking_Meta/ops:archive_status" : [
"archived",
"certified"
],
"boost" : 1.0
}
}
],
"must_not" : [
{
"exists" : {
"field" : "ops:Provenance/ops:superseded_by",
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
} |
Ohh, I see it now...
@al-niessner which is correct? I'm guessing the former? EDIT: Now I'm totally confused - I thought the latter must be the result of a typo in bulk.append (json.dumps({'doc':{'ops:Provenance/ops:superseded_by':supersede}})) and a global string search in |
Checking the OpenSearch document itself, the string is correct, so I have no idea why the API call for v7.0 is returning it with a period instead of a forward-slash (or, in redux, why the query is returning both v7.0 and v14.0)
|
Ah, / vs . is some good stuff. So, in opensearch ops:Provenance.ops:subperseded means something that either or both @jordanpadams or @tloubrieu-jpl do not like. Therefore we use / in the database. However, the general user community does not / and they want to use . which caused the creation and use of this utility. Therefore, . is correct in the URI while / is correct in the database itself. |
Okay, I awoke in the middle of the night and realized that my glib answer that it was a sorting problems is not correct. You are more or less on the right track but because I had coffee in the middle of the night, drove into the office, and now short tempered, jittery, and foggy I am just going to outline it pedantically so that I do not get wrong yet again. So the URI request comes in at registry,.controller (skipping all the empty prelude package names) and is used to build the URIParameters, which still have nothing to do with the problem. Note that in URIParameters the input value is considered an identifier. It the URIParameters are then first processed to determine if the identifier is a lid or lidvid or gibberish or whatever have you. Since URIParameters is hidden by a "get only" interface registry.UserContext, this is the only place that the lidvid actually gets set. However, the work is done in two places. First, the way to look up the lidvid is done in URIParameters.setLidVid() where it tries to resolve it as typed. Second, we are now in registry.model.LidVidUtils.resolve(). Since URIParameters is checking as typed and not latest, it returns either the latest if a lid is given or blindly accepts the identifier as a lidvid. This get us back to the transmuter where we have to verify our class type and lidvid type. In other words, when the URL is of the form /classes/bundles/:lidvid then we have to check that the given lidvid is a bundle. Since everything is a product, including bundles and collections, then we do not have to do this check when the URL is of the form /products/:lidvid as is this case. That takes us handler for this type of request which should be registry.controller.Standard. The handler for the endpoint is particularly thin and gets you to the beast registry.model.RequestAndResponseContext. There are several factory methods that allow for values to be defaulted to make the constructor happy. The registry.model.Standard.handler() uses the factory method . The factory methods move the ignore value logic from constructing classes to this class where it is needed. Still, it is the constructor that does all of the work. It is during the construction of this class, the user provided lidvid is further refined with output group contraints being the same as the resolution group constraints. They really only differ when you want parents, grandparents, children, or grandchildren. In these cases the output constraints like bundle is different than the resolution constraints (given lidvid group). The registry.controller.EndPointHander registry.controller.Member uses the other factory method for this reason. Once again we are back at registry.model.LidVidUtils,resolve() with outPreset and resPreset being the same. Historically, the group constraints where called preset. Thus the horrible naming. Anyway, resolving the lidvid is done again but latest or as typed but is not resolved correct. An error in translation maybe at this commit 8fb68e9 might have introduced the bug. No matter where it was introduced, the fix would be to change
As a side note: One last note, if VID 14.0 is not archived or certified you will never be able to reach it via the registry-api. The dream is to have roles (logins) where it is not so hard coded as it is today. Not likely the case since superseded_by is set for 7.0 unless provenance.py is now broken too. |
@al-niessner @jordanpadams so given
is the intention that registry API should
|
yes.
I think we should just throw an error if we have 2+ versions of a LID with no |
@jordanpadams should I modify the error message (to paraphrase) "Error: More than one hit where there should only be one" to add "Is provenance metadata present and up-to-date?" This seem likely to be a common dev error for a short while (because no-one's used to running the provenance script and it's not in the docker deployment yet afaik) so the extra information should save some time. |
Tradition has it that it should end up as a 50x error that tells the user something useful to give to us that we can then go find in the log file. Telling them about hits is not going to help them. Pretend that they do no know nor want to know that oracle - no - that elasticsearch - no - that opensearch is backing registry-api. The translation of exceptions to http errors can be extended if you do not see something you can abuse now. |
Okay, I'll modify it to be more informative (and unique). |
@alexdunnjpl , @al-niessner , @jordanpadams Should we manually, in the API java code, select the 'latest' VID when there are multiple lid without superseded by and add a WARNING or ERROR in the log, so that the user is not annoyed by that message which is for PDS EN operators/admins. Another comment, I am wondering why superseded by is at stake when we just want to get a lidvid ? The user should be able to get any superseded version if he is requesting it, right ? |
@tloubrieu-jpl re the first point, I think that's what Jordan's getting at with "we are better off providing no data to the user instead of bad data." (though per my understanding, it would be correct data for 99.9% of products which don't have multi-LID provenance chains). Your second point is correct and matches the current behaviour, it's only relevant to the extent that I was manually testing queries by LID/partial-LIDVID after making some extensive changes that affect the whole API (see #236). |
All queries must default to latest according to @jordanpadams like q= etc unless specifically told to find all of them (rarity supposedly). So, all searches are amended with a filter to only look at those that do not contain superseded_by. Tada, always the latest. Occasionally, need to find all or specific non-latest lidvids so need to turn it off in very few cases. Just weeding those out now (second point). |
or unless told to find a single, specific LIDVID (like some:product::7.0), right? |
Unless somebody sends up a flare, nobody will notice the bad database until performance fails because nobody sees the errors in the log like the ones there now. Give the user an error so that they send up a flare and good message in the log a developer can find and use to understand and debug the problem, Yes, specific lidvid is one of the weeds. |
+1 |
@tloubrieu-jpl @al-niessner ☝️ I am thinking the same thing as Al. Let's send up a flare and figure out what is going wrong before it becomes pervasive. |
With provenance metadata ironed out, LID and partial-LIDVID requests work, but requesting a specific, superseded LIDVID returns zero results. Error shows that the superseded_by check is still applied when it should not be.
That constraint is the direct result of checking the context (i.e. its selector) and finding That value is populated from the So @al-niessner as far as I can see* the missing element here is still that URIParameters.setLidVid() doesn't set public URIParameters setProductIdentifier(ControlContext control) throws IOException, LidVidNotFoundException
{
this.productIdentifier = LidVidUtils.resolve(this.getIdentifier(), ProductVersionSelector.SPECIFIC,
control, RequestBuildContextFactory.empty());
if (this.productIdentifier instanceof PdsLidVid) {
this.selector = ProductVersionSelector.SPECIFIC;
}
return this;
} * I still need to re-re-read a couple of paragraphs from your previous reply to check that this isn't addressed. |
@al-niessner just to confirm I didn't misunderstand your earlier email - currently, the provenance script doesn't yet support mixed-LID provenance histories (so I don't need to test whether I've broken that), correct? |
@alexdunnjpl - correct |
Not correct at all. To determine if it is to be added or not it is done, or was done, in one and only one place: registry-api/service/src/main/java/gov/nasa/pds/api/registry/search/SearchRequestFactory.java Lines 91 to 95 in 7433c9c
Note that context.justLatest() is NOT of UserContext but of RequestBuildContext. If you have the call stack or if we roughly trace it to LidVidUtils it is caused because the justLatest flag is being sent as true. I do not see what branch this is being pushed on so I cannot see your current code. Point is, it is NOT from the user input. Push your code on a branch so I can look at LidVidUtils and I will tell you what is wrong still. Did you flesh out resolve()? |
This is still true, and I can't see an inconsistency with what I asserted in the two paragraphs following the OpenSearch query dump (with all the code links). My latest working state is on branch The selector isn't set in EDIT: I see what you're saying now about the classes and I might have messed up the source of the selector value - will confirm now. |
Your problem is here: registry-api/service/src/main/java/gov/nasa/pds/api/registry/model/identifiers/LidVidUtils.java Lines 128 to 130 in 22a0970
You need to check that what you have is not already a valid lidvid (see previous post with some maybe working code suggestion) that says if what I got is a lidvid and it exists as is in all of history then use it otherwise find latest. If neither it nor its lid version exists then throw a lidvid non-existent exception (resolve() already does). |
If a user has specified a non-latest full VID, why is execution making it to that case instead of Still gotta let that percolate, but @al-niessner @jordanpadams is the intended response to a request for A::2.0, given versions A::1.0, A::3.0, a 404 or an autoresolution to A::3.0? I would think that any time a user specifies a full LIDVID, if it's not there it should tell the user it's not there, otherwise you risk giving them something they didn't want, but that conflicts with "if what I got is a lidvid and it exists as is in all of history then use it otherwise find latest". |
+1 for 404 no such lidvid for a partial/bad VID because it is just gibberish |
So we want to remove that fuzzy lidvi support too? I thought that was an intentional choice (per "We don't punish users for [that]") |
It was an intentional choice but we should punish them. It was more of a slop in the code that made life easy and less a good design/implementation practice. As long as we are clear that the given lidvi could not be fully translated as is then they can correct it. My understanding from @jordanpadams is that almost all user will give us a lid expecting the latest. The database is almost entirely VID 1.0 (millions of just 1.0 and couple thousand not 1.0) so the slop lived out of convenience. I am voting to kill the slop but it is not a democracy. |
👍
right. we shouldn't give the user back something they didn't ask for. if that returns 404, then they can try asking for just the LID. and then if they want all versions, then they can ask for the LID/all. If they really want to do fuzzier lidvid support (e.g. using the intent of those API endpoints is to really be a lid/lidvid resolver for the overall system. I don't think it really needs to support too much in terms of fuzziness at the moment. if we start seeing use cases where that is needed, we can figure it out. |
Current state: need to fix local dev deployment to add superseded_by to OpenSearch index, then re-test, then if it's still broken loop back to looking at a bugfix like in #234 (comment) |
@alexdunnjpl that may have something to do with this? NASA-PDS/registry#140 |
Tested - Pass (as of 31b6919):
Still need to manually test |
Ready for merge, since classes endpoints are either invalid or separately-bugged, per #239 |
Add additional lidvid classes fixes #234
Hi @alexdunnjpl , can you add an estimate for this bug fix ? Thanks |
@tloubrieu-jpl done, though I'm curious what value it provides after the fact? |
🐛 Describe the bug
Given multiple ingested versions of a product, requesting
<someclass>/<someLIDVID>
returns the document for the most-recent version of the product matching the LID, irrespective of which VID is specified📜 To Reproduce
Steps to reproduce the behavior:
int-registry-batch-loader
profile (which includesurn:nasa:pds:insight_rad:data_derived::7.0
harvest
against the latest version ofurn:nasa:pds:insight_rad
bundleregistry-mgr
against the LIDVID of the bundle ordata_derived
collectioncurl --get 'http://localhost:8080/collections/urn:nasa:pds:insight_rad:data_derived::7.0' | json_pp
, observe v14.0 metadata returned🕵️ Expected behavior
Expect return of requested version, not latest
📚 Version of Software Used
Docker image
nasapds/registry-api-service:1.1.11
registry-api version which includes commit 252bfbe and its dependencies
The text was updated successfully, but these errors were encountered: