-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JCL-417: Validate LDP containment data in SolidContainer::getResources #571
Conversation
}).collect(Collectors.collectingAndThen(Collectors.toSet(), Collections::unmodifiableSet)); | ||
final String container = getIdentifier().toString(); | ||
// As defined by the Solid Protocol, containers always end with a slash. | ||
if (container.endsWith("/")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could have a dedicated error thrown here and upon creation too, if the URI does not end in a slash.
And while this is a precondition, I find it in general strange to be checked here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not be inclined to throw an exception here. Generally, I'd prefer the client libraries to be generous in the data they accept. In this particular case, if the resource is a SolidContainer, then the getResources() method must follow the rules of Solid, but if the resource isn't strictly a SolidContainer, it shouldn't cause the consuming application to throw an exception (and thus the getResources method would just return an empty collection).
This condition is specifically to prevent against such cases as this:
A resource is https://example.com/resource
and it includes the triple:
<> ldp:contains <https://example.com/resourcechild> .
That triple should not cause the getResources method to include the sibling resource from being included in that list (since it doesn't conform to the Solid definition of containment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to note is that these checks are put in place to prevent a malicious or misbehaving server from causing an application to perform unexpected operations on non-contained resources. For well-behaved servers, this will never be an issue and there is no need to add these constraints, but we can't expect all servers to be well-behaved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, your last comment was the reason for which we decided to throw an error in the JS implementation of this feature: it may be desirable to let the consuming app know that there is something wrong with the target resource so that it takes action and removes the offending triple, so that other clients who don't implement this check don't perform the unintended operations we are preventing against here. I would prefer the DX to be convergent between JS and Java, so I'm happy to filter out the resources that do not follow slash semantics from the children list, but isn't there value in communicating this inconsistency to the app/user, especially when they can act on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My rationale is basically the "robustness principle" https://en.wikipedia.org/wiki/Robustness_principle
Be conservative in what you do and liberal in what you accept from others.
There may be cases where an RDFSource is used to model an example Container's RDF or where an app is intending to write certain triples or where a server is simply doing things incorrectly. There will be a lot of "slightly off" data in the wild, and I think it is important that client libraries don't just blow up in those cases. As long as the data is syntactically correct, the semantics may vary from domain to domain, and it would be unfortunate if we imposed constraints that prevented certain interesting use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a compromise (actually, I think it's a better solution overall)
-
Leave the current get resources() method as-is
-
Also implement the validate() method and have it return an invalid result if there are illegal containment triples or if it doesn't end in a slash, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see both sides of the coin but I like the validate() option a lot! Validate should by default already have this check in then.
And we need to highlight it in the docs then I would be fine with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a distinction between a read method that just ignores triples we deem invalid and a validation method where the intent is to be aware of any issues sounds like a good pattern. I'll think about other cases when it would be applicable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timea-solid and @NSeydoux I'll wait on merging this until you have a chance to see my latest additions to this in d3fd371
fde54c7
to
d3fd371
Compare
child.getTypes().forEach(builder::type); | ||
return new SolidResourceReference(URI.create(child.getIRIString()), builder.build()); | ||
}).collect(Collectors.collectingAndThen(Collectors.toSet(), Collections::unmodifiableSet)); | ||
final String container = normalize(getIdentifier()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe now you can call validate() here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not validate()
directly, as that performs a number of checks that are irrelevant here. You will see that in https://github.com/inrupt/solid-client-java/pull/571/files#diff-2c9b192110a924fa828ba14eaa2e7359afa8b3b811c8ad12fd6563e9161d7ba0R90, we re-use some of the same code as validate in this pipeline
At present, the
SolidContainer::getResources
method includes references to every resource that fits the pattern:Where
?container
is the target container resource. This means that a non-compliant server could return a set of (apparent) child resources that are not really part of the Solid containment hierarchy. For applications that are built with the Solid specification in mind, it could lead to unexpected behavior.