Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a new SIB component #32

Open
6 of 10 tasks
balessan opened this issue Jan 5, 2024 · 6 comments
Open
6 of 10 tasks

Implement a new SIB component #32

balessan opened this issue Jan 5, 2024 · 6 comments

Comments

@balessan
Copy link
Collaborator

balessan commented Jan 5, 2024

As proposed on #3, we go forward on the implementation of a new component which supports our chosen indexing strategy (based on the agent scenario, to be validated after synchro with the INRIA team).

Features:

  • Being able to specify a list of fields (properties)
  • Being able to specify a list of sources
  • Being able to choose a strategy (will trigger a specific engine)
  • Execute the search, returns the results as a BindingStream
  • See how to treat Results

Attributes dependencies:

  • Fields (the filtered properties):
  • For skills an URI, for first_name a string, etc
  • Sources = Stringified list of Data-src as URIs ['uri.com', 'uri2.com']
  • Strategy = documented string, which allows us to generate the corresponding sparql query and pick the proper Engine
  • Optionally -> pass "comunica" as a backend attribute to be able to switch between search backends

Requirements/dependencies:

  • Data:
    • Available indexes on the PODs side conforming with the PublicTypeIndex or the new version proposed by @lecoqlibre
    • Two options:
      • one external indexing agent (but next steps as it requires development)
      • custom views on one branch of djangoldp to provide automatically indexes for users and skills (~1d of work)
  • Libraries:
    • @comunica/query-sparql-link-traversal

Technical specifications:

<solid-traversal-search
   fields="skills, first_name, city"
   strategy="distributed" <!-- or centralized -->
   data-src="['https://first-server.com', 'https://second-server.com']"
/> 

Implementation steps:

Not sure about the impact on the store if any, or if it does everything in //

@balessan
Copy link
Collaborator Author

balessan commented Jan 5, 2024

For the demo we also need to generate data including:

  • Skills
  • Users having those skills
  • Associated indexes (see with @lecoqlibre for the shape of those indexes)

On 12 demonstration instances.

@balessan
Copy link
Collaborator Author

balessan commented Jan 9, 2024

First @lecoqlibre :

  • Clone: https://git.startinblox.com/framework/sib-core
  • Checkout beta
  • Create a new branch from there : feature/solid-traversal-search
  • NPM install
  • npm run watch on one terminal
  • npm run serve on another
  • Should be able to access http://localhost:3000/
  • Create a new example file: examples/solid-traversal-search.html
  • Implement the stuff (the hard thing)

@balessan
Copy link
Collaborator Author

balessan commented Jan 17, 2024

Next step for the component:

Switch from our stand-alone solid-traversal-search component to something directly integrated to the <solid-display> component:

  • add a dedicated attribute like filtered-strategy=traversal or a specific value for filtered-on like here in FilterMixin
  • modify the FilterMixin to add the support of a new strategy (beyond the default client search and the isFilteredOnServer in the attached method
  • Add a specific way to refresh the content of the resources list and pass it properly to the others listPostProcessors.
  • Will probably have an impact on the solid-display.ts and/or listMixin renderDOM methods which are responsible for building the results DOM from an array of resources, or maybe we should just override them directly in the filterMixin ?

@lecoqlibre
Copy link
Collaborator

To be more generic, could not we use directly SPARQL instead?

The component would take/produce a SPARQL query like SELECT ?user WHERE { ?user a sib:User; sib:hasSkill <skill>. } or SELECT ?user WHERE { ?user a sib:User; sib:hasSkill <skill>; sib:liveIn "Paris". }.

Then a query planner would use the appropriate indexes. It will parse the query, look after indexes, select an appropriate strategy and run it to respond to the query.

We should provide some indexes to the components so they can know where to search like:

<!-- It will show a list of users having the skill RDF -->
<solid-display
   query="SELECT ?user WHERE { ?user a sib:User; sib:hasSkill <https://example.org/skillRDF>. }"
   indexes="https://example.org/typeIndex, https://example.org/skillIndex"
/>
<!-- 
It will display a search form to find users given skill, name or city. 
When the different parameters will be selected, the SPARQL query will be 
recomputed before being passed to the query planner for execution.
-->
<solid-search
   fields="skills, first_name, city"
   indexes="https://example.org/typeIndex, https://example.org/skillIndex, https://example.org/cityIndex"
/>

The provided indexes should be self-described so the query planner will know how to use them. For instance, it will know that the https://example.org/typeIndex is a solid:TypeIndex because of the triple <> a solid:TypeIndex.

@balessan
Copy link
Collaborator Author

balessan commented Feb 6, 2024

To be more generic, could not we use directly SPARQL instead?

@lecoqlibre No because the intended audience of the framework are considered to be at the HTML integrator level, they do not know anything about RDF or sparql, they just know HTML attributes.

We can open advanced features targetting a developer audience but it is not supposed to be the primary target.

@lecoqlibre
Copy link
Collaborator

No because the intended audience of the framework are considered to be at the HTML integrator level, they do not know anything about RDF or sparql, they just know HTML attributes.

OK right @balessan, it make sense, we should be nice with integrators :D But this is not a problem, the component could create the SPARQL query behind the scene and pass it to the query planner internally. The HTML integrators will still use fields.

I think we should use SPARQL internally. I could use sparqljs to parse the query and write the query planning logic on top of it. This way, the query planner will be generic and work for many other use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants