Skip to content

Latest commit

 

History

History
115 lines (86 loc) · 8.62 KB

M11.md

File metadata and controls

115 lines (86 loc) · 8.62 KB

Milestone 11

Summary of Activity from March 2022.

A long time has passed which requires some explanation.

Somewhat off topic context

From March 20th onwards I was given a part time consulting position at Imec.be to work on semantic web tooling for big data. They contacted me for my work on banana-rdf - the older version, not the Scala3 version developed as part of this project - and are using Scala and Akka in their big data spaces. This sounded like there could be a lot of promising ways to integrate the work done on the solid-control project. I started off working full time until August to get my finances back in order, and there was a lot of technology to learn too.

After the summer I integrated with the VSDS: Vlaamsee Sensor Data Space project, with members from the IDLab and Ghent University (such as Pierter Colpaert). The VSDS project (and others projects imec is involved in) need to Authorization for Distributed Data Spaces to work and are also interested in the Solid architecture.

This allowed me starting September to get back to the Solid-Control project, by working part time for IMEC as allowed by the contract. This included the VSDS project and consulting for IMEC on Solid and access control. Having the nlnet Solid Control project has been very helpful, give me the freedom to develop the right architecture.

On Topic

The previous section was not quite off topic because this milestone is about getting Solid-Control to be deployment ready. And for that it helps to have people interested in deploying it and to have a number of use cases that can be implemented and tested.

One part of the VSDS project is the publication of sensor data as LDES: Linked Data Event Stream. This requires clients to be able to follow linked data and fetch updates of resoures. If the stream is access controlled, and potentially published across servers, this means the client also needs to authenticate efficiently to retrieve the resources. This gives us data and access control together in a situation that needs to be efficient at every level, which is a perfect use case to test the solid-control architecture.

Work

Client Authentication for 1 resource

The first stage is to write a client that can request one resource on a server. Such a client requires a basic wallet that can take a private key credential, and some basic logic to read and search through access control rules on receiving a 401 in order to select the key that can satisfy the WAC rules. Finally it has to sign the request and send it.

This is part of Solid Ctrl App PR1. One can see it in use in the small script Fetch.sc whose main line once everything is setup is

ioStr(uri"http://localhost:8080/protected/README").unsafeRunSync()

When run the client follows the full HTTP Sig Sequence Diagram:

Client                          keyid                            Resource
App                          Document                            Server
|                                |                                   |
|-(1) request URL -------------------------------------------------->|
|<-----------(1a) 40x + WWW-Authenticate: HttpSig + (Link) to ACR  --|
|                                |                                   |
|                                |                                   |
|-(2) request Access Control Resource (ACR)------------------------->|
|<-------------------------------------(2a) 200 with ACR content ----|
| (choose key)                   |                                   |
|                                |                                   |
|-(3)- sign headers+keyid------------------------------------------->|
|                                |                       initial auth|
|                                |                       verification|
|                                |                                   |
|                                |<-----------------(4) GET keyid----|
|                                |-(4a) return keyid doc------------>|
|                                                       - verify sig |
|                                                       - verify ACL |
|                                                                    |
|<--------------------------------------------(3a) answer resource---|

Building the client using the http4s library, uncovered a problem with the httpSig implementation, whose cause could be traced back to a simplification in the abstraction used there. This was fixed in Track F[_] in Http.Message subclasses.

That update required updating the Reactive-Solid Server too of course. It also uncoved a few problems that were also fixed in PR 22.

A client needs to read rdf sent to it so this required adding scala 3 IO to banana rdf. While doing that, I also added support for relative URIs, relative Triples, and relative Graphs, since that will be needed when posting graphs to remote servers such as ACLs, or LDES content...

More advanced clients

Next in order to have clients follow more interesting linked data I decided to produce LDES streams as RDF files that can be published by a Solid server in Scala. This stemmed from work over the summer at IMEC analysing some large CSV files describing City Flows data describing how many people were in certain parts of Belgium. The summer work used RML for describing the structure of CSV data and transforming it. But the tools to transform it to LDES files were not so obvious.

To do this correctly I needed to add back the banana-rdf DSL library which we called Diesel to the scala3 branch in PR 3: Add diesel support. This allows us to write RDF like this:

   def obsMtoGr(obsm: Obs[Measurement]): PointedSubjRGraph[R] = (  
         rURI("#"+obsm.id).a(sosa.Observation) 
            -- sosa.hasSimpleResult ->- ops.Literal(
               obsm.values.count.map(_.toString).getOrElse("0"), 
               xsd.float
            )
            -- wgs84.location ->- locationId(obsm.values.subjArea) 
            -- sosa.madeBySensor ->- deviceId(obsm.source, obsm.who) 
            -- sosa.observedProperty ->- observedProp(obsm.source, obsm.values.modality)
            -- sosa.resultTime ->- dateLit(obsm.date)
      )

which makes it much nicer to write and read RDF code in scala.

This transformation is now inside an imec github repository. I am asking if it would be possible to make it publically available. Some data of that form will be very useful to test access control in a larger environment. Adding Diesel support we ran into what I think is a bug in the scala compiler which I was able to reproduce DottyIssue16247...

So now we haved some data that is more complex, we can write a client that reads this LDES stream, following links, and then start adding more complex access control rules. At each cycle we can then improve both the client and the server, uncover efficiency improvements in the protocol, etc..

One improvement that is already abvious is that for ACLS that are the default on a container, the server should be able to hand out a cookie to to allow the client to re-authenticate efficently for all subdirectories without neededing to go through the 401 rigmarole... (On the other hand the client could also just sign headers in advance, knowing what the rules are likely to be...)

The above work led to an improvement to the Http Sig Spec on the sigUpdate branch. These improvements will become much more frequent and precise in the up-coming updating of the Scala implementation of Http Message Signatures RFC version 13 which is now in last call, as discussed on the W3C Solid Community Group meeting of Oct 19.