Using all flavours of Solid

The Solid project can sometimes feel like a diverse experimentation ground. This make interoperability harder of course, but at the same time it is a great strength of our project! Here is a small navigation guide to help you find your way as you think about how the ideas and technology we develop in the Solid project can be useful to your organization.

How it usually starts

Usually, when an organization explores how they could adopt "something like Solid", their first approach is to restructure the organization's web application in such a way that user data is stored in per-user data stores. This is a good start! But it still doesn't enable data portability. The vision we really want to move towards is a "Bring your own Data" world, which focuses on the data a user may import into your application, rather than on the data they may export from your application.

Bring your own Data

The next big step is to see how you can use data that a user may bring from their personal data store to your application, and how you can rely on this data. A lot of proof-of-concept projects we have done so far end up using the Solid pod mostly to store verifiable credentials, where for instance a university puts a cryptographically signed diploma in the user's pod, which the user can then bring to other applications. Other examples may be user preferences and interests (the ones advertisers traditionally try to collect about us behind our backs). But if we go a step further, even personal archives, notes, and unfinished work could be ported from one application to another using Solid.

For instance, I could start editing a presentation using an online text editor, that saves the document text to my pod. Then I could use a different online application to create a drawing, and insert that into the presentation. Yet another app could be used to edit and include photos that were automatically uploaded to my pod from my smartphone.

More than a Personal Cloud Server

In that sense, a Personal Data Store like the ones proposed by the Locker project, the Unhosted project, the Solid project, and all the other Personal Data Store projects, is very similar to a Personal Cloud Server like Nextcloud or Cozy, in that it gives the user control over their data.

But the Personal Data Store vision goes a step further, in that it also decouples itself from the applications you can bring this data into. Of course, bringing your own data to an application only works if:

this application has a way to negotiate which data it wants,
you grant it the access it needs (and not to the rest of your data),
the application knows how to interact with your personal data store to fetch or store the data,
the data is in a format (serialization, ontology) that the application understands, and
the application respects the context, business logic, and implicit assumptions around the data.

All five are hard. And it's worth noting that protocol specifications like Solid, WebDAV, SQL, and remoteStorage generally address only the third one. For instance, the Authorization layer of the Solid spec allows a user to hand-tune access to specific documents, but it doesn't provide an easy-to-use consent dialog that a user will understand. All these things will have to be built on top of the various personal data store protocols.

For some use cases, read access is enough. For other use cases, you will also want the data in your personal data store to be updated with the work you do in the application (read/write access). And maybe you even want to use the application to edit the permissions and share things with other humans and other organizations (control access).

In all cases, the imported data as a stream of ones and zeroes needs to be interpreted with knowledge and understanding of its Context, Language, Origin, and Grounding. And the fact that this data was shared in the first place can in itself also convey information and user intent as a Speech act. Together, these important aspects spell "CLOGS".

Flavour One: Solid Version 0.10

Last week, the Solid project produced version 0.10 of its specification. It describes how data can be stored and retrieved in a Solid pod, how proof of identity can be given out by a Solid Identity Provider, it gives 2 options for implementing access control, and >=6 options for implementing notifications. This does put a little bit of burden on your application, in that it will have to use a "polyglot" approach in order to be able to "speak" with all different kinds of Solid pods.

Polyglot clients

The WAC vs ACP dichotomy may not be as big as an issue as it seems, since most apps will not need to edit access control rules anyway. Using the Solid Application Interoperability Spec, your application can express its data needs, and be given access and pointers to all the data it needs. This data will adhere to Shape Trees which should be enough to meaningfully interpret, and correctly edit the data.

The Webhooks, WebSockets, and other channel types of the Solid Notification spec will probably cost you some extra work, but bear in mind that this is a relatively new part of the specification and I hope this will still consolidate some more in the future.

In any way, it is a good idea in general to make your application import data in a polyglot way. Maybe users come to your application with a Solid pod, but in other cases they come with data they want to import from (and edit on) their Dropbox account, their Google Drive, their remoteStorage account, their WebDAV server, etcetera. It makes sense for your application to be future proof and independent of any particular protocol version. We are also planning to do some more development work on polyglot bring-your-own-data client libraries like remoteStorage.js in the near future, which would help with that.

However, the different access protocols are not even the "flavours" I was referring to in the title of this blogpost.

Flavour Two: Multiple pods

Apart from the layers you could put on top of Solid (e.g. end-to-end encryption, or verifiable credentials), there is a fundamentally different approach to the architecture of Solid. Whereas classically we think of the situation where a user interacts with one identity provider, one storage provider, and multiple applications, recent experiments in practice have moved more towards also using multiple storage providers linked to a single identity.

The advantage of this is that the organization that wants to write into your pod can also do the hosting of that section of your personal data store for you. They might even restrict you to read-only access. For instance, if a university just writes "Master of Science" into a document on your pod, it would be too easy for you to edit that; to fix that, they could give you a signed version of that document. Or they could give you a link (and eternal read access) to a document that they host. Of course, this erodes the sovereignty of the user quite a bit, but it also mitigates the risk of people accidentally deleting their university diploma.

Luckily, when you want your application to support Bring-your-own-data use patterns, it doesn't matter much whether that data is on one or on multiple pods or domainnames; through hyperlinks, all data can still be found in much the same way. There are some details about editing the details of your profile document, which is tied directly to your WebID, but as long as all profile editor applications are aware that some users have one pod tied to their identity and other users may have several, these details can be resolved.

Flavour Three: Graph Stores

A third flavour of Solid, which I think was never accepted as part of the Solid specification, but has always had some traction in some semantic web academia circles, is the view of the Solid pod as not a file storage or "document box", but more as a database that allows server-side search across multiple documents.

In itself, server-side query execution feels like a natural feature to add, especially if you compare the Solid stack with traditional web development stacks like Ruby on Rails, the LAMP stack (Linux, Apache, MySQL, PHP), etcetera.

Even though a database server, at some level, will eventually end up storing files in folders, it handles the creation of indexes on the server-side, and in the Solid architecture, client-side applications are responsible for creating index files beyond the basic file system layout. Importing data from a graph store is not very different from importing that same data from a document store. If anything, it will be easier, not harder, because multiple views on the same data could be generated on the fly. But there also lies the risk.

Interoperability between Solid applications is already hard if the Solid spec is simple. The more complex the servers become, the harder it will be to agree on a single definition of that more complex behaviour, the more different protocols clients will need to implemement, and the more can go wrong with it. That is why for instance in the remoteStorage protocol (which predates the Solid protocol), we chose to make the protocol as simple as possible. To maximize interoperability, we want dumb servers and smart applications!

So, because Solid servers are so simplistic, applications that write data into your personal data store, will also sometimes need to update index files that make this data findable. And although this burden of collective maintenance of index files can be challenging, it is not very different from the burden of collective maintenance of the integrity of the main (source) documents in a personal data store.

Flavour `i`: Client-client specs

Once you have a polyglot client that allows your users to bring their own data in a way that is easy for them, the rest is all about the client-client specs, which at some level are the central part of the Solid project, and at another level is maybe also the most under-developed part of it.

We have Jackson's ShapeRepo that allows you to, for instance, express which attributes a Chilean astronomer has, in a machine-readable way. And we have PDS Interop Conventions which documents how Solid OS and other Solid applications store data on a pod, as developer documentation and by example.

The Solid Application Interoperability spec is probably the closest thing we have to making Solid applications interoperable with each other. It requires application developers to describe their data models with ShapeTrees, which will be a bit of extra work, but may at some point be the only way to allow other application developers to import the data your app exported to the user's Solid pod.

Just like the "Add to home screen" functionality of Chrome on Android only works for progressive web apps that have a service worker, and service workers only work when an app is hosted on https, we could require apps to describe their data formats (at least with developer documentation and examples, but preferably also with shape trees) before we can call an app a Solid app.

There is an example app called Projectron which is mentioned throughout the Solid Application Interoperability spec and which can be used to track projects and issues; a first step would be to make this app compatible with the SolidOS Issue tracker, so that issues created through Solid OS will "magically" show up when you open Projectron.

That is where the Solid project will start to show off its magic!