Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Natively Support In-Browser RDF #538

Open
jeff-zucker opened this issue Dec 16, 2021 · 10 comments
Open

Natively Support In-Browser RDF #538

jeff-zucker opened this issue Dec 16, 2021 · 10 comments

Comments

@jeff-zucker
Copy link
Contributor

Rdflib can not currently be used directly on RDF data stored in a browser's local storage because 1) default fetches all barf on any URL that isn't http(s) or file. 2) rdflib needs to see a content-type header to know what to do with documents it fetches 3) rdflib needs a way to know if/how a document is editable before it can write to it.

Simple usage can get around some these problems by using a 'MachineEditableDocument' triple, or by using getItem + parse instead of load and serialize + setItem instead of putBack. @NoelDeMartin's apps use this approach (get+parse/serialze+put) though not with rdflib.

But even with these work-arounds, update can not work - it calls load and putBack with no possibility to replace them. Similarly the forms system can not work - it calls update internally with no possibility to replace it.

I have created a simple fetch that provides what rdflib needs to see in order to work directly on the local storage (it sends a full Response object with wac-allow and content-type headers). This means that rdflib methods including load,putback,update,webOperation and the forms system can be used against documents in the local storage.

A user or app would need to allow this storage by setting inBrowserStorage to true when creating a fetcher. Once they had done that, their http and file fetches will work as previously but fetches in the form "ls://path" will read or write path in the browser's Local Storage.

const fetcher = $rdf.fetcher(kb,{inBrowserStorage:true});                           
await fetcher.load('https://example.com/foo.ttl'); // loads remote file      
await fetcher.load('ls://foo.ttl');                             // loads  in-browser file 
@josephguillaume
Copy link

I like it! It's an elegant solution for rdflib to work with documents in local storage without having to use solid-rest and browserfs.

In terms of use cases, it would be nice if this could play well with a local-first strategy.
With the proposed solution, an app would need to maintain two different sets of URIs. Presumably it could use a convention to replace ls by http/https to remember where it wanted to save the triples.

An alternative would be to maintain two separate stores, with one exclusively using this localStorage fetch. The advantage would be that the original why URIs could be kept.

Rather than committing to the use of ls:// now, perhaps we can first work out what the next step would look like.

Using keep/revert conflict resolution (I'm not an expert), updater.updateLocalFirst (or some other name) would need to know about the ls and http storages. At initial http load, the ls storage would need to be updated, i.e. making the data available offline. When an update is made, ls is updated immediately. If http is offline, then the two are now out of sync. When we try to sync, if the remote etag is out of date then the app needs to decide whether to do an explicit revert of the remote version, do some custom merge, or discard the local changes.

https://github.com/remotestorage/remotestorage.js/blob/0e6ef757e6fd2d5c067207cb07b7d62e820a58ec/doc/contributing/internals/cache-data-format.rst#keeprevert-conflict-resolution

With the single store solution, we'd need to ensure that every http triple (meeting some criteria) has a ls equivalent. With a two store solution we'd be checking that the contents of the two stores is identical.
Not sure what will be easiest but it seems like it might inform the API choice here on how ls should be accessed?

We might also run into limitations of local storage in this use case, in which case we probably want to make sure the API choice has a clean equivalent with indexeddb too.

@jeff-zucker
Copy link
Contributor Author

jeff-zucker commented Dec 17, 2021

This is rather elaborate, but I believe handles the local-first use case as well as supporting any kind of alternate fetch now or in the future using any URL scheme.

By default, rdflib will handle http(s): fetches with cross-fetch and [Edit: if we want this to be on by default] will handle browser: fetches with its new built-in rdflib.fetcher.inBrowserFetch. Users/apps can override those fetches and support other fetches by passing a flag on fetcher creation that would look like this:

  const fetcher = $rdf.fetcher(kb,{schemeHandlers:{
    http       : solidClientAuthn.fetch,  // includes both http & https
    file       : solidRestFile.fetch,
    browser    : solidRestBrowser.fetch,
    all        : yourLocalFirstOrOtherFetchHere;
   }});

If the scheme "all" is not defined, the scheme of the URL determines which fetch is used. If "all" is defined, all schemes are handled by the named all-fetch. That method can make use of the other kinds of fetches by calling fetcher.schemeFetch(scheme,uri,options). For example, this would look for the named URL in the browser local storage even though it is in the http: scheme.

  fetcher.schemeFetch('browser://localStorage','http://example.com/foo.ttl');

So a local-fist app could address the local with schemeFetch('browser',...) and the remote with schemeFetch('http'...) but use the same http: URL for both. Something like this :

  allFetch = (uri,options)=>{
     const localContent = schemeFetch('browser://indexedDb/',uri);
     const remoteContent = schemeFetch('http:',uri);
     // do sync stuff
  }

@jeff-zucker
Copy link
Contributor Author

I implemented something like the above in Solid Rest Browser.

@NoelDeMartin
Copy link
Contributor

NoelDeMartin commented Dec 23, 2021

I implemented something like the above in Solid Rest Browser.

Looking at that example, I have some doubts. It handles the basic use-case of storing data locally and remotely, but:

  • If you're offline, the data will only be written locally. How do you retry uploading the local data once you're back online?
  • If another app changes data, how do you update your local copy?
  • If the remote write fails because of some invalid RDF (think about shape trees and such once those are supported, for example), how do you roll back the local data?
  • How do you handle conflict resolution?

I understand all of these have to be handled by the application, right? In that case, I'm not sure if it's correct to say that the library is "local-first", rather that it allows you to write data locally. But in order to have a true local first app, the app developer still needs to do some work.

@josephguillaume
Copy link

josephguillaume commented Dec 23, 2021

I would say it's a work in progress. This is the foundation of what we'll need for functionality that is at least as user friendly as remotestorage.js. The intention is not for the application to have to do all of this itself.

@jeff-zucker
Copy link
Contributor Author

jeff-zucker commented Dec 23, 2021

The purpose of SolidRestBrowser is to provide a Solid interface to in-browser storage. It can be used to create a local-first system, but it is not in itself a local-first system. In order to write local-first you need a fetch that can write both locally and remotely, hopefully using the same syntax and on top of those fetches you need syncing logic - changes, conflicts, offline states, etc. SolidRestBrowser supplies the fetches and it is up to an app or other library to add syncing logic on top of that. I don't believe there is anywhere where I claim that SolidRestBrowser is a local-first system on its own.

I have added container support (returns simple Turtle representation of folders) as well as intermediate folder creation ( put(/foo/bar/baz.txt) creates /foo/ and /foo/bar/ if they don't exist. I am about to add full N3 PATCH support. There are etags. The major things missing are POST and some way to handle web-sockets. Those types of things belong in SolidRestBrowser, specific local-first logic does not.

@jeff-zucker
Copy link
Contributor Author

I could imagine a library that uses SolidRestBrowser as part of a local-first system. I'm glad to help with that effort and to make any needed changes to SolidRestBrowser but I am not going to write that library. I think keeping the fetches/storage itself separate from the syncing logic is good practice so having two libraries makes sense to me.

@jeff-zucker
Copy link
Contributor Author

However, if either of you thinks the local-first logic bits belong in SolidRestBrowser, I will gladly accept PRs. :-)

@NoelDeMartin
Copy link
Contributor

SolidRestBrowser supplies the fetches and it is up to an app or other library to add syncing logic on top of that. I don't believe there is anywhere where I claim that SolidRestBrowser is a local-first system on its own.

Ok, that's fine then, I think something simple can work and apps can have their own custom logic on top. I was confused because the title of the example in the README is "A Local-First Example", maybe it should be called something else to avoid confusing it with a full-fledged local-first solution.

@jeff-zucker
Copy link
Contributor Author

I just changed the title of that example to " Dual Fetches - a basis for Local-First" and added some explanation about the relationship between the example and a full local-first system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants