Offline mode #351

akindyakov · 2022-12-04T13:10:17Z

Implementation plan: - ~50 hours total

truthsayer -> archaeologist comms experiment - ~6 hours total
- Chrome: message-based comms - ~2 hours add POC msg-based truthsayer->archaeologist comms #381
- Firefox: event-based comms - ~4 hours
[direction abandoned] ~~Add db to archaeologist - around 8 hours total~~
- ~~Plug in SQLite with a trivial proof that it works - around 4 hours add POC sqlite integration to background #387~~
- ~~Mirror smuggler's db table layout - around 4 hours~~
Re-implement smuggler-api - ~20 hours total
- [direction abandoned] ~~for background, using SQL directly - around 16 hours~~
- for background, using browser.storage.local - around 16 hours - implement StorageApi for browser.storage.StorageArea #396
- for truthsayer, using truthsayer -> archaeologist - ~4 hours add StorageApi impl which works via msgs #401
Amend truthsayer/archaeologist UI workflows if needed (e.g. hide "login" buttons etc) - ~16 hours
- expose local hosting option with bare bones UX add bare-bones storage type toggle #404

The text was updated successfully, but these errors were encountered:

SergNikitin · 2022-12-07T09:14:17Z

Notes from brief research on IndexedDb:
- is a mature, stable, widely supported way to store data persistently in a web browser
- is supposedly very slow - without even looking for it, I stumbled into multiple articles about it
- doesn't have native SQL support, but has libraries that build it on top (although they supposedly are even slower)

akindyakov · 2022-12-08T08:16:45Z

doesn't have native SQL support

We have very limited needs for SQL in smuggler today: only access-by-key and iteration over for a search. Is this supported by IndexDB at least?

SergNikitin · 2022-12-08T09:11:57Z

Does IndexDB support access-by-key and iteration over for a search?

Yes it does! Setup of "tables", indexes and query APIs are all completely alien to what we are used to, but the capabilities are there.

More notes:

⚠️⚠️ data stored inside a browser is persistent, but not really (see "Browser storage is not really persistent" section)
- absurd-sql + sql.js looks like the most "bare-bones" solution to this issue, but it relies on some changes that haven't been merged to upstream for more than a year; usage of patched sql.js fork is required
- supposedly (1, 2) we should expect many more persistent solutions that are both simple to integrate and performant once browsers implement "The Origin Private File System" part of File System Access API (on desktop it seems Firefox is the only one left out ATM)
if an app wants offline support AND replication to backend then there are some sophisticated of the box solutions available
- apparently, out of the box solutions are all NoSQL (see "There is no relational data" section) because

creating replication for an SQL offline first database is way more work than just adding some network protocols on top of PostgreSQL

⚠️using IndexedDb directly (instead of using some abstraction on top of it) means a future rewrite if support for a mobile application is needed

akindyakov · 2022-12-09T06:11:50Z

This is concerning. Are the limitations on a size and persistence of indexedDB equality strict on chrome and edge?

Do we know how much disk space does Mazed node takes on average?

As far as I remember we do not use relations between tables in smuggler, so we should be safe in that regard. Do I miss anything?

SergNikitin · 2022-12-10T19:19:10Z

Are the limitations on a size and persistence of indexedDB ~~equality~~equally strict on chrome and edge?

persistence part: if I understood your question correctly, see "on automatic data purging" note below!
size part: I haven't found any alarming restrictions about size, only statements that IndexedDb becomes slower as the size grows. There are good solutions for this however -- see the note on "OPFS" at the bottom!

Do we know how much disk space does Mazed node takes on average?

No, we do not. At the moment performance and size are secondary concerns from my perspective - but see the note about them at the bottom, search for "OPFS"!

As far as I remember we do not use relations between tables in smuggler, so we should be safe in that regard. Do I miss anything?

I believe you are correct.

More notes

on automatic data purging:
- Safari: as mentioned in an article linked above, purges all data (including IndexedDb) after 7 days of inactivity
- Firefox & Chromium-based: the behaviour is less aggressive
  - a call to StorageManager.persist() marks data of an active website as "persistent" and after that the browser won't touch it unless explicitly requested by a user; shaky 🤔
  - I haven't found any info on the persist() API from within background, will have to experiment 🥼 See below why background's APIs are likely to be the most important ones
⚠️⚠️ IndexedDb adheres to "same-origin policy", which seemingly means that it is NOT possible to share a database between archaeologist and truthsayer natively.
- looks like there are ways around this, but if we are to keep truthsayer in offline mode at all then it'll at the very least mean increased implementation cost and complexity.
- in all our workflows we can get access to two databases: 1) archaeologist's database, accessed from background and 2) mased.se's database, natively available to truthsayer's code AND to content script
- since truthsayer is not expected to be open all the time, only one option is viable: archaeologist's database, accessed from background
- above means
  - either truthsayer itself has to become a mostly empty shell, with all UI elements that work with data (like indexed cards, search box etc) injected into it by content (like we do for browser history import in Import fragments page #297 )
  - content can't see javascript of the page itself, but if the opposite is possible then perhaps content can inject a database accessor into truthsayer, for truthsayer to use
⭐ In the previous notes I said that a new set of web APIs is expected to unleash much better storage solutions. Turns out they are already here -- we can embed sqlite into our web app that uses these APIs under the hood (search for "OPFS"), released only a month ago! This
- supposedly solves the issues with size and performance
- retains the same persistence issues and "same-origin policy" complications, just like IndexedDb and all other storage options

Next steps

Issues with data persistence look manageable on Chrome and Firefox. On Safari they appear unacceptable long-term, but short-term we have no Safari support plans anyway. So persistence doesn't block our immediate goals.

"Same-origin policy" complications on the other hand are less clear to me. It seems bad, but docs across the web are not very clear on what can and can't be done, so as a next step🦵I'll need to run some experiments.

SergNikitin · 2022-12-14T09:16:11Z

On sharing a data store between truthsayer and archaeologist:

extensions don't have an "origin" and APIs like IndexedDb and OPFS don't expose anything that could allow archaeologist to say "fetch me a store for mazed.se" (that's sort of what we rely on for sharing cookies between the two)
archaeologist can't inject a "callable" into truthsayer because extensions live in an isolated world
⭐ but luckily for us there are two other approaches
- on Chromium-based browsers it's dead easy for truthsayer to send requests to archaeologist and receive responses - there is a special externally_connectable manifest permission for that (if we knew about it earlier we might have implemented browser history import controls very differently)
- on Firefox externally_connectable is not supported, but there is an escape hatch - although webpages and content scripts don't share callables/variables/etc they do share events, so truthsayer can post an event saying "query a database" and archaeologist can post an event with a response

SergNikitin · 2022-12-15T07:32:42Z

<estimates moved to the top>

akindyakov · 2022-12-18T11:52:41Z

Thanks a lot for the detailed write-up!

A few questions:

Is "Plug in SQLite with trivial proof that it works" the biggest risk of failure? If so, can we start from it then?
With this setup, offline mode will be unavailable in Truthsayer when Archaeologist is not installed, is this correct?
What corners can we cut to speed it up?

akindyakov · 2022-12-19T08:47:11Z

What corners can we cut to speed it up?

We can push implementation for Firefox to the end of this list. The goal is to make it work for engineers in tech companies with a special policy on installed software, I know none who use Firefox. We can get to the point without Firefox at first

akindyakov · 2022-12-19T08:49:46Z

Plug in SQLite

As far as I remember, you wanted to use WASM to do it, didn't you? Can we add some small piece of code with WASM first and try to release an extension to the Chrome web store? Just to make sure they won't bloc/ban us for this.

SergNikitin · 2022-12-20T18:01:51Z

Is "Plug in SQLite with trivial proof that it works" the biggest risk of failure? If so, can we start from it then?

The biggest in my opinion was communication between truthsayer and archaeologist, but I made it work. SQLite comes second and will be my next step!

With this setup, offline mode will be unavailable in Truthsayer when Archaeologist is not installed, is this correct?

Yes, archaeologist will become a mandatory requirement to use it.

What corners can we cut to speed it up?

The most obvious one is to "push implementation for Firefox to the end of this list", I agree that we should use it. Just thinking at the level of big chunks of work estimated above, I don't see any obvious ones that can be removed. But once we start working on them we'll probably see sub-parts that aren't needed.

We can push implementation for Firefox to the end of this list.

Yes, good one to cut! (see the previous answer)

As far as I remember, you wanted to use WASM to do it, didn't you? Can we add some small piece of code with WASM first and try to release an extension to the Chrome web store?

Yes, sqlite itself is WASM-only in JS as far as I understand. It's a good idea to trial a WASM release, let's use the outcome of "Plug in SQLite with a trivial proof that it works" step for that.

SergNikitin · 2022-12-20T20:16:54Z

Minor update on how to pull sqlite into our code: prior to the current "official" (as in "by maintainers of sqlite itself") effort to provide WASM builds there have been a number of projects that experimented in a similar territory (see the "Attribution" section). All of them has been published to npm and can be consumed easily:

sql.js (in our case +absurd-sql) - more notes on why this combo is not very desirable in earlier comments
wa-sqlite - apparently the first unofficial project to experiment with WASM sqlite + OPFS. Probably would work for our case, but with the recent release of the "official" WASM sqlite the maintainer posted this discussion with the reasons why wa-sqlite is unlikely to stay attractive

So based on the above, it's appealing to use the "official" offerings. I was surprised to find that although WASM is officially supported, there are currently no official NPM packages - we are expected to download the binaries manually from their website. This is obviously inconvenient, and luckily some folks have created a paper-thin NPM-compatible wrapper over the official build -- sqlite-wasm-esm; they also have an ongoing conversation with sqlite maintainers to help them eventually offer the same convenience out of the box. That's the package I'll experiment with first ⭐

akindyakov · 2022-12-21T09:29:44Z

let's use the outcome of "Plug in SQLite with a trivial proof that it works" step for that.

Let's experiment with something smaller that sqlite and less "experimental", something that works 💯 . Any stable package with wasm inside would do. My concern is that Chrome Web Store might block Archaeologist entirely for using WASM - for some reason they afraid of it

SergNikitin · 2022-12-26T16:27:48Z

I managed to make our build process to pick up WASM dependencies properly and, as we expected, after that their usage is the same as regular JS - just do an import. The PR showcases use of a default DB of type 'memory' - that's an in-memory database, contents get lost as soon as the browser gets closed. So WASM-wise we are good, can write familiar SQL without a problem. 🎉

We are, however, not good on the persistence front and as a result will not proceed with SQLite at this time. See the second part of the comment for more context if curious.

What's next?

In summary,

messaging experiment - success ✅, will be used in final solution ✅
wasm experiment - success ✅, won't be used in final solution ❌
sqlite experiment - failure ❌

Next - implementation of something with direct usage of IndexedDb or browser.storage.

SQLite storage options

Aside from the in-memory DB, SQLite comes with two more types of storage:

Storage option 1 - OPFS

This is the one I was hoping to use, but it won't work right now. It does work "within workers", but turns out there are multiple kinds of workers. There are

"web workers" (that's what you get when you run new Worker()). These are general-purpose, can be used to do whatever the application needs. They are further divided into
1. "dedicated workers"
2. "shared workers"
"service workers" - these are workers with a "specific purpose", they are not general

OPFS docs say:

The createSyncAccessHandle() [...] Note that it is only usable inside dedicated Web Workers.

SQLite's OPFS implementation requires this API and their docs say

[OPFS via sqlite3_vfs] support is only available when sqlite3.js is loaded from a [...] dedicated worker [...].

As you might have guessed by this point, background.js is a "service worker" in Manifest V3, so the API is unuavailable and by extension OPFS support is unavailable. 👎

Is it possible to spawn a "dedicated worker" from background?

In Manifest V2 - yes. In Manifest V3 - no, but it has been identified as an undesired regression. Chrome expressed intent to fix this, that's being tracked here. So at some point we'll be able to use OPFS in our usecase 👍, but that day is not today.

Storage option 2 - `localStorage`/`sessionStorage` (nicknamed kvvfs)

That's what is available via window.localStorage and such. These specific globals are only available in a web page environment, so background is out of luck again.

What's interesting is background has access to a very similiar API -- browser.storage and, unlike window.localStorage there is no limit on how much data can be stored (if "unlimitedStorage" permission is requested). It's also a key-value VFS, just has differently named getter and setter. We probably don't have a chance to move along the OPFS blocker, but this I believe we could overcome with managable effort if we really wanted:

The glue code which uses localStorage/sessionStorage API's getItem()/setItem() would have to change to browser.storage's get()/set()
a couple of other decision-making places

Although a prospect of contributing to sqlite itself to make localStorage work in background is very exciting, we'll keep this treat for "future us" if and when we get dissatisfied with a hand-written persistence. In parallel, I have submitted a question to sqlite maintainers to better understand if this is possible at all.

akindyakov · 2022-12-26T21:07:02Z

Wow, you digging really deep! Read it like an adventure story😲

Just as an idea, is there some politfill package for or dependency injection trick to replace 'window.localStorage' with 'browser.storage' without changing code of SQLite at all?

akindyakov · 2022-12-26T21:07:18Z

So what's your plan now then?

SergNikitin · 2022-12-27T13:35:19Z

It's as described in "What's next?" section: instead of smuggler-api backed by sqlite I'm working on

implementation of smuggler-api with direct usage of IndexedDb or browser.storage.local

SergNikitin · 2022-12-27T15:16:05Z

is there some politfill package for or dependency injection trick to replace 'window.localStorage' with 'browser.storage' without changing code of SQLite at all?

Probably not -- these APIs have 1 fundamental difference which I missed originally, one is synchronous and another is asynchronous. Maintainers of sqlite were kind enough to look into this very quickly with the intension to patch their code in the next release, but identified this as the severe complication.

This means we can do our hand-written implementation without sqlite with a peace of mind -- we have ruled out all the possibilities where it wouldn't cost us an arm and a leg to get sqlite to work.

SergNikitin · 2023-01-11T09:14:44Z

First pass of a background implementation of StorageApi is close to done in #396, at least for most of the core endpoints. The one that presented a challenge that is difficult to address just within the bounds of StorageApi is node search - currently the only endpoint available is node.slice which is used in two different contexts:

to iterate over nodes (called like node.slice({}))
to lookup a very specific list of nodes (used only in steroid.node.lookup)

The API is inherently time-based and yet neither of the usecases actually cares about time-related parameters of GetNodeSliceArgs - first one just ignores them, second specifies parameters that say "give me a single range from 0 to now".

Since time-based lookups are very unattractive in a KV-storage I think I may need to split these into two distinct APIs, otherwise I'll either make steroid.node.lookup very slow or node.slice({}) very slow -- #397

SergNikitin · 2023-01-21T18:38:59Z

Core functionality is in place for Chromium-based browsers. Bugs to fix (mostly extracted from #404 comments):

cards get shown in reverse order, oldest first newest last local storage: iterate nodes from newest to oldest #410
edge-related functionality is mostly broken local storage: fix creation of edges #412
- quotes and bookmarks don't connect properly
- triptych connections don't render properly
- most dropdown menu buttons which connect card to a new card don't work
ownership of nodes and edges doesn't work - reorg account init in truthsayer to make usage deterministic #419 local storage: fix node & edge ownership #422
- prevents from deleting cards
- makes three-dots button disappear from every card
image upload doesn't work - propagate errors from archaeologist to truthsayer #435 local storage: descriptive error for image upload #436
browser history bootstrap is irreversible (bulk card removal doesn't work) - local storage: fix history import & implement bulk node removal #423
single-node delete doesn't work - local storage: implement single-node deletion #426
associations don't work - local storage: implement storage of associations #434
only half of attention tracking works - "total seconds of attention" works, "attention event timestamp" doesn't

akindyakov · 2023-01-21T20:32:22Z

file upload doesn't work

Only images uploading doesn't work, text files uploading works fine

akindyakov · 2023-02-05T08:53:25Z

🐛 Found one more bug: 🐛 Local mode: Node iterator created by storage.node.iterate() occasionally returns same node more than once #428

akindyakov · 2023-02-05T09:24:20Z

cards get shown in reverse order, oldest first newest last #410

Sort nodes by "crated-at" time in node iterator, latest first #430

Update: Looked closer, it looks like i found the issue. @SergNikitin , if you are not looking at it already, I'll take it

SergNikitin · 2023-02-16T19:32:21Z

Done!

akindyakov assigned SergNikitin Dec 4, 2022

akindyakov added this to the Release v0.2.0 milestone Dec 4, 2022

SergNikitin mentioned this issue Dec 26, 2022

add POC sqlite integration to background #387

Closed

This was referenced Dec 27, 2022

rm unused parts of smuggler-api #389

Merged

smuggler-api/types.ts: split types & helpers #390

Merged

split smuggler-api interface into distinct storage & authn APIs #391

Merged

This was referenced Jan 3, 2023

cleanup remains of ShareModal #392

Merged

switch direct smuggler access to StorageApi #393

Merged

replace StorageApi global var with a make fn #394

Merged

mv non-trivial StorageApi methods to steroid #395

Merged

SergNikitin mentioned this issue Jan 11, 2023

split node.slice into node.getByOrigin and node.iterate #397

Merged

This was referenced Jan 14, 2023

implement StorageApi for browser.storage.StorageArea #396

Merged

add StorageApi impl which works via msgs #401

Merged

rewrite MzdGlobal as functional component #403

Merged

SergNikitin mentioned this issue Jan 21, 2023

local storage: iterate nodes from newest to oldest #410

Merged

This was referenced Jan 23, 2023

local storage: fix creation of edges #412

Merged

split account creation during context init #418

Closed

reorg account init in truthsayer to make usage deterministic #419

Merged

This was referenced Jan 30, 2023

local storage: fix node & edge ownership #422

Merged

local storage: fix history import & implement bulk node removal #423

Merged

local storage: implement single-node deletion #426

Merged

akindyakov mentioned this issue Feb 7, 2023

Sort nodes by "crated-at" time in node iterator, latest first #430

Merged

This was referenced Feb 8, 2023

local storage: all array lavs use nontrivial types #432

Merged

local storage: implement storage of associations #434

Merged

local storage: descriptive error for image upload #436

Merged

Prevent yielding duplicated nodes from iterator #439

Merged

akindyakov mentioned this issue Feb 14, 2023

🪲 Image uploading not working #327

Closed

SergNikitin closed this as completed Feb 16, 2023

SergNikitin mentioned this issue Mar 1, 2023

local storage: fix attention tracking #472

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offline mode #351

Offline mode #351

akindyakov commented Dec 4, 2022 •

edited by SergNikitin

Loading

SergNikitin commented Dec 7, 2022

akindyakov commented Dec 8, 2022

SergNikitin commented Dec 8, 2022

akindyakov commented Dec 9, 2022

SergNikitin commented Dec 10, 2022

SergNikitin commented Dec 14, 2022

SergNikitin commented Dec 15, 2022 •

edited

Loading

akindyakov commented Dec 18, 2022

akindyakov commented Dec 19, 2022

akindyakov commented Dec 19, 2022

SergNikitin commented Dec 20, 2022

SergNikitin commented Dec 20, 2022

akindyakov commented Dec 21, 2022

SergNikitin commented Dec 26, 2022 •

edited

Loading

akindyakov commented Dec 26, 2022

akindyakov commented Dec 26, 2022

SergNikitin commented Dec 27, 2022

SergNikitin commented Dec 27, 2022 •

edited

Loading

SergNikitin commented Jan 11, 2023 •

edited

Loading

SergNikitin commented Jan 21, 2023 •

edited

Loading

akindyakov commented Jan 21, 2023

akindyakov commented Feb 5, 2023 •

edited by SergNikitin

Loading

akindyakov commented Feb 5, 2023 •

edited by SergNikitin

Loading

SergNikitin commented Feb 16, 2023

Offline mode #351

Offline mode #351

Comments

akindyakov commented Dec 4, 2022 • edited by SergNikitin Loading

SergNikitin commented Dec 7, 2022

akindyakov commented Dec 8, 2022

SergNikitin commented Dec 8, 2022

akindyakov commented Dec 9, 2022

SergNikitin commented Dec 10, 2022

More notes

Next steps

SergNikitin commented Dec 14, 2022

SergNikitin commented Dec 15, 2022 • edited Loading

akindyakov commented Dec 18, 2022

akindyakov commented Dec 19, 2022

akindyakov commented Dec 19, 2022

SergNikitin commented Dec 20, 2022

SergNikitin commented Dec 20, 2022

akindyakov commented Dec 21, 2022

SergNikitin commented Dec 26, 2022 • edited Loading

What's next?

SQLite storage options

Storage option 1 - OPFS

Is it possible to spawn a "dedicated worker" from background?

Storage option 2 - localStorage/sessionStorage (nicknamed kvvfs)

akindyakov commented Dec 26, 2022

akindyakov commented Dec 26, 2022

SergNikitin commented Dec 27, 2022

SergNikitin commented Dec 27, 2022 • edited Loading

SergNikitin commented Jan 11, 2023 • edited Loading

SergNikitin commented Jan 21, 2023 • edited Loading

akindyakov commented Jan 21, 2023

akindyakov commented Feb 5, 2023 • edited by SergNikitin Loading

akindyakov commented Feb 5, 2023 • edited by SergNikitin Loading

SergNikitin commented Feb 16, 2023

akindyakov commented Dec 4, 2022 •

edited by SergNikitin

Loading

SergNikitin commented Dec 15, 2022 •

edited

Loading

SergNikitin commented Dec 26, 2022 •

edited

Loading

Storage option 2 - `localStorage`/`sessionStorage` (nicknamed kvvfs)

SergNikitin commented Dec 27, 2022 •

edited

Loading

SergNikitin commented Jan 11, 2023 •

edited

Loading

SergNikitin commented Jan 21, 2023 •

edited

Loading

akindyakov commented Feb 5, 2023 •

edited by SergNikitin

Loading

akindyakov commented Feb 5, 2023 •

edited by SergNikitin

Loading