Fifthtry

Transparent Offline Feature

Ludicrous mode for web.

Release Date: 4-Mar-2020

One advantage of “standardised requests” is we can intercept all GET requests centrally in Realm.js, and make them offline aware. This can be done with zero client app support, all Realm app can be automatically offline enabled.

What will service worker do?

We are going to use service worker minimally: it will have only one purpose: if it detects we are offline, it will check if requested page is in cache, it will render it, else show a default page. If online, it will let the request pass, and do nothing.

Caching in loadNow()

Realm.js:loadNow() is primarily responsible for caching. It will look at each DATA.url, and cache it.

If offline, navigate() will serve data from cache, or serve Pages.Realm.Offline.

Realm.In: .isCached & .isOnline

In Realm.In we will have isCached property set, and isOnline which will tell if user is currently online, and if the current page was loaded from cache.

There would also be an online status subscription, to be notified when net comes online or goes offline (best guess basis: we may poll navigator.onLine every second).

Page Template

In realm HTTP requests are made either in web mode or in realm mode. In web mode, HTML is returned, in realm mode JSON.

We are going to need both JSON and HTML versions in cache.

Why: if a page is loaded in browser by URL, say bookmark or user enters the URL, HTML version is needed. During page navigation, JSON version is needed.

But we shouldn’t cache both, as that will double the cache size.

What we will do is store the JSON version always in cache. loadNow() gets the JSON after extracting it from HTML, which is served on first page load.

Service Worker will have access to page template, and will JSON from cache (template itself may be in cache). Service worker will embed the JSON in html template and return the HTML.

In HTML, along with embedding the JSON, realm will also embed the HTML template, and Realm.js will store the template in cache, so it can be used by service worker.

On Logout: Purge Cache

We may have sensitive information in cache, which we will purge when user logs out.

This will be indicated by DATA.cache.purge_entire_cache, if is true, Realm.js will purge the cache.

Cache ID & Storage Management

NOTE: ignore most of this section for now (we can not find which cache a given URL is stored in, we can store a mapping of that somewhere, but its too complicated, with minimal return): we will have two cache “realm” for framework/static stuff etc, and “app” for app stuff.

Cache API supports more than one cache per domain. Each cache has an ID.

For Realm we will use the following:

realm:elm-<hash>: This would be the cache that will store /static/* URLs. The HASH changes on every “deploy” so we will purge them on every reload after deploy, and repopulate all currently loaded URLs.

app:<cache-id>: This is where loadNow() will store all JSONs by default. DATA.cache.id will be consulted. Default value of .cache_id is default. It can be set to null to opt out of caching.

In fifthtry for example, each project will get its own cache_id.

We will also have Realm API (using ports) to fetch cache estimate for entire app, and API to delete cache_ids (or key inside the cache_id).

For cache management, Realm.Cache will come with helpers to visualise and delete etc cache estimate data.

When do we purge cache?

Lets say we have two kinds of cache, “framework cache” and “app cache”, framework cache stores elm generated JS etc, and app cache stores all application JSONs.

We store elm generated js in framework cache, and all static files. This should be cached on every deploy.

The format of JSON generated by Realm backend can change, so we should potentially purge all realm cache when realm version number changes. Version number in realm can have a field to indicate caching related version number, so not every change in realm leads to cache purge.

Stale While Revalidate

What should be the behaviour for when we may have something in cache, should we show it, in interest of speed, or to fetch latest content, in interest of accuracy?

Of course, both: if DATA.cache.etag is set, Realm.js will load the page with cached data, and make a HTTP request to backend. If backend .etags match, then perfect. If .etags are loadNow() can update the page, but should it?

NOTE: we are not doing what’s discussed in previous para.

Off-topic: window.onbeforeunload

Either in Realm (using Realm.navigate) or via reload, entering something browser location bar etc, may trigger a page change, which can be a problem if current page has some form with dirty data, which user may lose and we should warn user about.

window.onbeforeunload is the standard solution for that.

How would an app indicate to Realm that this feature is needed? It sends some commands etc, to indicate when its dirty and when its back clean.

Back to .etag mismatch

If app has indicated that its in clean state, then we can update the page without loss of anything. In case app is in dirty state, we can send a message saying so, and ask user to trigger a reload.

If .etag is missing, should we compare JSONs?

Not so sure. JSON keys are not guaranteed to be ordered. Unless we use sorted json on rust side.

What is BackgroundSync?

BackgroundSync, an experimental API:

… outbox can only be processed while the site is displayed in a browsing context. This is particularly problematic on mobile, where browsing contexts are frequently shut down to free memory.

We can store submit()s to local store and sync them when a page is viewed when the net comes online, but won’t sync unless we have the site opened at least once after net comes back online. This API is a solution to that.

CDN vs Service Worker - Crawler vs Browser

When thinking of website performance we have to take into consideration CDNs also. For purpose of this we will pretend CDN = Cloudflare. For optimal caching, we need some “edge processing” as HTTP caching header story is utterly fucked up (TODO: write a blog post explaining this, short story: number of caches per url = product of cardinality of each header in Vary, which can be phenomenally huge, effectively killing caching).

Let’s also assume we are talking about HTTPS website.

For every URL, when it comes to caching, we have to understand how many versions are being cached (and where).

First comes encoding. Compression is huge. There is different encodings supported by different browsers, and brotli at 92% (2020-02-21) vs gzip are two.

Then comes realm’s two modes: JSON vs HTML. Within HTML we have two possible scenarios: elm-JS inlined (for first time browsers: lets call it HTML-js, it includes “pure content JSON” also) vs with SSR but no JS (inline or even script tag) (for crawlers, lets call it HTML-ssr).

So for every URL of a site, we want to have 2 (gzip vs brotli) * 3 (JSON vs HTML-js vs HTML-ssr) = 6 versions in CDN cache.

Now comes browser: In browser, we want precisely one version in browser cache (JSON one). Browser sees two versions: HTML-js of the first URL visited, before service-worker was installed, and JSON version subsequently. We can call it a day, as its close to one, but we can make it close to precisely one (assuming cache never expires) by letting our app tell cloudflare to cache, but our edge worker instructing browser not to cache (realm will extract JSON version from HTML-js version, and explicitly cache it).

Anatomy of a Page

Any page, say a status message page on Facebook, can be said to have two parts: content part and user part. content part can have two versions: pure version, and personalised (ised now onwards) version.

pure version reflects underlying content, and is only updated when the content itself changes.

Personal-ised version reflects customisation to content basis who is viewing: eg if you have liked the status message, or if your friends have liked it.

user part usually is shown in header etc. It could have been called session in interest of clarity, but I am going with 4 letter names here.

Lets say this is how the config looks like:

{
    "user": {/* this is where user part is grafted  */},
    // .. ised and pure versions have shame shape ..
}

Lifecycle of a Page

Let’s see what happens on first page view of /foo/. Browser makes HTTP request, CDN, hopefully a few ms away from you serves the pure HTML-js version. Here we do not care who is looking at /foo/, logged in or not etc, everyone gets same page. It contains pure JSON, meaning user and ised fields are null. The HTML also contains the javascript to render the page.

Javascript renders the page, and installs service worker. Every subsequent request will be handled by service-worker.

Javascript knows that JSON extracted from HTML-js is missing user and ised keys, so it after rendering the pure version, so user has something to see, it makes an HTTP request to /foo/?realm_mode=ised. As soon as any of the either response is received the elm app is re-initialised (if app was in “clean” state), or if app was in dirty state, we send an update request to existing page.

The CDN will never cache /foo/?realm_mode=ised somehow (server returns no cache header). But Realm.js will cache it in browser’s cache.

On a request to /bar/, we will plug in stale value of user, from cache, so the first page render will have right stuff in header. Requests to /bar/?realm_mode=ised and /bar/?realm_mode=pure would be sent. /bar/?realm_mode=pure would (hopefully) be served by CDN. ised version will come from backend app.

user data management

Realm.js caches user data in memory, and in cache. DATA.config.user is assumed to be user data, and is always plucked from every /bar/?realm_mode=ised response, and stored in cache. When /bar/?realm_mode=pure is loaded, cached user data is attached as DATA.config.user before passing to elm.

It is recommended to further use server push technology to update user data when it changes on server. header:2 user data management

Cloudflare Worker

With cloudflare worker, exclude: /static/*, exclude: *?realm_mode=ised*, we capture request to all URLs, and convert them to realm_mode=pure requests. We further embed latest js version, and JSON schema_version (also make JSON schema_version part of REALM_DATA). App returns PURE requests with indefinite caching (only if app has facility to trigger CF cache purge when the URL content changes), but worker strips out the cache header, so browser always consult our CF worker. Worker also have HTML template embedded, so it returns either HTML or JSON as needed. HTML template no longer embeds JS version, and JS version is embedded cloudflare worker (meaning workers in deployed on every JS version change).

When the browser gets the first page, it gets pure version (hopefully already cached) from CF, along with JS (either as embedded in HTML or as script=src).

If the browser does not have service worker, all subsequent page loads (browser UI triggered, not navigate from within app), hit CF (if no net, and no service worker, we can’t do better: for now trying to rely on the douchbag).

Subsequent requests, if service worker is installed, would be converted by to `?realm_mode=pure” requests, which will return JSON which service worker will convert to HTML, and pass to browser.

On page load Realm.js sees its loading a pure response, and makes another realm_mode=json response to get data from server, after showing the stale version in browser (after plugging in latest user data).

JS Versioning 1: Deployment

On every deployment, the hash of our JS file changes. We have to ensure that we are always using the latest known JS.

With CF edge worker/backend servers, from Realm.js’s point of view, we can assume that latest REALM_DATA.hash will be present in all pure and ised versions.

Realm.js needs to know its own hash, this is tricky: we want some way to ensure that only when js actually changed we purge the cache (meaning if we create guid based file name on every deploy, as certain deploys may be backend only deploys, or even deploys only in elm types say, which do not lead to js changes). One way would be to keep last hash, create new hash, and if hash has changed, create a new guid, embed the guid in js, create a guid based url. Since we are doing content checks, and only generating new guid when non guid portion of content has changed, we get optimal caching of js files.

So on every deploy we have a hash, which is embedded in every cached json and in Realm.js. Next we need the hash to be comparable (we want to know which hash is newer if we have more than one hashes). Easy enough.

So what should Realm.js do when it encounters a new hash?

First lets look at world without browser side service worker: all pages in pure have cache control set to immutable, so we need to reload the page. Does document.location.reload() invalidate cache: let’s assume yes for now.

JS Versioning 2: Realm/App Breaking Changes

Table Of Content

What is Realm?

A Bit On Motivation

Routing is Hard

What does Realm do?

Backend Data And Type Safety

Tutorial

Quick Start Realm Tutorial

In Depth Tutorial (not ready)

Nix
Shell
Doit
Hello Rust
Hello Elm
Hello Static Files
Hello Server Side Rendering
Pre-Commit Hooks

Routing, Request And Response

Frontend, Data, Navigation, And APIs

How To Guides

File Upload

Backend: S3 File Upload
Authenticated File Serving
Frontend: Uploading Files From Elm

How to use storybook?

How to implement “loading..”?

Docs

Realm.In

Realm.Storybook.Story

realm::In

realm::Context

realm::Result

realm.magicSlice

realm::RequestConfig

Environment Variables

Internals - Only for Realm Developers, not Users

“Realm DATA”
iFrame Controller
Shutdown Routine
Testing Internals

Change Log

Get Realm Starter Working

Transparent Offline Feature

How to make http requests in Realm?

Development

Replay Testing

Tutorial: ToDo App

Realm Testing

Enhance Realm Starter

Double Load Issue

Deploy To Heroku Button

End failure

Realm-Starter Github Template

Proposal: Tracker And Visit

Proposal: Activity Store

Proposal: Bundling

Proposal: Retry On Network Error

Storybook: Editable JSON

Storybook: Notes

Storybook: Reference

Backlog

Readings

Change Log

How to Publish

Testing

Code Snippets

Skip rustfmt For Some Section

Close Modal Dialog When Clicked Outside

Ignoring Lints In Python

Ignoring Lints (clippy and rustc warnings) In Rust

Handle DateTime in Rust & Elm

Handle CiText value read in Rust

Transport Enum Type to and fro Rust/Elm through JSON