Fifthtry

Postgres WAL / Logical Decoding / Subscribing To DB Changes

Was wondering how realtime postgres (supabase: streams all changes via websocket) could be implemented, my notes notes on what I have learnt so far.

AWS Tutorial: Send DB Changes To Kinesis

Possible Use Cases

Reliably Bootstrapping S3 Backup Say

Slots give us changes since when they were created, but how do we get the exact state of database at that precise moment, so that we have a guarantee that between that if snapshot DB at that moment, and apply all changes in slot we have 100% changes in our S3 backup?

Clue 1: postgres has a command, pg_export_snapshot():

Clue 2:

When a new replication slot is created using the streaming replication interface, a snapshot is exported.

But, how to get the id of this snapshot? This is how we create the slot:

SELECT * 
FROM pg_create_logical_replication_slot(
  'regression_slot', 'test_decoding'
);
    slot_name    | xlog_position
-----------------+---------------
 regression_slot | 0/16B1970
(1 row)

Is xlog_position the snapshot id?

According to someone its defined as:

xlog_position corresponds to the XLOG position where logical decoding starts.

Doesn’t sound like.

Another question: is there a way to not actually delete the message on get() till it’s written safely by client, else we can have data loss.

Output Plugin Design Mistake

Seems to me PostgreSQL made a mistake in its design.

The format in which those changes are streamed is determined by the output plugin used.

This is bad because it requires you to install output plugin on server. Installing plugin on server has two problems.

  1. Software on server: We don’t want to, or sometimes can not (eg RDS) install stuff on server. You can have only one version of any plugin and so on. RDS for example only supports wal2json, and its a suboptimal choice to begin with: should have been some binary format. Plugins can be insecure, written in C etc. Just complete mess in my opinion.
  2. Data volume: The output of plugin is stored on server, and transferred over network. What we want such volume to be is as small as possible, so if you want JSON output, and use JSON plugin, in the server hard disc, and over the network JSON will travel. Its very bad.

What they should have done is pick any streaming-compressible binary format, and allowed each client to convert it to whatever they wanted.

To Research: Correlating DB Changes With Request/Scripts

It would be helpful to have a way to know what changed why, say what web request/script/cron deleted something. Can we do it if everyone co-operated (and investigate non cooperating client changes)?

Table Of Content

Immobile v2

Link Log

August 2020

July 2020

June 2020

May 2020

April 2020

March 2020

February 2020

January 2020

Recommendations

Books Have Read / Recommend

Product Management Books

Badass: Making Users Awesome

Movies

Five Cs of An Organisation

Success and failure of encryption

Open Source

Observer: Observability for Rust

Realm: Web Development Framework Using Rust and Elm

MartD: Server To Browser Messages

On Writing And Formats Of Written Communications

Rust Stuff

Rust feature flags

Why is diesel not compatible with async?

Making Postgres Only Diesel Code To Also Support Sqlite

Rust Git2’s Concepts

Git Hash And Build Date In Rust Build

Systray Only Native App In Rust

Software and Tools I Use Often

IPFS

DNS Over HTTPS Controversy

The Patel Motel Cartel

Standalone Complex

Awesome

January 2020

Word Of The Day

Monkey

Positions

ViM

Emacs

Nix On OSX Catalina

Postgres: WAL / Logical Decoding

Postgres: Listen-Notify

Wisdom

Rules

Go All The Way

SSH Commands

Lovelace

Sorry

Nu Shell

SHA256 vs SHA224

Pronouns Bad

Ghost

Web Components

Early Return