Ably’s LiveSync product is now available as a public alpha release. LiveSync allows you to automatically keep data in sync across servers and clients, in realtime.
This is a hard distributed systems problem. In this post, we’re going to have a look at four existing solutions to realtime data synchronization, the different technical designs, and the tradeoffs they make. Then, at the end of the post I’ll describe the set of design choices we’ve taken at Ably for our LiveSync product.
But first, what are others up to?
1. Supabase
Supabase has a feature called Postgres Changes, which is part of their realtime product offering. It allows you to listen to changes that happen in a single table, and send those changes to clients that are allowed to receive them.
As you INSERT, UPDATE, and DELETE records in your Postgres instance, Supabase receives these changes via the Postgres Streaming Replication Protocol, and sends the changes to the subscribed clients over websockets.
Supabase also bundles auth through Row Level Security (RLS). This is a feature where clients get individual tokens, and those tokens grant access to specific rows in a Postgres table. The Postgres Changes feature integrates with RLS.
Things to consider:
- Supabase WebSockets do not have history or guaranteed delivery: So you’re not guaranteed to receive all the changes made to your database, and you can’t ‘rewind’ to get those changes if you do notice you’ve missed them. Even temporary disconnections from clients can result in you missing messages, and you have to do quite a lot of work to protect against this.
- Mis-managed replication slots can tank your whole database: The architecture used by Supabase Postgres Changes involves two components: your Postgres server and a Realtime server. The Realtime Server consumes database changes from the Postgres replication slot. The replication slot stores all the changes that are happening in the database and once those changes have been processed by the consumer of the replication slot the data representing that change can be deleted from the replication slot by Postgres. This means that if the Realtime server can’t consume from the slot fast enough (faster than Postgres is writing to the slot), then the data stored in that replication slot will continue to grow. This replication slot shares disk space with your database. Eventually, the replication slot can consume all the disk space of your database. Once this happens, Postgres will reject new connections and new writes to the database.
- RLS adds a lot of queries to your database when using Postgres changes for fan-out: The Supabase RLS feature delegates authentication and access control to the built-in Postgres Row Security Policies feature. In most request-response queries to the database, a single user is making some queries and Postgres is checking if that user’s auth token determines that they can access those specific rows. However, one of the common uses of the Postgres Changes feature is to fan-out database changes to multiple clients over WebSockets. To implement RLS for these fan-out changes, Supabase will re-query the row that has changed using each client’s auth token. That means if you have 1000 clients listening for changes in your Postgres table, then a single INSERT, UPDATE, or DELETE operation on that table will result in Supabase generating 1000 queries against the database (one for each client) to fetch the changed row and check whether access is allowed.
- Postgres Changes only works for a single table at a time: That is, you can subscribe to multiple tables, but events in different tables are distinct from one-another. The data that your clients need is probably spread across a bunch of normalized database tables, and probably relies on a set of joins to create the entity that the client uses. However, Supabase will only send the changes that are happening to individual database tables. One option is to implement logic in the client to materialize the result of a query from the stream of changes to individual tables, which is exceedingly difficult to implement. A simpler solution, sometimes known as a poke/pull pattern, is to use Postgres Changes to fan-out a change notification to clients who then individually query for the new data from the database. This causes all your clients to request data at the same time, creating a spike request load.
2. Liveblocks
Liveblocks’ products are based on a series of CRDT building blocks (LiveObject, LiveMap, LiveList). These objects are backed by some opaque storage managed by Liveblocks. The changes are shared between the server and clients over WebSockets, which handle automatic reconnection and fetching older data. Thanks to the properties of the underlying CRDT, the order of message delivery doesn’t matter.
By building on CRDTs, Liveblocks ensures a high level of data integrity for multiplayer, collaborative applications where multiple users make changes to a shared object in realtime.
Things to consider:
- Liveblocks stores and controls your data but getting a copy to your server is hard: There is no mechanism by which Liveblocks can push changes into your datastore. In order to reflect a copy of the Liveblocks data into your own server or database, you can pull the state of the object via the Storage API. This can be used in conjunction with a Storage Updated Event webhook to obtain the new state when changes are made, which is sent at most every 5 seconds. Therefore, due to the additional network round trips and throttled change notifications, there may be significant latency in reflecting changes from Liveblocks to your own servers.
- Source-of-truth: In order to implement the CRDT building blocks, Liveblocks maintains control over the source-of-truth copy of your data. Many applications may expect the database to represent the source-of-truth in order to provide other application functionality. Therefore Liveblocks is best suited for the truly collaborative parts of your application state in which eventual consistency with your database is acceptable.
- Limited cross-platform support: Liveblocks ships a set of client libraries that allow you to interact with the underlying CRDT storage primitives. The libraries are open-source, but don’t cover a broad set of languages and currently only target the JavaScript ecosystem.
3. ElectricSQL
ElectricSQL is bi-directional database sync between a Postgres server and SQLite running on the clients. Changes made in your Postgres database are fanned out to the clients, and changes made in each client’s SQLite are reflected back into the Postgres Server. ElectricSQL makes this work with a series of client libraries, and a backend component called the Electric Sync Service that runs against your Postgres Server. The Sync Service, similar to Supabase Postgres Changes, also makes use of logical replication to receive outbound changes from the database, but additionally the database consumes changes inbound from the Sync Service. The SQLite databases running on the clients are kept in sync with the Sync Service over WebSockets.
ElectricSQL offers a novel approach to data synchronization and provides some powerful tooling for building local-first applications.
Things to consider:
- Tight coupling between the Sync Service and your database: ElectricSQL’s design means that the Sync Service is quite tightly coupled to your database. An implication of this, for example, is that migrations have to be applied to your database via the Migrations Proxy.
- Limited query language expressivity: ElectricSQL queries are table-scoped, which limits the complexity of the queries you can write.
- Sync lags for one-to-many relations: ElectricSQL queries allow you to request data from related tables, but only many-to-one relations will be synced at the same time as the parent row. Other relations can be ‘included’, but will be synced sometime after. ElectricSQL calls this ‘move-in lag’. For example in an app with projects and issues, if you sync ‘active’ projects and a project becomes active then the project row (and many-to-one relations like project author) will be synced at the same time, but the issues (one-to-many) won’t immediately appear. So your app would show a project with zero issues until the issues are synced by ElectricSQL.
- Conflict resolution semantics: Local writes are subject to concurrent merge semantics of the underlying CRDTs that are used under-the-hood, meaning that updates might be overridden by the semantics of the conflict resolution logic, which may not preserve the original intent of the operation performed by the user.
- There’s no permissions control: At the time of writing, ElectricSQL doesn’t have an access control or permissions system, although it is on the roadmap.
4. Replicache
Replicache is a client library that can be integrated with your own backend to synchronize database changes to your frontend clients. It operates on the principle of Server Reconciliation, an approach which delegates the resolution of conflicting concurrent changes to the database (or server) such that all clients reflect the authoritative state on the server, rather than relying on the properties of CRDTs as Liveblocks and ElectricSQL do.
To make Server Reconciliation work, the clients make requests to tell the server the change they want to make (and not the result of the change they have made, like you might in a REST API). The request to the server could be “increment by 1” rather than the result of the increment operation that a client might have applied to its own local state. The server then applies that change, and then sends a change notification to clients. The change notification (or “poke”) is delivered to the client over a WebSocket, and on receipt of the poke the client must then request any changes it hasn’t yet received from the server. To fulfill this request, the server must compute the set of changes on a per-client basis such that each client receives only the data that it is missing.
Replicache has a hosted solution called Reflect that takes away the significant complexity of building and managing the associated backend.
Things to consider:
- Backend complexity: There is significant complexity in correctly implementing your backend in order to satisfy the requirements of the client library. Depending on the needs of your application, Replicache offers guidance for the different strategies you can implement to enable server-client synchronization.
- Storing the complete change history: In order to send each client the specific set of changes that client hasn’t received, the backend needs to store the full change history that has happened. There’s no guarantee how up-to-date any specific client is, so the full change history is required so that lagging clients can be caught back up. For any reasonably fast-changing entity, the server will very quickly end up with a lot of changes it has to store. Additionally, some strategies require you to use soft deletes, meaning that entities cannot be fully removed from your database.
- Storing per-client state: Depending on the backend strategy used, the backend may need to store a pointer for each client that represents where in the change history the client thinks it is up to. The pointer is used to catch-back-up clients that have forgotten where they got to. Replicache’s design means you have to store a lot of extra data in the database.
- Thundering herd: On delivery of a poke to connected clients, all clients must request their missing changes from the backend. This creates a spike in load on your database and server which cannot be cached as each client requires a customized response based on where it is in the change history.
In each of these designs there are a series of technology and design trade-offs. Each one will be better or worse depending on the use case it’s applied to. With LiveSync, we’re trying to make the best choices for the most use cases, minimizing developer effort and offering a high degree of flexibility and data consistency.
Ably LiveSync
LiveSync allows you to reliably synchronize changes that occur in your database to frontend clients. LiveSync is a flexible solution designed to work with your database of choice and supports incremental adoption in existing applications.
LiveSync borrows concepts from Event Driven Architectures to offer a flexible and reliable approach to data synchronization. It is built on top of Ably Pub/Sub Channels, our highly available, low latency edge messaging product. This means that it automatically benefits from automatic connection recovery as well as features like rewind and history to make sure you don’t miss messages. You can also leverage Ably capabilities to control what data changes clients can access.
LiveSync provides a client library called the Models SDK which makes it easy to define data models in your frontend that are automatically kept up to date with changes made on your backend. These changes are automatically distributed to clients via the Database Connector, which is available as a hosted solution on the Ably platform or as a container image which you can operate yourself. The Models SDK helps merge those changes easily and provides hooks for optimistic updates allowing for a snappy user interface.
Synchronizing changes
LiveSync uses Ably Pub/Sub Channels to deliver the stream of changes from the database to connected clients. Each change is assigned a number that describes its position in the sequence of changes, called a sequenceId. Ably stores a history of the stream of changes on the channel so your backend doesn’t need to. The Models SDK tracks the sequence of changes the client has received and automatically rewinds on the channel to ensure that it doesn’t miss any changes.
Applications built on LiveSync make use of the transactional outbox pattern to reliably deliver an ordered sequence of changes from your database to clients. When you make a change in your database, you transactionally write an event to the outbox representing that change. You choose the content of the event, and the change will be reliably fanned out in-order to all clients and merged with your frontend application’s state via the Models SDK. Once the event is successfully published to a channel, the Database Connector will clean up the data it no longer needs from the outbox table, so your database doesn’t have to store change history or extraneous data.
The outbox table uses a monotonic serial number to automatically assign each outbox event with a sequenceId. Since the event published to the outbox is written transactionally with the changes to your own database tables, all the changes end up with an exact and total order defined by the sequenceId. The Database Connector and Models SDK ensure that these changes are processed by your frontend clients exactly once in the correct order.
You can integrate LiveSync with your existing backend API by extending your database writes to transactionally write change events to the outbox table. Since you determine what data is included on the change event and how it is interpreted by the client, you can make changes to models that span multiple tables and represent that data however you like on the client.
Optimistic updates
In LiveSync, your database is the source-of-truth: in effect, LiveSync is implementing a form of Server Reconciliation in which the server decides the resulting state of some updates and fans those changes out to clients.
There can be a short delay between a client making a change and the client receiving the associated change event from the backend. To make this easy to work with, the Models SDK supports optimistic updates, where the frontend client can make an update optimistically and the result of that update is immediately reflected on that client only. When the update is confirmed by the backend the optimistic update is replaced with the confirmed result delivered to the client. If the backend decides to reject the optimistic update, then the optimistic update is automatically rolled back in the client that requested it.
Bootstrap
The final piece of the puzzle is bootstrapping a brand new client from scratch. You probably already have a REST API that you use to fetch an entity from your backend. You can use your existing endpoint to fetch the data used to initialize the model in the frontend. This endpoint can be extended to return one additional piece of information: the sequenceId. The Models SDK automatically uses the sequenceId to rewind to the correct position in the change stream on the Ably channel, so that your frontend receives exactly the changes it’s missing and no more.
You don’t have to manage the sequenceId, because it’s automatically managed by the monotonic serial on the outbox table. You can trivially obtain it by transactionally reading the largest sequenceId currently in the outbox table and return this along with your model data.
Conclusion
LiveSync delivers changes in your database to clients in realtime over WebSockets, reliably and in-order, over Ably’s global edge messaging infrastructure. It can be integrated with your existing application and database. Ably channel history allows you to avoid storing change history or per-client state in your database. Your database remains the source-of-truth for your data.
LiveSync is now available in public alpha. If you would like to try it out, please check out the documentation. If you have any feedback, please get in touch.