I love Litestream and use it in every app I build now.
They advertise it as costing "pennies per day," but it's even less expensive than that. It obviously varies depending on how much storage you need, but I had a real app in production, and Litestream replication to S3 only cost me 2-3 cents ($0.02-$0.03) per month.[0]
[0] https://mtlynch.io/litestream/#using-logpaste-in-production
> In Litestream, we’re solving the problem a different way. Modern object stores like S3 and Tigris solve this problem for us: they now offer conditional write support. With conditional writes, we can implement a time-based lease. We get essentially the same constraint Consul gave us, but without having to think about it or set up a dependency.
Reading this blog post though, I couldn't see any reference to this. Is this supported in Litestream v0.5.0, or will it be for a future release?
https://fly.io/blog/litestream-revamped/ https://news.ycombinator.com/item?id=44045292
So far I’ve stuck with the SFTP solution, since I don’t use any of the cloud object storage services that are hardcoded into the tool.[^2]
Big thanks to the developers.
Could how does litestream handle backing up through a spotty connection and can we consolidate the backups into a central db an query against it?
Will Litestream freak out about the database being replaced underneath it?
Will I still be able to restore old versions of the DB?
I wish they'd put a bit more effort into the DX here, but it probably doesn't make much sense from a biz PoV since big customers aren't going to be running these kinds workloads.
Curious if anybody here is deploying SQLite apps to production and what host they're using?
> But the market has spoken! Users prefer Litestream. And honestly, we get it: Litestream is easier to run and to reason about. So we’ve shifted our focus back to it.
I once was responsible for migrating a legacy business app to Azure, and the app had a local MSSQL server co-running with the app (the same pattern that Litestream is using).
As have been mentioned below, the app had been developed assuming the local access (and thus <1ms latency), so it had a ton of N+1 everywhere.
This made it almost impossible to migrate/transition to another configuration.
So, if this style of app hosting doesn't take off and you're at all worried about this being a dead end storage once you reach a certain scale, I'd recommend not doing this, otherwise your options will be very limited.
Then again - I bet you could get very very far on a single box, so maybe it'd be a non factor! :)
Such a system would also require a side channel propagating WAL updates (over Kafka or similar) to replicas, so that the running replicas can update themselves incrementally and stay fresh without loading anything from object storage.
The NATS Jetstream use case is something I'm curious about.
Cheers and keep up the great work on Litestream.
Every time I deploy something, it spins up 2 instances that are in some suspended state. I have to restart them like 3 times before they actually boot? And why can I never just pick one instance when launching an app.
Apps will randomly go into a suspended state, without explaining why. Contacting support says they ran out of capacity, but didn't automatically start them back when capacity was available?! That's the whole point of Apps (vs Machines), you keep my app running...
Fly is set up to be the best compute provider, but there are too many reliability and ergonomics issues.
Please stop updating flyctl every time i go to deploy an app
Only fuss I remember encountering was with fighting with rails migrating solid queue properly, but this seemed like a rails unit issue and don’t remember it being a Fly issue.
I’ve been contemplating migrating my pg primary to SQLite too. Anyways don’t have much else to offer other than an anecdote that I’m happily using fly with partial SQLite.
So I would treat sqlite3_rsync as more of a demo than a stable product right now.
Litestream provides near-real-time offsite replication and point in time recovery, which sqlite3_rsync won't do on its own. You could probably build a litestream-like product based on sqlite3_rsync but it probably won't be as fast or as efficient on storage.
https://litestream.io/guides/s3/
I think this is also roughly what Turso is, although it's becoming a SQLite-compatible db rather than vanilla
https://docs.turso.tech/features/embedded-replicas/introduct...
Litestream is working on that now - the code is already in https://github.com/benbjohnson/litestream/blob/v0.5.0/vfs.go but it's not yet a working, documented feature.
They already have a prototype, and... it's pretty rough on the edges.
I'm porting it to my Go SQLite driver and already ran into a bunch of issues. But it seems at least feasible to get it into a working shape.
https://github.com/benbjohnson/litestream/issues/772
https://github.com/ncruces/go-sqlite3/compare/main...litestr...
https://developers.cloudflare.com/d1/best-practices/read-rep...
What's the Fly.io issue here? Aren't the issues you're describing in Rails not Fly.io?
I run several Go apps in production on Fly.io[0, 1, 2] and I've never had an issue with the Fly.io + SQLite part of it.
SQLite + Litestream makes things slightly more complicated, but again, I've never had issues with Fly.io specifically making that harder. I use a custom launch script in my Dockerfile that starts litestream with the -exec flag to start my app as a child process.[3]
[0] https://github.com/mtlynch/logpaste
[1] https://github.com/mtlynch/picoshare
[2] https://github.com/mtlynch/screenjournal
[3] https://github.com/mtlynch/logpaste/blob/0.3.1/docker-entryp...
Their managed postgres has gotten better, but its still a little sparse, so after about 6 months using it I am going to just take my DB to either Supabase or Planetscale.
With single instances topping out at 20+ TBs of RAM and hundreds of cores, I think this is likely very under-explored as an option
Even more if you combine this with cell-based architecture, splitting on users / tenants instead of splitting the service itself.
It was a Rails app, therefore easy to get into the N+1 but also somewhat easy to fix.
It’s certainly not intuitive. It would be awesome if they sweat these details, but their deal is “here’s a bag of sharp knives”, which is good for some use cases.
Its not its fault. :)
I would probably run both litestream and full backups, to get extra safety
Turso looks cool and is a project I will keep an eye on, but it's replica mode seems to be designed for apps rather than mechanism to rapidly scale out read replicas in a server cluster. Also, the web site warns that it's not ready for production use.
So this fills that gap by giving you a database as a service level of QOL without needing to provision a database as a service backend. Otherwise you're dicking about maintaining a service with all that comes with that (provisioning, updating, etc) when really all you need is a file that is automagically backed up or placed somewhere on the web to avoid the drawbacks of the local file system.
at the end of the day with litestream, when you respond back to a client with a successful write you are only guaranteeing a replication factor of 1.
This why I prefer to take backup stuff in a side container, eg: https://github.com/atrakic/gin-sqlite/blob/main/compose.yml
As a side note, you might consider revisiting your dockerfiles and skip litestream build steps, eg. in your final stage just add line like this:
COPY --from=litestream/litestream /usr/local/bin/litestream /bin/litestream
But most apps should just use a classic n-tier database architecture like Postgres. We mostly do too (though Litestream does back some stuff here like our token system).
If you have access to a database that is well managed on your behalf I would definitely still go with that for many usecases.
But I'd love to hear more from someone more well-versed in the use cases for reliable sql-lite
In every case where I had a SQLite vertical that required resilience, the customer simply configured the block storage device for periodic snapshots. Litestream is approximately the same idea, except you get block device snapshots implicitly as part of being in the cloud. There is no extra machinery to worry about and you won't forget about a path/file/etc.
Also, streaming replication to S3 is not that valuable an idea to me when we consider the recovery story. All other solutions support hot & ready replicas within seconds.
stories = get_stories(query)
which results in a SQL query like SELECT id FROM stories WHERE author = ?
with the '?' being bound to some concrete value like "Jim".Then, the framework will be used to do something like this
for id in stories {
story = get_story_by_id(id)
// do something with story
}
which results in N SQL queries with SELECT title, author, date, content FROM stories WHERE id = ?
and there's your N+1It's usually a big problem for database performance because each query carries additional overhead for the network round trip to the database server.
SQLite queries are effectively a C function call accessing data on local disk so this is much less of an issue - there's an article about that in the SQLite docs here: https://www.sqlite.org/np1queryprob.html
We ran into this while building, funnily enough, a database management app called DB Pro (https://dbpro.app) At first we were doing exactly that: query for all schemas, then for each schema query its tables, and then for each table query its columns. On a database with hundreds of tables it took ~3.8s.
We fixed it by flipping the approach: query all the schemas, then all the tables, then all the columns in one go, and join them in memory. That dropped the load time to ~180ms.
N+1 is one of those things you only really “get” when you hit it in practice.
A naive ORM setup will often end up doing a 1 query to get a list of object it needs, and then perform N queries, one per object, usually fetching each object individually by ID or key.
So for example, if you wanted to see “all TVs by Samsung” on a consumer site, it would do 1 query to figure out the set of items that match, and then if say 200 items matched, it would do 200 queries to get those individual items.
ORMs are better at avoiding it these days, depending on the ORM or language, but it still can happen.
In my opinion if you have an easy way to run postgres,MySQL,... - just run that.
There's usually a lot of quirks in the details of DB usage (even when it doesn't immediately seem like it - got bitten by it a few times). Features not supported, different semantics, ...
IMO every project has an "experimental stuff" budget and if you go over it it's too broken to recover, and for most projects there's just not that much to win by spending them on a new database thing
What exactly are you trying to port?
>But the market has spoken! Users prefer Litestream.
>Yeah, we didn’t like those movies much either.
Where are you seeing a 31 MB binary? Latest releases are 10-11 MB.[0]
>This why I prefer to take backup stuff in a side container, eg: https://github.com/atrakic/gin-sqlite/blob/main/compose.yml
Yeah, I agree that's cleaner, but once you're dealing with multiple containers, it's a big step up in complexity, and you can't do the simple install on places like Fly.io and Lightsail.
>As a side note, you might consider revisiting your dockerfiles and skip litestream build steps, eg. in your final stage just add line like this:
>COPY --from=litestream/litestream /usr/local/bin/litestream /bin/litestream
Ah, thanks! The litestream Docker image is new as of 0.5.0.
def test_homepage_queries(django_assert_max_num_queries, client):
with django_assert_max_num_queries(10):
assert client.get("/").status_code == 200
Or django_assert_num_queries to assert an exact number.Probably some of the most valuable code I've ever written on a per LOC basis lol.
But anyhow, merging that into a new project was always a fun day. But on the other side of the cleanup the app stops falling down due to memory leaks.
You can install MySQL/PostgreSQL on the application server, connect over a unix socket and get the same benefits as if you'd used SQLite on the application server (no network latency, fast queries). Plus the other benefits that come from using these database servers (Postgres extensions, remote connections, standard tooling etc). I'm guessing more RAM is required on the application server than if you used SQLite but I haven't benchmarked it.
If you’re building something as a hobby project and you know it will always fit on one server, sqlite is perfect.
If it’s meant to be a startup and grow quickly, you don’t want to have to change your database to horizontally scale.
Deploying without downtime is also much easier with multiple servers. So again, it depends whether you’re doing something serious enough that you can’t tolerate dropping any requests during deploys.
Now if I'm building a local app then absolutely sqlite makes the most sense but I don't see it otherwise.
This is an interesting take; why do you see recent hype around the most boring and stone-age of technologies, SQLite?
The advantage of sqlite3_rsync is that, if on the other end is an SSH server with sqlite3_rsync, you only transfer the changed pages, and still get a “perfect” copy of the file on the other end.
The advantage of Litestream is that on the other end does not need to live a “server,” and still only changes are uploaded. If you do it continuously, you get many points to recover from, and many of those cross reference each other, saving storage space too. On the flip side, you need the tool to restore.
In particular, JSON aggregations mean you can have a single query that does things like fetch a blog entry and the earliest 10 comments in a single go. I wrote up some patterns for doing that in SQLite and PostgreSQL here: https://github.com/simonw/til/blob/main/sqlite/related-rows-...
Here's an example PostgreSQL query that does this to fetch tags for posts: https://simonwillison.net/dashboard/json-agg-example
select
blog_entry.id,
title,
slug,
created,
coalesce(json_agg(json_build_object(blog_tag.id, blog_tag.tag)) filter (
where
blog_tag.tag is not null
), json_build_array()) as tags
from
blog_entry
left join blog_entry_tags on blog_entry.id = blog_entry_tags.entry_id
left join blog_tag on blog_entry_tags.tag_id = blog_tag.id
group by
blog_entry.id
order by
blog_entry.id desc
The alternative, more common path is the pattern that Django calls "prefetch_related". Effectively looks like this: select id, title, created from posts order by created desc limit 20
-- Now extract the id values from that and run:
select
blog_entry.id,
blog_tag.tag
from
blog_entry
join blog_entry_tags on blog_entry.id = blog_entry_tags.entry_id
join blog_tag on blog_entry_tags.tag_id = blog_tag.id
where
blog_entry.id in (?, ?, ?, ...)
-- Now you can re-assemble the list of tags for
-- each entry in your application logic
Once you have a list of e.g. 20 IDs you can run a bunch of cheap additional queries to fetch extra data about all 20 of those items.More formally, the number of queries should be constant and not linearly scaling with the number of rows you're processing.
> If there are many client programs sending SQL to the same database over a network, then use a client/server database engine instead of SQLite.
Alternately, if you can separate query issuance from result parsing, you can make N+1 palletable. Ex, do your query to get the ids, wait for the results, loop and issue the N queries, then loop and wait for results in order. That will be two-ish round trips rather than N+1 round trips. But you have to search to find database apis that allow that kind of separation.
[1] You can almost always express the query you want in SQL, but that doesn't mean it will have a reasonable runtime. Sometimes server side join and client side join have about the same runtime... if it's significant and you have the usual case where clients are easier to scale than database servers, it might be worthwhile to have the join run on the client to reduce server load.
In general, you want to ask the remote server once for all the data you need, then read all the results. It applies to databases as well as APIs.
Pipelined requests also solve the problem and can be more flexible.
Also, joins can be optimised in different ways. Sometimes the optimal way to do a join isn't to query each row one-by-one, but to do something like (when the rows you want are a large fraction of the rows that exist) making a bloom filter of the rows you want and then sequentially reading all the rows in the table.
For the latter, it's along the lines of `select * from posts where ...` and `select * from authors where id in {posts.map(author_id)}`. And then once it's in memory you manually work out the associations (or rely on your ORM to do it).
I love postgres but in no way is it as simple to run as sqlite (pretty sure even postgres core team would agree that postgres is more complex than sqlite).
No, JOINs should be orders of magnitude faster.
> What are you really supposed to do if you need to, e.g. fetch a list of posts along with the number of comments on each post?
You're really supposed to do a JOIN, together with a GROUP BY and a COUNT(). This is elementary SQL.
This is essentially what GraphQL does instead of crafting each of these super tailored API endpoints for each of your screens, you use their query language to ask for the data you want, it queries the DB for you and get you the data back in a single network roundtrip from the user perspective.
(Not an expert, so I trust comments to correct what I got wrong)
For sure downtime is easier with kubernete etc but again overkill for 99,99% of apps.
dpkg -i litestream.deb
systemctl enable litestream
systemctl start litestream
The fact it's so simple is my favourite thing about it.Not super sure who followed who but there was all of a sudden a lot of excitement
It used to be a very common pitfall - and often not at all obvious. You’d grab a collection of objects from the ORM, process them in a loop, and everything looked fine because the objects were already rehydrated in memory.
Then later, someone would access a property on a child object inside that loop. What looked like a simple property access would silently trigger a database query. The kicker was that this could be far removed from any obvious database access, so the person causing the issue often had no idea they were generating dozens (or hundreds) of extra queries.
of course, it's all possible with custom SQL but it gets complicated quick.
It's definitely really nice though that if you do choose sqlite initially to keep things as small and simple as possible, you don't immediately need to switch databases if you want to scale.
People care about things like long-term support. Postgres 13, from 2020, is still officially supported. Litestream 0.1.0 was the first release, also from 2020, but I can't tell if it is supported still. Worrying about the maturity, stability, and support of an application database is very reasonable in risk adverse projects.
More than once I've started a project with sqlite and then had to migrate to postgres. In my experience it's because of the way sqlite has to lock the whole database file while writing to it, blocking concurrent reads - this isn't a problem in postgres. (There's WAL mode, but it still serialises all writes, and requires periodic explicit checkpointing IME)
You may also find you really want a feature postgres has, for example more advanced data types (arrays, JSON), more advanced indices (GIN inverted index on array members), replication...
SQLite is designed for one local client at a time. Client-server relational databases are designed for many clients at a time.
SQLite's "buzz" isn't new, type "sqlite" into my https://tools.simonwillison.net/hacker-news-histogram tool and you'll see interest (on HN at least) has been pretty stable since 2021.
When I saw the v0.5.0 tag, I dived into just porting it. It's just over a couple hundred lines, and I have more experience with SQLite VFSes than most, so why not.
But it's still pretty shaky.
After some testing, I expect it to be working correctly in the vast majority of cases, but the performance may disappoint. The original still needs work.
* Very flexible, but rigid deployments (can build anywhere, deploy from anywhere, and roll out deployments safely with zero downtime)
* Images don't randomly disappear (ran into this all the time with dokku and caprover)
* If something goes wrong, it heals itself as best it can
* Structured observability (i.e. logs, metrics, etc. are easy to capture, unify, and ship to places)
* Very easy to setup replicas to reduce load on services or have safe failovers
* Custom resource usage (I can give some pods use more/less CPU/memory limits depending on scale and priority)
* Easy to self-host FOSS services (queues, dbs, observability, apps, etc.)
* Total flexibility when customizing ingress/routing. I can keep private services private and only expose public services
* Certbot can issue ssl certs instantly (always ran into issues with other self-hosting platforms)
* Tailscale Operator makes accessing services a breeze (can opt-in services one by one)
* Everything is yaml, so easy to manipulate
* Adding new services is a cake-walk - as easy as creating a new yaml file, building an image and pushing it. I'm no longer disincentivized to spin up a new codebase for something small but worthwhile, because it's easy to ship it.
All-in-all I spent many years trying "lightweight" deployment solutions (dokku, elastic beanstalk, caprover, coolify, etc.) that all came with the promise of "simple" but ended up being infinitely more of a headache to manage when things went wrong. Even something like heroku falls short because it's harder to just spin up "anything" like a stateful service or random FOSS application. Dokku was probably the best, but it always felt somewhat brittle. Caprover was okay. And coolify never got off the ground for me. Don't even get me started on elastic beanstalk.I would say the biggest downside is that managing databases is less rigid than using something like RDS, but the flip side is that my DB is far more performant and far cheaper (I own the CPU cycles! no noisy neighbors.), and I still run daily backups to external object storage.
Once you get k8s running, it kind of just works. And when I want to do something funky or experimental (like splitting AI bots to separate pods), I can go ahead and do that with ease.
I run two separate k8s "clusters" (both single node) and I kind of love it. k9s (obs. tool) is amazing. I built my own logging platform because I hated all the other ones, might release that into its own product one day (email in my profile if you're interested).
If you need to deploy it elsewhere, you just install k3s/k8s or whatever and apply the yamls (except for stateful things like db).
IT also handles name resolution with service names, restarts etc.
IT's amazing.
I really wish there was a way to compose SQL so you can actually write the dumb/obvious thing and it will run a single query. I talked with a dev once who seemed to have the beginnings of a system that could do this. It leveraged async and put composable queryish objects into a queue and kept track of what what callers needed what results, merged and executed the single query, and then returned the results. Obviously far from generalizable for arbitrary queries but it did seem to work.

Image by Annie Ruygt
I’m Ben Johnson, and I work on Litestream at Fly.io. Litestream makes it easy to build SQLite-backed full-stack applications with resilience to server failure. It’s open source, runs anywhere, and it’s easy to get started.
Litestream is the missing backup/restore system for SQLite. It runs as a sidecar process in the background, alongside unmodified SQLite applications, intercepting WAL checkpoints and streaming them to object storage in real time. Your application doesn’t even know it’s there. But if your server crashes, Litestream lets you quickly restore the database to your new hardware.
The result: you can safely build whole full-stack applications on top of SQLite.
A few months back, we announced plans for a major update to Litestream. I’m psyched to announce that the first batch of those changes are now “shipping”. Litestream is faster and now supports efficient point-in-time recovery (PITR).
I’m going to take a beat to recap Litestream and how we got here, then talk about how these changes work and what you can expect to see with them.
Litestream is one of two big SQLite things I’ve built. The other one, originally intended as a sort of sequel to Litestream, is LiteFS.
Boiled down to a sentence: LiteFS uses a FUSE filesystem to crawl further up into SQLite’s innards, using that access to perform live replication, for unmodified SQLite-backed apps.
The big deal about LiteFS for us is that it lets you do the multiregion primary/read-replica deployment people love Postgres for: reads are fast everywhere, and writes are sane and predictable. We were excited to make this possible for SQLite, too.
But the market has spoken! Users prefer Litestream. And honestly, we get it: Litestream is easier to run and to reason about. So we’ve shifted our focus back to it. First order of business: take what we learned building LiteFS and stick as much of it as we can back into Litestream.
Consider this basic SQL table:
CREATE TABLE sandwiches (
id INTEGER PRIMARY KEY AUTOINCREMENT,
description TEXT NOT NULL,
star_rating INTEGER,
reviewer_id INTEGER NOT NULL
);
In our hypothetical, this table backs a wildly popular sandwich-reviewing app that we keep trying to get someone to write. People eat a lot of sandwiches and this table gets a lot of writes. Because it makes my point even better and it’s funny, assume people dither a lot about their sandwich review for the first couple minutes after they leave it. This Quiznos sub… is it ⭐ or ⭐⭐?
Underneath SQLite is a B-tree. Like databases everywhere, SQLite divides storage up into disk-aligned pages, working hard to read as few pages as possible for any task while treating work done within a page as more or less free. SQLite always reads and writes in page-sized chunks.
Our sandwiches table includes a feature that’s really painful for a tool like Litestream that thinks in pages: an automatically updating primary key. That key dictates that every insert into the table hits the rightmost leaf page in the underlying table B-tree. For SQLite itself, that’s no problem. But Litestream has less information to go on: it sees only a feed of whole pages it needs to archive.
Worse still, when it comes time to restore the database – something you tend to want to happen quickly – you have to individually apply those small changes, as whole pages. Your app is down, PagerDuty is freaking out, and you’re sitting there watching Litestream reconstruct your Quiznos uncertainty a page (and an S3 fetch) at a time.
So, LTX. Let me explain. We needed LiteFS to be transaction-aware. It relies on finer-grained information than just raw dirty pages (that’s why it needs the FUSE filesystem). To ship transactions, rather than pages, we invented a file format we call LTX.
LTX was designed as an interchange format for transactions, but for our purposes in Litestream, all we care about is that LTX files represent ordered ranges of pages, and that it supports compaction.
Compaction is straightforward. You’ve stored a bunch of LTX files that collect numbered pages. Now you want to to restore a coherent picture of the database. Just replay them newest to oldest, skipping duplicate pages (newer wins), until all changed pages are accounted for.
Importantly, LTX isn’t limited to whole database backups. We can use LTX compaction to compress a bunch of LTX files into a single file with no duplicated pages. And Litestream now uses this capability to create a hierarchy of compactions:
Net result: we can restore a SQLite database to any point in time, using only a dozen or so files on average.
Litestream performs this compaction itself. It doesn’t rely on SQLite to process the WAL file. Performance is limited only by I/O throughput.
What people like about Litestream is that it’s just an ordinary Unix program. But like any Unix program, Litestream can crash. It’s not supernatural, so when it’s not running, it’s not seeing database pages change. When it misses changes, it falls out of sync with the database.
Lucky for us, that’s easy to detect. When it notices a gap between the database and our running “shadow-WAL” backup, Litestream resynchronizes from scratch.
The only time this gets complicated is if you have multiple Litestreams backing up to the same destination. To keep multiple Litestreams from stepping on each other, Litestream divides backups into “generations”, creating a new one any time it resyncs. You can think of generations as Marvel Cinematic Universe parallel dimensions in which your database might be simultaneously living in.
Yeah, we didn’t like those movies much either.
LTX-backed Litestream does away with the concept entirely. Instead, when we detect a break in WAL file continuity, we re-snapshot with the next LTX file. Now we have a monotonically incrementing transaction ID. We can use it look up database state at any point in time, without searching across generations.
Due to the file format changes, the new version of Litestream can’t restore from old v0.3.x WAL segment files.
That’s OK though! The upgrade process is simple: just start using the new version. It’ll leave your old WAL files intact, in case you ever need to revert to the older version.The new LTX files are stored cleanly in an ltx directory on your replica.
The configuration file is fully backwards compatible.
There’s one small catch. We added a new constraint. You only get a single replica destination per database. This probably won’t affect you, since it’s how most people use Litestream already. We’ve made it official.
The rationale: having a single source of truth simplifies development for us, and makes the tool easier to reason about. Multiple replicas can diverge and are sensitive to network availability. Conflict resolution is brain surgery.
Litestream commands still work the same. But you’ll see references to “transaction IDs” (TXID) for LTX files, rather than the generation/index/offset we used previously with WAL segments.
We’ve also changed litestream wal to litestream ltx.
We’ve beefed up the underlying LTX file format library. It used to be an LTX file was just a sorted list of pages, all compressed together. Now we compress per-page, and keep an index at the end of the LTX file to pluck individual pages out.
You’re not seeing it yet, but we’re excited about this change: we can operate page-granularly even dealing with large LTX files. This allows for more features. A good example: we can build features that query from any point in time, without downloading the whole database.
We’ve also gone back through old issues & PRs to improve quality-of-life. CGO is now gone. We’ve settled the age-old contest between mattn/go-sqlite3 and modernc.org/sqlite in favor of modernc.org. This is super handy for people with automated build systems that want to run from a MacBook but deploy on an x64 server, since it lets the cross-compiler work.
We’ve also added a replica type for NATS JetStream. Users that already have JetStream running can get Litestream going without adding an object storage dependency.
And finally, we’ve upgraded all our clients (S3, Google Storage, & Azure Blob Storage) to their latest versions. We’ve also moved our code to support newer S3 APIs.
The next major feature we’re building out is a Litestream VFS for read replicas. This will let you instantly spin up a copy of the database and immediately read pages from S3 while the rest of the database is hydrating in the background.
We already have a proof of concept working and we’re excited to show it off when it’s ready!
Configure the init flags to disable all controllers and other doodads, deploy them yourself with Helm. Helm sucks to work with but someone has already gone through the pain for you.
AI is GREAT at K8s since K8s has GREAT docs which has been trained on.
A good mental model is good: It's an API with a bunch of control loops
Then you are off to races. you can add more nodes etc later to give it a try.
Use k9s (not a misspelling) and headlamp to observe your cluster if you need a gui.
e.g. for ActiveRecord there's ar_lazy_preloader[0] or goldiloader[1] which fix many N+1s by keeping track of a context: you load a set of User in one go, and when you do user.posts it will do a single query for all, and when you then access post.likes it will load all likes for those and so on. Or, if you get the records some other way, you add them to a shared context and then it works.
Doesn't solve everything, but helps quite a bit.
Systemctl's only in there to restart it if it crashes; litestream itself is (iirc) a single cli binary.
While I don't have stats about every conference talk for the last decade, my experience has been that SQLite has been featured more in Rails conference talks. There's a new book titled "SQLite on Rails: The Workbook" that I don't think would have had an audience five years ago. And I've noticed more blog posts and more discussion in Rails-related discussion platforms. Moreover, I expect we'll see SQLite gain even more in popularity as it simplifies multi-agent development with multiple git worktrees.
The filtering you describe is trivial with COUNT(flag IN (...) AND state=...) etc.
If you want to retrieve data on the last comment, as opposed to an aggregate function of all comments, you can do that with window functions (or with JOIN LATERAL for something idiomatic specifically to Postgres).
Learning how to do JOIN's in SQL is like learning pointers in C -- in that it's a basic building block of the language. Learning window functions is like learning loops. These are just programming basics.
Besides the question wasnt litestream vs postgres backup apps. It was sqlite vs postgres.
If you have lots of concurrent writes SQLite isn't the right solution. For concurrent reads it's fine.
SQLite also isn't a network database out-of-the-box. If you want to be able to access it over the network you need to solve that separately.
(Don't try and use NFS. https://sqlite.org/howtocorrupt.html#_filesystems_with_broke... )
If you have a read-heavy app (99% of saas) that runs on one server and dont have millions of users then sqlite is a great option.
On the other side, why not just store everything in memory and flush to a local json file if you won't have any users? sqlite is overkill!
Usually you want to be able to run multiple webservers against a single database though, since that's the first thing you'll usually need to scale.
For anything that isn't just a basic row lookup from a single table, you should really just be writing the SQL yourself.
That becomes an instant problem if users ever write to your database. You can't duplicate the environment unless it's read-only.
And even if the database is read-only for users, the fact that every time you update it you need to redeploy the database to every client, is pretty annoying.
That's why it's usually better to start with Postgres or MySQL. A single source of truth for data makes everything vastly easier.
Bit more stretched out than I thought it had been