So yea, stick with what worked for decades if you don't see a reason not to. Also, I remember reading that StackOverflow runs on a bunch of super powerful root servers?
One note: you can absolutely use Python or Node just as well as Go. There's Hetzner that offers 4GB RAM, 10TB network (then 1$/TB egress), 2CPUs machines for 5$.
Two disclaimers for VPS:
If you're using a dedicated server instead of a cloud server, just don't forget to backup DB to a Storage box often (3$ /mo for 1TB, use rsync). It's a good practice either way, but cloud instances seem more reliable to hardware faults. Also avoid their object store.
You are responsible for security. I saw good devs skipping basic SSH hardening and get infected by bots in <1hr. My go-to move when I spin up servers is a two-stage Terraform setup: first, I set up SSH with only my IP allowed, set up Tailscale and then shutdown the public SSH IP entrypoint completely.
Take care and have fun!
I don't want to diss SQLite because it is awesome and more than adequate for many/most web apps but you can connect to Postgres (or any DB really) on localhost over a Unix domain socket and avoid nearly all of the overhead.
It's not much harder to use than SQLite, you get all of the Postgres features, it's easier to run reports or whatever on the live db from a different box, and much easier if it comes time to setup a read replica, HA, or run the DB on a different box from the app.
I don't think running Postgres on the same box as your app is the same class of optimistic over provisioning as setting up a kubernetes cluster.
Saying "you can just run things on a cheap VPS" sounds amateurish: people are immediately out with "Yeah but scaling", "Yeah but high availability", "Yeah but backups", "Yeah but now you have to maintain it" arguments, that are basically regurgitated sales pitches for various cloud platforms. It's learned helplessness.
He's mainly talking about the tech implementation which is the easy part.
the hard part of creating a business is finding a problem valuable enough to solve and reaching the users who need that problem solved. that's where the real value is.
If you get one dedicated server for multiple separate projects, you can still keep the costs down but relax those constraints.
For example, look at the Hetzner server auction: https://www.hetzner.com/sb/
I pay about 40 EUR a month for this:
Disk: 736G / 7.3T (11%)
CPU: Intel Core i7-7700 @ 8x 4.2GHz [42.0°C]
RAM: 18004MiB / 64088MiB
I put Proxmox on it and can have as many VMs as the IO pressure of the OSes will permit: https://www.proxmox.com/en/ (I cared mostly about storage so got HDDs in RAID 0, others might just get a server with SSDs)You could have 15 VMs each with 4 GB of RAM and it would still come out to around 2.66 EUR per month per VM. It's just way more cost efficient at any sort of scale (number of projects) when compared to regular VPSes, and as long as you don't put any trash on it, Proxmox itself is fairly stable, being a single point of failure aside.
Of course, with refurbished gear you'd want backups, but you really need those anyways.
Aside from that, Hetzner and Contabo (opinions vary about that one though) are going to be more affordable even when it comes to regular VPS hosting. I think Scaleway also had those small Stardust instances if you want something really cheap, but they go out of stock pretty quickly as well.
Thinking about on how to fit everything on a $5 VPS does not help your business.
> Here is the trick that you might have missed: somehow, Microsoft is able to charge per request, not per token. And a "request" is simply what I type into the chat box. Even if the agent spends the next 30 minutes chewing through my entire codebase, mapping dependencies, and changing hundreds of files, I still pay roughly $0.04.
> The optimal strategy is simple: write brutally detailed prompts with strict success criteria (which is best practice anyway), tell the agent to "keep going until all errors are fixed," hit enter, and go make a coffee while Satya Nadella subsidizes your compute costs.
Wow. I'll definitely be investigating this!
Initially from the title, I thought it would be about brainstorming and launching a successful idea, and that sort of thing.
It starts about cutting costs by the choice of infrastructure and goes further to less resource hungry tools and cheaper services. But never compares the cost of these things. Do I save actually the upgrade to a bigger server by using Go and sqlite over let's say Python and postgres? Or does it not even matter when you have just n many users. Then I do not understand why at one point the convenience of using OpenRouter is preferred over managing multiple API keys, when that should be cheaper and a cost point that could increase faster than your infrastructure costs.
There are some more points, but I do not want to write a long comment.
And, not sure I'm correct, but I felt Postgresql has more optimized storage if you have large text data than SQLite, at least for me I had storage full with SQLite, but same application on Postgresql never had this issue
Scaling to zero with database persistence using litestream has cut my bill down to $0.1 per month for my backend+database.
Granted I still don't have that many users, and they get 200ms of extra latency if the backend needs to wake up. But it's nice to never have to worry about accidental costs!
Something to remind to many tech folks on HN
But, actually you can run Kubernetes and Postgres etc on a VPS.
See https://stack-cli.com/ where you can specify a Supabase style infra on a low cost VPS on top of K3s.
The invented “people start with a k8s cluster for 5 users” doesn’t really exist. This is just a story repeated ad nauseam to fit a narrative that help them justify their choices. This position is just as dogmatic, if not more, than the alleged dogma it attempts to disrupt.
Smart technical leaders knows that technical decisions only matter in context never in absolutes. The right answer is always “it depends”.
I can agree that there is a tendency to prematurely optimize infra, as a direct consequence of lack of measuring especially in young busy startups. One could argue that premature optimization might be the smart choice when you don’t have enough data, as in the best case scenario (your startup do well) you’ve saved some time, worst case scenario you’ve lost some money that depending on the situation might be less valuable than time spent in maintaining, and later refactoring, infra.
No regrets. Infrastructure isn't the problem I'm trying to solve. The problem is: who's actually going to pay for this?
Optimizing infrastructure before you have customers is like designing a kitchen before you've written the menu. I launched within 72 hours of starting development and went straight to customer validation. The market feedback started coming in immediately.
Infrastructure costs show up in your bill. The cost of slow customer validation doesn't show up anywhere - until it's too late. That's the number I watch.
If you’re backing up to a third party losing your account isn’t a disaster, bring up a VM somewhere else, restore from backups, redirect DNS and you’re up and running again. If the backups are on a disk you can’t access anymore then a minor issue has just escalated to an existential threat to your company.
Personally I use Backblaze B2 for my offsite backups because they’re ridiculously cheap, but other options exist and Restic will write to all of them near identically.
Applications each have their own FreeBSD jails, so they're isolated.
ZFS incremental replication on top of regular app backups provide a quick recovery process should the hardware of that machine fail.
Moving those apps to the cloud would cost orders of magnitude more, for benefits I don't need.
How do you market them?
Is customer support an issue?
Do you see risk since ai makes it so easy to build/copy?
100% agreed.
https://www.toontales.net/short/lumber-jerks/
Acme Toothpicks
Note that you don't need all of that to keep your SSH server secure. Just having a good password (ideally on a non-root account) is more than enough.
Once I had Postgresql db with default password on a new vps, and forgetting to disable password based login, on a server with no domain. And it got hacked in a day, and was being used as bot server. And that was 10 years ago.
Recently deployed server, and was getting ssh login attempts within an hour, and it didn't had a domain. Fortunately, I've learned my lesson, and turned of password based login as soon as the server was up and running.
And similar attempts bogged down my desktop to halt.
Having an machine open to the world is now very scary. Thanks God for service like tailscale exists.
Funny you said that. I migrated an old, Django web site to a slightly more modern architecture (docker compose with uvicorn instead of bare metal uWSGI) the other day, and while doing that I noticed that it doesn't need PostgreSQL at all. The old server had it already installed, so it was the lazy choice.
I just dumped all data and loaded it into an SQLite database with WAL and it's much easier to maintain and back up now.
Not everybody says so... So, can anyone explain what's the right way to think about WAL?
Curious as to why you say this. I’m using litestream to backup to Hetzner object storage, and it’s been working well so far.
I guess itt’s probably more expensive than just a storage box?
Not sure but I also don’t have to set up cron jobs and the like.
What features postgres offers over sqlite in the context of running on a single machine with a monolithic app? Application functions [2] means you can extend it however you need with the same language you use to build your application. It also has a much better backup and replication story thanks to litestream [3].
- [1] https://andersmurphy.com/2025/12/02/100000-tps-over-a-billio...
- [2] https://sqlite.org/appfunc.html
- [3] https://litestream.io/
The main problem with sqlite is the defaults are not great and you should really use it with separate read and write connections where the application manages the write queue rather than letting sqlite handle it.
Running 100,000 `SELECT 1` queries:
PostgreSQL (localhost): 2.77 seconds
SQLite (in-memory): 0.07 seconds
(https://gist.github.com/leifkb/1ad16a741fd061216f074aedf1eca...)The one place I'd push back on SQLite: if your app has any write concurrency from external processes (cron jobs, webhooks), WAL mode helps but you still hit lock contention. I have data collection scripts running every 30 minutes that write to the same DB the web app reads from. Postgres handled that cleanly from day one. Neon's free tier is 512MB with connection pooling — more than enough for a side project with real data.
I have been doing this kind of thing with Cursor and Codex subscriptions, but they do have annoying rate limits, and Cursor on the Auto model seems to perform poorly if you ask it to do too much work, so I am keen to try out laconic on my local GPU.
EDIT:
Having tried it out, this may be a false economy.
The way it works is it has a bunch of different prompts for the LLMs (Planner, Synthesizer, Finalizer).
The "Planner" is given your input question and the "scratchpad" and has to come up with DuckDuckGo search terms.
Then the harness runs the DuckDuckGo search and gives the question, results, and scratchpad to the Synthesizer. The Synthesizer updates the scratchpad with new information that is learnt.
This continues in a loop, with the Planner coming up with new search queries and the Synthesizer updating the scratchpad, until eventually the Planner decides to give a final answer, at which point the Finalizer summarises the information in a user-friendly final answer.
That is a pretty clever design! It allows you to do relatively complex research with only a very small amount of context window. So I love that.
However I have found that the Synthesizer step is extremely slow on my RTX3060, and also I think it would cost me about £1/day extra to run the RTX3060 flat out vs idle. For the amount of work laconic can do in a day (not a lot!), I think I am better off just sending the money to OpenAI and getting the results more quickly.
But I still love the design, this is a very creative way to use a very small context window. And has the obvious privacy and freedom advantages over depending on OpenAI.
The thing is one you learn the technology, everything else seems more work than the "easy way".
I currently work in a small b2c startup with 200 active users (and targeting 5000 by the end of the year) and we're already paying AWS $1000/month on infra and it drives me crazy…
And the deployment process is also over-engineered in a way that makes it hard to change anything (if you want to release without changing things too much that's fine, but changing the deployment process is already a nightmare).
“But best practices”, “but scalability”, “but 99.999% uptime” …
In my case I'm seeing it a lot on the front-end side. My clients end up with single-page apps that install Shadcn, Tailwind, React, React Router, Axios, Zod, React Form and Vite, all to center a some input elements and perform a few in-browser API calls. It's a huge maintenance burden even before they start getting value out of it.
These large setups are often a correct answer, but not the right one for the situation.
Isn't this idea to spend a bit more effort and overhead to get YAGNI features exactly what TFA argues against?
It’s very interesting how people rent large VMs with a hypervisor. I’m wondering if licenses for VPS have any clauses preventing this for commercial scale.
I recall running LAMP stacks on something like 128MB about 20 years ago and not really having problems with memory. Most current website backends are not really much more complicated than they were back then if you don't haul in bloat.
That's 17 million hits per day in about 3.9 MiB/sec sustained disk IO, before factoring in the parallelism that almost any bargain bucket NVME drive already offers (allowing you to at least 4x these numbers). But already you're talking about quadrupling the infrastructure spend before serving a single request, which is the entire point of the article.
There is a good reason: teaching yourself not to over-engineer, over-provision, or overthink, and instead to focus on generating business value to customers and getting more paying customers. I think it’s what many engineers are keen to overlook behind fun technical details.
The Macbook Neo with 8GB RAM is a showcase of how people underistimated its capabilities due to low amount of RAM before launch, yet after release all the reviewers point to a larger set of capabilities without any issues that people didn't predict pre-launch.
Even their $5 plan gives 4GB.
What I like about sqlite is that it's simply one file
More features is a net negative if you don't need those features. Ideally you want your DB to support exactly what you need and nothing more. Not typically realistic but the closer you can get the better.
That being said, I'd much rather read a few ideas for good recurring passive income. Instead, the author kind of flexes on that, then says "I get refused VC money because they don't see how their money would be useful for me" -- which is one more flex -- and moves on to the technical bits.
It's coming across as bragging to me.
$20/month. Yeah. Great, but why? You get a lot of peace of mind with "real" HA setup with real backups and real recovery, for not much more than $20, if you are careful.
Another half of article is about running "free, unlimited" local AI on a GPU (Santa brought it) with, apparently, free electricity (Santa pays for it).
Observation #1: You can also solve the tech stack problem with Heroku. I think the author's stack probably has a steeper learning curve, but is a cheaper option. I think it's a bit of an odd comparison (I won't say straw-man, as I don't doubt some people do this) to go from a fully-controlled simple setup to using AWS with a pile of extra crap. You can also, for example, run something similar to what he or she is describing on AWS, Heroku etc. (I.e. without the things in the AWS diagram he indicated like kubernetes and load balancers.)
Observation #2: I have not found WAL mode is an antidote to SQLite locks during multiple concurrent writes. (This is anecdotal)
I think regarding Go vs Python/Ruby etc. I completely get that. I would now like to check out Go on web. I use Rust for most of my software writing, but am still on Python for web servers, because there is nothing I can use for Rust that is as powerful and easy as Django.
What do I get as an advantage being on AWS? S3 (literally like a $1 month) SQS (free tier) and Lambda (async jobs; free tier). Capacity if needed, just scale up t4g instances.
Why care so much about so little operating costs when your earning so much?
Which obviously works, it's not like there aren't tons of multi-million startups ultimately doing the exact same thing, and yet. It feels a bit... trite?
The moral of the story is: Don’t be (another) fool, your tech stack is not your priority.
Most people in the BiP these days barely know how to deploy a database or host something using nginx. it's all vercel, supabase, aws, clerk, yada yada. Cost aside, I think that people are addicted to complexity.
It seems to really help if you can put a term to it.
You get such a large performance malus and increase in complexity right from the start with The Cloud that it’ starts at a serious deficit, and only eventually maybe overcomes that to be overall beneficial with the right workload, people, and processes. Most companies are lacking minimum two of those to justify “the cloud”.
And that’s without even considering the cost.
What I think it actually is, is a way for companies that can’t competently (I mean at an organizational/managerial level) maintain and adequately make-available computing resources, to pay someone else to do it. They’re so bad at that, that they’re willing to pay large costs in money, performance, and maybe uptime to get it.
Too many have forgotten what it means to administrate a single system. You can do a lot with very simple tooling.
- [1] https://litestream.io
I just have a few large VMs, each a different environment with slightly different ways how I treat them - the prod ones get more due diligence and being careful, whereas all of the dev ones (including where I host Gitea, Woodpecker CI, Nextcloud, Kanboard, Uptime Kuma etc.) I mess around with the configuration in and do restarts more often. I personally used to run a Docker Swarm cluster, but now just use Docker Compose with Ansible directly, still multiple stacks per each of those servers, dead simple
So my setup ended up being:
* VPS / VMs - an environment, since don't really need replication/distributed systems at my scale
* container stack (Compose/Swarm) - a project, with all its dependencies, though ingress is a shared web server container per environment
* single container - the applications I build, my own are built on top of a common Ubuntu LTS base more often than not, external ones (like Nextcloud and tbh most DBs) are just run directly
Works very well, plus containers allow me to easily have consistent configuration management, networking, resource limits and persistent storage.You should always use a swap file/partition, even if you don't want any swapping. That's because there are always cold pages and if you have no swap space that memory cannot be used for apps or buffers, it's just wasted.
Dead giveaway
"What do you even need funding for?"
I agree. The author claims to have multiple $10K MRR websites running on $20 costs. I also don't understand what he needs money for — shouldn't the $x0,000 be able to fund the $20 for the next project? It doesn't make any sense at all.
Then the author trails off and tells us how he runs on $20/month.
Well, why did you apply for funding? Hello?
But 10k MRR sounds to me like travelling to Mars. I have 0 ideas and 0 initiative to push them ahead.
I guess it’s all about knowing when to re-engineer the solution for scale. And the answer is rarely ”up front”.
Can confirm it exists, especially with founders self-coding with LLMs now.
Its like not having syphilis or cancer, its a good thing.
Honestly, yes. I'm on HN for tech content, I don't really care about startups and the business side of things, even though sometimes there are interesting reads on this side as well. Also, it may very well be the case that I rediscover the meaning of MRR for the second or third time in sixteen years :).
But, in all honesty, all RR numbers are estimates. MRR is also a "made up number" from a certain point of view: it is not equivalent to cash received every month, because of annual subscriptions, cancelations, etc.
It's all of five minutes to write a deployment yaml and ingress and have literally anything on the web for a handful of dollars a month.
I've written rust services doing 5k QPS on DO's cheapest kube setup.
It's not rocket science.
Serverless node buns with vite reacts are more complicated than this.
Ten lines of static, repeatable, versioned yaml config vs a web based click by click deploy installer with JavaScript build pipelines and magical well wishes that the pathing and vendor specific config are correct.
And don't tell me VPS FTP PHP or sshing into a box to special snowflake your own process runner are better than simple vanilla managed kube.
You can be live on the web from zero in 5 minutes with Digital Ocean kube, and that's counting their onboarding.
You don’t need backups until you have customers.
I do like this: cron to run the backup and then rsync to https://www.rsync.net, then an after script that check it was run and post to my telegram the analysis.
That is.
In my head, I call this the 'doubling algorithm'.
If there's anything that's both relatively cheap and useful, but where "more" (either in quality or quantity) has additional utility, 2x it.
Then 2x it again.
Repeat until either: the price change becomes noticeable or utility stops being gained.
Tl;dr -- saving order-of single dollars is rarely worth the tradeoffs.
https://old.reddit.com/r/GithubCopilot/comments/1r0wimi/if_y...
It's easier to add a small config to Terraform to make your config at least key-based.
Now this is more controversial take and you should always benchmark on your own traffic projections, but:
consider that if you don't have a ton of indexes, the raw throughput of SQLite is so good that on many access patterns you'd already have to shard a Postgres instance anyway to surpass where SQLite single-write limitation would become the bottleneck.
[0] https://www.sqlite.org/src/doc/begin-concurrent/doc/begin_co...
At least with Storage Box you know it's just a dumb storage box. And you can SSH, SFTP, Samba and rsync to it reliably.
[0] https://docs.hetzner.com/storage/object-storage/supported-ac...
The same thing SQL itself buys you: flexibility for unforeseen use cases and growth.
Your SQLite benchmark is based in having just one write connection for SQLite but all eight writable connections for Postgres. Even in the context of a single app, not everyone wants to be tied down that way, particularly when thinking how it might evolve.
If we know our app would not need to evolve we could really maximize performance and use a bespoke database instead of an rdbms.
It seems a little aggressive for you to jump on a comment about how it’s reasonable to run Postgres sometimes with “SQLite smokes it in performance.” That’s true, when you can accept its serious constraints.
As a wise man once said, “Postgres is great and there's nothing wrong with using it!”
Also:
> PostgreSQL (localhost): (. .) SQLite (in-memory):
This is a rather silly example. What do you expect to happen to your data when your node restarts?
Your example makes as much sense as comparing Valkey with Postgres and proceed to proclaim that the performance difference is not insignificant.
Last night, I was rejected from yet another pitch night. It was just the pre-interview, and the problem wasn't my product. I already have MRR. I already have users who depend on it every day.
The feedback was simply: "What do you even need funding for?"
I hear this time and time again when I try to grow my ideas. Running lean is in my DNA. I've built tools you might have used, like websequencediagrams.com, and niche products you probably haven't, like eh-trade.ca. That obsession with efficiency leads to successful bootstrapping, and honestly, a lot of VCs hate that.
Keeping costs near zero gives you the exact same runway as getting a million dollars in funding with a massive burn rate. It's less stressful, it keeps your architecture incredibly simple, and it gives you adequate time to find product-market fit without the pressure of a board breathing down your neck.
If you are tired of the modern "Enterprise" boilerplate, here is the exact playbook of how I build my companies to run on nearly nothing.
The naive way to launch a web app in 2026 is to fire up AWS, provision an EKS cluster, set up an RDS instance, configure a NAT Gateway, and accidentally spend $300 a month before a single user has even looked at your landing page.
The smart way is to rent a single Virtual Private Server (VPS).
First thing I do is get a cheap, reliable box. Forget AWS. You aren't going to need it, and their control panel is a labyrinth designed to extract billing upgrades. I use Linode or DigitalOcean. Pay no more than $5 to $10 a month.
1GB of RAM sounds terrifying to modern web developers, but it is plenty if you know what you are doing. If you need a little breathing room, just use a swapfile.

The goal is to serve requests, not to maintain infrastructure. When you have one server, you know exactly where the logs are, exactly why it crashed, and exactly how to restart it.
Now you have constraints. You only have a gigabyte of memory. You could run Python or Ruby as your main backend language—but why would you? You'll spend half your RAM just booting the interpreter and managing gunicorn workers.
I write my backends in Go.
Go is infinitely more performant for web tasks, it's strictly typed, and—crucially for 2026—it is incredibly easy for LLMs to reason about. But the real magic of Go is the deployment process. There is no pip install dependency hell. There is no virtual environment. You compile your entire application into a single, statically linked binary on your laptop, scp it to your $5 server, and run it.
Here is what a complete, production-ready web server looks like in Go. No bloated frameworks required:
package main
import (
"fmt"
"net/http"
)
func main() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello, your MRR is safe here.")
})
// This will comfortably handle 10,000s of requests per second
// on a potato.
http.ListenAndServe(":8080", nil)
}
If you have a graphics card sitting somewhere in your house, you already have unlimited AI credits.
When I was building eh-trade.ca, I had a specific problem: I needed to perform deep, qualitative stock market research on thousands of companies, summarizing massive quarterly reports. The naive solution is to throw all of this at the OpenAI API. I could have paid hundreds of dollars in API credits, only to find a logic bug in my prompt loop that required me to run the whole batch over again.
Instead, I'm running VLLM on a dusty $900 graphics card (an RTX 3090 with 24GB of VRAM) I bought off Facebook Marketplace. It’s an upfront investment, sure, but I never have to pay a toll to an AI provider for batch processing again.
For local AI, you have a distinct upgrade path:
ollama run qwen3:32b) and lets you try out dozens of models instantly. It's the perfect environment for iterating on prompts.To manage all this, I built laconic, an agentic researcher specifically optimized for running in a constrained 8K context window. It manages the LLM context like an operating system's virtual memory manager—it "pages out" the irrelevant baggage of a conversation, keeping only the absolute most critical facts in the active LLM context window.
I also use llmhub, which abstracts any LLM into a simple provider/endpoint/apikey combo, gracefully handling both text and image IO whether the model is running under my desk or in the cloud.
You can't do everything locally. Sometimes you need the absolute cutting-edge reasoning of Claude 3.5 Sonnet or GPT-4o for user-facing, low-latency chat interactions.
Instead of juggling billing accounts, API keys, and rate limits for Anthropic, Google, and OpenAI, I just use OpenRouter. You write one OpenAI-compatible integration in your code, and you instantly get access to every major frontier model.
More importantly, it allows for seamless fallback routing. If Anthropic's API goes down on a Tuesday afternoon (which happens), my app automatically falls back to an equivalent OpenAI model. My users never see an error screen, and I don't have to write complex retry logic.
New, insanely expensive models are being released every week. I constantly hear about developers dropping hundreds of dollars a month on Cursor subscriptions and Anthropic API keys just to have an AI write their boilerplate.
Meanwhile, I'm using Claude Opus 4.6 all day and my bill barely touches $60 a month. My secret? I exploit Microsoft's pricing model.
I bought a GitHub Copilot subscription in 2023, plugged it into standard VS Code, and never left. I tried Cursor and the other fancy forks when they briefly surpassed it with agentic coding, but Copilot Chat always catches up.
Here is the trick that you might have missed: somehow, Microsoft is able to charge per request, not per token. And a "request" is simply what I type into the chat box. Even if the agent spends the next 30 minutes chewing through my entire codebase, mapping dependencies, and changing hundreds of files, I still pay roughly $0.04.
The optimal strategy is simple: write brutally detailed prompts with strict success criteria (which is best practice anyway), tell the agent to "keep going until all errors are fixed," hit enter, and go make a coffee while Satya Nadella subsidizes your compute costs.
I always start a new venture using sqlite3 as the main database. Hear me out, this is not as insane as you think.
The enterprise mindset dictates that you need an out-of-process database server. But the truth is, a local SQLite file communicating over the C-interface or memory is orders of magnitude faster than making a TCP network hop to a remote Postgres server.
"But what about concurrency?" you ask. Many people think SQLite locks the whole database on every write. They are wrong. You just need to turn on Write-Ahead Logging (WAL). Execute this pragma once when you open the database:
PRAGMA journal_mode=WAL;
PRAGMA synchronous=NORMAL;
Boom. Readers no longer block writers. Writers no longer block readers. You can now easily handle thousands of concurrent users off a single .db file on an NVMe drive.
Since implementing user authentication is usually the most annoying part of starting a new SQLite-based project, I built a library: smhanov/auth. It integrates directly with whatever database you are using and manages user signups, sessions, and password resets. It even lets users sign in with Google, Facebook, X, or their own company-specific SAML provider. No bloated dependencies, just simple, auditable code.
The tech industry wants you to believe that building a real business requires complex orchestration, massive monthly AWS bills, and millions in venture capital.
It doesn't.
By utilizing a single VPS, statically compiled binaries, local GPU hardware for batch AI tasks, and the raw speed of SQLite, you can bootstrap a highly scalable startup that costs less than the price of a few coffees a month. You add infinite runway to your project, giving yourself the time to actually solve your users' problems instead of sweating your burn rate.
If you are interested in running lean, check out my auth library and agent implementations on my GitHub. I’ll be hanging around the comments—let me know how you keep your server costs down, or tell me why I'm completely wrong.
Sometimes that crashing is what I want: a dedicated server running one (micro)service in a system that'll restart new servers on such crashes (e.g. Kubernetes-alike). I'd rather have it crash immediately rather than chugging along in degraded state.
But on a shared setup like OP shows, or the old LAMP-on-a-vps, i'd prefer the system to start swapping and have a chance to recover. IME it quite often does. Will take a few minutes (of near downtime) but will avoid data corruption or crash-loops much easier.
Basically, letting Linux handle recovery vs letting a monitoring system handle recovery
for inserts only into singe table with no indexes.
Also, I didn't get why sqlite was allowed to do batching and pgsql was not.
SQLite on the same machine is akin to calling fwrite. That's fine. This is also a system constraint as it forces a one-database-per-instance design, with no data shared across nodes. This is fine if you're putting together a site for your neighborhood's mom and pop shop, but once you need to handle a request baseline beyond a few hundreds TPS and you need to serve traffic beyond your local region then you have no alternative other than to have more than one instance of your service running in parallel. You can continue to shoehorn your one-database-per-service pattern onto the design, but you're now compelled to find "clever" strategies to sync state across nodes.
Those who know better to not do "clever" simply slap a Postgres node and call it a day.
From the article:
>To manage all this, I built laconic, an agentic researcher specifically optimized for running in a constrained 8K context window. It manages the LLM context like an operating system's virtual memory manager—it "pages out" the irrelevant baggage of a conversation, keeping only the absolute most critical facts in the active LLM context window.
The 8K part is the most startling to me. Is that still a thing? I worked under that constraint in 2023 in the early GPT-4 days. I believe Ollama still has the default context window set to 8K for some reason. But the model mentioned on laconic GitHub (Qwen3:4B) should support 32K. (Still pretty small, but.. ;)
I'll have to take a proper look at the architecture, extreme context engineering is a special interest of mine :) Back when Auto-GPT was a thing (think OpenClaw but in 2023), I realized that what most people were using it for was just internet research, and that you could get better results, cheaper, faster, and deterministically, by just writing a 30 line Python script.
Google search (or DDG) -> Scrape top N results -> Shove into LLM for summarization (with optional user query) -> Meta-summary.
In such straightforward, specialized scenarios, letting the LLM drive was, and still is, "swatting a fly with a plasma cannon."
(The analog these days would be that many people would be better off asking Claw to write a scraper for them, than having it drive Chromium 24/7...)
Unfortunately, this isn’t something that shows up on spec sheets when you’re choosing a service. :-/
Since I only needed about 3 VMs (though each being a bit beefier, running containers on them, a web server sitting in front of those with vhosts as ingress), I could give each VM its own IPv4 address and it didn’t end up being too expensive for my use case. Would be a bit different for someone who wants many small VMs.
I assign few VMs public IPs and use them as ingress / SSL termination / load balancer for my workloads running on VMs with only internal IPs.
I personally use kvm with libvirt and manage all these with Ansible.
If you have a plan from the start and you know what you'll need and you're pretty confident it won't change, then sure.
If you want a box that you can slice and dice however you want (VMs, containers, etc) then something like Proxmox might be worth it.
Building a $10K MRR website is hard. Building multiple (assuming "multiple" here means >= 3) $10K MRR websites is extremely hard.
I don't know which investors they pitched to, but most investors seeing that number will write a 100-200K check to invest in THE PERSON pretty immediately; unless there was strong red flags in their business model (porn, drug, gambling, etc...)
Since I'm in finance I would say, Turnover is vanity, positive cashflow is sanity...but its not nearly as catchy
Worth noting however that they are starting to introduce rate limits lately so you might struggle to run multiple concurrent sessions, though this is very inconsistent for me. Some days I can run 3-4 sessions concurrently all day, other times I get rate limited if I run one non-stop..
It's a static blog that renders markdown... there's literally nothing to code, let alone vibe code.
In other words, what gets you to $10k MRR isn’t the same thing(s) for 2x, 5x, or 10x that.
I can build whatever, I just have zero clue whatsoever what to build. Never have.
I was thinking more of
Running multiple websites. i.e. 1 application per namespace. Tooling i.e. k9s for looking at logs etc. Upgrading applications etc.
And one simple mistake, and we're screwed
Actually, there are no inserts in this example each transaction in 2 updates with a logical transaction that can be rolled back (savepoint). So in raw terms you are talking 200k updates per second and 600k reads per second (as there's a 75%/25% read/write mix in that example). Also worth keeping in mind updates are slower than inserts.
> no indexes.
The tables have an index on the primary key with a billion rows. More indexes would add write amplification which would affect both databases negatively (likely PG more).
> Also, I didn't get why sqlite was allowed to do batching and pgsql was not.
Interactive transactions [1] are very hard to batch over a network. To get the same effect you'd have to limit PG to a single connection (deafeating the point of MVCC).
- [1] An interactive transaction is a transaction where you intermingle database queries and application logic (running on the application).
Actually 35% faster than fwrite [1].
> This is also a system constraint as it forces a one-database-per-instance design
You can scale incredibly far on a single node and have much better up time than github or anthropic. At this rate maybe even AWS/cloudflare.
> you need to serve traffic beyond your local region
Postgres still has a single node that can write. So most of the time you end up region sharding anyway. Sharding SQLite is straight forward.
> This is fine if you're putting together a site for your neighborhood's mom and pop shop, but once you need to handle a request baseline beyond a few hundreds TPS
It's actually pretty good for running a real time multiplayer app with a billion datapoints on a 5$ VPS [2]. There's nothing clever going on here, all the state is on the server and the backend is fast.
> but you're now compelled to find "clever" strategies to sync state across nodes.
That's the neat part you don't. Because, for most things that are not uplink limited (being a CDN, Netflix, Dropbox) a single node is all you need.
Exactly. Back in the real world,anyone who is faced with that sort of usecase will simply add memory cache and not bother with the persistence layer.
Running 100,000 `SELECT 1` queries:
PostgreSQL (localhost): 2.71 seconds
SQLite (in-memory): 0.07 seconds
SQLite (tempfile): 0.07 seconds
(https://gist.github.com/leifkb/d8778422d450d9a3f103ed43258cc...)Possibly. But possibly you have a very long tail of sites that you hardly ever look at, and that change more frequently than you use them, and maintaining the scraper is harder work than just using Chromium.
The dream is that the Claw would judge for itself whether to write a scraper or hand-drive the browser.
That might happen more easily if LLMs were a bit lazier. If they didn't like doing drudgery they would be motivated to automate it away. Unfortunately they are much too willing to do long, boring, repetitive tasks.
Neither is "apt install caddy".
Devops engineers did not know 101 of cable management or what even a cage nut is and being amazed to see a small office running 3 used dell servers bought dirt cheap, and shocked when it sounded like a air raid when they booted up, thought hot swapping was just magic.
It is always the case - earlier in the 80s-90s programmers were shaking their heads when people stopped learning assembly and trusted the compilers fully
This is nothing and hardly is shocking? new skills are learnt only if valuable otherwise one layer below seems like magic.
This is specious reasoning. You don't prevent anything by adding artificial constraints. To put things in perspective, Hetzner's cheapest vCPU plan comes with 4GB of RAM.
Sure, but I would expect you to have at least one data point or at least near it, before making any estimates for that timescale. I don't see many people make MRR projections based on 2 days of of sales, it's just something I've noticed with startups and ARR.
# ioping -R /dev/sda
--- /dev/sda (block device 38.1 GiB) ioping statistics ---
22.7 k requests completed in 2.96 s, 88.8 MiB read, 7.68 k iops, 30.0 MiB/s
generated 22.7 k requests in 3.00 s, 88.8 MiB, 7.58 k iops, 29.6 MiB/s
min/avg/max/mdev = 72.2 us / 130.2 us / 2.53 ms / 75.6 usIt seems to me that I am getting much more good ideas than I can carry on.
$20 x 1000 => $20,000 // not more than what they make a month even if "multiple" here means 2
You can view application logs with anything that can read a text file, or journalctl if your distro is using that.
There are many methods of performing application upgrades with minimal downtime.
0: https://www.man7.org/linux/man-pages/man7/namespaces.7.html
If you were seeing errors due to concurrent writes you must adjust BUSY_TIMEOUT
I feel like the advice from people with your experience is worth way way way way more than what you'd hear from big tech. Like what you said yourself, big tech tends to recommend extremely complicated systems that only seem worth maintaining if you have a trillion dollar monopoly behind it.
> - [1] An interactive transaction is a transaction where you intermingle database queries and application logic (running on the application).
could you give specific example why do you think SQlite can do batching and PG not?
Nonsense. You can't outrun physics. The latency across the Atlantic is already ~100ms, and from the US to Asia Pacific can be ~300ms. If you are interested in performance and you need to shave off ~200ms in latency, you deploy an instance closer to your users. It makes absolutely no sense to frame the rationale around performance if your systems architecture imposes a massive performance penalty in networking just to shave a couple of ms in roundtrips to a data store. Absurd.
extremely lazy, large model
+
extremely diligent RalphNot sure if top model should be the biggest one though. I hear opposite opinions there. Small model which delegates coding to bigger models, vs big model which delegates coding to small models.
The issue is you don't want the main driver to be big, but it needs to be big enough to have common sense w.r.t. delegating both up[0] and down...
[0] i.e. "too hard for me, I will ping Opus ..." :) do models have that level of self awareness? I wanna say it can be after a failed attempt, but my failure mode is that the model "succeeds" but the solution is total ass.
I hope you understand that your claim boils down to stating that SQLite is faster at doing nothing at all, which is a silly case to make.
You probably won't see this unless both the following are true for your situation:
1) You have a workload that makes this issue noticeable. Long-lived connections and large transfer sizes make it more likely you'll notice. Loading 20kb of static html over the connection likely won't seem to have any problems (unless you run repeated trials and network analysis tools). Of course, modern websites can be pretty large...
2) Your users are long-term enough and in communication with you so these issues can even be noticed in the first place. Also helps if they're technical. If you're not hearing the story and aware of the situation on the other end of the line, all you see is a slow connection, could be anything causing it, and there are plenty of them for reasons that have to do with things closer to the client's end.
So all e.g. an e-commerce site might see is a somewhat higher bounce rate than necessary (due to some fraction of their users experiencing the site like it's on a somewhat-jittery ISDN line) without even knowing they're leaving money on the table because they likely have no way of even being aware of the problem.
[EDIT] Yes, we tried shifting around a bunch of ways on DO's side trying all kinds of ways to fix this, I'm quite sure it wasn't that we were unlucky with our hardware draw there or just one of their datacenters had this problem. It was something past the edge of their network.
And since it tends to reach for the most web-represented solution, that means infinite redis caches doing the same thing, k8s, and/or Vercel.
Best mental model: imagine something that produces great tactical architecture, with zero strategic architecture, running in a loop.
Consistency is key for the grindset.
A year or so after I left they ran out of money. They would've lasted longer if the infra guy would've just stayed the backend guy and helped get projects done more quickly instead of shiny k8s setups for projects with a dozen end-users per day. Recently I saw that the CTO has started a new startup - and ironically the only guy who he took with him onto the new team looks to have been the infra guy!
I don't blame infra guy, he genuinely believed he was doing the right thing.
$100 is peanuts to most businesses, of course. But even so, I'd rather spend it on fixing an actual bottleneck.
For example: Ticketmaster makes a ton of money and their site is complete dogshit.
if the scalability is in the number of "zero cost" projects to start, then 5 vs 15 is a 3x factor.
Same as 95+% of people.
He does not say what kind of funding he has been trying to get, but if my presumption is right, then some kind of Y-Combinator style hypergrowth.
I think the response he got is sensible if he was approaching "Excel investors" who are risk averse, not targeting hypergrowth.
So comfortable that lately I have declined offers for interesting and much much better paid work, because I can no longer be bothered to take any risks or alter my lifestyle.
But sometimes I wish I could have been the guy managing to get 10k MMR using knowledge I've got in spades.
beginTx
// query to get some data (network hop)
result = exec(query1)
// application code that needs to run in the application
safeResult = transformAndValidate(result)
// query to write the data (network hop)
exec(query2, safeResult)
endTxHow would you batch this in postgres and get any value? You can nest them all in a single transaction. But, because they are interactive transactions that doesn't reduce your number of network hops.
The only thing you can batch in postgres to avoid network hops is bulk inserts/updates.
But, the minute you have interactive transactions you cannot batch and gain anything when there is a network.
Your best bet is to not have an interactive transaction and port all of that application code to a stored procedure.
Your reading/learning material can spin out of those constraints.
So for me my recent constraints were:
1. Multiplayer/collaborative web apps built by small teams.
2. Single box.
3. I like writing lisp.
So single box pushes me towards a faster language, and something that's easy to deploy. Go would be the natural choice here, but I want a lisp so Clojure is probably the best option here (helps that I already know it). JVM is fast enough and has a pretty good deployment story. Multiplayer web apps, pushed me to explore distributed state vs streaming with centralised state. This became a whole journey which ended with Datastar [1]. Thing is immediate mode streaming HTML needs your database queries to be fast and that's how I ended up on SQLite (I was already a fan, and had used it in production before), but the constraints of streaming HTML forced me to revisit it in anger.
Your constraints could be completely different. They could be:
1. Fast to market.
2. Minimise risk.
3. Mobile + Web
4. Try something new.
Fast to market might mean you go with something like Rails/Django. Minimise risk might mean you go with Rails because you have a load of experience with it. Mobile + web means you read up on Hotwire. Try something new might mean you push more logic into stored procedures and SQL queries so you can get the most out of Postgres and make your Rails app faster. So you read The Art of Postgresql [2] (great book). Or maybe you try hosting rails on a VPS and set up/manage your own postgres instance.
A few companies back mine were:
1. JVM but with a more ruby/rails like development experience.
2. Mobile but not separate iOS/Android projects.
3. Avoid the pain of app store releases.
4. You can't innovate everywhere.
That meant Clojure. React native. Minimal clients with as much driven from the backend as possible. Sticking to postgres and Heroku because it's what we knew and worked well enough.
- [1] https://data-star.dev
- [2] https://theartofpostgresql.com
There's no right answer. Hope that's helpful.
Not everyone needs monopolistic tech to do their work. There's probably less than 10,000 companies on earth that truly need to write 240k rows/second. For everyone else, we can focus on better things.
I don't know what you value your time or opportunity cost as... but the $10/mo doesn't need to save very many minutes of your time deferring dealing with a resource constraint or add too much reliability to pay off.
If resource limitations end up upsetting one end user, that costs more than $10.
The authors stack left me thinking about how will he re-start the app if it crashes, versioning, containers, infra as code.
I've seen these articles before... the Ruby on Rails guys had the same idea and built https://kamal-deploy.org/
Which starts to look more and more like K3s as time goes on.
So, if you have a network server that does BEGIN TRANSACTION (process 1000 requests) COMMIT (send 1000 acks to clients), with sqlite, your rollback rate from conflicts will be zero.
For PG with multiple clients, it’ll tend to 100% rollbacks if the transactions can conflict at all.
You could configure PG to only allow one network connection at a time, and get a similar effect, but then you’re paying for MVCC, and a bunch of other stuff that you don’t need.
Deployment, caddy holds open incoming connections whilst your app drains the current request queue and restarts. This is all sub second and imperceptible. You can do fancier things than this with two version of the app running on the same box if that's your thing. In my case I can also hot patch the running app as it's the JVM.
Server hard drive failing etc you have a few options:
1. Spin up a new server/VPS and litestream the backup (the application automatically does this on start).
2. If your data is truly colossal have a warm backup VPS with a snapshot of the data so litestream has to stream less data.
Pretty easy to have 3 to 4 9s of availability this way (which is more than github, anthropic etc).
- When AWS/GCP goes down, how do most handle HA?
- When a database server goes down, how do most handle HA?
- When Cloudflare goes down, how do most handle HA?
The down time here is the server crashed, routing failed or some other issue with the host. You wait.
One may run pingdom or something to alert you.
Running 100,000 `SELECT 1` queries:
PostgreSQL (localhost): 2.84 seconds
PostgreSQL (Unix socket): 1.93 seconds
SQLite (in-memory): 0.07 seconds
SQLite (tempfile): 0.06 seconds
(https://gist.github.com/leifkb/b940b8cdd8e0432cc58670bbc0c33...)It is specious reasoning. Self-imposing arbitrary constraints don't make you write good, performant code. At most it makes your apps run slower because they will needlessly hit your self-impose arbitrary constraints.
If you put any value on performant code you just write performance-oriented code, regardless of your constraints. It's silly to pile on absurd constraints and expect performance to be an outcome. It's like going to the gym and work out with a hand tied behind your back, and expect this silly constraints to somehow improve the outcome of your workout. Complete nonsense.
And to drive the point home, this whole concern is even more perplexing as you are somehow targeting computational resources that fall below free tiers of some cloud providers. Sheer lunacy.
And most VPSs allow increasing memory with a click of a button and a reboot.
Worrying about HA when you don't have customers that need it is one thing, but I wouldn't want to be in a place where I have to put a banner on the website asking users to please make a new account because we had an oopsie.
I think your analogy is flawed; a more apt one would be training with deliberately reduced oxygen levels, which trains your body to perform with fewer resources. Once you lift that constraint, you’ll perform better.
You’re correct that you can write performant code without being required to do so, but in practice, that is a rare trait.
You seem terribly confused. Backups don't buy you high availability. At best, they buy you disaster recovery. If your node goes down in flames, your users don't continue to get service because you have an external HD with last week's db snapshots.
This is a disingenuous scenario. SQLite doesn't buy you uptime if you deploy your app to AWS/GCP, and you can just as easily deploy a proper RDBMS such as postgres to a small provider/self-host.
Do you actually have any concrete scenario that supports your belief?
Streaming replication lets you spin up new nodes quickly with sub second dataloss in the event of anything happening to your server. It makes having a warm standby/failover trivial (if your dataset is large enough to warrant it).
If your backups are a week old snapshots, you have bigger problems to worry about than HA.
This is...not true of many hyperscaler outages? Frequently, outages will leave individual VMs running but affect only higher-order services typically used in more complex architectures. Folks running an SQLite on a EC2 often will not be affected.
And obviously, don't use us-east-1. This One Simple Trick can improve your HA story.