What the hell have you built

I feel like sometimes it’s a form of procrastination.

There are things we don’t want to do (talk to costumers, investors, legal, etc.), so instead we do the fun things (fun for engineers).

It’s a convenient arrangement because we can easily convince ourselves and others that we’re actually being productive (we’re not, we’re just spinning wheels).

An improved CV, lets be honest most stuff is boring projects that could even be built with 1990's technology, distributed systems is not something that was invented yesterday.

However having in the CV any of those items from left side in the deployment strategy is way cooler than mentioning n-tier architecture, RPC (regardless how they are in the wire), any 1990's programming language, and so forth.

A side effect from how hiring works so badly in our industry, it isn't enough to know how master a knife to be a chef, it must be a specific brand of knife, otherwise the chef is not good enough for the kitchen.

"Maybe Redis for caching".

Really that's going way too far - you do NOT need Redis for caching. Just put it in Postgres. Why go to this much trouble to put people in their place for over engineering then concede "maybe Redis for caching" when this is absolutely something you can do in Postgres. The author clearly cannot stop their own inner desire for overengineering.

I love the fact that the author "wrote" this page with massive CSS framework (tailwind) and some sort of Javascript framework, with a bundler and obfuscator - instead of a plain, simple HTML page. Well played! :-)

From the titular tweet (12 years already !): https://x.com/codinghorror/status/347070841059692545

It's sure a corny stance to hold if you're navigating an infrastructure nightmare daily, but in my opinion, much of the complexity addresses not technical, but organisational issues: You want straightforward, self-contained deployments for one, instead of uploading files onto your single server. If the process crashes or your harddisk dies, you want redundancy so even those twelve customers can still access the application. You want a CI pipeline, so the junior developer can't just break prod because they forgot to run the tests before pushing. You want proper secret management, so the database credentials aren't just accessible to everyone. You want a caching layer, so you're not surprised by a rogue SQL query that takes way too long, or a surge of users that exhaust the database connections because you never bothered to add proper pooling.

Adding guardrails to protect your team from itself mandates some complexity, but just hand-waving that away as unnecessary is a bad answer. At least if you're not working as part of a team.

Thinking is scary. No one (among non-thinking colleagues) is going to criticize you for using de-facto standard services like kafka, mongo, redis, ecc... regardless of the nonsensical architecture you come up with.

Yes, I also put Redis in that list. You can cache and serve data structure in many other ways, for example replicate the individual features you need in you application instead of going the lazy route and another service to the mix. And don't get me started on Kafka... money thrown in the drain when a stupid grpc/whatever service would do.

Part of being an engineer is also selecting the minimum amount of components for your architecture and not being afraid of implementing something on your own if you only need 1 of 100s features that an existing product require.

> Add complexity only when you have proof you need it.

This does assume that said complexity can be added ad hoc later. Often earlier architecture choices make additions complex too or even prevent it entirely without a complete rewrite

So while the overall message is true there is some liberal use of simplification at play here too

In some cases a compromise can make sense. Eg use k8s but keep it simple within that - as vanilla as you can make it

Oh my word Riak - I haven't seen that DB mentioned for years!

I totally get the point it makes. I remember many years ago we announced SocketStream at a HackerNews meet-up and it went straight to #1. The traffic was incredible but none of us were DevOps pros so I ended up restarting the Node.js process manually via SSH from a pub in London every time the Node.js process crashed.

If only I'd known about upstart on Ubuntu then I'd have saved some trouble for that night at least.

I think the other thing is worrying about SPOF and knowing how to respond if services go down for any reason (e.g. server runs out of disk space - perhaps log rotation hasn't been setup, or has a hardware failure of some kind, or the data center has an outage - I remember Linode would have a few in their London datacenter that just happened to occur at the worst possible time).

If you're building a side project I can see the appeal of not going overboard and setting up a Kubernetes cluster from the get-go, but when it is things that are more serious and critical (like digital infrastructure for supporting car services like remotely turning on climate controls in a car), then you design the system like your life depends on it.

I built a small simple page that I send to people when they start proposing crazy db architectures that people might like if they like this page:

https://nocommasql.com/

This kind of complexity is unfortunately also embedded into model training data.

Left unchecked, Claude is very happy to propose "robust, scalable and production ready" solutions - you can try it for yourself. Tell it you want to handle new signups and perform some work like send an email or something (outside the lifecycle of the web request).

That is, implying you need some kind of a background workload and watch it bring in redis, workflow engines, multiple layouts for docker deployment so you can run with and without jobs, obscene amount of environment variables to configure all that, create "fallbacks" and retries and all kinds of things that you will never spend time on during an MVP and even later resist adding just because of the complexity and maintenance they require.

All that while (as in the diagram of the post), there is an Erlang/Elixir app capable of doing all that in memory :).

The fact that we have lambdas/serverless functions and people are still over-engineering k8s clusters for their "startup project" is genuinely hilarious. You can literally validate your idea with some janky Python code and like 20 bucks a month.

The problem is that people don't like hearing their ideas suck. I do this too, to be fair. So, yes, we spend endless hours architecting what we'd desperately hope will be the next Facebook because hearing "you are definitely not the next Facebook" sucks. But alas, that's what doing startups is: mostly building 1000 not-Facebooks.

The lesson here is that the faster you fail, the faster you can succeed.

There is an argument I rarely ever see in discussions like this, which is about reducing the need for working memory in humans. I'm just in the mid thirties, but my ability to keep things in working memory is vastly reduced compared to my twenties. Might just be me who's not cut out for programming or system architecturing, but in my experience what is hard for me is often what is hard for others, they just either don't think about it or ignore it and push through keeping hidden costs alive.

My argument is this; even if the system itself becomes more complex, it might be worth it to make it better partitioned for human reasoning. I tend to quickly get overwhelmed and my memory is getting worse by the minute. It's a blessing for me with smaller services that I can reason about, predict consequences from, deeply understand. I can ignore everything else. When I have to deal with the infrastructure, I can focus on that alone. We also have better and more declarative tools for handling infrastructure compared to code. It's a blessing when 18 services doesn't use the same database and it's a blessing when 17 services isn't colocated in the same repository having dependencies that most people don't even identify as dependencies. Think law of leaky abstractions.

I'm going through this decision right now. I agree, you are building a product with an unproven market and lots of time to grow organically, maybe you do want to start small and scrappy. Build something you can easily throw away and start over with. Build something that gets you to market as quickly as possible so you can pivot.

OTOH, If you are trying to sell the idea to investors and large companies that you are a serious player and have a plan and know-how to grow and scale your service quickly, maybe you do want to show that you have the design chops and ability to actually scale your product. Take a look and ask yourself, "Does my business model only work if it scales up dramatically, far beyond the capacity of a single database?" If the answer is "yes", start with a scalable architecture to save the 100+ person-years and endless gnashing of teeth it will take to untangle your monolith (been there.)

CDD, or CV-driven development, as I like to call it.

The alternative to CI/CD pipelines is to rely on human beings to perform the same repetitive actions the exact same way every single time without any mistakes. You would never convince me to accept that for any non-trivial project.

Especially in an age where you can basically click a menu in GitHub and say "Hey, can I have a CI pipeline please?"

Very relatable to a recent interview experience I had with a popular freelance platform for the backend developer position.

I never worked at a FAANG-ish company, and in the course of my 10-year career I spent most of my efforts on stopping the organizations from building the wrong thing in the first place, not on "making things scaleable" from the get-go. My view is that if you have product-market fit, you can throw money on the problem for a very, very long time and do just fine, so everyone in the org should focus on achieving PMF as soon as possible.

The question of "How would you scale a Django service to 10M requests per day" came up, and my answer to just scale components vertically and purchase stronger servers obviously was not satisfactory.

I don't really get this line of argument

Or at least it's not engaging with the obvious counterargument at all - that: "You may not need the scale now, but you may need it later". For a startup being a unicorn with a bajillion users is the only outcome that actually counts as success. It's the outcome they sell to their investors.

So sure, you can make a unscalable solution that works for the current moment. Most likely you wont need more. But that's only true b/c most startups don't end up unicorns. Most likely is you burn through their VC funding and fold

Okay stack overflow allegedly runs on a toaster, but most products don't fit that mold - and now that they're tied to their toaster it probably severely constrains what SO can do it terms of evolving their service

12 years later and Postgres is (still) Enough and getting better by the day: https://gist.github.com/cpursley/c8fb81fe8a7e5df038158bdfe0f...

Ironic that the clicking those big buttons only causes a JS error to be logged to console with nothing else happening. That doesn't particularly lend to the authors credibility, although the advice of using simple architecture where possible is correct.

The problem with doing things the sensible way (eschewing microservices and k8s when you work on projects that aren't hyperscale) is that you end up missing opportunities later on because recruiters will filter you because you can't meaningfully respond to the question about “how experienced you are with micro service architecture”. Granted I may have dodged a bullet by not joining a company with 50 engineers that claim to replicate Google's practices (most of which are here to make sure tens of thousands of engineers can work efficiently together), but still someone gets to pay the bill at the end of the month…

you guys are going to miss the days of over-engineered microservice solutions when you are debugging ai workflows :)

Blame the C-suite who approve embedding AWS solution designers into teams.

I had a call the other day with a consultancy to potentially pick up some infrastructure work/project type stuff. Asked about timezones involved and they said a lot of their clientele are US based startups. "So it's mainly Kubernetes work" they said.

I personally would suggest the vast majority of those startups do not need Kubernetes and certainly don't need to be paying a consultancy to then pay me to fix their issues for them.

I agree 100%. "Complexity is not a virtue. Start simple. Add complexity only when you have proof you need it."

Heh

Once you have a service that has users and costs actual money, while you don’t need to make it a spaghetti of 100 software products, you need a bit of redundancy at each layer — backend, frontend, databases, background jobs — so that you don’t end up in a catastrophic failure mode each time some piece of software decides to barf.

Or build your microservices as a monolith using a “local” async service mesh (no libs or boilerplate needed, its just an async interface for each service) and service namespaced tables in your DB, then just swap in a distributed transport on a per-case basis if you ever need to scale.

Are you doing software for money? Because not having Kubernetes in the project will stop you from receiving money. Someone please create with one of these smart AI tools the ultimate killer app: Kubernetes+crypto+AI+blockchain+Angular+Redux+Azure (Working only in Chrome browser).

The damages of micro-services, cloud-scale, and a bunch of enterprise architects that have done nothing for 10 years but read blogs (advertisements) written by other enterprise architects that just got back from watching demos at a conference.

Absolutely spot-on site. Love it.

Recently, with the AWS outage, our stack of loads of different cloud providers ended up working pretty well! It might be a bit complex running distributed nodes and updating state via API, but its cheap and clearly resilient.

The problem is job interviews, where you are expected to know how to scale everything reliably, so it wouldn't be satisfactory to answer that just have a monolith against a postgres instance.

As a practitioner we subconciously optimise for "beauty", in maths, physics or dev. Most hackers are self-motivated, by that beauty, not by 40k ork style functional design.

While I agree with most of this rant, I have a problem with the common "just use postgres" trope, often repeated here.

I recently had to work with SQL again after many years, and was appalled at the incidental complexity and ridiculous limitations. Who in this century would still voluntarily do in-band database commands mixed with user-supplied data? Also, the restrictions on column naming mean that you pretty much have to use some kind of ORM mapping, you can't just store your data. That means an entire layer of code that your application doesn't really need, just to adapt to a non-standard from the 70s.

"just use postgres" is not good advice.

Totally agree.

Now every system design interview expects you to build some monstrous stack with layers of caching and databases for a hypothetical 1M DAU (daily active users) app.

Mess in the head.

12 years on, and a lot of Postgres-based services built since the OP site first went live, I now actually may recommend MongoDB as the sensible option...

I love the unnecessary buttons that do nothing :)

Job security-driven development. It explains why some projects are unnecessary complex.

But that’s half the fun (and knowledge about these systems got me my current job)

Is there any good reason to switch from mysql to postgres though?

Funny, from the title I was expecting a productivity-adjacent "What have you even built?" article.

Except it's really a "What over-engineered monstrosity have you built?" in the theme of "choose boring technology"

p.s. MariaDB (MySQL fork) is technically older and more boring than PostgreSQL so they're both equally valid choices. Best choice is ultimately whatever you're most familiar with.

Job offers require experience in technologies that you won't ever need building solo project. I'm not surprised when those big scale technologies get shoehorned into small project for the sake of learning, showcasing "look I know that one" etc. Only totally missing this point could explain why someone would make this hyperbole rant page

Is this targeted at startup bros with an MVP and a dream ?

In almost any other scenario I feel the author is being intentionally obtuse about much of the reality surrounding technology decisions. An engineer operating a linux box running postgres & redis (or working in an environment with this approach) would become increasingly irrelevant & would certainly earn far less than the engineer operating the other. An engineering department following "complexity is not a virtue" would either struggle to hire or employ engineers considered up-to-date in 2006.

Management & EXCO would also have different incentives, in my limited observations I would say that middle and upper management are incentivised to increase the importance of thier respective verticals either in terms of headcount, budget or tech stack.

Both examples achieve a similar outcome except one is : scalable, fault tolerant, automated and the other is at best a VM at Hetzner that would be swiftly replaced should it have any importance to the org, the main argument here (and in the wild) seems to be "but its haaaard" or "I dont want to keep up with the tech"

KISS has a place and I certainly appreciate it in the software I use and operating systems I prefer but lets take a moment to consider the other folks in the industry who aren't happy to babysit a VM until they retire (or become redundant) before dispensing blanket advice like we are all at a 2018 ted talk . Thanks for coming to my ted talk

What the hell have you built? Turns out a pretty straightforward service.

That diagram is just aws, programming language, database. For some reason hadoop I guess. And riak/openstack as redundant.

It just seems like pretty standard stuff with some seemingly small extra parts because that make me think that someone on the team was familiar with something like ruby, so they used that instead of using java.

"Why is Redis talking to MongoDB" It isn't.

"Why do you even use MongoDB" Because that's the only database there, and nosql schemaless solutions are faster to get started... because you don't have to specify a schema. It's not something I would ever choose, but there is a reason for it.

"Let's talk about scale" Let's not, because other than hadoop, these are all valid solutions for projects that don't prioritize scale. Things like a distributed system aren't just about technology, but also data design that aren't that difficult to do and are useful for reasons other thant performance.

"Your deployment strategy" Honestly, even 15 microservices and 8 databases (assuming that it's really 2 databases across multiple envs) aren't that bad. If they are small and can be put on one single server, they can be reproduced for dev/testing purposes without all the networking cruft that devops can spend their time dealing with.

Nah, postgres is overhyped, MariaDB is enough and recently: <https://mariadb.org/mariadb-vs-postgresql-understanding-the-...>

Yet the author spent a whole afternoon (hopefully not more!) writing a website to tell some people (who exactly?) that they’re doing it wrong.

I built a small simple page that I send to people when they start proposing crazy db architectures that people might like if they like this page:

https://nocommasql.com/

just a nit. it pollutes back button history when I expand content. took 9 presses of back button to return to HN.

Useful, but 10 years ago without JSONB in PG it wasn't really the answer to everything. But as of today, I am recommending PG to anyone that does not have a good reason or use case to NOT use it.