Throwing away 18 months of code and starting over

"For the longest time, I would NOT allow people to write tests because I thought that culturally, we need to have a culture of shipping fast"

Tests are how you ship fast.

If you have good tests in place you can ship a new feature without fear that it will break some other feature that you haven't manually tested yet.

It's not really hinted at in the article, which doesn't actually mention whether the rewrite was a net gain - I presume it was or they wouldn't have written the article, and the lead-in picture paints a rosy picture, but the tone at the end suggests he's not happy with how things turned out.

But one thing that used to be a common design anti-pattern was the "version 2 problem". I think I first heard about it when Netscape were talking about how NN2 was a disaster, and they were finally happy with NN3 or NN4.

Often version 1 is a hastily thrown together mess of stuff, but it works and people like it. But there's lots of bad design decisions and you reach a limit with how far you can continue pushing that bad design before it gets too brittle to change. So you start on version 2, a complete rewrite to fix all the problems and you end up with something that's "technically perfect" but so overengineered, it's slow and everybody hates it, plus there are probably lots of workflow hoops to jump through to get things approved that you end up not making any progress, and possibly version 2 kills the product and/or the company.

The idea is that the "version 3" is a pragmatic compromise - the worse design problems from version 1 are gone, but you forego all the unnecessary stuff that you added in version 2, and finally have a product that customers like again (assuming you can convince them to come back and try v3 out) and you can build into future versions.

To a large degree I think this "version 2 problem" was a by product of waterfall design, it's certainly been less common since agile development became popular in the early 2000s and tooling made large scale refactoring easier, but even so I remember working somewhere with a v1 that the customers were using and a v2 that was a 3-year rewrite going on in parallel. None of the developers wanted to work on v1 even though that's what brought in the revenue, and v2 didn't have any of the benefit of the bug fixes accumulated over the years to fix very specific issues that were never captured in any of the scope documents.

Having a culture of not ever writing tests and actively disallowing them is so insane I can't even imagine why there's anything else in this post

This seems very post-hoc and like they're fortunate they happened to arrive at something better rather than worse.

The justification for K8s seems pretty thin. It makes me wonder if the author understands why they need it. I'm guessing it's because they've got substantial parallel, multi-tenant networking of stateful processes, which is a pretty defensible reason to use K8s. And easy to say. It seems strange to leave it out.

The argument against Temporal also seems invalid, but I'm not certain. It has been years since I used it, but wouldn't it be possible to poll for completion? It seems like you'd wind up with better observability/retryability tooling, and it's much simpler overall. Polling seems like a good compromise for what I'd consider much easier tooling to reason about.

I'd also posit that you could model a lot of this using your own serializable state machines. They're in the JS ecosystem, so XState is an excellent option. You'd get incredible visibility into your orchestration, deep access to testing the semantics and logic you care most about, and the ability to have your entire architecture be containers on the fly with no blackbox orchestration.

Of course, I'm speculating after browsing through their website a bit and thinking about the problems they described. I'm missing a lot of context. K8s could be the clear winner.

Still, after reading this I would never use this product. I don't mean to sound unkind. I'd never trust the decision-making of the people who followed this trajectory. If I were the author I'd take this down ASAP.

I can't imagine working as a developer at a place where manager/founder "does NOT allow" tests to be written. This, combined with four pivots mentioned in the article seems like they are just riding the hype and trying to brute-force a product without having any basics or PMF.

sorry, still don't get no tests as an excuse to go faster. obviously ymmv, but you will need to test your implementation somehow, and manual testing usually takes more time than running your automated tests. no need to over test, but definitely tests doesn't mean it will slow you down, unless you don't know how to test, which in that case, that's totally up to you.

Pearls.

> I would NOT allow people to write tests

> now [...] we started with tests from the ground up

It's a big move. But I understand it.

Sometimes your code is "just" a proof of concept, a way to test the idea. Very far from a decent product.

That is the time you ditch the code, keep the ideas (both good and bad) and start over.

Tests are most useful for regression detection, so it's a good instinct to not add them when you're primarily exploring. Once you've decided to switch to exploitation, though, regression will hurt. I think it's just a classic 0 to 0.1 not being the same thing as 0.1 to 1.

Nice using of the io domain there

I wouldn’t admit to this level of frankly incompetence.

Wildly swinging dogmatism on how to do software development that’s so wrong you have to throw it all away - then repeating this failure loop multiple times.

Doesn’t inspire any confidence in the person I wouldn’t get them to lead a project.

Why would you be so loud and proud about all this.

So you started with 2023 theo.gg philosophy but now moved on to 2026 theo.gg philosophy

Next is such a dumpster fire. So much wasted effort due to the Node ecosystem never developing a universal batteries included framework like Rails or Django.

"For the longest time, I would NOT allow people to write tests because I thought that culturally, we need to have a culture of shipping fast"

Tests are how you ship fast.

If you have good tests in place you can ship a new feature without fear that it will break some other feature that you haven't manually tested yet.

Exactly. OP seems to have very limited understanding of software development if that fact has eluded him.

The rewrite version of this that has gone best for me is to do it as a strangler, not a reset. Pick one ugly workflow, lock in current behavior with characterization tests, rebuild that slice behind a flag, repeat. You still get to fix the architecture, but you do not throw away years of weird production knowledge.

I think the more specific description would be that "not writing tests allows shipping fast today, writing tests allows shipping fast tomorrow and afterwards".

It wasn't too long ago that I wrote tests for something that was shipped years ago without any automated tests. Figured it was easier doing that than hoping we won't break it.

Exactly. OP seems to have very limited understanding of software development if that fact has eluded him.

This seems very post-hoc and like they're fortunate they happened to arrive at something better rather than worse.

Of course, I'm speculating after browsing through their website a bit and thinking about the problems they described. I'm missing a lot of context. K8s could be the clear winner.

Having a culture of not ever writing tests and actively disallowing them is so insane I can't even imagine why there's anything else in this post

And particularly the “no tests go faster”.

I feel like we keep having to reestablish known facts every two years in this field.

I think your comment is just as "insane" as the practice you are railing against. Although I wouldn't use the word "insane" - it's hyperbole. What's the right word here? I'm not sure... "dogmatic" isn't quite right.

If you are a two man startup, burning through runway and pre-product-market fit... then spending a lot of time on tests is questionable (although the cost-benefit now with AI is changing very fast).

What I find "insane", "dogmatic"... about your comment is the complete elision of this process of cost-benefit analysis, as if there should never be such an analysis.

I've worked with a lot of people like you. When a discussion begins about a choice to be made, they just stampede in with "THIS IS THE RIGHT WAY". And the discussion can't even be had.

This sort of "dogmatism" is so rife if engineering culture, I wonder if this is why the c-suite is so ready to dump us all for AI centaurs that just fucking ship features. How many of them got burned listening to engineers who refused to perform even the most basic of cost benefit analyses with the perspective of the business as a whole in mind and forced the most unnecessary, over-engineered bullshit.

I worked at one startup where the tech lead browbeat the founders into building this enormous microservice monster that took them years. They had ONE dev team, ONE customer, and the only feature actually being used was just a single form (which was built so badly it took seconds to type a single character in a field cause the React re-renders were crazy).

Now THAT's insanity.

Yea I stopped reading at this point

Pearls.

> I would NOT allow people to write tests

> now [...] we started with tests from the ground up

It's a big move. But I understand it.

Sometimes your code is "just" a proof of concept, a way to test the idea. Very far from a decent product.

That is the time you ditch the code, keep the ideas (both good and bad) and start over.

Nice using of the io domain there

So you started with 2023 theo.gg philosophy but now moved on to 2026 theo.gg philosophy

Two different stages of the project, not necessarily contradictory. I'm not saying this is great, but tests make a whole lot more sense when you know what you're building.

"The general tendency is to over-design the second system, using all the ideas and frills that were cautiously sidetracked on the first one. The result, as Ovid says, is a "big pile."

- Fred Brooks, 'The Mythical Man Month' (1975)

I definitely encountered this second-system effect recently. I have an app that works well because it was written to target a specific use case. User (and I) wanted some additional features, but the original architecture just couldn't handle these new features, so I had to do a rewrite from the ground up.

As I rewrote it, I started pulling in more "nice to haves" or else opening up the design for the potential to support more and more future features. I eventually got to a point where it became unwieldy as it had too many open-ended architectural decisions and a lot of bloat.

I ended up scrapping this v2 before releasing it and worked on a v3 but with a more focused architecture, having some things open-ended but choosing not to pursue them yet as I knew that would just introduce unneeded bloat.

I was quite aware of the second-system effect when doing all this, but I still succumbed to it. Thankfully, the v3 rewrite didn't take as long since I was able to incorporate a lot of the v2 design decisions but scaled some of them back.

My adaptation of the Version 2 Problem is “any idiot can ship version 1 of a product, but it takes skill to ship version 2”.

Usually levied at people who are so hyper focused on shipping a so-called MVP that is really demoware that they are driving us at a brick wall and commenting the entire way about what good time we are making.

This has been my experience exactly. V1 was custom built for a single client and they loved it. As we tried to expand to multiple clients the v1 was too narrowly scoped (both in UX and code architecture) so we did a full rewrite attempting to generalize the app across more workflows. V2 definitely expanded our client pool, but all our large v1 customers absolutely hated it.

We never did a full v3 rewrite, but it took about 4 years and many v3 redesigns of various features to get our legacy customers on board.

Next is such a dumpster fire. So much wasted effort due to the Node ecosystem never developing a universal batteries included framework like Rails or Django.

Which in turn were only invented because millennials would not be caught dead writing Java and JSP. We had all this shit figured out by the late nineties and 90% of what is accomplished on the web today was entirely possible and well integrated in Java app servers.

This whole business is a fashion industry.

I'm for one grateful for LLMs because for the first time in around 30 years there is actually genuine novelty to explore in software engineering. Ruby and nodejs weren't it.

I wouldn’t admit to this level of frankly incompetence.

Wildly swinging dogmatism on how to do software development that’s so wrong you have to throw it all away - then repeating this failure loop multiple times.

Doesn’t inspire any confidence in the person I wouldn’t get them to lead a project.

Why would you be so loud and proud about all this.

"bugs were appearing everywhere out of the blue. The codebase was a huge mess of nulls, undefined behaviour, bad error handling. It was so bad that we actually lost a client over this."

Especially wild considering their product is literally an automated bug finder lol.

Same. Admitting to it is one thing, but still it takes a certain kind of attitude to outright forbid people to write tests.

I think there's a real possibility this is a "no such thing as bad publicity" stunt.

I think the more specific description would be that "not writing tests allows shipping fast today, writing tests allows shipping fast tomorrow and afterwards".

It wasn't too long ago that I wrote tests for something that was shipped years ago without any automated tests. Figured it was easier doing that than hoping we won't break it.

Yeah, but in my experience it really is a literal today vs tomorrow thing.

Your tests pay for themselves the moment you want to ship a second feature without fear of breaking the first.

Yea I stopped reading at this point

If you are a two man startup, burning through runway and pre-product-market fit... then spending a lot of time on tests is questionable (although the cost-benefit now with AI is changing very fast).

What I find "insane", "dogmatic"... about your comment is the complete elision of this process of cost-benefit analysis, as if there should never be such an analysis.

I've worked with a lot of people like you. When a discussion begins about a choice to be made, they just stampede in with "THIS IS THE RIGHT WAY". And the discussion can't even be had.

Now THAT's insanity.

> they just stampede in with "THIS IS THE RIGHT WAY". And the discussion can't even be had.

That's exactly what this person is railing against. They strictly forbid testing.

The truth is in the middle somewhere, regarding tests at least (yes, your microservices story is insane).

I think the author could have been happier with the no-test decision if they had treated the initial work as a prototype with the idea of throwing it away.

At the same time, writing some tests, should not be seen as a waste of time since if you're even at all experienced with it, it's going to be faster than constantly reloading your browser or pressing up-up-up-up-up in a REPL to check progress (if you're doing the latter you are essentially doing a form of sorta reverse TDD).

So I dunno... I may be more in line with the idea that's a bit insane to prevent people from writing tests BUT so many people are so bad at writing tests that ya, for a go-gettem start up it could be the right call.

I certainly agree with your whole cost-benefit analysis paragraph.

> After we started hiring, it became a disaster.

When it stopped being two people he still forbade tests. In this decade. That is fucking nuts.

Fun fact: the guy I worked a 2 man project with and I had a rock solid build cycle, and when we got cancelled to put more wood behind fewer arrows, he and I built the entire CI pipeline. On cruisecontrol. And if you don’t know what that is, that is Stone Age CI. Literal sticks and rocks. Was I ahead of a very big curve? You bet your sweet bippy. But that was more than twenty years ago.

Did I say that my way was the right way? No: what I said was actively disallowing tests in every situation was the wrong way.

There is no ability here for the cost benefit analysis to change over time. There is only no tests

Not having ANY tests means tons of manual testing is needed every time you modify code, which will rapidly consume more time than writing the tests would.

And particularly the “no tests go faster”.

I feel like we keep having to reestablish known facts every two years in this field.

I ran into some serious struggles when we got far enough into accepting most of the tenets of XP as standard practice that most jobs didn’t even debate half of them and then landed at places that still thought they were stupid. I’d taken for granted I wasn’t going to have to fight those fights and forgotten how to debate them. Because I Said So is not a great look.

My adaptation of the Version 2 Problem is “any idiot can ship version 1 of a product, but it takes skill to ship version 2”.

We never did a full v3 rewrite, but it took about 4 years and many v3 redesigns of various features to get our legacy customers on board.

Same. Admitting to it is one thing, but still it takes a certain kind of attitude to outright forbid people to write tests.

"bugs were appearing everywhere out of the blue. The codebase was a huge mess of nulls, undefined behaviour, bad error handling. It was so bad that we actually lost a client over this."

Especially wild considering their product is literally an automated bug finder lol.

"The general tendency is to over-design the second system, using all the ideas and frills that were cautiously sidetracked on the first one. The result, as Ovid says, is a "big pile."

- Fred Brooks, 'The Mythical Man Month' (1975)

Oh wow, it's from Mythical Man Month? I've been meaning to read that for years and still never have.

This whole business is a fashion industry.

I'm for one grateful for LLMs because for the first time in around 30 years there is actually genuine novelty to explore in software engineering. Ruby and nodejs weren't it.

Yeah, but in my experience it really is a literal today vs tomorrow thing.

Your tests pay for themselves the moment you want to ship a second feature without fear of breaking the first.

Mongodb is webscale.

It really wasn't.

MVC really changed web dev for the better, and Django/Rails trail-blazed it. It's one of the few paradigms I've seen in my career that was an unequivocal win for us.

Two different stages of the project, not necessarily contradictory. I'm not saying this is great, but tests make a whole lot more sense when you know what you're building.

Yes. TFA author could have gone into it with this mindset and treated the initial work as a prototype with the idea of throwing it away and would have been happier about it.

> but tests make a whole lot more sense when you know what you're building.

It's very true. This is a "gotcha" a lot of anti-TDDers always bring up, and yet some talk about "prototyping == good" without ever making the connection that you can do both.

in an age of generated tests, a mandate on no tests is just dumb

I think there's a real possibility this is a "no such thing as bad publicity" stunt.

The truth is in the middle somewhere, regarding tests at least (yes, your microservices story is insane).

I think the author could have been happier with the no-test decision if they had treated the initial work as a prototype with the idea of throwing it away.

I certainly agree with your whole cost-benefit analysis paragraph.

Not having ANY tests means tons of manual testing is needed every time you modify code, which will rapidly consume more time than writing the tests would.

> they just stampede in with "THIS IS THE RIGHT WAY". And the discussion can't even be had.

That's exactly what this person is railing against. They strictly forbid testing.

Again - that's a business decision that needs to be made in the context of that business. The fact that testing was forbidden isn't in itself good or bad. It depends on that business context. THe post says nothing about how that decision was made, whether it was discussed, or if it was just his absolutist ideal he imposed without consideration of the broader cost-benefit.

And I still feel the original comment doesn't give this point enough weight.

> After we started hiring, it became a disaster.

When it stopped being two people he still forbade tests. In this decade. That is fucking nuts.

Did anyone here actually look at the product they were actually building? It's an AI agent bug discovery product. Their whole culture is probably driven at a fundamental philosophical level about the problems of bug discovery. As he says: he wanted to rely on dogfooding - using their product as the way of spotting bugs.

That may have been spectactular naivete but it's not insanity.

The point I keep coming back to here that everyone is fighting me so hard on is that these blanket statements of: NO TESTS IS NUTS... absent of an understanding of the business context... is harmful.

Did I say that my way was the right way? No: what I said was actively disallowing tests in every situation was the wrong way.

There is no ability here for the cost benefit analysis to change over time. There is only no tests

Did you edit the wording of your original comment slightly to emphasise the "actively disallowing them" in every situation? Anyway... if that is what you meant, then ok. It's less awful a statement than what I felt I originally read.

I'd still push back on your hyperbole though. I don't think the author was insane - and we don't know what the broader business context was when they started growing the team and decided to persist without building out the test architecture at that point. They made a call that dogfooding was going to be enough to catch issues as they grew the team. There are a lot of scenarios where that is going to be true.

One scenario where it wouldn't - the most likely - is that the team isn't actually dogfooding because they personally don't find the product useful. Leadership lambasts them to use the product more... but no one does cause it sucks so much it impacts their own personal productivity.

Even there I wouldn't use the word insane... just poor leadership.

It really wasn't.

MVC really changed web dev for the better, and Django/Rails trail-blazed it. It's one of the few paradigms I've seen in my career that was an unequivocal win for us.

Oh wow, it's from Mythical Man Month? I've been meaning to read that for years and still never have.

That and Brooks’ underrated “The Design of Design” are notable for having an almost impossible density of quotable aphorisms on every page. They’re all so relevant today that it’s hard to believe that he’s talking about problems he faced half a century ago.

Mongodb is webscale.

Do you think it can handle 10 requests per hour? How many mongo instances will that require, and should I use micro services?

Yes. TFA author could have gone into it with this mindset and treated the initial work as a prototype with the idea of throwing it away and would have been happier about it.

> but tests make a whole lot more sense when you know what you're building.

It's very true. This is a "gotcha" a lot of anti-TDDers always bring up, and yet some talk about "prototyping == good" without ever making the connection that you can do both.

in an age of generated tests, a mandate on no tests is just dumb

And I still feel the original comment doesn't give this point enough weight.

Forbidding tests is not a business decision, it's a software engineering decision, and it's a remarkably poor one at that.

That may have been spectactular naivete but it's not insanity.

The point I keep coming back to here that everyone is fighting me so hard on is that these blanket statements of: NO TESTS IS NUTS... absent of an understanding of the business context... is harmful.

What ends up happening is that your most fundamental features end up rotting because manual testing has biases. Chief among them is probably Recency Bias. It is in fact super easy to break a launch feature if it’s not gating any of the features you’re working on now. If you don’t automate those, yes, you’re nuts.

One of the worst ones I ever encountered was learning that someone broke the entire help system three months prior, and nobody noticed. Because developers don’t use the help system. I convinced a team of very skeptical people that E2E testing the help docs was a higher priority than automating testing of the authentication because every developer used that eight times a day or more. In fact on a previous project with trunk based builds, both times I broke login someone came to tell me so before the build finished.

Debugging is about doing cheap tests first to prune the problem space, and slower tests until you find the culprit. Testing often forgets that and will run expensive tests before fast ones. Particularly in the ice cream cone.

In short, if you declare an epic done with zero automation, you’re a fucking idiot.

Even there I wouldn't use the word insane... just poor leadership.

> Did you edit the wording of your original comment slightly to emphasise the "actively disallowing them" in every situation?

I did not.

He did not edit, and you're misunderstanding the meaning behind his post. Not everything needs to be pedantic and accurate, language is flexible, this is about communicating, not being right.

What we really don't need is paragraphs of someone arguing because their own definitions differ slightly from the OP

Do you think it can handle 10 requests per hour? How many mongo instances will that require, and should I use micro services?

Never heard of "The Design of Design" but I bought it off this comment chain.

I think our industry would do a lot to take a moment and breath to understand what we have collectively done since inception. Wonder often if we will look at the highly corporatized influence our industry has had during our time as the dark ages 1000s of years into the future. The idea that private enterprise should shape the direction of our industry is deeply problematic, there needs to be public option and I doubt many devs would disagree.

In short, if you declare an epic done with zero automation, you’re a fucking idiot.

Forbidding tests is not a business decision, it's a software engineering decision, and it's a remarkably poor one at that.

Hard disagree. It's both. Choosing one way or the other comes with potential risks and rewards to the business and it's up to business leadership to choose what risks they want to take. Your job as an engineer - if you are not part of leadership is to explain those risks / rewards, and then let them make the call.

> Did you edit the wording of your original comment slightly to emphasise the "actively disallowing them" in every situation?

I did not.

He did not edit, and you're misunderstanding the meaning behind his post. Not everything needs to be pedantic and accurate, language is flexible, this is about communicating, not being right.

What we really don't need is paragraphs of someone arguing because their own definitions differ slightly from the OP

>He did not edit

He edited his reply to me multiple times... which is what made me suspect an edit to the original comment. But whatever, I'm happy to acknowledge his original intent even if he did state it more harshly.

>What we really don't need is paragraphs of someone arguing because their own definitions differ slightly from the OP

This is unnecessary. OP came out with "AUTHOR IS INSANE" even on the most generous of interpretations. Even if we allow for nuance OP is claiming, there is little constructive about his contribution. I feel fine about calling it out.

Never heard of "The Design of Design" but I bought it off this comment chain.

>He did not edit

>What we really don't need is paragraphs of someone arguing because their own definitions differ slightly from the OP

> He edited his reply to me multiple times...

I got the sense from your reply that some extra clarity would be beneficial.

> This is unnecessary. OP came out with "AUTHOR IS INSANE" even on the most generous of interpretations.

I did not actually call the author insane, I called their decision to explicitly disallow testing insane. It's an insane decision. I am not _literally_ calling the author insane.

> He edited his reply to me multiple times...

I got the sense from your reply that some extra clarity would be beneficial.

> This is unnecessary. OP came out with "AUTHOR IS INSANE" even on the most generous of interpretations.

I did not actually call the author insane, I called their decision to explicitly disallow testing insane. It's an insane decision. I am not _literally_ calling the author insane.

> I did not actually call the author insane...

If you think this distinction really matters wrt the point I'm trying to make, then it's time for you and I to bug out conversationally. Sometimes two individuals have such different ways of communicating that the pain of exegesis isn't worth the squeeze. No hard feelings. I'm sure 50% responsibility is at least mine, but it's not going to be worth it for either of us figuring out exactly what.

> I did not actually call the author insane...

We developed this product for over 1.5 years, closed clients left right and center, and now we're throwing everything away.

In case you don't know me (or Autonoma), we're no strangers to pivots. Funnily enough, we pivoted like 4 times already (enterprise search, documentation generation, coding agent, QA testing platform). The reasons are beyond the scope of this article. In all cases, we knew bugs were painful, we just didn't know what was the best way of solving the problem.

On our last pivot, we actually started getting customers and raised a round from one of the biggest names in the industry. We hired a team of 14 (as I'm writing this article) and started closing clients left right and center. So why would we take such a drastic decision over something that's working for many customers? Well, many reasons. That's why I'm writing this article.

The No-Tests Era (and Why I Regret It)

In my past, I've been guilty of overengineering (as many of you probably also did, and if you say you didn't you're lying). I was Uncle Bob-Pilled but I've been burnt so hard with this that in this startup, I went 180 deg in the opposite direction: TypeScript monorepo, no strict, no tests. Just ship and that's all that matters.

That worked great for our early customers when there were only two of us writing code and each of us owned a huge portion of it and knew everything. After we started hiring, it became a disaster. Bugs were appearing everywhere out of the blue. The codebase was a huge mess of nulls, undefined behaviour, bad error handling. It was so bad that we actually lost a client over this.

For the longest time, I would NOT allow people to write tests because I thought that culturally, we need to have a culture of shipping fast and we should be dogfooding our own product and that should be enough. At some point, I realized that I was affecting the quality of our product and productivity and I changed my mind. In some ways, it was too late. That's why now that we're rewriting everything, we started with tests from the ground up and the most strict TypeScript mode.

Why a Rewrite and Not a Refactor

Initially, when we set out to build this product, we wanted a fully agentic solution. At the time, the tech wasn't there. It was GPT-4 era (not 4o, just 4). The models were so bad that we needed to build huge guardrails and give them very accurate information in order for them to kind of work. We built huge Playwright and Appium wrappers that would do very complex inspections over the code and extract a bunch of information. I'm proud to say that none of the open source solutions I've seen come close to the level of sophistication and replicability we built. We had like 7 clicking strategies that would self-heal on the fly if something changed, for example. That made running tests super fast and reliable.

Now, models have actually advanced so much that the sophisticated inspection is not necessary anymore. And we'd be bringing a huge codebase, with crippling tech debt and vestiges of a worse past of unstrict TypeScript, with not much to gain. We discussed it with the team and decided the best way moving forward was to rewrite the whole thing (bringing the agentic stuff over).

Dropping Next.js and Server Actions

Some of the tech decisions we made were to actually change technologies. And I think this is interesting.

We were super invested in Next.js and Server Actions. We didn't do fetches of any kind. Server Actions is a great idea with a subpar execution IMHO. The idea of having functions that you could use on the frontend is amazing. They have some very bad caveats that make them a bad solution in almost all cases.

They're async. This might sound like a weird take, but remember we're using this mainly on React. Having async functions fetching data potentially means a useEffect block (or have that data be SSR), or async functions and having to handle the changing state manually.

They're hard to test. When you implement a function like this:

export async function getUserById(id: string): Promise<User> {
    return prisma.user.findFirst({
		where: {id}
    })
}

You either have to create a Prisma object with an in-memory db, apply migrations, etc; or mock the Prisma object. It gives you no flexibility. It's impossible to do dependency injection because that'd mean passing the db from the frontend to the backend and that clearly makes no sense. The feeling of defining a Server Action and calling it from the frontend is magical but that's where the magic ends.

They execute sequentially. GLOBALLY. WTF is that decision. It's like a manufactured Python Global Interpreter Lock but in TypeScript. I understand that the reasoning is to have some kind of idempotency and not have renders be out of order... but WTF. That makes them either unusable in any project or you have to develop FOR Server Actions. And when you have to accommodate your practices to a framework and not the other way around, it's a sign that it's a bad tech.

They're not observable. This was really bad. We use Sentry and for many technologies they have automatic instrumentation. I'm a hater of manual instrumentation. I think these things should not clutter the code. The problem with Server Actions is that they all become a single POST. So in Sentry, you'd see every Server Action as a single POST to / with garbage unreadable data. They can't be traced. You can't add headers to them. Just a pain.

They're a security footgun. You could argue this is a skill issue. If you're like me and start using a technology before fully reading the documentation, you might easily miss this. Server Actions become an endpoint in practice. If you don't structure the action right, you could expose yourself to very obvious security vulnerabilities that are not apparent when writing the code. For example, the function that I wrote before is actually unsafe. This would let anyone get any user if they have the ID:

export async function getUserById(id: string): Promise<User> {
    return prisma.user.findFirst({
		where: {id}
    })
}

This would be the correct implementation:

export async function getUserById(id: string): Promise<User> {
	const cookieStore = await cookies()
	if (cookieStore.orgId == null) {
		return null
	}

    return prisma.user.findFirst({
		where: {id, orgId: cookieStore.orgId}
    })
}

The problem this has is that if you code this way, you can't share the code with the API. So what you actually have to do is have a private directory where you have the implementations, and the Server Actions are a wrapper of those implementations but with the auth check, and those same private functions can be used by the API.

/private/users.ts

export async function privateGetUserById(id: string, orgId: string): Promise<User> {
    return prisma.user.findFirst({
		where: {id, orgId}
    })
}

/lib/users-actions.ts

export async function getUserById(id: string): Promise<User> {
	const cookieStore = await cookies()
	if (cookieStore.orgId == null) {
		return null
	}

    return privateGetUserById(id, cookieStore.orgId)
}

/api/users.ts

import type { NextApiRequest, NextApiResponse } from 'next'

export default async function handler(
  req: NextApiRequest,
  res: NextApiResponse
) {
  // validate headers and body for auth

  const result = await privateGetUserById(id, cookieStore.orgId)

  return NextResponse.json(result)
}

They use errors as flow control. It would be weird to blame Next.js for the fact that errors in JavaScript (and TypeScript) are bad. JavaScript errors are just bad and everybody knows that. I usually prefer returning errors as values so I know when something throws and know what to do about it without having to go to the implementation. That works mostly for Next.js, except that Next.js uses errors as flow control. This is one of the worst practices in coding ever IMO. Especially in JavaScript where errors are not apparent when they're thrown. Stuff like redirects are exceptions of type redirect. So something like this won't work:

// Broken: the redirect gets caught
async function myAction() {
  try {
    await doSomething();
    redirect('/dashboard');
  } catch (e) {
    // This catches the redirect "error" too
    console.error(e);
  }
}

So I'm forced to do something ugly like this:

async function myAction() {
  let shouldRedirect = false;
  try {
    await doSomething();
    shouldRedirect = true;
  } catch (e) {
    // handle real errors
  }
  if (shouldRedirect) redirect('/dashboard');
}

Or, I'm forced to do try/catches on every line if I need to handle errors differently for each function:

async function myAction() {
  try {
    await doSomething();
  } catch (e) {
    console.error(e);
    return
  }

  try {
    await doSomethingElse();
  } catch (e) {
    await undoSomething()
    console.error(e);
    return
  }

  redirect('/dashboard');
}

In all my years of software development I have never had such a slow framework in all senses. Slow to build. Slow to develop. Slow to startup. Slow Server Actions. I swear Angular 2 was faster in all senses. I was so broken by this that when we changed to just React, I thought the server was not reloading because it was so fast.

What We Chose Instead (and How We Went From 8GB to Basically Free)

We went for React with tRPC (basically TanStack Start) and a Hono backend. I have to say I miss having everything packaged into a single package, but it just makes more sense given our deployment.

We have everything mostly deployed in Kubernetes (we can discuss this but we have very complex stateful workflows that we need to expose as temporary machines that are very easily deployed on Kubernetes, and very hard on other platforms). We actually had the Next.js project running in a container. Literally, the container took like 8GB RAM per instance. It was the biggest we had. With React we just compile into static files and serve them from a CDN (basically free). And the Hono backend is like < 100MB RAM.

The Orchestration Problem

If you think Server Actions were painful, wait until you hear about orchestration.

We considered useworkflow.dev (which I love the ergonomics of), but it was impossible to merge with the use cases we had for our jobs and it would just create points of failure that I didn't need. Eg:

// pseudocode
export async function startMobileJob(){
	"use step"

	// this action is async and doesn't wait for the job to finish, just for it to be created on Kubernetes
	await k8s.spawnJob()

	// i could poll for the result?
	await pollForJob()

	// i could use the webhook primitive and wait for the response?
	await webhook()
}

In both cases I'll be adding custom code that could fail. The webhook needs a sidecar container that checks if the job finishes and sends a fetch request. You might say "well that sounds pretty easy" and it is. Until for some reason it fails because some dependency was missing or the Kubernetes JSON was incorrectly formatted or a multiplicity of reasons and you get a job that never ends on the UI. The polling is the same. Jobs can end in multiple different states and you have to account for all of them.

I tried many solutions in the past that seem simple but hide complexities much bigger (like building a small orchestrator in TypeScript that would listen for job changes). It's impossible to reconcile the information in a quick and dirty implementation. Also, very hard to test.

In the end we went with the technology we know and love, with its quirks and ugly UI: Argo. It's Kubernetes native. Each step is a job and you're guaranteed that it runs in the order that you need. It scales pretty well to the thousands of tests we need to run reliably. And you can merge many workflows together into a single one so we can easily send a message at the end. Again, this is very hard to test but we have already built abstractions to do this.

Before you mention Temporal: it has the same problem useworkflow has. It can't wait for a job to finish without breaking the workflow abstraction.

"But Why Do You Need a Kubernetes Job?"

The reason is that the jobs have stateful parts, and in many cases, we need to deploy stuff alongside the job. For example, for our mobile jobs, we need to acquire a lease for the device (a whole different implementation that's again, outside the scope of this blogpost), create the job, install the app, connect to the device, etc. All of those take time and the Appium connection + install is costly to setup. So we can't have a stateless process be passing around information and recreating all this every time.

It could be a step in one of these workflow technologies, but that'd mean having a HUGE image with both dependencies from iOS, Android, web, and any other workflow I need into a single image making it super huge. That could be a solution. I'm not saying it would be a bad one, but we just thought we'd go with the devil that we know.

Wrapping Up

I'm open if you know better solutions to any of this. I'd love to know if you agree with these decisions or if you'd do something different. It's been a very exciting journey for us and we'll be announcing this new product in a few weeks. We're just testing with some design partners right now. If you want early access or want to break it before we launch, DM me on Twitter. We'll have a "one more thing" in a couple of weeks as well, so stay tuned.

Hacker Times