An AI coding agent, used to write code, needs to reduce your maintenance costs

Insightful. Agree with this take.

Unfortunately, maintainability is simply bucketed as a "non-functional" requirement.

Maintainability (and similar NFRs) should actually be considered what preserves and enables the delivery of future functional requirements -- in contrast to framing non-functional requirements as simply "how" the software must do what it does vs. the "what"/functional requirements that "actually matter".

From that standpoint, if a steady flow of features/improvements is important for a project, maintainability isn't really a non-functional requirement at all, and amounts to being a functional requirement, in practice, over anything except the shortest of time horizons.

In my experience AI reduces maintenance costs. Though, context might matter here, I'm working on a multi decade set of projects, while there is a lot of greenfield feature development, the old code / older projects have suddenly become a lot easier to work with, modernize, and in a bunch of cases, eliminated. Dependency on old libraries, build tools, in some cases updated, in other cases just eliminated, builds are faster, easier for developers, etc. End to end testing has become a lot easier to setup and automate. DevOps have been improved a lot, diagnosing production issues drastically improved, we have a ton of logs and information, and while we have various consolidated dashboards / monitoring to capture critical things, now we can do a lot more analysis on our deployed system (~50 ish projects)

Two things I'd add

1. software doesn't only have tech maintenance - there is also user support and it increases as software grows.

2. I'm not convinced maintenance costs scale linearly. And even if it scales linearly, you will eventually get to a point where maintenance takes up all your time.

Same with code reviews.

I wonder if AI could make code reviews more presentable.

for example, with human code reviews, developers learn quickly not to visually change code like reflowing code or comments, changing indent (where the tools can't suppress it), moving functions around or removing lines or other spurious changes.

And don't refactor code needlessly.

also, could break reviews up into two reviews - functional changes and cosmetic changes.

The article's framing around the maintenance-to-feature ratio resonates with something I've been noticing in my own workflow.

One underappreciated aspect: the artifact surface area of an AI session grows much faster than the code surface area. For every hour of Claude Code output, you get not just code changes but screenshots, generated images, exported transcripts, spec drafts, downloaded model weights — all scattered across wherever Finder happened to drop them.

The maintenance cost argument applies here too. If you can't quickly navigate to the right artifact at the right moment, you end up re-generating things you already have, or worse, losing context between sessions. The "maintenance" of your working environment is a real tax on the ratio the article is describing.

I've been trying to address the file-side of this problem specifically, but the broader point stands: AI coding agents will only reduce net maintenance costs if the surrounding tooling (file management, context switching, artifact organization) keeps pace.

The bet that he misses, which a lot of companies are starting to make or at least think about, is that AI will get better at coding. So the model / harness / whatever is next takes care of the maintenance burden.

That's the theory anyway.

I feel like AI might let us model some of the things that we initially didn't scope that led to these problems (e.g. "Decided not to fix every bug, or upgrade every dependency") - being able to more easily ask a system that can dig into "how much time are we spending on stuff related to foo"

AI tooling can also be a place where we start building our view of what maintainable software practices look like so we don't make decisions that have these same tail effort profiles. That can be things like building out tooling to handle maintenance updates

I think the real thing that comes out of AI tooling is probably that the tooling needs to be trained (or steered) towards activities that enhance human attention management.

My low value comment. This feels directionally correct to me. The problems I've been struggling with in my dev job for the past 6 months have been 80% maintenance/legacy code interfering with new feature development.

Some of our developers are overly aggressive about using AI and I've started going down that path because I need to keep up and actually enjoy the flow of working with AI in my IDE.

I put a lot of work into keeping my area of the codebase understandable and coherent but I do not see that from the others on our team. I'm not perfect but I and extremely sensitive to incoherent, or un-grok-able at a glance.

Anyway, I like the novel (to me at least) framing of this article!

This could have been a good piece of writing if the author chose not to be so smugly overconfident in their belief and show real evidence to support their claim. Mentioning the front page of HN as your source is glib and immediately made me doubt the conclusions. I was interested to see what work the author put into researching this but apparently they didn’t do any work at all.

When an LLM provides you with an overconfident piece of writing with no sources to back it up, what do you do?

For me, if I can make a kickass testing system that people love so much that they actually build features with it and it’s not an afterthought, then maintenance becomes much easier. It’s often called test driven development but I’ve rarely seen it done in such a way that the dev ex is good enough for it to work.

But say you have that. Then you have great profiling. At that point you can measure correctness and performance. Then implementation becomes less of a focal point. And that makes it a lot easier to concede coding to ai

Yeah, but to be honest, I sometimes just tell Claude to cleanup / refactor stuff; it finds a lot of things, discusses it with me and I approve the plan, and it churns away my tokens for some time. I do this once in a while, and I've been doing this for over 6 months and I don't feel like my development has significantly slowed down. Yeah my token usage is more for sure, but my codebase also is, so I'm not worried about that. To me AI seems to make maintenance very easy, like the rest. You just need to do it.

Edit: I make it sound a bit simple maybe. I do more extensive redactors also, where I'm more involved and opinionated. But I don't feel the need to do that very often very deeply. But yeah sometimes it's definitely necessary to prevent the project from going off rails.

This is what I've been preaching to my team. With 5.5 and 4.7 the coding agents are good enough know to almost never take any tech debt. Any new feature or fixes should come with a cleanup or refactor, on the same PR.

> Your crowd might tell you that, for each month you spend writing code, you’ll spend... 10 days on maintenance in the first year; and 5 days on maintenance each year after that

Someone is an optimist! I'd estimate those significantly higher, and even worse if you are in a field that has to do any sort of SOC/HIPAA/GDPR audit

I think AI is great for the soul destroying boring stuff that makes me want to quit my job like wrapping legacy code in test cases. Hey I’ll take on any idiot who’s willing to do that job, even if he’s artificial.

Great Article! I think ultimately we are heading towards a world where much better software will be created. This is the major roadblock we need to cross over before that can be true, but I think it is a very tractable problem!

I created a video that talks about this in more detail:

https://www.youtube.com/watch?v=G3Q7Y-nrUbk

The maintenance-cost framing is the useful constraint. I’d rather see agents default to smaller diffs, test scaffolding, and explicit assumptions than maximize lines changed per prompt.

https://www.laws-of-software.com/laws/kernighan/ relates here.

The incitives for remote LLMs are off with providing defaults which optimize for maintenable sound architecture though. Same way Claude is going to produce overview of the indexes of the summaries of comprehensive reports, no one is going to read. No doubt this feels like excellent KPI on how much output was generated.

So what are all of these agentic based strategies going to do once the infinite money spigot of investment into AI ends and they need to start charging prices that actually make a profit?

I get that most of the cost is in training and not inference, but I don’t see how models stay useful once the worlds software updates in a few months post training since the models can’t learn without said training.

Are we just going to have shops do the equivalent of old COBOL shops where everything is built to one years standards and the main language/framework is mostly set in stone?

Would be an interesting concept and read were it grounded in reality. Unfortunately, it's data and graphs pulled out of someone's imagination. Reality is nowadays with the right skillset you can take state of the art AI tools and get a complete language rewrite and or refactor and be done the same afternoon.

With AI, you can hypothesise what can potentially break with each new addition (which your regression tests do not even capture at present). Then, you can write tests for each of those hypotheses, ask AI to deploy a canary, ask AI to divert 5% of traffic to the canary. Ask AI to analyse the logs for any signs of regression in performance, ask AI to roll it out to 100% if everything is good. Congrats! At this point, you've become a slave to AI and cannot do without it. Even logging into a remote server now causes mental pain; having to do anything by hand causes pain. You just wait for your limit to be reset to return to slavery again. A master of a slave is as much of a slave to his salve as the slave is to the master itself.

Insightful. Agree with this take.

Unfortunately, maintainability is simply bucketed as a "non-functional" requirement.

I've found the first, and most important, step for any team or organisation to eliminate concerns with NFRs, "tech debt", and whatever else it may be called, is to stop giving it a name.

I'm being completely serious. By giving it some kind of distinct name, you are giving license to it being ring-fenced and de-prioritised by someone who doesn't (but, arguably, probably should) know better.

Quality matters. It hits your P&L very quickly and very hard if you don't maintain it. So it is as important as any other factor.

> amounts to being a functional requirement, in practice, over anything except the shortest of time horizons

Right! The unfortunate thing is that many software companies don't seem to think much further than a quarter ahead, not really.

Sure they might have a product roadmap that extends for a year or two into the future, but let's be honest. Often that roadmap is mostly for sales purposes, not engineering planning purposes. Product and engineering will pivot if sales slump. The earlier in the company's lifespan, the more likely this will happen often

However if companies get out of this startup mode then they should start to stabilize... But many don't. They continue this pattern of short sighted short term planning, which means product stability remains a low priority effort.

Ultimately I guess many companies just either do not have the resources to build good software or do not actually care to

I've found the first, and most important, step for any team or organisation to eliminate concerns with NFRs, "tech debt", and whatever else it may be called, is to stop giving it a name.

Quality matters. It hits your P&L very quickly and very hard if you don't maintain it. So it is as important as any other factor.

Name it "not done yet." But, yes, very keen observation here.

Two things I'd add

1. software doesn't only have tech maintenance - there is also user support and it increases as software grows.

2. I'm not convinced maintenance costs scale linearly. And even if it scales linearly, you will eventually get to a point where maintenance takes up all your time.

Some of our developers are overly aggressive about using AI and I've started going down that path because I need to keep up and actually enjoy the flow of working with AI in my IDE.

Anyway, I like the novel (to me at least) framing of this article!

The article's framing around the maintenance-to-feature ratio resonates with something I've been noticing in my own workflow.

> Your crowd might tell you that, for each month you spend writing code, you’ll spend... 10 days on maintenance in the first year; and 5 days on maintenance each year after that

Someone is an optimist! I'd estimate those significantly higher, and even worse if you are in a field that has to do any sort of SOC/HIPAA/GDPR audit

https://www.laws-of-software.com/laws/kernighan/ relates here.

I created a video that talks about this in more detail:

https://www.youtube.com/watch?v=G3Q7Y-nrUbk

Name it "not done yet." But, yes, very keen observation here.

> amounts to being a functional requirement, in practice, over anything except the shortest of time horizons

Right! The unfortunate thing is that many software companies don't seem to think much further than a quarter ahead, not really.

Ultimately I guess many companies just either do not have the resources to build good software or do not actually care to

This rings true for me too, but I don't think it counts if your just using AI to aid maintenance. The basic argument in the article is around how many hours of maintenance you have to do for each hour of "value-add" feature development. So A. your only measuring maintenance costs not the ratio and B. The "old code" whp wasn't written with AI in the first place.

That's the theory anyway.

Meanwhile many LLM users have seen generated code quality drop as prices and service are brought in line with costs. If graduate student level work costs many times the price of a student worker then why bother?

That's better than 99.99999% humans. Where do I put my credit card details?

When an LLM provides you with an overconfident piece of writing with no sources to back it up, what do you do?

> When an LLM provides you with an overconfident piece of writing with no sources to back it up, what do you do?

You draw made up lines on made up plots and call it evidence, obviously.

Same with code reviews.

I wonder if AI could make code reviews more presentable.

And don't refactor code needlessly.

also, could break reviews up into two reviews - functional changes and cosmetic changes.

Do any refactorings in separate reviews, and say things like "REFACTOR_ONLY:", with a rule that none of the code changes behavior.

That makes reviews a lot easier. The review starts from "nothing should be changing" and then reviewers can pattern match on that.

Otherwise, the reviewer is re-evaluating every line of code to make sure nothing has changed. That's really hard to do properly.

The version control systems I've worked with have allowed queues of changes, each one reviewed independently. As I'm developing, if I need a refactor, I go up a commit, refactor, send out for review, rebase my in progress work and continue.

I send out a continual stream of "CLEANUP:" "REFACTOR_ONLY:", and similar changes with the final change being a lot smaller than a big monster of a change.

Your reviewers will appreciate the effort.

Plays the metric game (if you're working in that type of org) without being evil too.

https://github.com/ReviewStage/stage-cli looks like an interesting start on that subject.

I think the real thing that comes out of AI tooling is probably that the tooling needs to be trained (or steered) towards activities that enhance human attention management.

> AI tooling can also be a place where we start building our view of what maintainable software practices look like so we don't make decisions that have these same tail effort profiles. That can be things like building out tooling to handle maintenance updates

This has been possible already but from my vantage point, it doesn't look like anyone really did it? Sure, there already exists tons of OSS that is built for this case, even before AI, yet it seems to me to always come back to incentives. IMO, there is no incentive to write maintainable software (and I'm not sure there ever will be one at this pace). Businesses are only incentivized to write enough software to accomplish the task within their own defined SLAs and nothing further. But even that doesn't seem to be a blocker at this point if Github is used as an example.

Good software comes from people who care deeply about solving the problems in way that they are invested in. If your employees don't care about your product, you're already starting on the wrong foot. AI isn't going to incentivize bad-average developers to write better software or a good developer to push back harder against their clueless manager. When they make the decision, AI might help (assuming it doesn't make a bigger mess) but it's not going to reduce technical debt in any meaningful way without a sea change of perspective from product managers around the world.

So far, I just don't see it happening in theory or in practice. I hope I'm proven wrong!

At least if you a test suite that doesn't have to be migrated. I too would like to migrate some services from Python to Rust but my test suite is written in Python so I would have to actually check if the test suite migration was correct manually (I can't event compile it!) before doing the rewrite.

So what are all of these agentic based strategies going to do once the infinite money spigot of investment into AI ends and they need to start charging prices that actually make a profit?

Are we just going to have shops do the equivalent of old COBOL shops where everything is built to one years standards and the main language/framework is mostly set in stone?

The maintenance-cost framing is the useful constraint. I’d rather see agents default to smaller diffs, test scaffolding, and explicit assumptions than maximize lines changed per prompt.

Glad you asked. AI empowers people who couldn't do a job before to do a job. With more supply of qualified workers, these workers compete with each other by lowering the salary they'll take.

So:

* You get paid less. * The company might pay a similar amount due to LLM costs. Although, it could be more or less as well, depending on how it works out.

A couple of years ago, I saw a story of a guy writing two articles for a website a day. The boss asked him if he wanted to transition to AI-assisted writer for less pay. He said, "No." After a couple of weeks, he got canned. He checked the website out, and it had a bunch of AI writing on it.

LLMs are there to reduce your salaries and increase the businessowner's profits. Bigger inequality in wealth, it's only going to grow more and more. Also, a ton of people fired across many different fields.

I think this is still the role of human oversight, these tools will forever be imperfect and the instructions we give them as prompts will always been prone to inaccuracies/misinterpretation. I find it useful to evaluate the code and often ask for simpler solutions and so far it has produced slightly more elegant solutions. The tendency to spawn helper functions to solve every problem or doing things in a slightly weird or at least unconvential way when there is an easier/standard way of doing it that would create less code. Your ideas if automated would definitely make things more maintainable but even code produced my machines require a human to be responsible for making sure/verifying it works.

This will probably be how things will work in future: devs will shift to specifying features which will be validate through tests.

The AI will then be middle layer that will iterate until tests pass.

Layer 1: Specs (Humans)

Layer 2: Code (AI mostly)

Layer 3: Tests (AI + human checks).

You can only type at 50WPM and read one file at a time, the LLM doesn't have the physical limits, use it at your advantage so you can actually focus on the work that matter

This is my experience exactly.

I have reduced our response time on our api to 30ms from 80ms and gotten a setup we can comfortably grow into.

I had not had time to track down these optimizations without Claude code.

I'm getting downvotes for this. Why exactly?

My local model humming next to me will always be available. Is it as good as a foundational model? No. But it'll work just fine for most pedestrian tasks and I don't need to keep now useless mechanical knowledge in my brain.

You can only type at 50WPM and read one file at a time, the LLM doesn't have the physical limits, use it at your advantage so you can actually focus on the work that matter

Glad you asked. AI empowers people who couldn't do a job before to do a job. With more supply of qualified workers, these workers compete with each other by lowering the salary they'll take.

So:

* You get paid less. * The company might pay a similar amount due to LLM costs. Although, it could be more or less as well, depending on how it works out.

That is one possibility (that is playing out). Another one worth contrasting is the idea of AI as leverage for the worker. If you can take a regular developer and augment their output by 25%, then they have become more valuable to you and you should pay them more. Why should you pay them more? Because the market rate will price in that they provide more value now and you'll lose those workers to competitors if you don't.

That's a pretty old economic idea, and it will be interesting to see if it holds up in this instance. I have no idea how this all plays out. I do think it won't be one size fits all though.

This is my experience exactly.

I have reduced our response time on our api to 30ms from 80ms and gotten a setup we can comfortably grow into.

I had not had time to track down these optimizations without Claude code.

I'm getting downvotes for this. Why exactly?

This will probably be how things will work in future: devs will shift to specifying features which will be validate through tests.

The AI will then be middle layer that will iterate until tests pass.

Layer 1: Specs (Humans)

Layer 2: Code (AI mostly)

Layer 3: Tests (AI + human checks).

Yes, that is how I see it too. What I would add is - intent testing - collect user messages, and check them against executed work from time to time. Every ask must be implemented and tested, every code must be justified by a user message.

What a boring fucking future.

Do any refactorings in separate reviews, and say things like "REFACTOR_ONLY:", with a rule that none of the code changes behavior.

That makes reviews a lot easier. The review starts from "nothing should be changing" and then reviewers can pattern match on that.

Otherwise, the reviewer is re-evaluating every line of code to make sure nothing has changed. That's really hard to do properly.

I send out a continual stream of "CLEANUP:" "REFACTOR_ONLY:", and similar changes with the final change being a lot smaller than a big monster of a change.

Your reviewers will appreciate the effort.

Plays the metric game (if you're working in that type of org) without being evil too.

That's better than 99.99999% humans. Where do I put my credit card details?

> When an LLM provides you with an overconfident piece of writing with no sources to back it up, what do you do?

You draw made up lines on made up plots and call it evidence, obviously.

https://github.com/ReviewStage/stage-cli looks like an interesting start on that subject.

And nwave

https://github.com/nWave-ai/nWave

They have /nw-buddy to point you in the right direction

Very nifty

So far, I just don't see it happening in theory or in practice. I hope I'm proven wrong!

I think I have a different perspective on this because I've worked in places that do care about that sort of thing on tools that do focus on those sorts of things. I think the long term incentive for these tools to address tech debt as a goal comes from the AI eval benchmarks trending towards being saturated. The advantages of one tool over another will be in the longer context things. This naturally will tend to start to act as a forcing function for training to focus on the longer tail of software development. A good way of thinking about this is GPT 3.5 was good at dealing well with lines of code and functions, 4 was functions, small apps, 5 seems adept at delivering apps and systems, 6 will be systems and whole enterprised programs of work.

That's a pretty old economic idea, and it will be interesting to see if it holds up in this instance. I have no idea how this all plays out. I do think it won't be one size fits all though.

What a boring fucking future.

Which to me is why it’s so important to build cooler and crazier shit. A web app to facilitate some business process was always boring, but at least you got to code. Now it’s just boring. The thing I’m building right now is pretty wild, involving computer vision, robotics, and surgery. It’s super complex and without AI, the development would have bankrupted us. But because of ai, we did it and the product is going to FDA this year.

No kidding. AI does all the interesting problem solving and humans...

Write tests. The most boring activity on the planet

And nwave

https://github.com/nWave-ai/nWave

They have /nw-buddy to point you in the right direction

Very nifty

No kidding. AI does all the interesting problem solving and humans...

Write tests. The most boring activity on the planet

I wasn’t talking about unit tests. Was talking about tests that accelerate development, where you can setup everything and test a feature by just pressing enter vs clicking around or whatever. The tests are how you build features, so I don’t consider that boring

There is problem solving in coding, but the bigger problems exist at a higher level and that’s still on you to solve.

Also I’ve been messing with “ai-only” files recently. You make a markdown file that basically tells it what the file does, how it’s used, and point to an API contract in some other file. Then you can run async ai that will try things and only submit a PR of all the tests pass and the perf improves. The files become almost unreadable to be, but I decided to embrace it because they were already unreadable. But so is the output of, say, the protobuf code generator and I never had a problem accepting that

Well, humans specify the tests. The AI can probably write them better, too.

Probably more boring still, though.

There is problem solving in coding, but the bigger problems exist at a higher level and that’s still on you to solve.

Well, humans specify the tests. The AI can probably write them better, too.

Probably more boring still, though.

I’ll get straight to the point: your AI coding agent, the one you use to write code, needs to reduce your maintenance costs. Not by a little bit, either. You write code twice as quick now? Better hope you’ve halved your maintenance costs. Three times as productive? One third the maintenance costs. Otherwise, you’re screwed. You’re trading a temporary speed boost for permanent indenture.

Oh, you want to know why? Sure. Let’s go for a drive. On a dark desert highway...

Productivity is Determined by Maintenance Costs

Every line of code you write has to be maintained: bug fixes, cleanup, dependency upgrades, and so forth. I’m not talking about new features or enhancements. Just maintenance. For every month you spend writing code, you’ll spend some amount of time in the following year maintaining that code, and some in each year after that, forever, as long as that code exists.

Let’s say you asked a crowd of, say, 50 developers what those maintenance costs were. Using a technique called Wisdom of the Crowd, you could get a reasonably accurate response.1

1You’re welcome to conduct your own wisdom-of-the-crowd survey! But it turns out that the specific numbers don’t matter for the overall point I’m making here.

Your crowd might tell you that, for each month you spend writing code, you’ll spend...

10 days on maintenance in the first year; and
5 days on maintenance each year after that.

If you were a particularly obsessive individual, you could spend hours making a spreadsheet modeling how those estimates affect productivity over time. A spreadsheet like this.

The first month of a new project is glorious. You spend all your time building fancy new features.

The next month is slightly less glorious. A fraction of your time—not much, but a smidge—goes to fixing bugs and cleaning up design mistakes from the first month. In the third month, a smidge more. And the fourth month, the fifth, the sixth...

Eventually, it’s not glorious at all. According to our crowd’s maintenance estimates, you’ll spend more than half your time on maintenance after 2½ years. After ten years, you can hardly do anything else.

Halving the crowd’s maintenance estimates gives you three more years before you hit the 50% mark. Doubling them sees you below 50% in less than a year.

The lesson is clear. If you want a productive team, you have to focus on their maintenance costs.

All Models Are Wrong

Do these numbers ring true to you? They do to me. In my career as a consultant, I specialized in late-stage startups, and they all had the exact problem shown in the graph above. About 5-9 years in, they’d notice their teams were no longer getting shit done, and then they’d call me.

Their teams weren’t quite as bad as the graph shows. Maybe their maintenance costs were lower. Or maybe... and this feels more likely to me... their maintenance costs were exactly that bad, and they papered over the problem instead. Maybe they:

Decided not to fix every bug, or upgrade every dependency
Added people when the team got slow... and then kept adding more, because it was never enough
Scrapped it all and started over with a rewrite

There’s room to debate the precise maintenance numbers, but overall, the model feels right. If you’ve been around the block, you know this graph is true. You’ve seen how productivity melts away over time. You have the scars.

What Does This Have to Do With AI?

Only everything.

Let’s say your team just started using Rock Lobster, the latest and greatest agentic coding framework, and it Doubles!! your code output! Woohoo! The code’s a bit harder to understand, though, and your team is drowning in pull requests, and you maybe kinda sorta teensy weensy don’t actually read the code before smashing the approve button. Like, at all. I mean, you skimmed it, during boring meetings, sometimes, and that’s gotta be good enough, right? LGTM, let’s get this shit done!

So now you’re producing two months of work in a month, and let’s say you’ve doubled how much each “month” of output costs to maintain. Next month’s maintenance costs quadruple.

Oh.

About five months after you start using Rock Lobster, your productivity is back down to where you started, and a few months after that, it’s worse than it would have been had you never touched Rock Lobster in the first place.

I’m not saying your AI doubles maintenance costs. Or productivity. This is an extreme example. But even if your AI produces code that’s just as easy to maintain as your human-written code, the productivity gains don’t last.

You Can Check Out Any Time You Like2

2But you can never leave.

Agents are expensive, and they’re only getting more so. Once your agent’s juice is no longer worth the squeeze, you might decide to save your pennies and go back to coding the old way. Like a caveman. With your fingers.

Ha! Joke’s on you! When you stop using the agent, all the productivity benefit goes away... but the added maintenance costs don’t! As long as that code’s still around, you’re stuck with lower productivity than if you had never touched the agent at all.

The Passage Back

The math only works if the LLM decreases your maintenance costs, and by exactly the inverse of the rate it adds code. If you double your output and your cost of maintaining that output, two times two means you’ve quadrupled your maintenance costs. If you double your output and hold your maintenance costs steady, two times one means you’ve still doubled your maintenance costs.

Instead, you have to invert your productivity. If you’re producing twice as much code, you need code that costs half as much to maintain. Three times as much code, one third the maintenance.

This is the secret to success. All the benefits, none of the lock-in.

Can We Kill the Beast?

I dunno. All my reading of the finest news sources says that coding agents increase maintenance costs. Some people do say they help them understand large systems better. But big decreases in costs, of the size we need to see? No. Just the opposite.

That’s a problem. The model isn’t a perfect representation of reality, but the overall message is right. You need AI that reduces your maintenance costs, and in proportion to the speed boost you get from new code. Without it, you’re screwed. You’re trading a temporary speed boost for permanent indenture.

So, yeah, go ahead, chase improvements to your coding speed. But spend just as much time chasing improvements to your maintenance costs. Or you, too, will be trapped in Hotel California.

Such a lovely place.

Such a lovely face.

As much as it might seem like it, this isn’t meant to be an anti-AI rant. There’s other levers to pull, such as AI that makes maintenance itself more productive, even if it doesn’t make the code more maintainable. I encourage you to copy the spreadsheet and play with all the levers in the model. See what happens when you change the assumptions to match your real-world situation.

Hacker Times