“Through more vigorous computer programing and more sophisticated scheduling, it was possible to reduce the changeover period from two weeks to two days.”
- Lee Iacocca (in his autobiography)
This is also why one of my instructions to coding agents is that they adhere to established coding and testing patterns, even where they appear to be sub-optimal.
Vibe coded software is the Marvel green screen movie equivalent.
This is the reason why AI-assisted programming has not turned out to be the silver bullet we have been hoping for, at least yet. Muddled prompting by humans gets you the Homer Simpson car you wished for, that will eventually collapse under its own weight.
I've been thinking a lot about Programming as Theory Building [0] as the missing piece in AI-assisted engineering. Perhaps there are approaches which naturally focus on the essence while ignoring the accidents, but I'm still looking for them. Right now the state of the art I see ignores both accident and essence alike, and degrades the ability to make progress.
Please inform me if there are any approaches you know that work! And lest this sound pessimistic, far from it. This state of affairs is actually intoxicatingly motivating. Feels like we have found silver, and just need to start learning to mould bullets.
[0] Another classic required reading of the industry https://pages.cs.wisc.edu/~remzi/Naur.pdf
That was true for almost seventy years until roughly last year.
AI is the silver bullet - my output is genuinely 10X what it was before claude code existed.
Fred Brooks wrote that book when they were programming IBM operating systems in assembly language.
Times have really, really changed - do not pay attention to the messages of this book unless for historical fun.
The last three times I read the book, everything held.
This time, I'm not so sure: AI does change things significantly. Perhaps not for all teams and not all scales of software, but in my case (solo developer, complex software system) I did measure a 12x productivity increase [1].
Also, some of the problems Brooks describes became much easier, if not borderline trivial with AI. For example, maintaining design documentation that stays consistent with the software being built. I do this and it is no longer a problem.
I still think most of what Brooks wrote is applicable today. I think the biggest difference is that AI enables smaller teams to work on larger systems, and the biggest benefit is for single-person teams (ahem) like me. I see it as another step that allows me to tackle larger systems: the previous one was Clojure which reduced incidental complexity so significantly that I was able to develop the system to the size it is today. AI is the next step: it allows me to build features that would have taken me years in a span of months. Not because of "vibe coding", but primarily because I can work on a set of design documents and turn my ideas into a coherent design.
[1] For the nitpickers: yes, measured, not guessed. Yes, the metric was reasonable. No, it wasn't "lines of code" or something equally silly, in fact one of my main goals is reducing code size as much as possible. Yes, I compared larger time periods: 2 months with AI to an average of 12 months of the previous year. No, the metric wasn't gamed: this is a solo business and I have no interest in gaming my own metrics. I earn a living from this work, so this is as objective as it gets.
These beautiful metaphors by Brooks, from books like inmates running the asylum, etc - they made me romanticize the industry I had no idea about.
Now, about 20 years later, I wonder to which extent they influenced me to take the path I've taken. Because when I began, it wasn't lucrative or cool in the slightest.
When I measure software dev, delivery of code isn't even a metric I care about. It is a key part of the process, to be sure, but I care about results - Did we ship? Did it work? Do we have happier customers and a smaller bug list?
In my experience, while I can answer "yes" to those questions on people who use AI assistance surgically, applying it where its strengths lie... I can answer an emphatic "No" for the teams I've worked with who are "AI-first", making the AI usage itself part of their goals.
So increasing individual output by itself is not enough to affect the argument. It could, if you also reduce the size of people needed for a project, where people are everyone included in the project, not just SWE. But there are strong forces in large orgs to pull toward larger project sizes: budgeting overhead and other similar large orgs optimize for legibility kind of arguments.
IMO the only way this will change is when new companies will challenge existing big guys. I think AI will help achieve this (e.g. agentic e-commerce challenging the existing players), but it will take time.
At _this_ moment, AI is in the state of producing things - if you like with factor 10 or more. But what will come afterwards, when all this mush of code shall create _reliable_ results. This means not man month then, rather man years or decades to fix this billion and maybe trillions lines of opaque probabilistic LOC. You have to take the mean of these two stages, if nothing qualitative happens to the models.
What this article explains is why despite your feelings of untouchable success, on average the experience of using software just keeps getting worse and worse and worse, making this the worst era for software quality that I've ever lived through
Didn't we already do this with every company looking to hire "rockstar programmers"? I don't recall that that ended well.
Clearly..it still wasn't a silver bullet. Because output as a metric is a bad one. I thought it was only one managers valued..but apparently Anthropic has convinced devs to value it finally? i guess it def hits that dopamine receptor hard.
I also do this.
e.g. after watching Claude burn tokens building and then deploying a docker image multiple times (and it taking extra time), I asked it to just create a build.and.deploy.sh script. I also then have a test.deploy.sh script that Claude can use to confirm everything worked.
Saves a ton of time/tokens AND has the added benefit of being usable by me or other humans when doing manual tests or debugging outages etc.
When concrete things like that start to happen, then I will start to believe in the 10x claim.
I’m being glib, but there’s a whole class of software (eg simple crud apps) that just don’t have any marginal value anymore. So it doesn’t matter if it’s 10X faster or 100X faster. 100 x $0 is still 0.
But after people's expectations adjusted it was just back on the treadmill.
I don't think we've found a new steady-state yet, but I have some gut feeling guesses about where it's going to be.
1. I would not have attempted this without AI assistance because it's a big project.
2. I have built a functional program that I am able to use for real work in a handful of weeks, working part time on this (like literally a few hours per day prompting Claude and Kimi).
3. Had I decided to do this without AI assistance it would have been months of work.
We decided to integrate our SaaS into Microsoft Business Central and NetSuite as plugins into those systems. BC has its own programming language, called AL, that has a lot of idiosyncrasies from any other language I've worked with. And NetSuite plugins are written in SuiteScript, which is a custom JS runtime with a ton of APIs to learn.
In the "before", it would've taken 5 developers a year or more to build those integrations. I did both by myself in well under a year. Thank you Claude.
Complete frontend + backend + database.
Yes, it is an internal app, but it works and everyone loves it.
Does that count as an example?
(Also I absolutely expect him to need help at some point, but so far it has taken his project from absolutely impossible to 3 weeks of work in between work, renovating his house and being a dad for the first time so I was very impressed.)
Assuming 10x on the speed of dev, Is the vscode repo a decent example? Recently they've been all in on AI augmented development so i'm thinking they'd be a reasonable subject?
How do you isolate out what counts as the "development" part of their delivery cycle (is that the dev inner loop, does that show up in frequency of commits then?) to measure it and see if it's running 10x?
https://github.com/microsoft/vscode/graphs/contributors?from...
I don't think we'll see AAA game velocity change until asset generation progresses quite a bit, not to mention stuff like rigging. Even then, there's still a layer between code and engine where you have to wire everything together which an LLM will struggle with.
Replacing some old COBOL is probably more of a management decision based on appetite for change and politics rather than development speed.
Aren't there some measurable things like github repo creation, PRs, app store additions, etc. that can be correlated to LLM adoption? Didn't Show HN have to get throttled after LLMs arrived?
Direct github link: https://github.com/open-noodle/gallery
One of the latest things I made with Claude was a tool that allowed me to move a bunch of very low traffic Cloud Run services to a single VPS without losing any of the Cloud Run benefits such as easy Docker-based deployment and automatic certificate provisioning. I thought about making something like that for quite some time, and Claude finally made it possible, which makes me quite happy.
The fun thing here is that no other soul genuinely cares about it, or any other code I might publish. The code, especially AI generated, is so cheap that if anyone wants to repeat my steps to get rid of Cloud Run services, they will probably vibe-code their own tool instead of figuring out how to use mine, just like I did that instead of spending time on learning Dokku or similar solutions.
So, yes, 10x and more, but no one cares about the result, which makes the whole 10x measurement less useful.
I've always been a backend engineer, never front end. And almost every team I've been on has lacked any front end skills at all, so all our tools end up being a mash of scripts, maybe sometimes an API.
Now we are all front end engineers creating UIs for things we could never do before, and this starts API first development, so the CLI + UI are just calling APIs. Nothing new here, but this used to be what teams do, now a single person does it.
https://github.com/KeibiSoft/KeibiDrop
It took me 2 years ago around 2k hours to build a cross platform FUSE vault, without using AI assisted tools.
The pain was debugging through logs and system traces. And understanding how things work.
Now managed to ship this one much faster, as an after hours project. Started it in may 2025, and around end of November 2025 started using claude on it.
Just by dumping logs into claude, and explaining the attack vector for the problems, saved me the FML moments of grindings walls of syscalls on 3 platforms.
I would say much easier to progress, and ship with the same rigour, minimize my time, focus and brain power involvement such that I can put the energy somewhere else.
In my experience stuff like RAILS had negligible impact in my field because companies would always require solid backup from some big name vendor (MS, Oracle, IBM, Sun - back in the day, or even SAP).
So most if not all the smaller silver bullets did not even make a blimp on the radar... and stuff like Java or .NET, while definitely better than C or COBOL... did not really deliver in terms of productivity boost (in part because, as noted in the message I am answering to, expectations kept growing at the same pace)
When I find claude is using tools or approaches that I have replaced with more specific ones, I ask claude to add a hook to prevent doing this in the future and point it to the instructions of what to do instead.
And of course I wrapped all that up in a Skill so it knows what approaches to take to add things to hooks.
It becomes fairly trivial to incrementally stop it making repeated mistakes like this.
I remember when coding was free as in beer and freedom!
Then codify this behavior into a process which automatically gets run through.
I.e. $repo/origin as bare repo, then prompt to create a shell script which creates the worktree and cds into it, running the script you mentioned, instantiating pi in it. Potentially define explicit phases for your workflow and show the phase in the UI - and quality gates for transitions. Eg force the implment to finalize phase to only happen if all tests succeeded. Potentially add multiple review phases here too, with different prompts. This progressively gets rid of more and more inconsistencies.
Still not a perfect solution, but on average I've had less and less to manually address with that workflow. Albeit at cost of tokens (multiple reviews phases obviously ingest all changes multiple time)
Pi-agents extensibility is just a lot better then the other harnesses, but you could obviously also just introduce a different orchestrator to do the same. For me, pi-agent was just the least amount of effort necessary to get it going.
It seems like if you write the docs yourself that's not leveraging the probability that the model itself knows the anti-context guard rail that best prevents it from grabbing its average tool use.
When asked to fetch JIRA tickets, use the "fetch-jira" skill rather than reading via ACLI
Claude has gotten better about following CLAUDE.md over the last year (it was pretty laughably bad at it previously).Also, how much more money do you make? Or are you working less?
Like a sibling comment - I'm also curious about what that 12x means for you and your business - same revenue at fewer hours? More revenue, fewer hours? Etc.
AI is not delivering 10x shareholder value, anywhere. Software developers have quite the level of hubris about how important they are to companies. Yes our work is very complex and takes a certain mindset to do it well. It takes a lot of other roles to have a successful business, many of those roles will use AI to help draft slide decks, emails, etc. and that's the limit for them.
Look at recent companies doing layoffs claiming its because of AI, like CloudFlare and Coinbase, do their reported financials paint the picture that they are crushing it with AI? No, its net losses into the $100's of millions.
That book isn't, it's built from humility and a rare bright light in this god forsaken field.
Take LLM out that safe space and suddenly they are no silver bullet, in fact they are unless.
So of course those making the 10x claim mean in the safe space where LLM can handle all activities required. You can’t have it both ways 10x and difficult and confusing tasks for LLMs.
Nothing wrong with forks though.
Side projects where you try an idea are you not finding 1h now to do what was 10h work?
Which is what I’m seeing at my job. All of these “afternoon vibe code” projects never actually get users because everyone just vibe-codes their own.
>I always look to staff up a project at the beginning as much as possible, looking for doing as much in parallel up-front as we can.
Ah, maybe this is what you think he would take issue with? Fair enough. Perhaps I should have said:
>I always look to staff up as much as is economically and organizationally optimal, to exploit all genuine parallelism opportunities, being careful not to overstaff.
Martin Fowler, the author of the blog, may be a bit different than that.
How many people are writing crud apps using mainstream languages vs COBOL though? You don't need 100% silver bullet 1-shot everything, just to recognize the signals that for many use cases, there's a significant shift happening. The safe space is expanding and velocity is increasing.
Trying to fix syntax errors in strong interpolation on a 5-minute-delay loop is hell.
If I had to output the code myself, would take around 8 hours of constant writing to get around 1k LoC of code. For FUSE level tricky stuff, I might need to spend 3 weeks for 10 LoC. Very easy to burnout and build pain.
1) It tried the tool, but for some reason it worked unexpectedly and Claude is VERY good at working around problems, it won't just stop.
2) Context got too long so those rules were "forgotten"
AI requires a larger amount of fragile resources to work as opposed to an editor, keyboard and a human.
It some sense it’s a bit like the bitcoin revolution that slowed down once transaction times ballooned out. And blockchains didn’t replace databases as expected. Probably for very good reasons: resources required v. results delivered.
I personally agree that AI is great technology for some great new tools. But we still haven’t found its limits: cost v. results. That happened with bitcoins and blockchains is still outstanding for AIs.
It's when they practically ignore the rabbit holes where it's suspect. I'm definitely seeing speed ups. I troubleshot a linux system yesterday with minimal effort using a local llm. It likely would have taken me a few hours to locate all the docs & testing procedures. the llm did it with only a few prompts. To ensure it did it correctly, I had to interrogate it a few times before letting it proceed.
Humans make really bad scientists, and it takes a lot of effort to properly catalog and provide statistics for these things.
There is an improvement, but I doubt any random dev can give a real estimate since before LLMs they couldnt really give you a real estimate anyway. I do know when I encounter a bug now, debugging is almost immediately possible.
I build things I never would have. My tooling is better and more robust than ever. I verify and test my work better than ever. I fix more bugs than I used to simply because no one needs to care if it fits into a cycle. I explore and solve more problems in more parts of the application, even if I don’t write code. I take better care of our infrastructure. Performance goes up, bugs go down, AWS resources scale back, costs go down. I’ve paid for my AI usage in scaled back resources several times over at this point.
It might not be 10x but it’s a significant multiple.
Might tend to deviate and waste time, needs guiding once in a while, and to check what is it spewing out, point it in the correct direction.
The things that the software does might have value, but the marginal utility of your software is effectively 0.
So my agent just listens for green checks and no PR comments and loops until those conditions are met.
In the early 1960s, Fred Brooks managed the development of IBM's System/360 computer systems. After it was done he penned his thoughts in the book The Mythical Man-Month which became one of the most influential books on software development after its publication in 1975. Reading it in 2026, we'll find some of it outdated, but it also retains many lessons that are still relevant today.
The book contains Brooks's law: “Adding manpower to a late software project makes it later.” The issue here is communication, as the number of people grows, the number of communication paths between those people grows exponentially. Unless these paths are skillfully designed, then work quickly falls apart.
Perhaps my most enduring lesson from this book is the importance of conceptual integrity
I will contend that conceptual integrity is the most important consideration in system design. It is better to have a system omit certain anomalous features and improvements, but to reflect one set of design ideas, than to have one that contains many good but independent and uncoordinated ideas.
He argues that conceptual integrity comes from both simplicity and straightforwardness - the latter being how easily we can compose elements. This point of view has been a strong influence upon my career, the pursuit of conceptual integrity underpins much of my work.
The anniversary edition of this book is the one to get, because it also includes his even-more influential 1986 essay “No Silver Bullet”.