The 100 hour gap between a vibecoded prototype and a working product

I work as a DevOps/SRE and have been doing it FinTech (bank, hedge funds, startups) and Crypto (L1 chain) for almost 20 years.

My thoughts on vibe coding vs production code:

- vibe coding can 100% get you to a PoC/MVP probably 10x faster than pre LLMs

- This is partly b/c it is good at things I'm not good at (e.g. front end design)

- But then I need to go in and double check performance, correctness, information flow, security etc

- The LLM makes this easier but the improvement drops to about 2-3x b/c there is a lot of back and forth + me reading the code to confirm etc (yes, another LLM could do some of this but then that needs to get setup correctly etc)

- The back and forth part can be faster if e.g. you have scripts/programs that deterministically check outputs

- Testing workloads that take hours to run still take hours to run with either a human or LLM testing them out (aka that is still the bottleneck)

So overall, this is why I think we're getting wildly different reports on how effective vibe coding is. If you've never built a data pipeline and a LLM can spin one up in a few minutes, you think it's magic. But if you've spent years debugging complicated trading or compliance data pipelines you realize that the LLM is saving you some time but not 10x time.

Everyone keeps saying 80/20 but that undersells what's going on. The last 20% isn't just hard. It's hard because of what happened during the first 80%.

When an agent takes a shortcut early on, the next step doesn't know it was a shortcut. It just builds on whatever it was handed. And then the step after that does the same thing. So by hour 80 you're sitting there trying to fix what looks like a UI bug and you realize the actual problem is three layers back. You're not doing the "hard 20%." You're paying interest on shortcuts you didn't even know were taken. (As I type this I'm having flashbacks to helping my kid build lego sets.)

The author figured this out by accident. He stopped prompting and opened Figma to design what he actually wanted. That's the move. He broke the chain before the next stage could build on it. The 100 hours is what it costs when you don't do that.

The 100 hours number feels about right for a solo project. What people underestimate is that the last 20% isn't just polish — it's the boring defensive stuff that makes an app not crash on someone else's phone.

I shipped a React Native app recently and probably 30% of the total dev time was wrapping every async call in try/catch with timeouts, handling permission denials gracefully, making sure corrupted AsyncStorage doesn't brick the app, and testing edge cases on old devices. None of that is the fun part. None of it shows up in a demo. But it's the difference between "works on my machine" and "works in production."

Vibecoding gets you to the demo. The gap is everything after that.

The gap is definitely real. But I think most of this thread is misdiagnosing why it exists. It's not that AI cannot produce production quality code, it's that the very mental model most people have of AI is leading them to use the wrong interaction model for closing that last 20% of complexity in production code bases.

The author accidentally proved it: the moment they stopped prompting and opened Figma to actually design what they wanted, Claude nailed the implementation. The bottleneck was NEVER the code generation, it was the thinking that had to happen BEFORE ever generating that code. It sounds like most of you offload the thinking to AFTER the complexity has arisen when the real pattern is frontloading the architectural thinking BEFORE a single line of code is generated.

Most of the 100-hour gap is architecture and design work that was always going to take time. AI is never going to eliminate that work if you want production grade software. But when harnessed correctly it can make you dramatically faster at the thinking itself, you just have to actually use it as a thinking partner and not just a code monkey.

100 hours try 500 hours at least if you want a competitive product, unless you are a wizard at marketing where you out market the 80/20 guys.

They're... launching an NFT product in 2026...

I know it's not the point of this article, but really?

The more I evaluate Claude Code, the more it feels like the world's most inconsistent golfer. It can get within a few paces of the hole in often a single strike, and then it'll spend hours, days, weeks trying to nail the putt.

There's some 80-20:ness to all programming, but with current state of the art coding models, the distribution is the most extreme it's ever been.

I'm having somewhat good experiences with AI but I think that's because I'm only half-adopting it: instead of the full agentic / Ralphing / the-AI-can-do-anything way, I still do work in very small increments and review each commit. I'm not as fast as others, but I can catch issues earlier. I also can see when code is becoming a mess and stop to fix things. I mean, I don't fix them manually, I point Claude at the messy code and ask it to refactor it appropriately, but I do keep an eye to make sure Claude doesn't stray off course.

Honestly, seeing all the dumb code that it produces, calling this thing "intelligent" is rather generous...

"working" != "shipping."

When we start selling the software, and asking people to pay for/depend upon our product, the rules change -substantially.

Whenever we take a class or see a demo, they always use carefully curated examples, to make whatever they are teaching, seem absurdly simple. That's what you are seeing, when folks demonstrate how "easy" some new tech is.

A couple of days ago, I visited a friend's office. He runs an Internet Tech company, that builds sites, does SEO, does hosting, provides miscellaneous tech services, etc.

He was going absolutely nuts with OpenClaw. He was demonstrating basically rewiring his entire company, with it. He was really excited.

On my way out, I quietly dropped by the desk of his #2; a competent, sober young lady that I respect a lot, and whispered "Make sure you back things up."

With sufficiently advanced vibe coding the need for certain type of product just vanishes.

I needed it, I quickly build it myself for myself, and for myself only.

I’ve had a similar experience. I’ve been vibecoding a personal kanban app for myself. Claude practically one-shotted 90% of the core functionality (create boards, lanes, cards, etc.) in a single session. But after that I’ve now spent close to 30 hours planning and iterating on the remaining features and UI/UX tweaks to make the app actually work for me, and still, it doesn’t feel "ready" yet. That’s not to say it hasn’t sped up the process considerably; it would’ve taken me hours to achieve what Claude did in the first 10 minutes.

I think there's a lot to pick apart here but I think the core premise is full of truth. This gap is real contrary to what you might see influencers saying and I think it comes from a lot of places but the biggest one is writing code is very different than architecting a product.

I've always said, the easiest part of building software is "making something work." The hardest part is building software that can sustain many iterations of development. This requires abstracting things out appropriately which LLMs are only moderately decent at and most vibe coders are horrible at. Great software engineers can architect a system and then prompt an LLM to build out various components of the system and create a sustainable codebase. This takes time an attention in a world of vibe coders that are less and less inclined to give their vibe coded products the attention they deserve.

My non-technical client has totally vibe coded a SaaS prototype with lots of features, way bigger product than OP and it sort of works. They spent like 200 hours on it. I wonder what would have been the time needed to clean it up and approve it is secure. I declined to work on it, as I was not sure if it's even possible or if it would be better to rewrite the entire thing from scratch with better prompts. I was not that sure about it given the cost and the fact that they had a product that sort of worked and I let them go to find someone to clean it up. My reasoning is that if the client took 200h to develop this without stopping to check the code, it would take me 2 - 3 x to rewrite it with AI, but the right way, while the cleanup may be so painful it would be way better value for money to rewrite it from scratch.

I'd also say for a lot of applications -- most applications perhaps -- outside of "consumer" ones, the number of features is quite a bit more important than the shape of a button or the animations during a page transition.

Even pretty massive companies like databricks don't think about those things and basically have a UI template library that they then compose all their interfaces from. Nothing fancy. Its all about features, and LLM create copious amounts of features.

The interesting part about vibe coding is the spectrum of experiences and attitudes. I have been playing with it for 2-3hrs a day for the last 4 months now. None of my friends who are using it are using it in the same way. Some people vibe and then refactor, some spec-everything and micro-prompt the solutions. Nobody is feeling like this thing can go unsupervised.

And then there is one guy, a friend of mine, who is planning to release a "submit a bug report, we will fix it immediately" feature (so, collect error report from a user, possibly interview them, then assess if its a bug or not with a "product owner LLM", and then autonomously do it, and if it passes the tests - merge and push to prod - all under one hour. Thats for a mid cap company, for their client-facing product. F*** hell! I have a full bag of bug reports ready for when this hits prod :->

I started working on one of my apps around a year ago. There was no ai CLI back then. My first prototype was done in Gemini chat. It took a week copy and pasting text between windows. But I was obsessed.

The result worked but that's just a hacked together prototype. I showed it to a few people back then and they said I should turn it into a real app.

To turn it into a full multi user scaleable product... I'm still at it a year later. Turns out it's really hard!

I look at the comments about weekend apps. And I have some of those too, but to create a real actual valuable bug free MVP. It takes work no matter what you do.

Sure, I can build apps way faster now. I spent months learning how to use ai. I did a refactor back in may that was a disaster. The models back then were markedly worse and it rewrote my app effectively destroying it. I sat at my desk for 12 hours a day for 2 weeks trying to unpick that mess.

Since December things have definitely gotten better. I can run an agent up to 8 hours unattended, testing every little thing and produce working code quite often.

But there is still a long way to go to produce quality.

Most of the reason it's taking this long is that the agent can't solve the design and infra problems on its own. I end up going down one path, realising there is another way and backtracking. If I accepted everything the ai wanted, then finishing would be impossible.

> Late in the night most problems were fixed and I wrote a script that found everyone whose payment got stuck. I sent them money back (+ extra $1 as a ‘thank you for your patience’ note), and let them know via DMs.

(emphasis added)

Not sure if it was actually written by hand or AI was glossed over, but as soon as giving away money was on the table, the author seems to have ditched AI.

> Now I'm pretty sure that people who say they "vibecoded an app in 30 minutes" are either building simple copies of existing projects, produce some buggy crap, or just farm engagement.

Some people seem to be better at it than others. I see a huge gulf in what people can do. Oddly there is a correlation between was a good engineer pre AI and can vibe code well.

But I see one odd thing. A subset of those who people would consider good or even amazing pre AI struggle. The best I can tell at this stage is because they lacked get int good results with unskilled workers in the past and just relied on their own skills to carry the project.

AI coders can do some amazing things. But at this stage you have to be careful about how you guide it down a path in the same way you did with junior engineers. I am not making a comparison to AI being junior, they by far can code better than most senior engineers, and have access to knowledge at lighting speed.

Nobody is saying they're ready for production in 30 minutes, just that there is something real where an idea used to be.

Something much closer to production SDLC patterns than a Figma mockup.

If you ask for something complicated this headline is more than true. But why complicate things, keep it simple and keep it fast.

Also this article uses 'pfp' like it's a word, I can't figure out what it means.

I'm able to vibe code simple apps in 30 minutes, polish it in four hours and now I've been enjoying it for 2 months.

I’m sure someone else has probably coined the term before me (or it’s just me being dumb, often the case) but I’ve started calling this phase of SWE ‘Ricky Bobby Development’.

So many people are just shouting ‘I wanna go fast’ and completely forgetting the lessons learned over the past few decades. Something is going to crash and burn, eventually.

I say this as a daily LLM user, albeit a user with a very skeptical view of anything the LLM puts in front of me.

This seems more like he is bad at describing what he wants and is prompting for “a UI” and then iterating “no, not like that” for 99 hours.

I have had the experience with creating https://swiftbook.dev/learn

Used Codex for the whole project. At first I used claude for the architect of the backend since thats where I usually work and got experience in. The code runner and API endpoints were easy to create for the first prototype. But then it got to the UI and here's where sh1t got real. The first UI was in react though I had specifically told it to use Vue. The code editor and output window were a mess in terms of height, there was too much space between the editor and the output window and no matter how much time I spent prompting it and explaining to it, it just never got it right. Got tired and opened figma, used it to refine it to what I wanted. Shared the code it generated to github, cloned the code locally then told codex to copy the design and finally it got it right.

Then came the hosting where I wanted the code runner endpoint to be in a docker container for security purpose since someone could execute malicious code that took over the server if I just hosted it without some protection and here it kept selecting out of date docker images. Had to manually guide it again on what I needed. Finally deployed and got it working especially with a domain name. Shared it with a few friends and they suggested some UI fixes which took some time.

For the runner security hardening I used Deepseek and claude to generate a list of code that I could run to show potential issues and despite codex showing all was fine, was able to uncover a number of issues then here is where it got weird, it started arguing with me despite showing all the issues present. So I compiled all the issues in one document, shared the dockerfile and linux secomp config tile with claude and the also issues document. It gave me a list of fixes for the docker file to help with security hardening which I shared back with codex and that's when it fixed them.

Currently most of the issues were resolved but the whole process took me a whole week and I am still not yet done, was working most evenings. So I agree that you cannot create a usable product used by lots of users in 30 minutes not unless it's some static website. It's too much work of constant testing and iteration.

if something like a popup appears that i didnt ask the page to do i snap close the page and never look at it again

What I really want to know is... as a software developer for 25+ years, when using these AI tools- it is still called "vibecoding"? Or is "vibecoding" reserved for people with no/little software development background that are building apps. Genuine question.

I came across the following yesterday: "The Great Way is not difficult for those who have no preferences," a famous Zen teaching from the Hsin Hsin Ming by Sengstan

As we move from tailors to big box stores I think we have to get used to getting what we get, rather than feeling we can nitpick every single detail.

I'd also be more interested in how his 3rd, 4th or 5th vibe coded app goes.

The bottleneck seems to have shifted.

Before LLMs the slow part was writing code. Now the slow part is validating whether the generated code is actually correct.

The 80/20 rule doesn’t go away. I am an AI true believer and I appreciate how fast we can get from nothing to 80% but the last “20%” still takes 80%+ of the time.

The old rules still apply mainly.

I have not been coding for a few years now. I was wondering if vibe coding could unstick some of my ideas. Here is my question, can I use TDD to write tests to specify what I want and then get the llm to write code to pass those tests?

>> people who say they "vibecoded an app in 30 minutes" are either building simple copies of existing projects,

those are not copies, they aren't even features. usually part of a tiny feature that barely works only in demo.

with all vibe coding in the world today you still need at least 6 months full time to build a nice note taking app.

If we are talking something more difficult - it will be years - or you will need a team and it will still take a long time.

Everything less will result in an unusable product that works only for demo and has 80% churn.

i found that to be effective is to use multiple AI tools at once. I'm using Gemini newest model i cant think of at the top of my head right now, and Claude newest model. i have each for its purpose with rustover IDE to speed things up. Rustover is particularly helpful because of how rust is worked with, the constant cargo cli commands and database interactions right in the IDE. i know visual code has this to a certain limit but IMO i prefer Rustover. Using multiple models is because i know what each one is good at and how my knowledge works with their output, makes my life way easier and drives frustration down, which is needed when you need creativity at the forefront. This is being said it def helps to know what you are doing if not 100% at least 60% of the things you are asking the models to do for you, I have caught mistakes and know when a model might make mistake which im fine with, sometimes i just want to see how something is done like the structure for a certain function of crate as im reading cargo.io doc constantly to learn what im doing.

There are plenty of ways to code and use code, which-ever works for you is good just improve on it and make it more effective. I have multiple screens on my computer, i don't like jumping back and fourth opening tabs and browsers so i have my set up the best way that works for me. As for the AI models, they are not going to be that helpful to you if you don't understand why its doing what its doing in a particular function or crate (in case of rust) or library. I imagine the the over the top coder that has years of experience and multiple knowledge in various languages and depth knowledge in libraries, using the same technique he can replace a whole Department by himself technically.

It seems like the entire "product" here is just a ChatGPT system prompt: "combine this image of a person with this image of a dinosaur".

The only thing he needed to code was an NFT wrapper, which presumably is just forking an existing NFT wholesale.

The interesting, user-facing part of the project isn't code at all! It's just an HTML front end on someone else's image generator and a "pay me" button.

Very disappointing.

The speed of prototyping right now is wild.

The interesting shift seems to be that building the first version is no longer the bottleneck — distribution, UX polish and reliability are.

Look at the screenshots to understand what the author means by 'product'.

Woodworking is an analogy that I like to use in deciding how to apply coding agents. The finished product needs to be built by me, but now I can make more, and more sophisticated, jigs with the coding agents, and that in turn lets me improve both quality and quantity.

It already starts with BS. Yes there are apps you can build in 30 minutes and they are great, not buggy or crap as he says it. And there are apps you need 1 hour or even weeks. It depends on what you want to build. To start off by saying that every app build in 30 minutes is crap, simply shows that he did not want to think about it, is ignorant or he simply wanted to push himselve higher up by putting others down. At this point, every programmer who claims that vibecoding doesn't make you at least 10 times more productive is simply lying or worst, doesn't know how to vibe code.

this is why i use ai just for one file at the time, as extension of my own programming. not so fast, but keeps control

> With AI, it’s easier to get the first 90 percent out there. This means we can spend more time on the remaining 10 percent, which means more time for craftsmanship and figuring out how to make your users happy.

EXCEPT... you've just vibe coded the first 90 percent of the product, so completing the remaining 10 percent will take WAY longer than normal because the developers have to work with spaghetti mess.

And right there this guy has shown exactly how little people who are not software developers with experience understand about building software.

I keep seeing things that were vibe coded and thinking, "That's really impressive for something that you only spent that much time on".

To have a polished software project, you must spend time somewhat menially iterating and refining (as each type of user).

To have a polished software project, you need to have started with tests and test coverage from the start for the UI, too.

Writing tests later is not as good.

I have taken a number of projects from a sloppy vibe coded prototype to 100% test coverage. Modern coding llm agents are good at writing just enough tests for 100% coverage.

But 100% test coverage doesn't mean that it's quality software, that it's fuzzed, or that it's formally verified.

Quality software requires extensive manual testing, iteration, and revision.

I haven't even reviewed this specific project; it's possible that the author developed a quality (CLI?) UI without e2e tests in so much time?

Was the process for this more like "vibe coding" or "pair programming with an LLM"?

I can't say I'm impressed by this at all. 100+ hours to build a shitty NFT app that takes one picture and a predefined prompt, then mints you a dinosaur NFT. This is the kind of thing I would've seen college students slam out over a weekend for a coding jam with no experience and a few cans of red bull with more quality and effort. Has our standards really gotten so low? I don't see any craftsmanship at play here.

> The "remaining 10 percent" is a difference between slop and something people enjoy.

I would say the remaining 10% are about how robust your solution is - anything associated with 'vibe' feels inherently unsecure. If you can objectively proof it is not, that's 10 % time well spend.

Instead of 10x devs you now have the super rare 100x devs. They are using AI how it should be used.

I can't take anyone seriously who says an AI edge will be a "superpower".

Which part of "commodity" is confusing???

Of course vibe coding is going to be a headache if you have very particular aesthetic constraints around both the code and UX, and you aren't capable of clearly and explicitly explaining those constraints (which is often hard to do for aesthetics).

There are some good points here to improve harnesses around development and deployment though, like a deployment agent should ask if there is an existing S3 bucket instead of assuming it has to set everything up. Deployment these days is unnecessarily complicated in general, IMO.

If you hear someone spouting off about how vibe coding allows for creation of killer apps in a fraction of the time/cost, just ask them if you can see what successful killer apps they’ve created with it. It’s always crickets at that point because it’s somewhere between wishful thinking and an outright lie.

Why did this crypto grifter AI app get traction on this site?

Im an 20 year veteran of application development consulting. Contributor level... not talking head. I do more estimating than anyone you likely know. Consulting is cooked. I just AI native built (not vibe coding...) an application with a buddy, another Principal level engineer and what would cost a client 500-750k and 8-12 weeks, we did for $200 and 1 sprint. Its a passion project but highly complex mapping and navigation app with host/client multi-user sync'd state. Cooked.

I mean the worst part about this is the author also vibe coded their security. It could have been much more catastrophic if they built a crypto wallet or trading system. But because it was NFTs I guess the max damage was limited.

I have to say its a little sad that so many devs think of security and cryptography in the same way as library frameworks. In that they see it as just some black box API to use for their projects rather than respecting that its a fully developed, complex field that demands expertise to avoid mistakes.

I have had the experience with creating https://swiftbook.dev/learn

I came across the following yesterday: "The Great Way is not difficult for those who have no preferences," a famous Zen teaching from the Hsin Hsin Ming by Sengstan

As we move from tailors to big box stores I think we have to get used to getting what we get, rather than feeling we can nitpick every single detail.

I'd also be more interested in how his 3rd, 4th or 5th vibe coded app goes.

The 80/20 rule doesn’t go away. I am an AI true believer and I appreciate how fast we can get from nothing to 80% but the last “20%” still takes 80%+ of the time.

The old rules still apply mainly.

this is why i use ai just for one file at the time, as extension of my own programming. not so fast, but keeps control

I keep seeing things that were vibe coded and thinking, "That's really impressive for something that you only spent that much time on".

To have a polished software project, you must spend time somewhat menially iterating and refining (as each type of user).

To have a polished software project, you need to have started with tests and test coverage from the start for the UI, too.

Writing tests later is not as good.

I have taken a number of projects from a sloppy vibe coded prototype to 100% test coverage. Modern coding llm agents are good at writing just enough tests for 100% coverage.

But 100% test coverage doesn't mean that it's quality software, that it's fuzzed, or that it's formally verified.

Quality software requires extensive manual testing, iteration, and revision.

I haven't even reviewed this specific project; it's possible that the author developed a quality (CLI?) UI without e2e tests in so much time?

Was the process for this more like "vibe coding" or "pair programming with an LLM"?

I work as a DevOps/SRE and have been doing it FinTech (bank, hedge funds, startups) and Crypto (L1 chain) for almost 20 years.

My thoughts on vibe coding vs production code:

- vibe coding can 100% get you to a PoC/MVP probably 10x faster than pre LLMs

- This is partly b/c it is good at things I'm not good at (e.g. front end design)

- But then I need to go in and double check performance, correctness, information flow, security etc

- The back and forth part can be faster if e.g. you have scripts/programs that deterministically check outputs

- Testing workloads that take hours to run still take hours to run with either a human or LLM testing them out (aka that is still the bottleneck)

I'm building a Java HFT engine and the amount of things AI gets wrong is eye opening. If I didn't benchmark everything I'd end up with much less optimized solution.

Examples: AI really wants to use Project Panama (FFM) and while that can be significantly faster than traditional OO approaches it is almost never the best. And I'm not taking about using deprecated Unsafe calls, I'm talking about using primative arrays being better for Vector/SIMD options on large sets of data. NIO being better than FFM + mmap for file reading.

You can use AI to build something that is sometimes better than what someone without domain specific knowledge would develop but the gap between that and the industry expected solution is much more than 100 hours.

The magic is testing. Having locally available testing and high throughput testing with high amount of test cases now unlocks more speed.

The test cases themselves becomes the foci - the LLM usually can't get them right.

They're... launching an NFT product in 2026...

I know it's not the point of this article, but really?

Yep. As much as the rest of it resonated with LLM coding experiences I'm having, the NFT thing is unfortunate.