Impressive. As a practical matter, one wonders what th point would be in creating a new programming languages if the programmer no longer has to write or read code.
Programming languages are after all the interface that a human uses to give instructions to a computer. If you’re not writing or reading it, the language, by definition doesn’t matter.
It’s missing all the heart, the soul, of deciding and trading off options to get something to work just for you. It’s like you bought a rat bike from your local junkyard and are trying to pass it off as your own handmade cafe racer.
I've been trying a new approach I call CLI first. I realized CLI tools are designed to be used both by humans (command line) and machines (scripting), and are perfect for llms as they are text only interface.
Essentially instead of trying to get llm to generate a fully functioning UI app. You focus on building a local CLI tool first.
CLI tool is cheaper, simpler, but still has a real human UX that pure APIs don't.
You can get the llm to actually walk through the flows, and journeys like a real user end to end, and it will actually see the awkwardness or gaps in design.
Your commands structure will very roughly map to your resources or pages.
Once you are satisfied with the capability of the cli tool. (Which may actually be enough, or just local ui)
You can get it to build the remote storage, then the apis, finally the frontend.
All the while you can still tell it to use the cli to test through the flows and journeys, against real tasks that you have, and iterate on it.
I did recently for pulling some of my personal financial data and reporting it. And now I'm doing this for another TTS automation I've wanted for a while.
I haven't read any farther than this, yet, but this made me stutter in my reading. Isn't a comparison just a function that takes two arguments and returns a third? How is that different from "+"?
That is the part of the post that stuck with me, because I've also picked up impossible challenges and tried to get Claude to dig me out of a mess without giving up from very vague instructions[1].
The effect feels like the Loss-Disguised-As-Win feeling of the video-games I used to work on at Zynga.
Sure it made a mistake, but it is right there, you could go again.
Pull the lever, doesn't matter if the kids have Karate at 8 AM.
That said, the core value of the software wouldn't exist without a human at the helm. It requires someone to expend the energy to guide it, explore the problem space, and weave hundreds of micro-plans into a coherent, usable system. It's a symbiotic relationship, but the ownership is clear. It’s like building a house: I could build one with a butter knife given enough time, but I'd rather use power tools. The tools don't own the house.
At this point, LLMs aren't going to autonomously architect a 400+ table schema, network 100+ services together, and build the UI/UX/CLI to interface with it all. Maybe we'll get there one day, but right now, building software at this scale still requires us to drive. I believe the author owns the language.
That said, it's a lot of words to say not a lot of things. Still a cool post, though!
However, I fear that agents will always work better on programming languages they have been heavily trained on, so for an agent-based development inventing a new domain specific language (e.g. for use internally in a company) might not be as efficient as using a generic programming language that models are already trained on and then just live with the extra boilerplate necessary.
Congratulations on getting to the front page ;)
Step #2 is: get real people to use it!
The "more on that later" was unit tests (also generated by Claude Code) and sample inputs and outputs (which is basically just unit tests by a different name).
This is... horrifically bad. It's stupidly easy to make unit tests pass with broken code, and even more stupidly easy when the test is also broken.
These "guardrails" are made of silly putty.
EDIT: Would downvoters care to share an explanation? Preferably one they thought of?
By what definition? It still matters if I write my app in Rust vs say Python because the Rust version still have better performance characteristics.
So yeah for some things we are already at the point of "I am not longer the coder, I am the architect".. and it's scary.
This is such an interesting statement to me in the context of leftpad.
I mean, they may be right but there is also a big opportunity for this being Gell-Mann amnesia : "The phenomenon of a person trusting newspapers for topics which that person is not knowledgeable about, despite recognizing the newspaper as being extremely inaccurate on certain topics which that person is knowledgeable about."
Who the hell is going to use it then? You certainly won't, because you're dependent on AI.
fn read_float_literal(&mut self) -> &'a str {
let start = self.pos;
while let Some(ch) = self.peek_char() {
if ch.is_ascii_alphanumeric() || ch == '.' || ch == '+' || ch == '-' {
self.advance_char();
} else {
break;
}
}
&self.source[start..self.pos]
}
Admittedly, I do have a very idiosyncratic definition of floating-point literal for my language (I have a variety of syntaxes for NaNs with payloads), but... that is not a usable definition of float literal.At the end of the day, I threw out all of the code the AI generated and wrote it myself, because the AI struggled to produce code that was functional to spec, much less code that would allow me to easily extend it to other kinds of future operators that I knew I would need in the future.
There may actually be more value in creating specialized languages now, not less. Most new languages historically go nowhere because convincing human programmers to spend the time it would take to learn them is difficult, but every AI coding bot will learn your new language as a matter of course after its next update includes the contents of your website.
I'm working on a language as well (hoping to debut by end of month), but the premise of the language is that it's designed like so:
1) It maximizes local reasoning and minimizes global complexity
2) It makes the vast majority of bugs / illegal states impossible to represent
3) It makes writing correct, concurrent code as maximally expressive as possible (where LLMs excel)
4) It maximizes optionality for performance increases (it's always just flipping option switches - mostly at the class and function input level, occassionaly at the instruction level)
The idea is that it should be as easy as possible for an LLM to write it (especially convert other languages to), and as easy as possible for you to understand it, while being almost as fast as absolutely perfect C code, and by virtue of the design of the language - at the human review phase you have minimal concerns of hidden gotcha bugs.
Roughly: machine code --> assembly --> C --> high-level languages --> frameworks --> visual tools --> LLM-assisted coding. Most of those transitions were controversial at the time, but in retrospect they mostly expanded the toolbox rather than replacing the lower layers.
One workflow I’ve found useful with LLMs is to treat them more like a code generator after the design phase. I first define the constraints, objects, actors, and flows of the system, then use structured prompts to generate or refine pieces of the implementation.
Going into the vault!
I believe we're at a point where it's not possible to accurately decide whether text is completely written by human, by computer, or something in between.
Programming languages function in large parts as inductive biases for humans. They expose certain domain symmetries and guide the programmer towards certain patterns. They do the same for LLMs, but with current AI tech, unless you're standing up your own RL pipeline, you're not going to be able to get it to grok your new language as well as an existing one. Your chances are better asking it to understand a library.
On a different but related note, it's almost the same as pairing django or rails with an LLM. The framework allows you to trust that things like authentication and a passable code organization are being correctly handled.
I'm being slightly facetious of course, I still use sequence diagrams and find them useful. The rest of its legacy though, not so much.
Black Mirror did it first https://en.wikipedia.org/wiki/Hang_the_DJ
This latest fever for LLMs simply confirms that people would rather do _anything_ other than program in a (not necessarily purely) functional language that has meta-programming facilities. I personally blame functional fixedness (psychological concept). In my experience, when someone learns to program in a particular paradigm or language, they are rarely able or willing to migrate to a different one (I know many people who refused to code in anything that did not look and feel like Java, until forced to by their growling bellies). The AI/LLM companies are basically (and perhaps unintentionally) treating that mental inertia as a business opportunity (which, in one way or another, it was for many decades and still is -- and will probably continue to be well into a post-AGI future).
If there are millions of lines on github in your language.
Otherwise the 'teaching AI to write your language' part will occupy so much context and make it far less efficient that just using typescript.
That's assuming that your new, very unknown language gets slurped up in the next training session which seems unlikely. Couldn't you use RAG or have an LLM read the docs for your language?
How will it "learn" anything if the only available training data is on a single website?
LLMs struggle with following instructions when their training set is massive. The idea that they will be able to produce working software from just a language spec and a few examples is delusional. It's a fundamental misunderstanding of how these tools work. They don't understand anything. They generate patterns based on probabilities and fine tuning. Without massive amounts of data to skew the output towards a potentially correct result they're not much more useful than a lookup table.
If this blog post is unedited LLM output, the blog owner needs to sell whatever model, setup and/or prompt he used for a million dollars, since it's clearly far beyond the state-of-the-art in terms of natural-sounding tone.
Who’s going to use it?
I'm using Claude Code to work on something involving a declarative UI DSL that wraps a very imperative API. Its first pass at adding a new component required imperative management of that component's state. Without that implementation in context, I told Claude the imperative pattern "sucks" and asked for an improvement just to see how far that would get me.
A human developer familiar with the codebase would easily understand the problem and add some basic state management to the DSL's support for that component. I won't pretend Claude understood, but it matched the pattern and generated the result I wanted.
This does suggest to me that a language spec and a handful of samples is enough to get it to produce useful results.
There are languages that are already pretty sparse with keywords. e.g in Go you can write 'func main() string', no need to define that it's public, or static etc. So combining a less verbose language with 'codegolfing' the variables might be enough.
My language is a step ahead of Rust, but not as strict as Ada, while being easier to read than Swift (especially where concurrency is involved).
I have done exactly the above with great success. I work with a weird proprietary esolang sometimes that I like, and the only documentation - or code - that exists for it is on my computer. I load that documentation in, and it works just fine and writes pretty decent code in my esolang.
"But that can't possibly work [based on my misunderstanding of how LLMs work]!" you say.
Well, it does, so clearly you misunderstand how they work.
I’ve never seen LLM being able to produce these kind of absurdist jokes. Or any jokes, really.
Probably if you’re trying to be esoteric and arcane then yeah, you might have trouble, but that’s not normally how languages evolve.
In go every third line is a noisy if err check.
Over the course of four weeks in January and February, I built a new programming language using Claude Code. I named it Cutlet after my cat. It’s completely legal to do that. You can find the source code on GitHub, along with build instructions and example programs.
I’ve been using LLM-assisted programming since the original GitHub Copilot release in 2021, but so far I’ve limited my use of LLMs to generating boilerplate and making specific, targeted changes to my projects. While working on Cutlet, though, I allowed Claude to generate every single line of code. I didn’t even read any of the code. Instead, I built guardrails to make sure it worked correctly (more on that later).
I’m surprised by the results of this experiment. Cutlet exists today. It builds and runs on both macOS and Linux. It can execute real programs. There might be bugs hiding deep in its internals, but they’re probably no worse than ones you’d find in any other four-week-old programming language in the world.
I have Feelings™ about all of this and what it means for my profession, but I want to give you a tour of the language before I get up on my soapbox.
If you want to follow along, build the Cutlet interpreter from source and drop into a REPL using /path/to/cutlet repl.
Arrays and strings work as you’d expect in any dynamic language. Variables are declared with the my keyword.
cutlet> my cities = ["Tokyo", "Paris", "New York", "London", "Sydney"]
=> [Tokyo, Paris, New York, London, Sydney]
Variable names can include dashes. Same syntax rules as Raku. The only type of number (so far) is a double.
cutlet> my temps-c = [28, 22, 31, 18, 15]
=> [28, 22, 31, 18, 15]
Here’s something cool: the @ meta-operator turns any regular binary operator into a vectorized operation over an array. In the next line, we’re multiplying every element of temps-c by 1.8, then adding 32 to each element of the resulting array.
cutlet> my temps-f = (temps-c @* 1.8) @+ 32
=> [82.4, 71.6, 87.8, 64.4, 59]
The @: operator is a zip operation. It zips two arrays into a map.
cutlet> my cities-to-temps = cities @: temps-f
=> {Tokyo: 82.4, Paris: 71.6, New York: 87.8, London: 64.4, Sydney: 59}
Output text using the built-in say function. This function returns nothing, which is Cutlet’s version of null.
cutlet> say(cities-to-temps)
{Tokyo: 82.4, Paris: 71.6, New York: 87.8, London: 64.4, Sydney: 59}
=> nothing
The @ meta operator also works with comparisons.
cutlet> my greater-than-seventy-five = temps-f @> 75
=> [true, false, true, false, false]
Here’s another cool bit: you can index into an array using an array of booleans. This is a filter operation. It picks the element indexes corresponding to true and discards those that correspond to false.
cutlet> cities[greater-than-seventy-five]
=> [Tokyo, New York]
Here’s a shorter way of writing that.
cutlet> cities[temps-f @> 75]
=> [Tokyo, New York]
Let’s print this out with a user-friendly message. The ++ operator concatenates strings and arrays. The str built-in turns things into strings.
cutlet> say("Pack light for: " ++ str(cities[temps-f @> 75]))
Pack light for: [Tokyo, New York]
=> nothing
The @ meta-operator in the prefix position acts as a reduce operation.
cutlet> my total-temp = @+ temps-c
=> 114
Let’s find the average temperature. @+ adds all the temperatures, and the len() built-in finds the length of the array.
cutlet> (@+ temps-c) / len(temps-c)
=> 22.8
Let’s print this out nicely, too.
cutlet> say("Average: " ++ str((@+ temps-c) / len(temps-c)) ++ "°C")
Average: 22.8°C
=> nothing
Functions are declared with fn. Everything in Cutlet is an expression, including functions and conditionals. The last value produced by an expression in a function becomes its return value.
cutlet> fn max(a, b) is
... if a > b then a else b
... end
=> <fn max>
Your own functions can work with @ too. Let’s reduce the temperatures with our max function to find the hottest temperature.
cutlet> my hottest = @max temps-c
=> 31
Cutlet can do a lot more. It has all the usual features you’d expect from a dynamic language: loops, objects, prototypal inheritance, mixins, a mark-and-sweep garbage collector, and a friendly REPL. We don’t have file I/O yet, and some fundamental constructs like error handling are still missing, but we’re getting there!
See TUTORIAL.md in the git repository for the full documentation.
I’m a frontend engineer and (occasional) designer. I’ve tried using LLMs for building web applications, but I’ve always run into limitations.
In my experience, Claude and friends are scary good at writing complex business logic, but fare poorly on any task that requires visual design skills.
Turns out describing responsive layouts and animations in English is not easy. No amount of screenshots and wireframes can communicate fluid layouts and animations to an LLM. I’ve wasted hours fighting with Claude about layout issues it swore it had fixed, but which I could still see plainly with my leaky human eyes.
I’ve also found these tools to excel at producing cookie-cutter interfaces they’ve seen before in publicly available repositories, but they fall off when I want to do anything novel. I often work with clients building complex data visualizations for niche domains, and LLMs have comprehensively failed to produce useful outputs on these projects.
On the other hand, I’d seen people accomplish incredible things using LLMs in the last few months, and I wanted to replicate those experiments myself. But my previous experience with LLMs suggested that I had to pick my project carefully.
A small, dynamic programming language met all my requirements.
make test and make check until there are no more errors”.Finally, this was also an experiment to figure out how far I could push agentic engineering. Could I compress six months of work into a few weeks? Could I build something that was beyond my own ability to build? What would my day-to-day work life look like if I went all-in on LLM-driven programming? I wanted to answer all these questions.
I went into this experiment with some skepticism. My previous attempts at building something entirely using Claude Code hadn’t worked out. But this attempt has not only been successful, but produced results beyond what I’d imagined possible. I don’t hold the belief that all software in the future will be written by LLMs. But I do believe there is a large subset that can be partially or mostly outsourced to these new tools.
Building Cutlet taught me something important: using LLMs to produce code does not mean you forget everything you’ve learned about building software. Agentic engineering requires careful planning, skill, craftsmanship, and discipline, just like any software worth building before generative AI. The skills required to work with coding agents might look different from typing code line-by-line into an editor, but they’re still very much the same engineering skills we’ve been sharpening all our careers.
There is a lot of work involved in getting good output from LLMs. Agentic engineering does not mean dumping vague instructions into a chat box and harvesting the code that comes out.
I believe there are four main skills you have to learn today in order to work effectively with coding agents:
Models and harnesses are changing rapidly, so figuring out which problems LLMs are good at solving requires developing your intuition, talking to your peers, and keeping your ear to the ground.
However, if you don’t want to stay up-to-date with a rapidly-changing field—and I wouldn’t judge you for it, it’s crazy out there—here are two questions you can ask yourself to figure out if your problem is LLM-shaped:
If the answer to either of those questions is “no”, throwing AI at the problem is unlikely to yield good results. If the answer to both of them is “yes”, then you might find success with agentic engineering.
The good news is that the cost of figuring this out is the price of a Claude Code subscription and one sacrificial lamb on your team willing to spend a month trying it out on your codebase.
LLMs work with natural language, so learning to communicate your ideas using words has become crucial. If you can’t explain your ideas in writing to your co-workers, you can’t work effectively with coding agents.
You can get a lot out of Claude Code using simple, vague, overly general prompts. But when you do that, you’re outsourcing a lot of your thinking and decision-making to the robot. This is fine for throwaway projects, but you probably want to be more careful when you’re building something you will put into production and maintain for years.
You want to feed coding agents precisely written specifications that capture as much of your problem space as possible. While working on Cutlet, I spent most of my time writing, generating, reading, and correcting spec documents.
For me, this was a new experience. I primarily work with early-stage startups, so for most of my career, I’ve treated my code as the spec. Writing formal specifications was an alien experience.
Thankfully, I could rely on Claude to help me write most of these specifications. I was only comfortable doing this because Cutlet was an experiment. On a project I wanted to stake my reputation on, I might take the agent out of the equation altogether and write the specs myself.
This was my general workflow while making any change to Cutlet:
plans/doing/ directory. Sometimes we’d end up with 3-4 plan files for a single feature. This was intentional. I needed the plans to be human-readable, and I needed each plan to be an atomic unit I could roll back if things didn’t work out. They also served as a history of the project’s evolution. You can find all the historical plan files in the Cutlet repository.sudo access—and ask it to implement my plan.This workflow front-loaded the cognitive effort of making any change to the language. All the thinking happened before a single line of code was written, which is something I almost never do. For me, programming involves organically discovering the shape of a problem as I’m working on it. However, I’ve found that working that way with LLMs is difficult. They’re great at making sweeping changes to your codebase, but terrible at quick, iterative, organic development workflows.
Maybe my workflow will evolve as inference gets faster and models become better, but until then, this waterfall-style model works best.
I find this to be the most interesting and fun part of working with coding agents. It’s a whole new class of problem to solve!
The core principle is this: coding agents are computer programs, and therefore have a limited view of the world they exist in. Their only window into the problem you’re trying to solve is the directory of code they can access. This doesn’t give them enough agency or information to be able to do a good job. So, to help them thrive, you must give them that agency and information in the form of tools they can use to reach out into the wider world.
What does this mean in practice? It looks different for different projects, but this is what I did for Cutlet:
clang-tidy and clang-format to ensure a baseline of code quality. Just like with tests, the project instructions asked the LLM to run these tools after every major code change. I noticed that clang-tidy would often produce diagnostics that would force Claude to rewrite parts of the code. If I had access to some of the more expensive static analysis tools (such as Coverity), I would have added them to my development process too.make test-sanitize target that rebuilt the entire project and test suite with ASan and UBSan enabled (with LSan riding along via ASan), then ran every test under the instrumented build. The project instructions included running this check at the end of implementing a plan. This caught memory errors—use-after-free, buffer overflows, undefined behavior—that neither the tests nor the linter could find. Running these tests took time and greatly slowed down the agent, but they caught even more issues than clang-tidy.ctags and cscope for navigating the source code. I don’t know how useful this was, because I rarely ever saw it use them. Most of the time it would just grep the code for symbols. I might remove this in the future.--dangerously-skip-permissions enabled and full sudo access. I believe this is the only practical way to use coding agents on large projects. Answering permissions prompts is cognitively taxing when you have five agents working in parallel, and restricting their ability to do whatever they want makes them less effective at their job. We will need to figure out all sorts of safety issues that arise when you give LLMs the ability to take full control of a system, but on this project, I was willing to accept the risks that come with YOLO mode.All these tools and abilities guaranteed that any updates to the code resulted in a project that at least compiled and executed. But more importantly, they increased the information and agency Claude had access to, making it more effective at discovering and debugging problems without my intervention. If I keep working on this project, my main focus will be to give my agents even more insight into the artifact they are building, even more debugging tools, even more freedom, and even more access to useful information.
You will want to come up with your own tooling that works for your specific project. If you’re building a Django app, you might want to give the agent access to a staging database. If you’re building a React app, you might want to give it access to a headless browser. There’s no single answer that works for every project, and I bet people are going to come up with some very interesting tools that allow LLMs to observe the results of their work in the real world.
Coding agents can sometimes be inefficient in how they use the tools you give them.
For example, while working on this project, sometimes Claude would run a command, decide its output was too long to fit into the context window, and run it again with the output piped to head -n 10. Other times it would run make check, forget to grep the output for errors, and run it a second time to capture the output. This would result in the same expensive checks running multiple times in the course of making a single edit. These mistakes slowed down the agentic loop significantly.
I could fix some of these performance bottlenecks by editing CLAUDE.md or changing the output of a custom script. But there were some issues that required more effort to discover and fix.
I quickly got into the habit of observing the agent at work, noticing sequences of commands that the agent repeated over and over again, and turning them into scripts for the agent to call instead. Many of the scripts in Cutlet’s scripts directory came about this way.
This was very manual, very not-fun work. I’m hoping this becomes more automated as time goes on. Maybe a future version of Claude Code could review its own tool calling outputs and suggest scripts you could write for it?
Of course, the most fruitful optimization was to run Claude inside Docker with --dangerously-skip-permissions and sudo access. By doing this, I took myself out of the agentic loop. After a plan file had been produced, I didn’t want to hang around babysitting agents and saying Yes every time they wanted to run ls.
As Cutlet evolved, the infrastructure I built for Claude also evolved. Eventually, I captured many of the workflows Claude naturally followed as scripts, slash commands, or instructions in CLAUDE.md. I also learned where the agent stumbled most, and preempted those mistakes by giving it better instructions or scripts to run.
The infrastructure I built for Claude was also valuable for me, the human working on the project. The same scripts that helped Claude automate its work also helped me accomplish common tasks quickly.
As the project grows, this infrastructure will keep evolving along with it. Models change all the time. So do project requirements and workflows. I look at all this project infrastructure as an organic thing that will keep changing as long as the project is active.
Now that it’s possible for individual developers to accomplish so much in such little time, is software engineering as a career dead?
My answer to this question is nope, not at all. Software engineering skills are just as valuable today as they were before language models got good. If I hadn’t taken a compilers course in college and worked through Crafting Interpreters, I wouldn’t have been able to build Cutlet. I still had to make technical decisions that I could only make because I had (some) domain knowledge and experience.
Besides, I had to learn a bunch of new skills in order to effectively work on Cutlet. These new skills also required technical knowledge. A strange and new and different kind of technical knowledge, but technical knowledge nonetheless.
Before working on this project, I was worried about whether I’d have a job five years from now. But today I’m convinced that the world will continue to have a need for software engineers in the future. Our jobs will transform—and some people might not enjoy the new jobs anymore—but there will still be plenty of work for us to do. Maybe we’ll have even more work to do than before, since LLMs allow us to build a lot more software a lot faster.
And for those of us who never want to touch LLMs, there will be domains where LLMs never make any inroads. My friends who work on low-level multimedia systems have found less success using LLMs compared to those who build webapps. This is likely to be the case for many years to come. Eventually, those jobs will transform, too, but it will be a far slower shift.
Is it fair to say that I built Cutlet? After all, Claude did most of the work. What was my contribution here besides writing the prompts?
Moreover, this experiment only worked because Claude had access to multiple language runtimes and computer science books in its training data. Without the work done by hundreds of programmers, academics, and writers who have freely donated their work to the public, this project wouldn’t have been possible. So who really built Cutlet?
I don’t have a good answer to that. I’m comfortable taking credit for the care and feeding of the coding agent as it went about generating tokens, but I don’t feel a sense of ownership over the code itself.
I don’t consider this “my” work. It doesn’t feel right. Maybe my feelings will change in the future, but I don’t quite see how.
Because of my reservations about who this code really belongs to, I haven’t added a license to Cutlet’s GitHub repository. Cutlet belongs to the collective consciousness of every programming language designer, implementer, and educator to have released their work on the internet.
(Also, it’s worth noting that Cutlet almost certainly includes code from the Lua and Python interpreters. It referred to those languages all the time when we talked about language features. I’ve also seen a ton of code from Crafting Interpreters making its way into the codebase with my own two fleshy eyes.)
I’d be remiss if I didn’t include a note on mental health in this already mammoth blog post.
It’s easy to get addicted to agentic engineering tools. While working on this project, I often found myself at my computer at midnight going “just one more prompt”, as if I was playing the world’s most obscure game of Civilization. I’m embarrassed to admit that I often had Claude Code churning away in the background when guests were over at my place, when I stepped into the shower, or when I went off to lunch. There’s a heady feeling that comes from accomplishing so much in such little time.
More addictive than that is the unpredictability and randomness inherent to these tools. If you throw a problem at Claude, you can never tell what it will come up with. It could one-shot a difficult problem you’ve been stuck on for weeks, or it could make a huge mess. Just like a slot machine, you can never tell what might happen. That creates a strong urge to try using it for everything all the time. And just like with slot machines, the house always wins.
These days, I set limits for how long and how often I’m allowed to use Claude. As LLMs become widely available, we as a society will have to figure out the best way to use them without destroying our mental health.
This is the part I’m not very optimistic about. We have comprehensively failed to regulate or limit our use of social media, and I’m willing to bet we’ll have a repeat of that scenario with LLMs.
Now that we can produce large volumes of code very quickly, what can we do that we couldn’t do before?
This is another question I’m not equipped to answer fully at the moment.
That said, one area where I can see LLMs being immediately of use to me personally is the ability to experiment very quickly. It’s very easy for me to try out ten different features in Cutlet because I just have to spec them out and walk away from the computer. Failed experiments cost almost nothing. Even if I can’t use the code Claude generates, having working prototypes helps me validate ideas quickly and discard bad ones early.
I’ve also been able to radically reduce my dependency on third-party libraries in my JavaScript and Python projects. I often use LLMs to generate small utility functions that previously required pulling in dependencies from NPM or PyPI.
But honestly, these changes are small beans. I can’t predict the larger societal changes that will come about because of AI agents. All I can say is programming will look radically different in 2030 than it does in 2026.
This project was a proof of concept to see how far I could push Claude Code. I’m currently looking for a new contract as a frontend engineer, so I probably won’t have the time to keep working on Cutlet. I also have a few more ideas for pushing agentic programming further, so I’m likely to prioritize those over continuing work on Cutlet.
When the mood strikes me, I might still add small features now and then to the language. Now that I’ve removed myself from the development loop, it doesn’t take a lot of time and effort. I might even do Advent of Code using Cutlet in December!
Of course, if you work at Anthropic and want to give me money so I can keep running this experiment, I’m available for contract work for the next 8 months :)
For now, I’m closing the book on Cutlet and moving on to other projects (and cat).