I built a programming language using Claude Code

> While working on Cutlet, though, I allowed Claude to generate every single line of code. I didn’t even read any of the code. Instead, I built guardrails to make sure it worked correctly (more on that later).

Impressive. As a practical matter, one wonders what th point would be in creating a new programming languages if the programmer no longer has to write or read code.

Programming languages are after all the interface that a human uses to give instructions to a computer. If you’re not writing or reading it, the language, by definition doesn’t matter.

I've been working on a large codebase that was already significant before LLM-assisted programming, leveraging code I’d written over a decade ago. Since integrating Claude and Codex, the system has evolved and grown massively. Realistically, there’s a lot in there now that I simply couldn't have built in a standard human lifetime without them.

That said, the core value of the software wouldn't exist without a human at the helm. It requires someone to expend the energy to guide it, explore the problem space, and weave hundreds of micro-plans into a coherent, usable system. It's a symbiotic relationship, but the ownership is clear. It’s like building a house: I could build one with a butter knife given enough time, but I'd rather use power tools. The tools don't own the house.

At this point, LLMs aren't going to autonomously architect a 400+ table schema, network 100+ services together, and build the UI/UX/CLI to interface with it all. Maybe we'll get there one day, but right now, building software at this scale still requires us to drive. I believe the author owns the language.

This takes all the satisfaction out of spending a few well thought out weekends to build your own language. So many fun options: compiled or interpreted; virtual machine, or not; single pass, double pass, or (Leeloo Dallas) Multipass? No cool BNF grammars to show off either…

It’s missing all the heart, the soul, of deciding and trading off options to get something to work just for you. It’s like you bought a rat bike from your local junkyard and are trying to pass it off as your own handmade cafe racer.

One topic of llms not doing well with UI and visuals.

I've been trying a new approach I call CLI first. I realized CLI tools are designed to be used both by humans (command line) and machines (scripting), and are perfect for llms as they are text only interface.

Essentially instead of trying to get llm to generate a fully functioning UI app. You focus on building a local CLI tool first.

CLI tool is cheaper, simpler, but still has a real human UX that pure APIs don't.

You can get the llm to actually walk through the flows, and journeys like a real user end to end, and it will actually see the awkwardness or gaps in design.

Your commands structure will very roughly map to your resources or pages.

Once you are satisfied with the capability of the cli tool. (Which may actually be enough, or just local ui)

You can get it to build the remote storage, then the apis, finally the frontend.

All the while you can still tell it to use the cli to test through the flows and journeys, against real tasks that you have, and iterate on it.

I did recently for pulling some of my personal financial data and reporting it. And now I'm doing this for another TTS automation I've wanted for a while.

> More addictive than that is the unpredictability and randomness inherent to these tools. If you throw a problem at Claude, you can never tell what it will come up with. It could one-shot a difficult problem you’ve been stuck on for weeks, or it could make a huge mess. Just like a slot machine, you can never tell what might happen. That creates a strong urge to try using it for everything all the time.

That is the part of the post that stuck with me, because I've also picked up impossible challenges and tried to get Claude to dig me out of a mess without giving up from very vague instructions[1].

The effect feels like the Loss-Disguised-As-Win feeling of the video-games I used to work on at Zynga.

Sure it made a mistake, but it is right there, you could go again.

Pull the lever, doesn't matter if the kids have Karate at 8 AM.

[1] - https://github.com/t3rmin4t0r/magic-partitioning

> The @ meta operator also works with comparisons.

I haven't read any farther than this, yet, but this made me stutter in my reading. Isn't a comparison just a function that takes two arguments and returns a third? How is that different from "+"?

Claude Code built a programming language using you

Next you can let Claude play your video games for you as well. Gads we are a voyeuristic society aren’t we.

AI written code with a human writted blog post, that's a big step up.

That said, it's a lot of words to say not a lot of things. Still a cool post, though!

I have been trying this as well, and you can quickly come very far.

However, I fear that agents will always work better on programming languages they have been heavily trained on, so for an agent-based development inventing a new domain specific language (e.g. for use internally in a company) might not be as efficient as using a generic programming language that models are already trained on and then just live with the extra boilerplate necessary.

Using LLMs to invent new programming languages is a mystery to me. Who or what is going to use this? Presumably not the author.

I think we're going to see a lot more of this. I've done a similar thing, hosting a toy language on haskell, and it was remarkably easy to get something useful and usable, in basically a weekend. If you keep the surface area small enough you can now make a fully fledged, compiled language for basically every single purpose you'd like, and coevolve the language, the code, and the compiler

I'd say these times will be filled with a lot of tailored-to-you "self"-made software, but the question is, are we increasing amount of information in the world? I heard that claude and chatgpt are getting good at mathematical proofs which give really something to our knowledge, but all other things are neutral to entropy, if not decreasing. Strange time to live in, strange valuations and devaluations...

Not to discount your experience, but I dont understand what's interesting about this. You could always build a programming language yourself, given enough time. Programming languages' constructs are well represented in the training dataset. I want someone to build something uniquely novel that's not actually in the dataset and i'll be impressed by CC.

It’s been a while friend

Congratulations on getting to the front page ;)

The AI age is calling for a language that is append-only, so we can write in a literate programming style and mix prompts with AI output, in a linear way.

Curious how you handled context management as the project grew — did you end up with a single CLAUDE.md or something more structured? I've been thinking about this problem and working on a standard for it.

Does this really test Claude in a useful way? Is building a highly derivative programming language a useful use case? Claude has probably indexed all existing implementations of imperative dynamic languages and is basically spewing slop based on that vibe. Rather than super flexible, super unsafe languages, we need languages with guardrails, restrictions and expressive types, now more than ever. Maybe LLMs could help with that? I'm not sure, it would certainly need guidance from a human expert at every step.

> I’ve also been able to radically reduce my dependency on third-party libraries in my JavaScript and Python projects. I often use LLMs to generate small utility functions that previously required pulling in dependencies from NPM or PyPI.

This is such an interesting statement to me in the context of leftpad.

I rolled a fair dice using ChatGPT.

"Just one more prompt..." I can relate. who else has been affected by this?

I recently tried using Claude to generate a lexer and parser for a language i was designing. As part of its first attempt, this was the code to parse a float literal:

  fn read_float_literal(&mut self) -> &'a str {
    let start = self.pos;
    while let Some(ch) = self.peek_char() {
      if ch.is_ascii_alphanumeric() || ch == '.' || ch == '+' || ch == '-' {
        self.advance_char();
      } else {
        break;
      }
    }
    &self.source[start..self.pos]
  }

Admittedly, I do have a very idiosyncratic definition of floating-point literal for my language (I have a variety of syntaxes for NaNs with payloads), but... that is not a usable definition of float literal.

At the end of the day, I threw out all of the code the AI generated and wrote it myself, because the AI struggled to produce code that was functional to spec, much less code that would allow me to easily extend it to other kinds of future operators that I knew I would need in the future.

Admittedly I only skimmed this, but I found it interesting that they came to the conclusion that Claude is really bad at (thing they know how to do, and therefore judge ) and really good at (thing they don't know how to do or judge).

I mean, they may be right but there is also a big opportunity for this being Gell-Mann amnesia : "The phenomenon of a person trusting newspapers for topics which that person is not knowledgeable about, despite recognizing the newspaper as being extremely inaccurate on certain topics which that person is knowledgeable about."

Now anyone can be a Larry Wall, and I'm not sure that's a good thing.

That was step #1.

Step #2 is: get real people to use it!

Nope. You didn't write it. You plagiarized it. AI is bad

The "more on that later" was unit tests (also generated by Claude Code) and sample inputs and outputs (which is basically just unit tests by a different name).

This is... horrifically bad. It's stupidly easy to make unit tests pass with broken code, and even more stupidly easy when the test is also broken.

These "guardrails" are made of silly putty.

EDIT: Would downvoters care to share an explanation? Preferably one they thought of?

Wait. You built a new language, that there's thus no training data for.

Who the hell is going to use it then? You certainly won't, because you're dependent on AI.

Impressive. As a practical matter, one wonders what th point would be in creating a new programming languages if the programmer no longer has to write or read code.

Programming languages are after all the interface that a human uses to give instructions to a computer. If you’re not writing or reading it, the language, by definition doesn’t matter.

The constraints enforced in the language still matter. A language which offers certain correctness guarantees may still be the most efficient way to build a particular piece of software even when it's a machine writing the code.

There may actually be more value in creating specialized languages now, not less. Most new languages historically go nowhere because convincing human programmers to spend the time it would take to learn them is difficult, but every AI coding bot will learn your new language as a matter of course after its next update includes the contents of your website.

In the 90s people hoped Unified Modeling Language diagrams would generate software automatically. That mostly didn’t happen. But large language models might actually be the realization of that old dream. Instead of formal diagrams, we describe the system in natural language and the model produces the code. It reminds me of the old debates around visual web tools vs hand-written HTML. There seems to be a recurring pattern: every step up the abstraction ladder creates tension between people who prefer the new layer and those who want to stay closer to the underlying mechanics.

Roughly: machine code --> assembly --> C --> high-level languages --> frameworks --> visual tools --> LLM-assisted coding. Most of those transitions were controversial at the time, but in retrospect they mostly expanded the toolbox rather than replacing the lower layers.

One workflow I’ve found useful with LLMs is to treat them more like a code generator after the design phase. I first define the constraints, objects, actors, and flows of the system, then use structured prompts to generate or refine pieces of the implementation.

Like everything generated by LLMs though, it is built on the shoulders of giants - what will happen to software if no one is creating new programming languages anymore? Does that matter?

I don’t agree with the idea that programming languages don’t have an impact of an LLM to write code. If anything, I imagine that, all else being equal, a language where the compiler enforces multiple levels of correctness would help the AI get to a goal faster.

I have been building a game via a separate game logic library and Unity (which includes that independent library).. let's just say that over the last couple weeks I have 100% lost the need to do the coding myself. I keep iterating and have it improve and there are hundreds of unit tests.. I have a Unity MCP and it does 95% of the Unity work for me. Of course the real game will need custom designing and all that; but in terms of getting a complete prototype setup.... I am literally no longer the coder. I just did in a week what it would have taken me months and months and months to do. Granted Unity is still somewhat new to, but still.. even if you are an expert- it can immediately look at all your game objects and detect issues etc.

So yeah for some things we are already at the point of "I am not longer the coder, I am the architect".. and it's scary.

> Impressive. As a practical matter, one wonders what th point would be in creating a new programming languages if the programmer no longer has to write or read code.

I'm working on a language as well (hoping to debut by end of month), but the premise of the language is that it's designed like so:

1) It maximizes local reasoning and minimizes global complexity

2) It makes the vast majority of bugs / illegal states impossible to represent

3) It makes writing correct, concurrent code as maximally expressive as possible (where LLMs excel)

4) It maximizes optionality for performance increases (it's always just flipping option switches - mostly at the class and function input level, occassionaly at the instruction level)

The idea is that it should be as easy as possible for an LLM to write it (especially convert other languages to), and as easy as possible for you to understand it, while being almost as fast as absolutely perfect C code, and by virtue of the design of the language - at the human review phase you have minimal concerns of hidden gotcha bugs.

Saves tokens. The main reason though is to manage performance for what techniques get used for specific use cases. In their case it seems to be about expressiveness in Bash.

> If you’re not writing or reading it, the language, by definition doesn’t matter.

By what definition? It still matters if I write my app in Rust vs say Python because the Rust version still have better performance characteristics.

In principle (and we hope in practice) the person is still responsible for the consequences of running the code and so it remains important they can read and understand what has been generated.

I've been wondering if a diffusion model could just generate software as binary that could be fed directly into memory.