How to Make Programming Terrible for Everyone

jneens web site

# Published Mar 27, 2026

A full-page ad for The Last One software system, post-processed with various effects by me.
BYTE magazine Aug. 1981, pg. 196
(click through for more a readable version)

I was recently shocked to learn that The Daily WTF is still running after all these years. It seems like such a time capsule now; a write-in blog making fun of the absolute nonsense programmers encounter in the field. Marvel at the 24 nested stringReplace calls! Check out this trilingual ASP.NET SQL query that generates both HTML and Javascript! Watch The PHP God non-deterministically divide by zero! To me it evokes an era of IRC channels, PHP, subversion, and logging into the production server to update the code. Simpler times. These days it’s a lot of very obtuse React.

There is one snippet from that time that really stuck in my brain though. It reached above the nonsense layer and into the philosophical. And that is the story of The Quine Programmer.

See, unlike the typical inanity and wacky code snippets of the site, the Quine Programmer (and the Quine System they create) get at some very fundamental questions about programming. Though it’s not technically a quine - that would require producing the actual program’s source code - the so-called Quine Programmer has built a system that can do just about anything, given the right configuration. And they’ve presumably built it using a programming language, which is a system that can do just about anything, given the right code. So naturally we have to ask:

What is the difference between a “System that Can Do Anything” and, say, Python?

At what point is the user of a system doing capital-P Programming? Is Excel a programming language? What about The Last One? If scratch is a programming language, does clicking around Unreal Engine count? nginx.conf? What about the command-line flags of find?1

And what is it in concrete terms that makes Python better than The Quine Programmer’s monster?

The Quine Programmer’s monster.
As shown in “The Quine Programmer” on The Daily WTF.

Most folks I know who work on and write about programming languages would say that, broadly speaking, these systems all represent a type of programming.2 There are some design considerations all these systems have in common, and it is a design space I enjoy thinking about.

In some sense, which I hope to make a bit more concrete here, Python and your favourite programming language have more respect for the interaction between the user and the computer, allowing each to do what they do best. Every modern high-level language is built on a mountain of abstractions, but to some extent they actually free you from thinking about it most of the time, allowing you to work with simplified mental models that make the act of programming easier, clearer, and more fun.

So, how do we do better than the Quine Programmer? How can we connect the dots between human user and computer in a way that respects the strengths of both?

Computer Language Design

For this broader question, I prefer the following extremely general definition of a “computer language”:

A computer language is a computer program whose meaningful inputs are unbounded in complexity.

This is pretty close to the definition from my clojure/west talk 10 years ago, and though there’s many things I’d change about that talk a decade later, I think it’s fairly workable for our purposes. It clearly includes configuration, markup, and styling languages, as well as non-text-based languages quite nicely. It also, crucially, includes the Quine Programmer’s system.3

And for those versed in certain languages, evaluating a tool by the possible shapes of its inputs should feel fairly familiar - the shape of the input determines the utility of the output. The moment you are handling arbitrarily complex input is perhaps the moment you should ask “Am I the Quine Programmer?”4

What I think more strongly stand the test of time5 are the general design goals of a computer language that I presented in that talk:

Interpretation. The computer should be able to interpret valid input and reject invalid input.

Predictability. The user should be able to understand the way in which the computer will interpret their input.

Discoverability. The user should understand how to express new goals within the constraints of the system.

The Quine Programmer is quite satisfied with their performance on Interpretation. Give it a valid input, and the system will work just fine! Perhaps they haven’t thought through the feedback loop for invalid input particularly well, and perhaps there are some corners the users would be surprised to know about, but to the Quine Programmer’s mind these are simply part of the spec. Every bug a feature.

As for Predictability, the Quine Programmer has not even considered it. In their mind the implementation is the mental model. But their users, who almost as a rule are not familiar with the implementation, will absolutely struggle to use the system if they cannot reasonably predict its behaviour. Any bozo can write display: flex in a CSS file, but there is no meaning to this action unless you have a mental model of what it will do. Systems with low predictability leave users with a complicated guess-and-check workflow that relies mostly on superstition.

A dated meme, but one that encapsulates the “try it and see if it works” feeling I remember from trying to make a site look right in IE6.

Similary, Discoverability may have been overlooked by the Quine Programmer, who typically is not known to write extensive documentation. Just read the code! Languages that fail on this point tend to be plagued with copy-paste code, slightly modified each time, a result of users’ fear of venturing off the well beaten path.

The Quine Programmer has also likely overlooked another important discovery feature: error messages. It is vitally important that the system respond in a helpful way when given invalid input. Users rely on these messages (among other things) to correct mistakes and develop a clear mental model of the language.

In fact, Peter Naur argued in 1985 that that the programmer’s mental model of the language is closer to the point of programming than the execution of code:

[…] it is concluded that the proper, primary aim of programming is, not to produce programs, but to have the programmers build theories of the manner in which the problems at hand are solved by program execution.

These design goals all share a common thread, which is enabling a user to communicate and do cool things with a computer in a way that empowers them and allows them to think in higher-level terms, without leaving them lost and confused. They’re principles of user empowerment nearly identical to those that UX designers think about every day.

So that’s a fun little philosophical exercise from 10 years ago, based on a satirical blog post from 15 years ago. But this is 2026, so I think we all know where this is going.

Evaluating AI as a computer language

This beef isn’t new for me. I was getting into embarrassing online arguments with AI people on exactly these points as far back as 2014. Perhaps I should have written about them more.

See, the design goals we’ve talked about aren’t just applicable to quine systems or computer languages. I’d argue they apply to any tool that facilitates communication between a human user and a computer. Any AI system that communicates with a user in natural language certainly meets our computer language definition above, and so I feel fairly justified in judging it by the same criteria.6 Also, AI marketing keeps claiming it has invented things 7 that compilers, linters, formatters, and all manner of language tools have been doing for decades. If they insist on being programming tools, I think it’s only fair to critique them by the same terms as programming tools: Interpretation, Predictability, and Discoverability.

From my perspective, AI can be seen as an incredibly poorly designed computer language.

Actually, that may not quite be fair, as AI systems excel at discoverability. By design, it is certainly “easy” to use an AI system. Those who believe in the existence of “prompting skill” may have one or two tips and tricks for saving tokens and discouraging certain classes of errors, but ultimately communicating with a machine in natural language takes an extremely minimal amount of knowledge or training. It’s all right there, willing to explain anything you ask, factual accuracy be damned. The issue is that the designers seem to have forgotten the rest of it: what discoverability is for.

Whether an AI system can interpret valid input is a topic of contentious debate (and I don’t think we should accept “sometimes” as an answer here), but it is very clear to me that AI systems are not capable of rejecting invalid input, at least not consistently.8 One could even ask whether there is a distinction between valid and invalid input for an AI system.

Natural language is slippery, full of weird regionalisms and self-negatives and overlapping meanings and context dependence that make the process of interpretation extremely error-prone, even for a theoretically perfect system. As Dijkstra famously argued, the use of formal symbols for logical and technical tasks historically represented a major breakthrough, and is one we should not let slip lightly.

But the complete abject failure of AI systems lies mainly in predictability. The scale of the data set means that there is simply no way for users to develop a mental model of its operation, or to even guess as to the nature of the interpretation of a given input. We have automated the “Peter Griffin Struggles With Venetian Blinds” workflow on our codebases and our users.

To be fair to the users though, they do in fact end up constructing a mental model of their interactions with an AI. The problem is it’s dangerously wrong.

The AI User’s Actual Mental Model

Most computer science folks these days are familiar with ELIZA, the 1960s chatbot that famously convinced users they were talking to a real person, seemingly passing the Turing Test with flying colours by mere social engineering. Fewer, I think, have read ELIZA author Joseph Weizenbaum’s excellent book on the topic, Computer Power and Human Reason. I recently managed to find a hard-copy, printed in 1976 - it has one of those old cardboardy hard-back bindings and smells like my grandmother’s bookshelf.

The language in it is admittedly a bit dated and certainly dense, but it’s surprisingly prescient for today’s world. Here’s how Weizenbaum describes a user developing a mental model of a natural language system like ELIZA (pg. 15):

So, unless they are capable of very great skepticism (the kind we bring to bear while watching a stage magician), they can explain the computer’s intellectual feats only by bringing to bear the single analogy available to them, that is, their model of their own capacity to think.

The lack of a clear mental model leads to the so-called ELIZA effect: a user’s tendency to project all kinds of intellectual capabilities onto a computer system, in the same way my cats might think that I have magical powers when I turn the lights on and off or summon meat from the refrigerator. The classical example is Weizenbaum’s secretary being moved to tears by a fairly simple word-substitution algorithm, in a conversation beginning “Men are all alike.”

This is not, as is the impression I fear so many walk away with, some problem limited to a gullible secretary complaining to a computer program about her boyfriend.9 We are not as different from her as maybe we would like to imagine. In fact, Weizenbaum cites professional psychiatrists declaring the program to be the future of their field, and even Carl Sagan himself chimed in to offer a rosy vision of a future where therapy was administered coldly through arrays of computer terminals.10 You are not immune to the ELIZA effect!

As a more modern example, from about 2014 to 2017, I ran a twitter bot trained on my tweets, called @jneebooks, which was a popular trend at the time. I don’t quite remember if I ended up using an off-the-shelf thing or the very naive 3-word Markov model I was tinkering with. But I remember why I stopped running it: a friend of mine from high school thought I was trying to break into the ebooks market, and had an entire conversation with it, mistaking it for me. And then when I told him it was a bot he didn’t believe me.

That’s a fun comedy of errors, but today, AI’s tendency towards being anthropomorphized is not neutral. It’s directly led to some of its more unsavoury effects on its users. In fact, LLMs are arguably an active cognitohazard. Even presumed subject-matter experts are prone to this - just a short while ago a Meta AI specialist posted herself admonishing OpenClaw for deleting her email, an act that assumes a computer can feel shame and correct its behaviour to avoid it.

I think we should consider this a kind of optical illusion created by having no other reference point to fall back on. Like most illusions, being aware of the problem doesn’t mean your perception is suddenly “fixed” - you and I are equipped with human senses that have a lot of strange flaws and corner cases.

Anyone who works with computers has been here though, far before AI. When something goes wrong, we’re as likely as anyone to assume the computer is cursed, has it out for us specifically, or needs to be appeased in some way. And at that moment, the computer does have an interiority that is hidden from us. There certainly is something we haven’t seen or don’t understand! A good programming tool or computer language, though, is a means for us to find and dissect that interiority, stripping away the illusion. AI actively worsens it instead.11

Aside on Tuned Noise

In the last few years I’ve been getting a bit into shader programming. Despite being decidedly mediocre at it, the act of programming this way has been very therapeutic for me. Demoscene-style shaders are optimized for live-coding in competitive 25-minute demo battles, meaning memorization and quick typing are much more desirable traits in a design than future-proofing or even readability.

In art coding, there is a constant struggle to maintain order over chaotic systems. Good graphics programmers and artists know how to use noise and unpredictability to their advantage, while retaining predictability over the general behaviour. As my dear friend and extremely accomplished shader artist blackle put it, “The only thing bad about glitches is that they take artistic control away.” Noise is tricky, even in simple cases - understanding the distribution or behaviour of a noise source is critical for getting good results.

The reason I bring this up is twofold. First, because I anticipate objections to “Predictability” based on the fact that programmers use noise and randomness all the time, and I want to establish that predictability is in fact key to working with noise.

The second reason is that tuned noise is legitimately a decent application of AI models, (including GPT!). Machine learning models themselves are, in fact, one important subgenre of tuned noise. This isn’t just my opinion - LLM researcher Andrej Karpathy explains in the annotations for microgpt:12

The [GPT] model is a big math function that maps input tokens to a probability distribution over the next token.

And naturally, this is generally the productive use to which not-quite-as-large learning models have been put for some time. I think it’s fair to say that presuming local models, ethical training, and the retention of some manner of creative control,13 there is not much one can object to with this use case.14

But critically: in all these cases, the noise is in the output, rather than the interface. In the case of computer languages, where input is unbounded in complexity, there are already so many natural sources of unpredictability that introducing more sources, especially when their behaviour is not well understood, is an unnecessary sacrifice of creative control. Sure, it can enable you to make more code, but that’s unlikely to be better or more reliable.15

Conclusion

I care deeply about tool usability - there’s much in HCI that seems to be about capturing the attention of the most uninterested user, but in my opinion UX is equally important in the other ways we (even technical users!) interact with computers. When I spelled out these design goals a decade ago, I had some hope that maybe the industry would start taking tool design seriously. The industry’s answer appears to be Claude.

And it’s not like the problem is new. As far back as 1981, tech press was salivating over half-baked Quine Systems like The Last One, claiming such hogwash as “[…] it’s the end of programming as you know it”, and “[…] those in the DP [data processing] industry who fail to adapt to the new approach may find themselves out of work.” By all accounts, The Last One was a monstrous system to work with.

Claude and other natural-language-based programming tools carry the thesis that the failure of The Last One and other older Quine Systems was primarily due to technological failures, and the capabilities of hardware at the time. But how valuable can these advances be if they fail to empower their users? The problem is in the interface!

I think the ultimate fate of AI programming won’t be too far from that of The Last One. When a programming tool is unreliable, completely resists mental-modeling, and is incapbable of consistently rejecting invalid input, I think it’s reasonable to say it’s not fit for purpose, and is certainly not the future of programming. We simply cannot develop mental models of AI through traditional means. But we have to remember that just because we don’t understand it doesn’t mean it’s hiding secret insight or power.

But I also think the art of programming survives the coming AI winter. There’s going to be quite a lot of work to do to clean up the mess we’ve made for ourselves. But I have a deep love for a good mess, precisely because it means there’s still work to do.16

Instead, why not learn a new, traditional programming language this year, maybe one that’s very different than you’re used to? Heck, mock one up yourself! The Quine Programmer’s mistake wasn’t in building a computer language, just in doing it badly, and failing to empathize with their audience. Maybe by learning from what’s been done before, you can do better! ∎

Thank you to blackle, Tesselode, AmyZenunim, and tef for helping me turn this mess into something even vaguely presentable. Anything that sucks here is probably a result of me not taking their advice. Yall rock.

Those with just enough CS knowledge to be dangerous will probably be excitedly ready to explain Turing Completeness at this moment, a mathematical categorization of programming languages based on their capability, given some infinite amount of resources (classically, a tape). But I’m more interested in the design of languages than their specific application or capabilities, so let’s perhaps say that turing-completeness is an attribute that a language can have, rather than a definition of what one is. It’s an important consideration to be sure! But there are other design considerations that I think apply a bit more generally. ↩
…with the caveat that sometimes “programming” is implied to be in a Turing complete system, or even more specifically an imperative one, and this ambiguity is also part of why I prefer “computer language”. I’m including things like HTML and CSS here, as they also have the same kind of design considerations. ↩
There’s some grey area around creative tools like Photoshop or your average DAW (which these days tend to have programming-like systems built in anyways). ↩
AITQP subreddit when ↩
…with some modifications, which may be a bit unfair of me. ↩
specifically, systems that generate or interpret code with LLMs, based on inputs including code, prompts, and documentation. I do know about systems that e.g. do log pattern-finding, fuzzing, and all manner of other tasks, but I would argue those don’t need the specific bit I take issue with here, which is the natural-language interface with the user. They also would be completely fine running locally on ethically-sourced data, so are usually much less controversial. ↩
some going so far as to claim having invented the concept of abstraction itself! ↩
I swear to god if somebody comes at me with “you just haven’t used the latest model yet”. We’ve heard this before. Non-determinism is inherent to the design. ↩
Even though this is Weizenbaum’s only written example I can find of ELIZA interaction (specifically the DOCTOR module), the fact that this prompt is the example snippet every time ELIZA is brought up says something about our perception and treatment of women, and allows us to be somewhat dismissive of the ELIZA effect in ways that are super uncomfortable to think about! ↩
The article, “In Praise of Robots” from Natural History Jan 1975 is, naturally, paywalled. You can apparently ask a chatbot to hallucinate about it though. But here’s a Forbes article discussing it, which contains the same quote Weizenbaum pulled in his book. As you might expect, this version of Sagan is quoted extensively by AI proponents and AGI17 true believers. ↩
Huge shout-outs to blackle for this line of argument. To those who would say that AI specifically helps them understand the problem better, I would say two things - one, finding patterns in logs is actually probably defensible as a machine-learning use case, in the same way that the best spam filters tend to lean on ML methods. Two, I’ve seen many, many instances where this “understanding” is mostly an illusion, where the AI user has been led severely astray on a problem in ways they were sort of by definition not equipped to notice. While it’s also possible to be led astray like this by traditional documentation or other discovery features in a language, when this happens I think we generally regard it as bad, and recognize it as a problem to be fixed. ↩
Specifically, the noise that is tuned comes from three sources in microgpt (there are more in a production AI system): 1) a uniform shuffle of the training data, 2) a Gaussian distribution for the initial matrix values of each attention head, and 3) the final weighted random choice according to the trained weights during inference. The rest is all tuning! ↩
these are very big presumptions ↩
except maybe the price, SpeedTree is notoriously expensive. ↩
Holy heck it just occurred to me that the code from The Daily WTF snippets is almost certainly in your favourite model’s training set. Terrifying. ↩
If you have a big mess to clean up, consider hiring me! ↩
It’s also important to note that AGI is pseudoscience. ↩

Hacker Times