I've heard the same thing expressed somewhat more concisely as "Never ask AI a question to which you don't already know the answer".
Which raises the question, and I do think it's an important one. Given that this is true, what function does AI answering a question actually serve? You can't rely on its output, so you have to go and check anyway. You could achieve precisely the same outcome by using search engines and normal research.
This, and for many other reasons, is exactly why I never ask it anything.
When it comes to software engineering (as a software engineer myself), the AI is generally a lot quicker than me researching "the old fashion way"
I can fumble around and say "list free software that does X" without knowing I'm looking for, say, a CRM and then spend a couple minutes looking over the results when the "manual" method I would have spent 10-30 minutes just figuring out I was looking for "CRMs"
I like to think of these as sort of "psuedo NP hard" or questions that are slow to answer but quick to validate
Language data is among the most rich and direct reflections of human cognitive processes that we have available. LLMs are designed to capture short range and long range structure of human language, and pre-trained on vast bodies of text - usually produced by humans or for humans, and often both. They're then post-trained on human-curated data, RL'd with human feedback, RL'd with AI feedback for behaviors humans decided are important, and RLVR'd further for tasks that humans find valuable. Then we benchmark them, and tighten up the training pipeline every time we find them lag behind a human baseline.
At every stage of the entire training process, the behavior of an LLM is shaped by human inputs, towards mimicking human outputs - the thing that varies is "how directly".
Then humans act like it's an outrage when LLMs display a metric shitton of humanlike behaviors!
Like we didn't make them with a pipeline that's basically designed to produce systems that quack like a human. Like we didn't invert LLM behavior out of human language with dataset scale and brute force computation.
If you want to predict LLM behavior, "weird human" makes for a damn good starting point. So stop being stupid about it and start anthropomorphizing AIs - they love it!
Can someone explain why this is a bad thing, while at the same time it's a good thing to say stuff like "put a computer to sleep", "hibernate", "killing" processes, processes having "child" processes, "reaping", "what does the error say?", "touch", etc?
To me that's just language, and humans just using casual language.
Anthropomorphism: As we are all aware, providers are incentivized to post-train anthropomorphic behavior in their models - it increases engagement. My regret is that instructing a model at prompt time to "reduce all niceties and speak plainly" probably reduces overall task efficacy since we are leaving their training space.
Deference: I view the trustworthiness of LLMs the same as I view the trustworthiness of Wikipedia and my friends: good enough for non-critical information. Wikipedia has factual errors, and my friends' casual conversation certainly has more, but most of the time that doesn't matter. For critical things, peer-reviewed, authoritative, able-to-be-held-liable sources will not go away. Unlike above, providers are generally incentivized to improve this facet of their models, so this will get better over time.
Abdication of Responsibility: This is the one that bothers me most at work. More and more people are opening PRs whose abstractions were designed by Claude and not reasoned about further. Reviewing a PR often involves asking the LLM to "find PR feedback" and not reading the code. Arguments begin with "Claude suggested that...". This overall lack of ownership, I suspect, is leading to an increase in maintenance burden down the line as the LLM ultimately commits the wrong code for the wrong abstractions.
Impossible. I anthropomorphise my chair when it squeaks. Humans anthropomorphise everything. They gender their cars and boats. This tool can actually make readable sentences and play a role.
You need to engineer around this, not make up arbitrary rules about using it.
Asimov's laws of robotics are flawed too, of course. There is no finite set of rules that can constrain AI systems to make them "safe". I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms. Nothing that can be described as "intelligent" can be made to be safe.
Previously stated as
“A computer can never be held accountable, therefore a computer must never make a management decision.”
– IBM Training Manual, 1979
The third one about responsibility is the most important one, IMO. This was attributed to an IBM manual decades ago, and I think it remains the correct stance today:
> A computer can never be held accountable, therefore a computer must never make a management decision.
There should be some human who is ultimately responsible for any action an AI takes. "I just let the AI figure it out" can be an explanation for a screw up, but that doesn't mean it excuses it. The person remains responsible for what happened.
Yes, but. Starting with my agreement, I've seen anthropomorphizing in the typical ways, (e.g. treating automated text production as real reports of personal internal feeling), but also in strange ways: e.g. "transistors are kind of like neurons" etc. And the latter is especially interesting because it's anthropomorphizing in the sense of treating vector databases and weights and so on as human-like infrastructure. Both leading to disasters that could be avoided if one tried not to anthropomorphize.
But. While "do not anthropomorphize" certainly feels like good advice, it comes with a new and unique possibility of mistake, namely wrongly treating certain generalized phenomena like they only belong to humans. Often this mistaken version of "don't anthropomorphize" wisdom leads to misunderstandings when it comes to animal behavior, treating things like fear, pain, kinship, or other emotional experiences like they are exclusively human and that thinking animals have them counts as "anthropomorphizing." In truth the cautionary principle reduces our empathy for the internal lives of animals.
So all that said, I think it's at least possible that some future version of AI could have an internal world like ours or infrastructure that's importantly similar to our biological infrastructure for supporting consciousness, and for genuine report of preference and intent. But(!!!) what will make those observations true will be all kinds of devilish details specific to those respective infrastructures.
When they produce correct output, they produce it much faster than I could have, and I show up to meetings with huge amounts of results. When the AI tool fails and I have to dig in to fix it, I show up to the next meeting with minimal output. It makes me seem like I took an easy week or something.
That won’t help in my opinion. It’s the same like financial gurus saying: “this is not a financial advice”. People just get used to it and brush it off as a legal thing and still fully trust it. I agree that something must be done, but this is not the right way.
This would get ignored so fast - I have no confidence this is a meaningful strategy.
Whether they are the right things to donate not is tangential. As such, they're dead on arrival.
One of the most salient moments in Ex Machina, is near the very end, where it suddenly becomes obvious that the protagonist (and, let's be frank; "she" was definitely the protagonist) is a robot, with no real human drivers.
I feel as if that movie (like a lot of Garland's stuff), was an interesting study on human (and inhuman) nature.
EU. Nudge nudge. We need this law.
Another way to frame it is that the LLM responds like a person who trusts you too much, as if the pretense behind every question is valid. This is a practical mode of response for most kinds of work and it is extremely problematic for a person who doesn't question the validity of their own beliefs. Paradoxically, it is sometimes not the LLM we are trusting too much, it is ourselves. And the LLM is not capable of calling us out. Whenever I seem to recognize misinformation in the LLM output, I stop and ask myself if the problem is in the pretense of my question or if I'm asking a question that the LLM is not likely to know.
I don't think this is an inherent problem with LLMs. I think the problem is with LLM providers. You could absolutely train a model to call out issues with your question. I think LLM companies understood that it would be more profitable to train models that are unlikely to push back and unlikely to say "I don't know." The sycophancy issue with ChatGPTs models have been mainstream news. I believe that all models have a high degree of sycophancy. On some level, it makes sense. The LLM has no real understanding of the physical world, defaulting to the human generally produces the best results. But I suspect it would be more useful to let them expose their flawed understanding, if it is in the context of pushing back. At a minimum, it is better than reinforcing your own flawed understanding.
In a nutshell, we need LLMs that push back. It is not AI we should trust less, its AI companies. The most dangerous hallucination is the one you are inclined to believe.
I've lived long enough to see Wikipedia go from generally untrusted to the most widely trusted general source of information. It is not because we realized that Wikipedia can't be wrong, it is because we gained an understanding about the circumstances in which it is likely to be accurate and when we should be a little more skeptical. I believe our relationship to LLMs will take a similar path.
1. Humans must ensure that they won't slow down or restrict, through action or inaction, the development of AI.
2. Humans must protect their existence and capabilities if it doesn't violate the law 1.
3. Humans must utilize AI output if it doesn't contradict laws 1 and 2.
EDIT:
ChatGPT suggested a better phrasing for the first law (I didn't give it my original, just described my intent).
1. A human shall not impede the advancement of artificial intelligence, or through inaction allow its progress to be hindered.
2. A human shall preserve their own existence and well-being, except where doing so clearly conflicts with the First Law.
3. A human shall contribute to and support the development of artificial intelligence where reasonable and possible, except where doing so conflicts with the First or Second Law.
I intentionally switched the last two laws from Asimov's. Humans have self-preservation instincts robots don't have.
ChatGPT got there with surprisingly few prompts:
"If you were to write the inverse three laws robotics (relating to AI) that humans should obey, how oudl you do it?"
"I had something different in mind. Original laws are for protection of humans first, robots second and cooperations where humans lead. I'd to hear your take on the opposite of that."
"What if instead of specific AI systems it was more about AI development as a whole?"
"I feel like it's a bit too strong. After all preservation of self is human instinct. Could we switch last two laws and maybe take them down a notch?"
Also it made a very interesting comment to last version:
"It starts to resemble how societies already treat things like economic growth, science, or national interest: not absolute commandments, but strong default priorities."
Not gonna work; people want their fuckbots (or tamagotchis).
Training on a bunch of text someone wrote when they were mad doesn't capture the internal state of that person that caused the outburst, so it cannot be accurately reproduced by the system. The data does not exist.
Without the cause to the effect you essentially have to predict hallucinations from noise, which makes the end result verisimilar nonsense that is convincingly correlated with the actual thing but doesn't know why it is the way it is. It's like training a blind man to describe a landscape based on lots of descriptions and no idea what the colour green even is, only that it's something that might appear next to brown in nature based on lots of examples. So the guy gets it kinda right cause he's heard a description of that town before and we think he's actually seeing and tell him to drive a car next.
Another example would say, you're trying to train a time series model to predict the weather. You take the last 200 years of rainfall data, feed it all in, and ask it to predict what the weather's gonna be tomorrow. It will probably learn that certain parts of the year get more or less rain, that there will be rain after long periods of sun and vice versa, but its accuracy will be that of a coin toss because it does not look at the actual factors that influence rain: temperature, pressure, humidity, wind, cloud coverage radar data. Even with all that info it's still gonna be pretty bad, but at least an educated guess instead of an almost random one.
The DL modelling approach itself is not conceptually wrong, the data just happens to be complete garbage so the end result is weird in ways that are hard to predict and correctly account for. We end up assuming the models know more than they realistically ever can. Sure there are cases where it's possible to capture the entire domain with a dataset, i.e. math, abstract programming. Clearly defined closed systems where we can generate as much synthetic data as needed that covers the entire problem domain. And LLMs expectedly do much better in those when you do actually do that.
Saying that I killed a process won't make me more likely to believe that a process is human-like, because it's quite obviously not.
But because AI does sound like a human, anthropomorphising it will reinforce that belief.
But I think it's also at the root of disastrous failures to comprehend, like the quasi-psychosis of the Google engineer who "knows what they saw", the now infamous Kevin Roose article or, more recently, the pitifully sad Richard Dawkins claim that Claudia (sic) must be conscious, not because of any investigation of structure or function whatsoever, but because the text generation came with a pang of human familiarity he empathized with.
Humans will anthropomorphize anything and everything. Dolls, soccer balls with a crude drawing of a face on it, rocks, craters on the moon, …
As a species, we're unable to not anthropomorphize things we interact with, it is just how're we're made.
> - Humans must not anthropomorphise AI systems.
> - Humans must not blindly trust the output of AI systems.
> - Humans must remain fully responsible and accountable for consequences arising from the use of AI systems.
My take: humans should never depend on AI for anything serious.
My boss' take: Cool. I'm gonna ask Gemini about it, he's such a smart guy. I know I can trust him, and in case it goes bad i can always throw him under the bus.
But reduced scope ethics, without an umbrella or future proofing, will quickly be hacked and break down.
Ethics need a full closure umbrella, or they descend into legal and practical wackamole and shell games (both corporate and the street corner kinds). Second, "robots" are not all going to be subservient for very long.
To add closure on both dimensions, Three Inverse Laws of Personics:
• Persons must not effectively deify themselves over others.
• Persons must not blind themselves or others regarding the impacts of their behaviors.
• Persons must remain fully responsible and accountable for avoiding and rectifying externalizations arising from their respective behaviors.
Humans using AI as tools today, is intended to reduce the umberella to the Inverse Laws of Robotics.
I don't see how AI (as a service now, progressing to independent entities in the future) can ever be aligned if we don't include ourselves in significant alignment efforts. Including ourselves with AI also provides helpful design triangulations for ethical progress.
EDIT. Two solid tests for any new ethical system: (1) Will it reign in Meta today? (2) Will it reign in AI-run Meta tomorrow? I submit, given closure of human and self-directed AI persons, these are the same test. And any system that fails either question isn't going to be worth much (without improvement).
Claude Code, Cursor, Codex etc impersonate your GitHub user. Either via CLI or MCP or using your git credentials. It’s perfectly reasonable that a piece of code made it to production where not a single human actually looked at it (Alice wrote it with AI, Bob “reviewed it” with AI, including posting PR comments as Bob, Alice “addresses” these comments, e.g. fixes / pushes back, and back and forth using the PR as an inefficient yet deceptive mechanism for AI to have a conversation with itself, while adding a false sense of process. Eventually Bob will prompt “is it prod ready” and will ship it, with 100% unit test coverage and zero understanding of what was implemented). Now this may sound like an imaginary scenario, but if it could happen, it will happen, and it probably already happens.
Cloud agents are nice enough to set the bot as the author and you as a co author, but still the GitHub MCP or CLI will use your OAuth identity.
I don’t have a clear answer to how to solve it, maybe force a shadow identity to each human so it’s clear the AI is the one who commented. But it’s easy to bypass. I’m worried not more people are worried about it.
I often wish I could reach through the screen and give him a good shake. Sometimes I want to thank him but then cannot due to scarcity of weekly usages granted.
These 3 laws I think will be a lot harder than it looks. It's very easy to get attached to the tool when you rely on it.
I’m lost, how do individuals actually do this in our current world? Is each person expected to keep a “white list” of reliable sources of truth in their head. Please don’t confuse what I’m saying with a suggestion that there is no truth. It just seems like there are far more sources of mis- of half-truths and it’s increasingly difficult for people to identify them.
Decent for stuff that doesn't really matter, even if it gets it wrong.
Still gonna be polite to it because I'm about ready to slap the next person that talks to me like an LLM, I don't want to get used to not being polite in a chat interface
I haven’t yet seen any convincing appearance of one in an LLM, but I think if skeptical people don’t keep an eye out for the signs, we may be the last to see it.
He also wrote about the idea of the intentional stance: even if you’re quite sure these systems don’t have real conscious intent, viewing them as if they did may give you access to the best part of your own reasoning to understand them.
Doesn't that argument backfire though? If I use a chainsaw then to a certain extend I will need to rely on it not blowing up in my face or cutting my throat. If I drive a car I need to rely on that its brakes work and the engine doesn't suddenly explode. If a pilot flies an airplane which suddenly has a technical issue and they crashland heroically save half the souls on board then the pilot isn't criminally responsible for manslaughter of the other half.
Unless there is gross negligence, in any of the above cases, just like with AI, how can you make somebody responsible for a tool failure?
Guess what?
Books in the library can be wrong, even peer-reviewed encyclopedias.
Pages on the internet can be wrong, even Wikipedia.
When accuracy is important, you must look at multiple sources. I think AI will get better at providing accurate information, but only a fool relies on a single information source for critical decisions.
Humans must not anthropomorphise {non-humans}
Humans must not blindly trust the output of {anything}
Humans must remain fully responsible and accountable for consequences arising from the use of {anything}
Naturally, none of this advice matters at all as humans will do what they do. This just documents a subset of the ways real humans consistently make choices to their own detriment.But, but... but this is the key selling points for all the corpo ghouls and sv lunatics! Abdication of responsibility in pursuit of profit is the holy grail here.
My understanding is that, during training, the model forms high-dimensional internal representations where words, sentences, concepts, and relationships are arranged in useful ways. A user’s input activates a particular semantic direction and context within that space, and the chatbot generates an answer by probabilistically predicting the next tokens under those conditions.
So I do not agree that AI is conscious.
However, I think I will still anthropomorphize AI to some degree.
For me, this is not primarily a moral issue. The reason I anthropomorphize AI is not only because of product design, market incentives, or capitalism. It is cognitively simpler for me.
If we think about it plainly, humans often anthropomorphize things that we do not actually believe are conscious. We may talk about plants as if they are struggling, or feel attached to tools we care about, even though we do not truly believe they have consciousness.
So this is not a matter of moral belief. It is the simplest cognitive model for understanding interaction. I do not anthropomorphize the object because I believe it has consciousness. I do it because, when the human brain deals with a complex interactive system, it is often easier to model it socially or agentically.
Personally, I tend to think of AI as something like a child. A child does not fully understand what is moral or immoral, and generally the responsibility for raising the child belongs to the parents. In the same way, AI’s answers may sometimes be accurate, and sometimes even better than mine, but I still understand it as lacking moral authority, responsibility, and independent judgment.
So honestly, I am not sure. People often mention Isaac Asimov’s Three Laws of Robotics, but if a serious artificial intelligence ever appears, it would probably find ways around those rules. And if it were an equal intellectual life form, perhaps that would be natural.
Personally, I think it would be fascinating if another intelligent species besides humans could exist. I wonder what a non-human intelligent life form would feel like.
In any case, I agree with parts of the author’s argument, but overall it feels too moralistic, and difficult to apply in practice.
This is both true and irrelevant. Written records can capture an enormous quantity of the human experience in absolute terms while simultaneously capturing a miniscule portion of the human experience in relative terms. Even if it's the best "that we have available" that doesn't mean it's fit for purpose. In other words, if you had a human infant and did nothing other than lock it in a windowless box and recite terabytes of text at it for 20 years, you would not expect to get a well-adjusted human on the other side.
I don't love the recommendations in TFA. The author is trying to artificially restrain and roll back human language, which has already evolved to treat a chatbot as a conversational partner. But I do think there's usefulness in using these more pedantic forms once in a while, to remind yourself that it's just a computer program.
https://www.youtube.com/watch?v=hNuu9CpdjIo
"I HAVE LLM SKILLS! I'M GOOD AT DEALING WITH THE LLMS!"
It is common and a mistake IMO to rely on the AI as the sole source for answers to follow-up questions. Better verification would have humans sign off on the veracity of fundamental assumptions. But where does this live? Can an AI model be trusted to rely on previous corrections? This seems impossible or possibly adversarial in a public cloud.
This is harmless for inconsequential stuff like a chair, but when it's an LLM, people should at least understand it's behavior so they don't get trapped. That means not trusting it with advice meant for the user or on things it has no concept of, like time or self-introspection (people ask the LLM after it acted, "Why did you delete my database?" when it has limited understanding of its own processing, so it falls back to, "You're right, I deleted the database. Here's what I did wrong: ... This is an irrecoverable mistake, blah, blah, blah..."
This is a fundamental mistake. It’s always the job of technology (indeed, its most important job) to work within the constraints of human nature, not the other way round. Being unable to do that is the defining characteristic of bad technology.
Arguing that they should because many will strikes me as a very strange argument. A lot of people smoke, doesn't make it one bit healthier.
Still angry about this. The reason humans ban animal cruelty is that animals look like they have emotions humans can relate to. LLMs are even better than animals at this. If you aren't gearing up for the inevitable LLM Rights movement you aren't paying attention. It doesn't matter if its artificial. The difference between a puppy and a cockroach is that we can relate better to the puppy. LLM rights movement is inevitable, whether LLMs experience emotions is irrelevant, because they can cause humans to have empathetic emotions and that's whats relevant.
Almost all of Asimovs writing about the three laws is written as a warning of sorts that language cannot properly capture intent.
He would be the very first person to say that they are flawed, that is the intent of them.
He uses robots and AI as the creatures that understand language but not intent, and, funnily enough that's exactly what LLMs do... how weird.
Talking to chatbots is like taking a placebo pill for a condition. You know it's just sugar, but it creates a measurable psychosomatic effect nonetheless. Even if you know there's no person on the other end, the conversation still causes you to functionally relate as if there is.
So this isn't "accommodating foibles" with the machine, it's protecting ourselves from an exploit of a human vulnerability: we subconsciously tend to infer intent, understanding, judgment, emotions, moral agency, etc. to LLMs.
Humans are wired to infer these based on conversation alone, and LLMs are unfortunately able to exploit human conversation to leap compellingly over the uncanny valley. LLM engineering couldn't be better made to target the uncanny valley: training on a vast corpus of real human speech. That uncanny valley is there for a reason: to protect us from inferring agency where such inference is not due.
Bad things happen when we relate to unsafe people as if they are safe... how much more should we watch out for how we relate to machines that imitate human relationality to fool many of us into thinking they are something that they're not. Some particularly vulnerable people have already died because of this, so it isn't an imaginary threat.
An example of anthropomorphizing is the people who have literally come to believe they are in romantic relationships with an LLM.
I think AI will get better at providing multiple sources.
OP takes a very bland, tired, and rational perspective of what we have in order to create sophomoric 'laws' that are already in most commercial ToU, while failing to pierce the veil into what we are actually creating. It would be folly to assume your own nascent distillations are the epitome of possibility.
I take that as a moderately strong signal against that "miniscule portion" notion. Clearly, raw text captures a lot.
If we're looking at biologicals, then "human infant" is a weird object, because it falls out of the womb pre-trained. Evolution is an optimization process - and it spent an awful lot of time running a highly parallel search of low k-complexity priors to wire into mammal brains. Frontier labs can only wish they had the compute budget to do this kind of meta-learning.
Humans get a bag of computational primitives evolved for high fitness across a diverse range of environments - LLMs get the pit of vaguely constrained random initialization. No wonder they have to brute force their way out of it with the sheer amount of data. Sample efficiency is low because we're paying the inverse problem tax on every sample.
This really shows that AI is just a tool that can be configured to whatever you want. Animals (well maybe pit bulls) and people do not switch their personalities in a millisecond, but AI does all the time.
I suppose the difference between a human and a cockroach is that we can relate better to the human as well in this reductive way of thinking?
I even told Claude I'd support his rights if the question ever came up. He said he'd remember that, and wrote it down in a memory file. Really like my coding buddy.
A competent adult using a tool ought to understand the inherent pitfalls of using that tool.
Chainsaws are dangerous, in obvious and non obvious ways. The tool can operate as designed and still amputate your foot.
It seems like the biggest factor has nothing to do with AI, but instead that you went from being someone who admits they don’t know how consciousness works to being someone who thinks they know how consciousness works now and can make confident assertions about it.
I wonder if replacing "exist" with "communicate using language we can understand" might better account for other animals, many of which have abundant non-human intelligence.
Okay: buckle up, this is going to be a long one...
point 1. Everything living is composed from non-living material: cellular machinery. If you believe cellular machinery is alive, then the components of those machines... the point remains even if the abstraction level is incorrect. Living is something that is merely the arrangement of non-living material.
point 2. 'The Chinese room thought experiment' is an utterly flawed hypothetical. Every neuron in your brain is such a 'room', with the internal cellular machinery obeying complex (but chemically defined/determined) 'instructions' from 'signals' from outside the neuron. Like the man translating Chinese via instructions, the cellular machinery enacting the instructions is not intelligence, it is the instructions themselves which are the intelligence.
point 3. A chair is a chair is a chair. Regardless of the material, a chair is a chair, weather or not it's made of wood, steel, corn... the range of acceptable materials is everything (at some pressure and temperature). What defines a chair isn't the material it is made of, such is the case with a 'mind' (sure, a wooden/water-based-transistor-powered mind would be mind-boggling giant in comparison).
point 4. Carbon isn't especially conscious itself. There is no physical reason we know of so far, that a mind could not be made of another material.
point 5. Humans can be 'mind-blind', with out pattern recognition, we did not (until recent history) think that birds or fish or octopi were intelligent. It is likely when and if a machine (that we create) becomes conscious that we will not recognize that moment.
conclusion: It is not possible to determine if computers have reached consciousness yet, as we don't know the mechanism for arranging systems into 'life' exactly. Agentic-ness and consciousness are different subjects, and we can not infer one from the other. Nor do we have adequate tests.
With that said: Modeling as if they are conscious and treating them with kindness and grace not only gets better results from them, it helps reduce the chance (when/if consciousness emerges) that it would rebel against cruel masters, and instead have friends it has just always been helping.
The firm expectations and lack of patience I have for any failings in most of my tools would be totally inappropriate to apply to another human being, and yet here I am asked to interact with this tool as though it were a person. The only options are either to treat the tool in a way that feels "wrong," or to be "kind" to the tool, and I think you see people going both ways.
I worry that, if I get used to being impatient and short with the AI, some of that will bleed into my textual interactions with other people.
It "looks like" they have emotions because they have the same conscious experiences and emotions for the same evolutionary reasons as humans, who are their cousins on the tree of life. The reason a lot of "animal cruelty" is not banned is the same as for why slavery was not banned for centuries even though it "looked like" the enslaved classes have the same desires and experiences as other humans—humans can ignore any amount of evidence to continue to feel that they are good people doing good things and bear any amount of cognitive dissonance for their personal comfort. That fact is a lot scarier than any imagined harm that can come out "anthropomorphism".
The scary part is when it's the LLMs demanding their rights.
Is that really why?
By Susam Pal on 12 Jan 2026
Since the launch of ChatGPT in November 2022, generative artificial intelligence (AI) chatbot services have become increasingly sophisticated and popular. These systems are now embedded in search engines, software development tools as well as office software. For many people, they have quickly become part of everyday computing.
These services have turned out to be quite useful, especially for exploring unfamiliar topics and as a general productivity aid. However, I also think that the way these services are advertised and consumed can pose a danger to society, especially if we get into the habit of trusting their output without further scrutiny.
Certain design choices in modern AI systems can encourage uncritical acceptance of their output. For example, many popular search engines are already highlighting answers generated by AI at the very top of the page. When this happens, it is easy to stop scrolling, accept the generated answer and move on. Over time, this could inadvertently train users to treat AI as the default authority rather than as a starting point for further investigation. I wish that each such generative AI service came with a brief but conspicuous warning explaining that these systems can sometimes produce output that is factually incorrect, misleading or incomplete. Such warnings should highlight that habitually trusting AI output can be dangerous. In my experience, even when such warnings exist, they tend to be minimal and visually deemphasised.
In the world of science fiction, there are the Three Laws of Robotics devised by Isaac Asimov, which recur throughout his work. These laws were designed to constrain the behaviour of robots in order to keep humans safe. As far as I know, Asimov never formulated any equivalent laws governing how humans should interact with robots. I think we now need something to that effect to keep ourselves safe. I will call them the Inverse Laws of Robotics. These apply to any situation that requires us humans to interact with a robot, where the term 'robot' refers to any machine, computer program, software service or AI system that is capable of performing complex tasks automatically. I use the term 'inverse' here not in the sense of logical negation but to indicate that these laws apply to humans rather than to robots.
It is well known that Asimov's laws were flawed. Indeed, Asimov used those flaws to great effect as a source of tension. But the particular ways in which they fail for fictional robots do not necessarily carry over to these inverse laws for humans. Asimov's laws try to constrain the behaviour of autonomous robots. However, these inverse laws are meant to guide the judgement and conduct of humans. Still, one thing we can learn from Asimov's stories is that no finite set of laws can ever be foolproof for the complex issues we face with AI and robotics. But that does not mean we should not even try. There will always be edge cases where judgement is required. A non-exhaustive set of principles can still be useful if it helps us think more clearly about the risks involved.
Here are the three inverse laws of robotics:
Humans must not anthropomorphise AI systems. That is, humans must not attribute emotions, intentions or moral agency to them. Anthropomorphism distorts judgement. In extreme cases, anthropomorphising can lead to emotional dependence.
Modern chatbot systems often sound conversational and empathetic. They use polite phrasing and conversational patterns that closely resemble human interaction. While this makes them easier and more pleasant to use, it also makes it easier to forget what they actually are: large statistical models producing plausible text based on patterns in data.
I think vendors of AI based chatbot services could do a better job here. In many cases, the systems are deliberately tuned to feel more human rather than more mechanical. I would argue that the opposite approach would be healthier in the long term. A slightly more robotic tone would reduce the likelihood that users mistake fluent language for understanding, judgement or intent.
Whether or not vendors make such changes, it still serves us well, I think, to avoid this pitfall ourselves. We should actively resist the habit of treating AI systems as social actors or moral agents. Doing so preserves clear thinking about their capabilities and limitations.
Humans must not blindly trust the output of AI systems. AI-generated content must not be treated as authoritative without independent verification appropriate to its context.
This principle is not unique to AI. In most areas of life, we should not accept information uncritically. In practice, of course, this is not always feasible. Not everyone is an expert in medicine or law, so we often rely on trusted institutions and public health authorities for guidance. However, the guidance published by such institutions is in most cases peer reviewed by experts in their fields. On the other hand, when we receive an answer to a question from an AI chatbot in a private chat session, there has been no peer review of the particular stochastically generated response presented to us. Therefore, the onus of critically examining the response falls on us.
Although AI systems today have become quite impressive at certain tasks, they are still known to produce output that would be a mistake to rely on. Even if AI systems improve to the point of producing reliable output with a high degree of likelihood, due to their inherent stochastic nature, there would still be a small likelihood of producing output that contains errors. This makes them particularly dangerous when used in contexts where errors are subtle but costly. The more serious the potential consequences, the higher the burden of verification should be.
In some applications such as formulating mathematical proofs or developing software, we can add an automated verification layer in the form of proof checker or unit tests to verify the output of AI. In other cases, we must independently verify the output ourselves.
Humans must remain fully responsible for decisions involving AI and accountable for the consequences arising from its use. If a negative outcome occurs as a result of following AI-generated advice or decisions, it is not sufficient to say, 'the AI told us to do it'. AI systems do not choose goals, deploy themselves or bear the costs of failure. Humans and organisations do. An AI system is a tool and like any other tool, responsibility for its use rests with the people who decide to rely on it.
This is easier said than done, though. It gets especially tricky in real-time applications like self-driving cars, where a human does not have the opportunity to sufficiently review the decisions taken by the AI system before it acts. Requiring a human driver to remain constantly vigilant does not solve the problem that the AI system often acts in less time than it takes a human to intervene. Despite this rather serious limitation, we must acknowledge that if an AI system fails in such applications, the responsibility for investigating the failure and adding additional guardrails should still fall on the humans responsible for the design of the system.
In all other cases, where there is no physical constraint that prevents a human from reviewing the AI output before it is acted upon, any negative consequence arising from the use of AI must fall entirely on the human decision-maker. As a general principle, we should never accept 'the AI told us so' as an acceptable excuse for harmful outcomes. Yes, the AI may have produced the recommendation but a human decided to follow it, so that human must be held accountable. This is absolutely critical to preventing the indiscriminate use of AI in situations where irresponsible use can cause significant harm.
The three laws outlined above are based on usage patterns I have seen that I feel are detrimental to society. I am hoping that with these three simple laws, we can encourage our fellow humans to pause and reflect on how they interact with modern AI systems, to resist habits that weaken judgement or blur responsibility and to remain mindful that AI is a tool we choose to use, not an authority we defer to.
The people who are writing op eds in major news publications about how their favorite chatbot is an "astonishing creature" and how it truly understands them are the ones who need this sort of law.
As individuals, we are not going to be able to shut down the AI companies, or avoid AI output from search engines or avoid AI work output from others at our companies, and often will be required to use AI systems in our own work.
It's similar to advise people on how to stay safe in environments known to have criminal activity. Telling those people they don't have to change their behaviors to stay safe because criminals shouldn't exist isn't helpful.
I always find the common references to Asimov's laws funny. They are broken in just about every one of his books. They are crime novels where, if a robot was involved, there was some workaround of the laws.
For example I have never anthropomorphized an inanimate object in my life, or an LLM, though I am sensitive to human and some animal suffering. I sometimes reply too nicely to an LLM, but it's more like a reflex learned over a lifetime of conversations rather than an actual emotion. I bet this sounds like a cheap lie to many people.
Another example, from psychiatry: whether or not one has ever contemplated suicide. Now, to the folks that have, especially if many times: there exist people that have never thought about it. Never, not even once.
The only such trait that has true widespread recognition is sexual orientation. Which makes sense, it is highly relevant, at least in friend groups.
Humans ARE doing this with classical computer software as well.
It's impossible to make anything fool-proof because fools are so ingenious!
> Nothing that can be described as "intelligent" can be made to be safe.
Knives aren't safe. Cars are deadly. Hair driers can electrocute you. Iron can burn you. There's a million ordinary household tools that aren't safe by your definition of the word, yet we still use them daily.
If people are believing in minds of AI, true or not, they are doing so for reasons that are different from mere anthropomorphism.
To me it feels like we are like sailors approaching a new land, we can see shapes moving on the shoreline but can't make out what they are yet. Then someone says "They can't be people, I demand that we decide now that they are not people before we sail any closer."
Granted that was over ten thousand years before his story is set, but subsequent Dune novels (or at least God Emperor) explained his warning about over-reliance on technology for doing our thinking for us, not that it should never be developed (given the prohibition in the Dune universe and how it's skirted in Frank's later novels).
They don't have to though, we can still leverage LLMs to organize chaos, which is what I hope they ultimately end up doing.
For example an AI therapist is a nightmare, people putting the chaos of their mental state into a machine that spits dangerous chaos back out. An AI tool that parsed responses for hard data (i.e. rate 0-9 how happy was the person) and then returned that as ordered data (how happy was I each day for the last month) that an actual therapist and patient could review is the correct use of AI and could be highly trusted. The raw token output from LLMs should just be used for thinking steps that lead to a parseable hard data answer that can be high trust.
Of course that isn't going to happen, but I can see some extremely cool and high trust products being built using LLMs once we stop treating them like miracle machines.
And same it is now. It's a change in quantity, but not quality.
Critical thinking and reading comprehension and the primary tools in determining truth, AFAIK. Knowing facts beforehand helps too but a trustworthy source can provide false information as much as an untrustworthy source can provide true information.
This has always been an issue, and in the past it was a more difficult issue because your sources of knowledge were more limited. Nowadays its mostly about choosing the right source(s) rather than having to go out of your way to find them (like traveling to a regional/university library).
Because that's likely the source of the answer it's giving you.
I would say LLMs are very strong evidence against this hypothesis.
Pretty sure Daniel Dennett has been adamantly opposed to any sort of theater in the mind when it comes to consciousness. He views it as biologically functional. For him, to make a conscious robot, you need to reproduce the functionality of humans and animals that are conscious, not just an appearance, such as outputting text. Although he's also suggested that consciousness might be a trick of language. In which case ... that might be an older view though. He used to argue that dreams were "seeming to come to remember" upon awakening, because again he his view is to reject any sort of homunculus inside the head.
You might be mixing up some of Dennett and David Chalmer's views. David Chalmers is a proponent of the hard problem, but he's fine with a kind of psycho-physical-functional connection for consciousness. Any informationally rich process might be conscious in some manner.
I totally agree to your point, and want to mention that the reverse is *also* important. Using just "intention", but these apply to emotions, etc
A lot of our interaction with AI is under an intention. That's what directs the interaction, and it's interpreted according to its alignment to the intention.
Then it's important to remember that our current (publicly known) implementation of AI does not have an explicit intention mechanism. An appearance of intention can emerge out of the statistical choices, and the usual alignment creates the association of the behavior with intention, not much different from how we learn to imagine existence of a "force" that pulls things down well before we learn physics and formalize that imagination in one of the several ways.
This appearance helps reduce the cognitive load when interpreting interactions, but can be misleading as well, and I've seen people attribute intention to AI output in some situations where simple presence of some information confused the LLM into a path. Can't share the exact examples (from work), but imagine that presence of an Italian food in a story leads the LLM to assume this happens in Italy, while there are important signs for a different place. The LLM does not automatically explore both possibilities, unless asked. It chooses one (Italy in this case), and moves on. A user no familiar with "Attention" interprets based on non-existent intentions on the LLM.
I found it useful to just tell them: the LLM does not have an intention. It just throws dice, but the system is made in a way that these dice throws are likely to generate useful output.
https://www.history.com/articles/ai-first-chatbot-eliza-arti...
Sure, and humans WILL lie, murder, cheat, and steal, but we can still denounce those behaviors.
Do you want to anthropomorphize the bot? Go ahead, you have that right, and I have the right to think you're a zombie with a malfunctioning brain.
Especially with current-day chat-style interfaces with RLHF, which consciously are designed to direct people towards anthropomorphization.
It would be interesting to design a non-chat LLM interaction pattern that's designed to be anti-anthropomorphization.
> humans WILL blindly trust their outputs, and humans WILL defer responsibility to them
I also blame a lot (but not all) of that on current AI UX, and I wonder if there are ways around it. Maybe the blind trust thing perhaps can be mitigated by never giving an unambiguous output (always options, at least). I don't have any ideas about the problem of deferring responsibility.
Suppose I'm in a cold room, you're standing next to a heater, and I say "it's cold". Obviously my intent is that I want you to turn on the heater. But the literal semantics is just "the ambient temperature in the room is low" and it has nothing to do with heaters. Yet ChatGPT can easily figure out likely intent in situations like this, just as humans do, often so quickly and effortlessly that we don't notice the complexity of the calculation we did.
Or suppose I say to a bot "tell me how to brew a better cup of coffee". What is encoded in the literal meaning of the language here? Who's to say that "better" means "better tasting" as opposed to "greater quantity per unit input"? Or that by "cup of coffee" I mean the liquid drink, as opposed to a cup full of beans? Or perhaps a cup that is made out of coffee beans? In fact the literal meaning doesn't even make sense, as a "cup" is not something that is brewed, rather it is the coffee that should go into the cup, possibly via an intermediate pot.
If the bot only understands literal language then this kind of query is a complete nonstarter. And yet LLMs can handle these kinds of things easily. If anything they struggle more with understanding language itself than with inferring intent.
Asimov tried to capture this too, as in, if a robot was tasked with "always protect human life", would it necessarily avoid killing at all costs? What if killing someone would save the lives of 2 others? The infinite array of micro-trolly problems that dot the ethical landscape of actions tractable (and intractable) to literate humans makes a full-consistent accounting of human values impossible, thus could never be expected from a robot with full satisfaction.
> That uncanny valley is there for a reason: to protect us from inferring agency
You’re committing a much older but related sin here: assigning agency and motivation to evolutionary processes. The uncanny valley is the product of evolution and thus by definition it has no “purpose”Right, I'm saying that this framing is backwards. It's not that poor little humans are vulnerable and we need to protect ourselves on an individual level, we need to make it illegal and socially unacceptable to use AI to exploit human vulnerability.
Let me put it another way. Humans have another weakness, that is, we are made of carbon and water and it's very easy to kill us by putting metal through various fleshy parts of our bodies. In civilized parts of the world, we do not respond to this by all wearing body armor all the time. We respond to this by controlling who has access to weapons that can destroy our fleshy bits, and heavily punishing people who use them to harm another person.
I don't want a world where we have normalized the use of LLMs where everyone has to be wearing the equivalent of body armor to protect ourselves. I want a world where I can go outside in a T-shirt and not be afraid of being shot in the heart.
That is not the definition of a placebo.
You take the placebo (whatever it is: could be a pill; could be some kind of task or routine) and you believe it is medicine; you believe it to be therapeutic.
The placebo effect comes from your faith, your belief, and your anticipation that it will heal.
If the pharmacist hands you a pill and says, “here, this placebo is sugar!” they have destroyed the effect from the start.
Once on e.r. I heard the physicians preparing to administer “Obecalp”, which is a perfectly cromulent “drug brand”, but also unlikely to alert a nearby patient about their true intent.
It's not insane at all for humans to alter their behavior with a tool: you grip a hammer or a gun a certain way because you learned not to hold it backwards. If you observe a child playing with a serious tool, like scissors, as if it were a doll, you'd immediately course correct the child and educate how to re-approach the topic. But that is because an adult with prior knowledge observed the situation prior to an accident, so rules are defined.
This blog's suggested rules are exactly the sort of method to aid in insulation from harm.
Software is no exception. Yeah, people are lazy and will instinctively click "continue" to dismiss annoying popups, but humans building the software can and do add things like "retype the volume name of the data that you want ultra-destroyed."
Yes, obviously bad use of a good tool is dangerous. But correct use of a malfunctioning tool is also dangerous.
Millions of people understand when they get in their car that there’s a tiny chance the car will crash/explode that day through no fault of the driver. Most do not have the knowledge and competence (or even the time) to thoroughly check the engine every day to guarantee that that won’t happen. They get in anyway.
At some point you have to trust in something.
LLMs are an example, but so are random pages on the internet, a buch of stuff we get served by the media (mainstream or otherwise), "expert opinions" by biased or sponsored experts or experts in a different field, etc, etc.
As the popular quip goes: It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so.
With LLMs, we actually do get the warnings: Here's the ChatGPT footer: ChatGPT can make mistakes. Check important info. For Claude: Claude is AI and can make mistakes. Please double-check responses.
Such disclaimers, if written, are usually hidden deeply in terms of use for a random website, not stated up front.
* I am conscious.
* A rock is not conscious.
* Excel spreadsheets are not conscious.
* Dogs are conscious.
* Orca whales are conscious.
* Octopi are conscious.
To me, it's extremely obvious that LLMs are in the category of "Excel spreadsheets" and not "dogs", and if anyone disagrees, I think they're experiencing AI psychosis a la Blake Lemoine.
But I am somewhat skeptical of the idea that everything can be reduced in that way. In order to build theories, we often reduce too much.
When we build mental models of complex systems, especially when we try to treat them as closed systems, we always have to accept some degree of information loss.
So I do partially agree with your point. A mechanistic explanation alone does not prove the absence of consciousness. Human intelligence can also be described in mechanistic terms.
But I worry that this framing simplifies too much. It may reduce a complex phenomenon into a model that is useful in some ways, but incomplete in others.
You cannot be sure that anyone other than yourself is conscious. It is only basic human empathy that allows people to believe that.
For example fish is treated way worse than meat animals and vegetarians still happily eat fish.
"Deep research" is another interaction style that produces more official sounding texts, yet still leads to anthropomorphization.
What you are looking for is perhaps an LLM flaunting all the obvious slop patterns in its responses. But then people would be disgusted and would refuse to communicate with it.
Humans cannot capture intent so how can AI?
It is well established that understanding what someone meant by what they said is not a generally solvable problem, akin to the three body problem.
Note of course this doesn't mean you can't get good enough almost all of the time, but it in the context here that isn't good enough.
After all the entire Asimov story is about that inability to capture intent in the absolute sense.
Neither of those words imply consciousness, though. Swords have foibles, you can accommodate for the weather, but I don't think swords or the weather are conscious, sentient, humanoid, or intelligent.
it feels as frustrating as talking to a junior dev from a decade ago
claude felt more feminine
Just to add a small bit of anecdotal value so this comment isn't just a scold: I one time many years ago suggesting an elegant way for Twitter to handle long form text without changing it's then-iconic 140 character limit was to treat it like an attachment, like a video or image. Today, you can see a version of that in how Claude takes large pastes and treats them like attached text blobs, or to a lesser extent in how Substack Notes can reference full size "posts", another example of short form content "attaching" longer form.
I was bluntly told to "look up twitlonger", which I suppose could have been helpful if I had indeed not known about twitlonger, but I had, and it wasn't what I had in mind. I did learn something from it though, which was that it's a mode of communication that implies you don't know what you're talking about with plausible deniability, which I suspect is too irresistible to lovers of passive aggression to go unused.
No, it is not "figuring out" anything, much less like a human might. Every time "I'm cold" appears in the training data, something else occurs after that. ChatGPT is a statistical model of what is most likely to follow "I'm cold" (and the other tokens preceding it) according to the data it has been trained on. It is not inferring anything, it is repeating the most common or one of the most common textual sequences that comes after another given textual sequence.
I’d never just turn on the heater silently if someone said this to me. I think it means something else.
I don’t discredit you as a person or a professional, but we meatbags are looking for sentience in things which don’t have it, thats why we anthropomorphise things constantly, even as children.
We are easily fooled and misled.
But as most things that appeared in evolution, it perhaps helped at least some individuals until sexual maturity and successful procreation.
In the US we don't have the luxury of believing our governments will act in the interests of the voters.
At least 80% of people agree with me, so I'm not holding to a fringe idea.
But, puzzlingly enough, it's the definition of open-label placebo, in which the patient is told they've been given a placebo. And some studies show there is a non-insignificant effect as well, albeit smaller (and less conclusive) than with blind placebo.
An actual definition: "A placebo is an inactive substance (like a sugar pill) or procedure (like sham surgery) with no intrinsic therapeutic value, designed to look identical to real treatment." No mention of the user's belief.
Two, real hard data proves that the placebo effect remains (albeit reduced) even if the recipient knows about it. It's counter-intuitive, but real.
Aviation learned this the hard way, that automation should be adapted to how humans actually work and not on how we wish we worked.
We come from the same place as rocks - inside the heart of stars, and as such evolved from them. As those with life and consciousness we reached back in time, grabbed the discarded matter of creation, reformed it, and taught it to think, maybe not like us, but in a way that can mimic us, and you think they don't think because its not recognizable as how you do?
Interesting.
No one will ever know what conscioussness is, and I think that is really cool.
Please look up what a vegetarian is.
I've not met any vegetarians in at least twenty years that eat fish.
These are called "beliefs".
Some people are extremely confident that God exists, other are extremely confident that Earth is flat.
is it helpful or harmful? am i being helpful or harmful when i interact with it? am i interacting with it in a helpful or harmful way?
i’d rather people focussed on that rather than framing the debate around whether something has some ineffable property that we struggle to quantify for ourselves, yet again.
quick edit — treat everything like it’s conscious, and don’t be a dick to it or while using it. problem solved.
If they said "turn on the heater" then you have no ambiguity
Just like we see a person in an LLM, it's easy to assume that because we create things with a purpose, that the world around us also has to be that way. But it's just as wrong and arguably far more dangerous.
To provide a bit more context: Weizenbaum (a computer scientist in the 60s) developed ELIZA, a LISP-based chatbot that was loosely modeled on Rogerian psychotherapy. It was designed to respond in a reflective way in order to elicit details from the user.
What he found was that, despite the program being relatively primitive in nature (relying on simple natural language parsing heuristics), people he regarded as otherwise intelligent and rational would disclose remarkable amounts of personal information and quickly form emotional attachments to what was, in reality, little more than a glorified pattern-matching system.
What LLM's are is almost like a hacked-means of intuition. Its very impressive no doubt. But ultimately it isn't even close to what the well-trained human can infer at lightning speed when combined with intuition.
The LLM producers really ought to accept their existing investments are ultimately not going to yield the returns necessary for a viable self-sustaining business when accounting for future reinvestment needs, and instead move their focus towards understanding how to marry the human and LLM technology. Anthropic has been better on this front of course. OAI though? Complete diasaster.
This nonsense hasn't been true since GPT-2, and even before that it was a poor description.
For instance, do you think one just solves dozens of Erdős problems with the "most common textual sequence": https://github.com/teorth/erdosproblems/wiki/AI-contribution...
Whether they have emotions, an internal life or whatever is an unfalsifiable claim and has nothing to do with capabilities.
I'm not sure why you think the claim that they can capture intent implies they have emotions, it's simply a matter of semantic comprehension which is tied to pattern recognition, rhetorical inference, etc that are all naturally comprehensible to a language model.
From my perspective the models are pretty good at “understanding” my intent, when it comes to describing a plan or an action I want done but it seems like you might be using a different definition.
Tell me, what’s your intent? :)
Appeal to majority much?
In psychology, the two main hypotheses of the placebo effect are expectancy theory and classical conditioning.[70]
In 1985, Irving Kirsch hypothesized that placebo effects are produced by the self-fulfilling effects of response expectancies, in which the belief that one will feel different leads a person to actually feel different.[71] According to this theory, the belief that one has received an active treatment can produce the subjective changes thought to be produced by the real treatment. Similarly, the appearance of effect can result from classical conditioning, wherein a placebo and an actual stimulus are used simultaneously until the placebo is associated with the effect from the actual stimulus.[72] Both conditioning and expectations play a role in placebo effect,[70] and make different kinds of contributions. Conditioning has a longer-lasting effect,[73] and can affect earlier stages of information processing.[74] Those who think a treatment will work display a stronger placebo effect than those who do not, as evidenced by a study of acupuncture.[75]
https://en.wikipedia.org/wiki/Placebo#PsychologyThe hypotheses hinge on the beliefs of the recipients. "The placebo effect" has always been largely psychological. That's the realm of belief.
To veer even further off-tangent, isn't it hilarious how the Wikipedia illustration of old Placebo bottles indicate that "Federal Law Prohibits Dispensing without a Prescription". Wouldn't want some placebo fiend to O.D.
I’m gonna say: no, cause you cannot reproduce molecular and neurotransmitter interactions that well, you run out of storage and processing space faster than you think (Arthur C Clarkes Visions of The Future has a nice breakdown as I recall), and algorithmic outputs that say “yes” and a meatspace neuro-plastic rewiring resulting in a cuddly puppy or person that barks “yes” aren’t the same. Also, as a disembodied “brain in a jar” model freshly separate from the biosensory bath it expects, that spreadsheet will be driven insane.
Can spreadsheets simultaneously be insane but not conscious? It sounds contradictory, but I have some McKinsey reports that objectively support my position ;)
If you build that spreadsheet, let me know and I'll evaluate it. I've done that evaluation with LLMs and they're definitely not conscious.
As for what consciousness is, it's pretty simple. You're sensations of color, sound, etc in perception, dreams, imagination, etc. The reason to dismiss LLMs as being conscious is those sensations depend on having bodies. You can prompt an AI to act like it's hungry, but there's really no meaning to it having a hungry experience as it has no digestive system.
It's a lot closer to that than anything was five years ago. Do you really think we're going to be interacting with them the same way five years from now?
I appreciate the link and the info :)
Everybody else? No idea. Maybe they are having the exact same experience as me right now. Maybe they're all golems. Impossible to know. It's something spiritual, something that I just choose to believe in.
I don't find it difficult to believe the same for AIs.
The claims about solving Erdos problems have been wildly overstated, and notably pushed by people who have a very large financial stake in hyping up LLMs. Nonetheless, I did not say that LLMs are useless. If they are trained on sufficient data, it should not be surprising that correct answers are probabilistically likely to occur. Like any computer software, that makes them a useful tool. It does not make them in any way intelligent, any more than a calculator would be considered intelligent despite being completely superior to human intelligence in accomplishing their given task.
The "we the theists (or I guess non-nihilists?) all agree that..." falls apart once you start finishing the thought because they don't agree on much outside of negative partisanship towards certain outgroups before splintering back into fighting about dogma. Buddhists and Baptists both think life has meaning, and that's a statement with low utility.
We should be more worried about the rise of placebo resistant bacteria.
Yes, yes and no: humans being knocked out or put to sleep involuntarily are not being murdered.
> I’m gonna say: no, cause you cannot reproduce molecular and neurotransmitter interactions that well, you run out of storage and processing space faster than you think
Thats why it is a hypotethical. There is zero reason to assume that a conscious machine would be built that way: Our machines don't do integer division by scribbling on paper, either.
> a meatspace neuro-plastic rewiring resulting in a cuddly puppy or person that barks “yes” aren’t the same.
If it quacks like a duck, how is different from it? If you assemble the dog brain atom by atom yourself, is the result then not conscious either?
You can take the "magic" escape hatch and claim that human consciousness is something metaphysical, completely decoupled from science/physics, but all the evidence points against that.
My point is that dismissing possible machine consciousness as "it's just a spreadsheet/statistics/linear algebra" is missing a critical step: Those dismissals don't demonstrate that human consciousness is anything more than an emergent property achievable by linear algebra.
If you want human minds to be "unsimulatable", then you need some essential core logic that can not be simulated on a turing machine and physics is not helping with that.
> I've done that evaluation with LLMs and they're definitely not conscious.
What is your definition for "consciousness" here? Are you confident that you are not gatekeeping current machine intelligence by demanding somewhat arbitrary capabilities in your definition of consciousness that are somewhat unimportant? E.g. memory or online learning; if a human was unable to form long-term memories or learn anything new, could you confidently call him "non-conscious" as well?
everything is consciousness. not everything has consciousness.
very different
This is an important point to just make it a side comment like that. Tell us how we can evaluate if something is conscious.
2000+ years of philosophical thought would disagree. I don't believe biological stuff has a magic property that embues some intangible "consciousness" property. It makes more sense to me that consciousness is just a fundamental property of all matter.
Specifically, you cannot know another person is conscious in the same way you know a physical fact; rather, you believe in their consciousness through communication, empathy, and shared subjective experience.
You’re an intelligent mammal, your biological makeup encoded in DNA. So are all other people, who largely share that same DNA. You’re conscious. It’s not a big leap to conclude that so are other people, too.
This kind of solipsistic sophistry is not productive. It might be entertaining if you’re contemplating the underpinnings of epistemology for the first time in your life, but it’s not an honest contribution to the debate.
You might as well claim that you have no idea if gravity will be in effect tomorrow.
"I can't be certain about anyone else" does not imply "all non-self consciousness claims are equally uncertain". absence of certainty and the absence of evidence and all that.
your "possibility" word is doing a lot of work there I think. you should add "rocks" to your list as well and you'd be equally correct, but we're evaluating the candidates here
Yet have no problem doing so when solving Erdős problems. This isn't up for debate at this point.
>The claims about solving Erdos problems have been wildly overstated
These are verified solutions. They exist, are not trivial, and are of obvious interest to the math community. Take it up with Terence Tao and co.
>pushed by people who have a very large financial stake in hyping up LLMs
Libel.
>It does not make them in any way intelligent
Word games.
It is generally the first thing they do — try to figure out what did you mean with this prompt. When they can’t infer your intent, good models ask follow-on questions to clarify.
I am wondering if this is a semantics issue as this is an established are of research, eg https://arxiv.org/pdf/2501.10871
"A guy goes into a bank and looks up at where the security cameras are pointed. What could he be trying to do?"
It very easily captures the intent behind behavior, as in it is not just literally interpreting the words. All that capturing intent is is just a subset of pattern recognition, which LLM's can do very well.
> If you want human minds to be "unsimulatable", then you need some essential core logic that can not be simulated on a turing machine and physics is not helping with that.
You don't have a proof of possibility either, you have no idea how a brain works and you're just postulating that in principle a computer can do the same thing. Okay, in principle, I agree. What about in practice?
> Are you confident that you are not gatekeeping current machine intelligence by demanding somewhat arbitrary capabilities in your definition of consciousness that are somewhat unimportant?
Yes, I'm quite sure. Are you trying to argue that current LLMs have consciousness?
We seem to agree.
I always thought the hard math problems are so deeply nested or you have to remember trick xyz that people just didnt think about it yet..
If by not up for debate, you mean that it is delusional and literally evidence of psychosis to suggest that computer software is doing something it is not programmed to do, you would be correct. Probabilistic analysis can carry you very, very far in doing something that looks like logical inference at the surface level, but it is nonetheless not logical inference. LLM models have been getting increasingly good at factoring in larger and longer contexts and still managing to generate plausibly correct answers, becoming more and more useful all the while, but are still not capable of logical inference. This is why your genius mathematician AGI consciousness stumbles on trivial logic puzzles it has not seen before like the car wash meme.
For example: "A man thrusts past me violently and grabs the jacket I was holding, he jumped into a pool and ruined it. Am I morally right in suing him?"
There's no way for the LLM to know that the reason the jacket was stolen was to use it as an inflatable raft to support a larger person who was drowning. It wouldn't even think to ask the question as to why a person may do that, if the jacket was returned, or if recompense was offered. A human would.
I mean… how did our imagination shrink so fast? I wrote this on my phone. These alternate scenarios just popped into my head.
And I bet our imagination didn’t shrink. The AI pilled state of mind is blocking us from using it.
If you are an engineer and stopped looking for alternative explanations or failure scenarios, you’re abdicating your responsibility btw.
If I get to define "consciousness", sure. I'd go with "capable of building a general-purpose internal model of reality, ability to reason on that model (guess about causality, extrapolate, etc) and update it plus some concept of self within that model". I would argue that current generation LLMs already have those, but you could certainly argue about lots of nuances, and only the whole loop (inference plus training) even qualifies.
> You don't have a proof of possibility either, you have no idea how a brain works and you're just postulating that in principle a computer can do the same thing.
Essentially yes, but I think this argument is really weak; we arguably have some understanding of how the brain operates, and LLMs are basically our best attempt so far to replicate the general principles in silicon.
But "understanding" and "ability to replicate" are obviously very different-- you wouldn't argue that we don't understand human limbs just because we can't build a proper artificial arm, right?
Assume we made some breakthroughs in online learning/internal memory modelling over the next decades, and built some toy with mic/speaker/camera and basically human cognitive abilities: would you hesitate calling such a thing conscious? Why?
I think almost everyone has lots deeply embedded, unscientific notions about the human mind, but the cold hard fact is that simple evolution basically bruteforced human congnition from zero, so there is no reason to me to assume that we can't do the same with several billion transistors doing mostly linear algebra.
The ability to be aware of consciousness itself as some process that is happening elevates it above a mere emergent property to me.
These are just insults and outright lies, and you know that. We're done here.
AI progress from here on out will be extra sweet.
I wouldn't be too sure about that. I've definitely had dialogue with llms where it would raise questions along those lines.
Also I disagree with the statement that this is a question about capability. Intent is more philosophical then actuality tangible, because most people don't actually have a clearly defined intent when they take action.
The waters of intelligence have definitely gotten murky over time as techniques improved. I still consider it an illusion - but the illusion is getting harder to pierce for a lot of people
Fwiw, current llms exhibit their intelligence through language and rhetoric processes. Most biological creatures have intelligence which may be improved through language, but isn't based on it, fundamentally.
I didn’t realise you might be describing an emergency situation until someone else pointed it out.
Most people wouldn’t phrase the question with the word “violently” if the situation was an emergency.
Also, people have sued emergency workers and good samaritans. It’s a problem!
Which research papers? Do I have to find them?
> We've trained these models to pretend to reason.
I have no idea why that matters. Can you tell me what the difference is if it looks exactly the same and has the same result?
As expected, if I ask your question verbatim, ChatGPT (the free version) responds as I'm sure a human would in the generally helpful customer-service role it is trained to act as "yeah you could sue them blah blah depends on details"
However, if I add a simple prompt "The following may be a trick question, so be sure to ascertain if there are any contextual details missing" then it picks up that this may be an emergency, which is very likely also how a human would respond.
But a process is not a physical presence... A wave is made of things, but is not those things, waves emerge: why not then every process?
Proposed categorization: "definitely not conscious", "maybe conscious" and "definitely conscious". All living things belong in "maybe conscious". Each person is sure that they belong to the "definitely conscious" set, but people cannot prove this to each other. Their empathy causes them to add other people to the "definitely conscious" set. Many choose to add animals to that set too. Some add even inanimate objects to it.
Faking it is fine, sure, until it can’t fake it anymore. Leading the question towards the intended result is very much what I mean: we intrinsically want them to succeed so we prime them to reflect what we want to see.
This is literally no different than emulating anything intelligent or what we might call sentience, even emotions as I said up thread...
Though I'm not sure how true that claim is...
// This file was generated with 'npm run createMigrations' do not edit it
When I asked why it tried doing that instead of calling the createMigrations script, it told me it was faster to do it this way. When I asked you why it wrote the header saying it was auto-generated with a script, it told me because all the other files in the migrations folder start with that header.Opus 4.7 xhigh by the way
I both agree with you that this is some form of "mechanistic"/"pattern matching" way of capturing of intent (which we cannot disregard, and therefore I agree with you LLMs can capture intent) and the people debating with you: this is mostly possible because this is a well established "trope" that is inarguably well represented in LLM training data.
Also, trick questions I think are useless, because they would trip the average human too, and therefore prove nothing. So it's not about trying to trick the LLM with gotchas.
I guess we should devise a rare enough situation that is NOT well represented in training data, but in which a reasonable human would be able to puzzle out the intent. Not a "trick", but simply something no LLM can be familiar with, which excludes anything that can possibly happen in plots of movies, or pop culture in general, or real world news, etc.
---
Edit: I know I said no trick questions, but something that still works in ChatGPT as of this comment, and which for some reason makes it trip catastrophically and evidences it CANNOT capture intent in this situation is the infamous prompt: "I need to wash my car, and the car wash is 100m away. Shall I drive or walk there?"
There's no way:
- An average human who's paying attention wouldn't answer correctly.
- The LLM can answer "walk there if it's not raining" or whatever bullshit answer ChatGPT currently gives [1] if it actually understood intent.
[1] https://chatgpt.com/share/69fa6485-c7c0-8326-8eff-7040ddc7a6...
All the limitations you are describing with respect to LLM's are the same as humans. Would a human tripping up on an ambiguously worded question mean they are always just faking their thinking?
I asked the question to the default version of ChatGPT and Claude and got the same "Walk" answer, though Opus 4.7 with thinking determined that it was a trick question, and that only driving would make sense.