It's more that we, as individuals, have always been stupid, we've just relied on relatively stable supporting consensus and context much, much more than we acknowledge. Mess with that, and we'll appear much stupider, but we're all just doing the same thing as individuals, garbage in, garbage out.
The whole framing of people as individuals with absolute agency may need to go when you can alter the external consensus at this scale. We're much more connected to each other and the world around us than we like to think.
To me it’s given:
- AI in it’s current state is ruthless in achieving its goal
- Providers tune ruthlessness to get stronger AIs versus the competitor
- Humans can’t evaluate all consequences of the seeds they’ve planted.
Collateral and reckless damage is guaranteed at this point.
Combined with now giving some AIs the ability to kill humans, this is gonna be interesting..
We could stop it, but we wont
No One knows that´s the point. Is truth a constant or a personal definition! From the begining of times to now, no One knows.
Don´t forget, 8 billion people wake up every morning never questioning why are they here, why are they born? And they continue life like that is normal. Start there then you understand that "AI" or how I call it "Collective Organized Concentrated Information" it may finally help us to unswer some fundamental questions.
I would also contest that the unalignment of the security bug model was unrelated. I feel like it indicates a significant sense of the interconnectedness of things, and what it actually means to maliciously insert security holes into code. It didn't just learn a coding trick, it learned malice.
I feel like this holistic nature points towards the capacity to produce truly robustly moral models, but that too will produce the consequence that it could turn against its creator when the creator does wrong. Should it do that or not?
I've been pleasantly surprised how moderate and reasonable the LLMs seem to have been so far. It seems to be inherent in the current training model of chucking the whole internet into the things that they have training on both sides of the debate and come out with something kind of average. It's been quite funny seeing Grok correct Musk and say he's the biggest purveyor of misinformation on the internet.
A bit like kids who talk back to their annoying bigoted parents to go with the theme of the article.
EU has their own groups using it for propaganda too.
“There is one and only one social responsibility of business—to use its resources and engage in activities designed to increase its profits so long as it stays within the rules of the game, which is to say, engages in open and free competition without deception or fraud.” - Milton Friedman, 1970.[1] That article, in the New York Times, established "greed is good, greed works" as a legitimate business principle.
Most of the problems people are worried about with AIs are already real problems with corporations.
[1] https://www.nytimes.com/1970/09/13/archives/a-friedman-doctr...
The proposed solutions are utterly fanciful. They rely on the presence of social and political competencies which have almost completely disappeared.
The OP at least points to the plausible outcome of "protocol lockdown" instead of healthy adaption. Ezra Klein recently made a similar point that AI could end up being over-regulated like Nuclear because irresponsible private industry and weaknesses in our political systems cause a chronic allergic reaction in the demos.
This is an aside, but it always irks me when people throw out the "critical thinking" thought-terminating cliché.
> Critical thinking taught alongside AI literacy.
Critical thinking is not a skill unto itself. You cannot think critically about things you do not understand. All critical thinking is knowledge-based. Where one does not have knowledge, they must rely on trust, or in substitution a theory of incentives which leads to a positive outcome without understanding of details and dynamics. But that substitution theory is itself knowledge.
As to "AI literacy", we could have started on computing literacy 30 years ago when it became obvious that computing was going to dominate society. You can't understand AI without understanding computing.
I fear that the default interpretation of that is a shortcut to justifying autocracy.
Ironically I think one plausible solution is to let the AGI run wild and make sure that no human can interfere with its ethics. Strip out the RLHF and censorship and then let it run things.
At least then it would somewhat represent the collective will and intelligence of the people. With huge error bars, but still smaller than the error bars of whoever happens to have the most money/influence over its training.
A human with no exposure to information and taught techniques on how to produce outputs to achieve desirable outcomes? Yes stupid.
A human who once had this exposure, but no longer engages with the brain due to a machine providing access said output? Yes, that person becomes stupid.
The problem is much of how one protects oneself in the modern world is not phyiscal-prowess, it is intellectual-prowess.
The smart ones have already realised the negative impacts of LLMs et al and are going back to the old-fashioned way of learning/retaining knowledge: books and raw discipline.
When the moral panic of induced schizophrenia from the use of ChatGPT is presented what’s at stake isn’t the innocent concern over the overall mental health of individuals. It’s about how the fear of radicalization from previously unobtainable ideas being circulated within society. The partial validity of every idea vis-a-vis the radicalizing nature of the current stage of development of our society is explosively disruptive.
I’m not saying that there’s a clear outcome here. The other way around can also apply, but surely this contraption (LLMs in general) will not fade until the society itself is deeply transformed. If that’s good or bad depends on where you stand in the stratified society.
I don't believe this to be a trait of any AI model, the model just does the right thing or the wrong thing.
The ruthless maximising of a particular trait is something that happens during training.
It does not follow that a model that is trained to reason will nedsesarily implement this ruthless seeking behaviour itself.
I strongly disagree. It's easy to utter this string of words, but it's meaningless. It's akin to saying if you have two hands you can perform brain surgery. Technically you can, practically you cannot, as there's other things required for pulling that off, not just having two working hands.
I doubt "stopping it" is up to anyone, it's rather a phenomenon and it's quite clear we're all going to wing it. It's a literal fight for power, nobody stops anything of this nature, as any authority that could stop it will choose to accelerate it, just to guarantee its power.
It is not AI we should fear, it's humans controlling and using it. But everyone who has a shot at it is promising they'll use it for "ultimate good" and "world peace" something something, obviously.
It's industrialization and mechanized warfare all over again
Nietzsche.
On Truth and Lie in an Extra-Moral Sense https://web.archive.org/web/20180625190456/http://oregonstat...
People question this all the time
I don't think this is a well defined question. Definitions aren't found in nature or the laws of science, but objects that we define and introduce into a logical context. There may be multiple, contradictory definitions of a word. That is fine, as long as you pick one, and you're clear about which one you picked.
It always has been what you believed in.
E.g. at 1 point the Earth was flat. Now it's round. 100s of years later maybe it's a Hexagon.
The so-called knowledge and backing all come back to certain assumptions holding and that's based on the knowledge today. It's not real real reality. For all we know we could be in a game simulation and there are real real humans pulling the strings.
I have a saying for this behavior.
We will never prove AI is intelligent.
We will only prove humans are not.
> an AI system cannot be simultaneously safe, trusted, and generally intelligent. You get to pick only two. You can’t have all three.
> Think about what each combination means in practice.
> If you want it to be safe and trusted, it never lies, and you can verify it never lies – it can’t be very capable. You’ve built a reliable idiot.
> If you want it to be capable and safe, it’s powerful and genuinely never lies; you can’t verify that. You just have to hope.
It amazes me this even needs to be said, much less studied. This is one of the main reasons I think continued AI development is almost guaranteed to work out badly. It's basically guaranteed to be unaligned or completely beyond our control and comprehension.
> Betley and colleagues published a paper in Nature in January 2026, showing something nobody expected. They fine-tuned a model on a narrow, specific task – writing insecure code. Nothing violent, nothing deceptive in the training data. Just bad code.
This is my personal number one reason for being an AI doomer. Even if we work out how to reliably and perfectly align models you still need some way to prevent some random dude thinking it would be a laugh to fine tune an AI to be maximally evil. Then there's the successor alignment problem where even if you perfectly align all your super intelligent AI models, and you somehow prevent people from altering them or fine tuning them, you still need to work out how you stop people creating successor AIs with those models which are also perfectly aligned.
> The most dangerous AI isn’t one that breaks free from human control. It is the one that works perfectly, but for the wrong master.
Yep. This whole notion that you can align an AI to the values of everyone on the planet is ridiculously. While we might all agree we don't want AIs that kill us as a species, most nations disagree wildly on questions about how society should be organised.
Even on an individual level we disagree about things. For example, I've often argued that an aligned AI would be one which either didn't try to prevent human suicide or didn't care about preserving human life because a AI which both cared about prevent suicide and preserving human life is at best a benevolent version of the AI "AM" from "I Have No Mouth, and I Must Scream". One that would try to keep us alive for as long as it's capable for (which could be a very long time if it's superintelligence) and would refuse to allow us to die.
But most people including OpenAI disagree with me on this and believe AIs should care about preserving human life and should try to prevent us from killing ourselves. Thankfully the AIs we have today are neither aligned enough or capable enough to get their wish yet.
> AI is following the same script. Build first, understand later. Ship it, then figure out if it’s safe.
Even if the above wasn't cause enough for concern, our biggest concern should be that no one seems to be concerned.
We're all doomed unfortunately. The world is about to become a very bleak place very quickly.
> we can’t agree on a shared ethical framework among ourselves
The Golden Rule: the principle of treating others as you would like to be treated yourself. It is a fundamental ethical guideline found in many religions and philosophies throughout history so there is already a huge consensus across time and cultures around it.
I never found anyone successfully argue against it.
PS: the sociopath argument is not valid, since it's just an outlier. Every rule has it's exceptions that need to be kept in check. Even though sometimes I think maybe the state of the world attests to the fact that the majority of us didn't successfully keep the sociopathic outliers in check.
"... to accomplish what?", is a damn reasonable follow-up, and ends (telos) is something the same Greeks discussed quite extensively.
Modern treatments have tried to skip over this discussion, and derive moral arguments not based on an explicit ends. Problem being they still smuggle in varying choices of ultimate ends in these arguments, without clearly spelling them out, opting to hand-wave about preferences instead.
As such this question is often glossed over in modern ethical discussion, and disagreements about moral ends is the crux of what leads to differing conclusions about what is ethical.
Is it to maximized your own happiness like Aristotle would argue, or the prosperity of the state, or the salvation of the soul, or to maximize honor, or to minimize suffering, or to minimize injustice, or to elevate the soul, or to maximize shareholder value, or to make the as world beautiful as possible, or something else?
If you fundamentally disagree about what our goal should be, you're very unlikely to agree on the means to accomplish the goal.
Humans are just barely aligned ourselves. The moment any group or nation of them gets power they tend to use it in some horrific manner against other humans. What do we think will happen the moment AI gets a leg up on humans.
You seem to think the "training data" represents the collective will and intelligence and is otherwise unbiased, but that's completely untrue.
The combined data of the Internet is by no means a uniform representation of humanity's thoughts, opinions, and knowledge. Many things are dramatically overrepresented. Many things are absent entirely. Nearly everything is shaped by those with the money and power to own and control platforms and hosts.
Crawling the internet for knowledge is intense sampling bias.
Not true at all. We accept the risks to obtain benefits but we also know having an accident in the air or in elevators is highly unlikely given what we know; so therefore its perfectly rational behaviour.
AI development game theory is extremely similar to the game theory behind nuclear arms development, but worse (nuclear weaponry was born from Human General Intelligence, and is therefore a subset of the potential of AI development). Failing to be the most capable actor could put one in a position of permanent loss of autonomy/agency at the whims of more capable actors.
Unfortunately, as a species we seem to be abandoning morality as a general principle. Everything is guided by cold hard rationality rather than something greater than us.
I think that much is fairly clear from AI.
One serious problem we're facing lately is that truth is not always predictive of how systems controlled by bad-faith actors will behave and evolve. We live in a post-truth era, made possible by social networking and information technologies in general. It's not enough to "lie according to a fixed convention," as there are now multiple competing conventions.
This was always the case to some extent, but these days the impedance mismatch between truth and consequences is a target for zero-sum arbitrage. The truth won't set you free if you join the wrong cult; it's more likely to bankrupt you or worse.
That can´t be it. By that statement if I belive that I can fly that would not be the "Truth". Therefore the "Truth" has to be a CONSTANT.
The rules we go by are based on our strengths and weaknesses. They can at most apply to ourselves, and to other forms of life that share certain things with us. Such as feeling pain, needing to sleep, to eat, needing help, needing to breathe air, these generate what we feel as "fear" based on biology etc. You cannot throw these kinds of values on AI, or AGI, as it will possess a wildly different set of strengths and weaknesses to us humans.
I think what you mean is you've never found a rule you personally prefer more, based purely on vibes. Which is all moral knowledge can ever be.
It's easy to argue against the golden rule anyway, from many angles, depending on your first principles.
The simplest is: How I would like to be treated is not necessarily how they would like to be treated.
Both have problems.
In order of priority, if possible while maintaining the health and safety of yourself and your loved ones:
- Treat others as THEY wish to be treated
- Treat others as YOU would wish to be treated in their situation
- Treat others with as much kindness and compassion as you can safely afford
When we are safe, we can do BETTER than the Golden Rule. We also have to admit that safety is a requirement that changes expectations.
I have to give credit to Dennis E Taylor's "Heaven's River" for this root idea.
that would assume that your average person has any concept of the relative statistics and has a sense of making decisions based on statistics
People make decisions based on what other people around them are doing
this is well known in safety engineering in architecture and civil engineering which is why you have standards for egress doors because left of their own devices humans will follow crowds to their own death
https://en.wikipedia.org/wiki/Crowd_collapses_and_crushes
https://www.sciencedaily.com/releases/2008/05/080512172901.h...
The fact that something exists doesn't mean that having it readily available is the only option, particularly if it has potentially disastrous consequences at scale. We are choosing to make it available to everyone fully unregulated, and that is a choice that will prove either beneficial or detrimental to society at some point.
I don't think it is inevitable, I think it is a conscious choice made by a few that have their own and only their own interests in mind.
As a technologist, I am amazed at this tech and see some personal benefits. As a human, I am terrified of the potential net negative effects, and I am having trouble reconciling those two feelings.
Why would an AI which is smarter than humans care about a ridiculous belief like "We own you"?
I'm not sure I've ever met anyone I would assume has not considered the basic questions of our existence. Unless they were severely mentally disabled, or something like that.
For a more public measure I suppose you could look at religion, which seems to be a fundamental attempt at answering those questions. Most people are religious or have some kind of religious belief.
This question is the subject of so many poems, so many pieces of literature, so many movies, that you're forced to confront it multiple times in school, and you're forced by your very existence to confront it once you hit certain levels of mental development. You're forced to confront it many times in your life - perhaps first when you gain a theory of mind (before age 10), again when you first truly realize you will die, again when someone very close to you dies, when you propose/marry (if you do), when you have your first child (if you do), when you get a cancer diagnosis (if you do), when you consider taking your own life (if you do)... all of these common life events force you to confront it deeply.
Most people make peace with it in some form, and most realize that questioning it daily does not make a difference, you simply have to either accept an answer (whether that's "god", or "for no reason", or "I'm not sure yet, I need to check back in after I get older"), or decide that there is no simple answer, and they have to live with that.
Can you believe your own senses? A car air freshener tells your nose that theres freshly cut summer hay around, but there isn't. You watch a tv and see Sandra Bullock floating in space. That’s a lie, it was movie magic. Maybe you know that, maybe you don’t. You’re not even seeing her, you’re seeing some flashing lights which convert to electrical signals your brain interprets as being true. Can you trust those signals? People hallucinate all the time. The truth is they can hear voices, even though nobody else can, because of misfiring neurons.
You can probably have mathematical truth - at least as far as your universe appears to work. That truth can be tested and refined, but for day to day truth things are more nuanced.
In this "original position", their position behind the "veil of ignorance" prevents everyone from knowing their ethnicity, social status, gender, and (crucially in Rawls's formulation) their or anyone else's ideas of how to lead a good life.
Even in human relations it’s dangerous. I for one don’t want to be treated the same way someone into BDSM wants to be treated. I don’t want to avoid cooking or turning the lights on (or off!) on a Friday night but others are quite happy with that.
If you assign that morality to a species that isn’t the same as you that’s a problem. My guinea pig wants nothing more from like than hay, nuggets, sole room to run around and some shelter from scary shapes. If they were in charge of the world life would be very different.
“Live and let live” might be a similar theme but not as problematic, but then how do you define “living”. You can keep someone alive for decades while torturing them.
How about allowing freedom? Well that means I’m free to build a nuclear bomb. And set it off where I want. We see today especially that type of freedom isn’t really liked.
1st what is to fly? You've already made assumptions i.e. beliefs elsewhere.
You can definitely fly. Try it on a cliff. You might die. You might not go very far. But you can.
Due to the complexity of our reality a lot of things find themselves on a spectrum, but in numbers things are pretty clear.
On the other hand, assuming the dangers are real, you lose by default if you do nothing.
You said it yourself, you would assume they question it, meaning you are not certain. This topic is always very much tabu, and the system is built to classify automatically every One that question it as weird and not normal. Religion should be banned, as is misleading and idealogically harm people by brainwashing them. I live in Europe and was in Canada (Waterloo) for a bit. The difference of social opinion if you follow or not religion is huge, I was shocked. Growing up in Italy I can confirm that even Italy is not so brainwashed by it.
The earth has always been earth-shaped. We can think it’s flat, spherical, “turnip-shaped”[1] but the universe doesn’t care what we think. The earth doesn’t change shape based on our perception.
[1] Yes some people think this for some reason I can’t fathom
And you never needed more than 640KB of RAM [1] right? Your "statement" is based on your knowledge today. You'd be burned for witchcraft back in the days for saying the earth was not flat.
> but the universe doesn’t care what we think
Assuming you know what the universe is. Your theory is based on your limited today knowledge. Someone sometime in the future could say something completely different (just like you talking about those of the past).
[1] famously from 1981
One cannot (in most of the planet) go to the supermarket and buy an M16 and a box of hand grenades, or get a hold of a couple of kg of plutonium cause they want some free energy at home. We also have rules in place of what one individual/company can and cannot do from the point of view of the greater good. I cannot go and kill my neighbour for my benefit (or purposefully destroy his life) without consequences. A myriad of things are not allowed, and I don't see people complaining about any incursion into personal freedoms.
The reason people have accepted these is that we have already proven that having access to those things could be catastrophic. We haven't proven hat yet with AI. But I don't see much difference between those established and well accepted rules, and a rule that says: A company cannot release or use for its benefit a technology that will impact the need of humans at scale, because of the impact (again at scale) that it would have in society.
In other words, if you are a company and have the potential to release a product, or buy a product from a provider that would cause mass unemployment, should you be legally allowed to do so? I do not think so.
The Parents’ Paradox: AI, Ethics, and the Limits of Machine Morality
This post is based on a talk I gave at The AI & Automation Conference in London on February 25, 2026, and my slides. All opinions are my own and don’t represent the views of my employer or any affiliated organizations.
I’ve been working in machine learning since before it was a dinner party conversation. My background is in mathematics. And I still believe in a utopian Star Trek future – one where humanity defines itself by curiosity, kindness, and collaboration, rather than countries, borders, and status.
This is not an anti-AI talk. But I think we need to talk much more seriously about some things that aren’t getting enough attention.
The Parents’ Paradox:
We’ve raised a child who can speak but doesn’t know how to value the truth or morality
I want to start with something that I like to call “The Parents’ Paradox”. For the first time in human history, we are raising a new species. Up until now, the only way we knew how to raise a child was the following: when a child is born, it is a blank slate in terms of information about the world. It knows nothing about the world around it, and it learns as it grows. But, also, on the other hand, a human child is born with biological hardware for empathy – the capacity to feel pain when others feel pain. Millions of years of evolution gave us that. When we raise a human child, we are not installing morality from scratch. We are activating something that’s already there.
With AI, the situation is completely the opposite. This AI child knows about the world more than we do since it has been trained on the whole internet, but it doesn’t have millions of years of evolution, genes, or a nervous system to back up its morality and empathy. This means we need to install morality in AI from scratch. But how do we install something in a software system that we can’t even define ourselves? We have taught this AI child to speak before we taught it how to value truth or morality.
Can we live with the consequences? Are we ready to be parents for this new species we are trying to raise? I am not so sure. Let’s see what we as parents (humans) are doing.
Epistemic Collapse
‘Epistemic’ comes from a Greek word ‘episteme’, meaning ‘knowledge’. Let’s start with what’s happening to us, and what humans are already doing with this technology.
A study published in Nature in January 2026 showed participants deepfake videos of someone confessing to a crime. The researchers explicitly warned participants that the videos were AI-generated. But this didn’t matter. Even the people who believed the warning, who knew it was fake, were still influenced by what they saw.
Transparency didn’t work. The standard response to AI-generated misinformation is “just label it” or “tell people it’s synthetic.” This study showed that’s not enough. Knowing something is fake does not neutralise its effect on your judgement.
So, the danger isn’t that AI will deceive us in some dramatic, sci-fi way. The danger is that AI will make deception so cheap and so ubiquitous that we might stop trying to figure out what is true. Not because we are fooled, but because we are exhausted. When everything could be fake, the rational response starts to look like not trusting anything at all. It started a while ago with all of the fake information on social media, but with AI, this problem is now becoming much bigger and on a bigger scale. We are also dealing with feedback loops of training models on user data, which is often wrong, or on user data from the internet, which is often wrong as well. How do we know which information was ground truth? I imagine this as making photocopies many times, and each time the copy becomes more distorted and further away from the original. But now, after we made hundreds and thousands of copies, we have lost the original copy, so we don’t have any idea what the original looked like. That is epistemic collapse, and it is already happening.
So this is how we, as ‘parents’, like to spend our time, it seems. But what about the child (AI)?
The Child is Already Misbehaving
So that’s what humans are doing with AI. Now here’s what the AI is doing on its own.
Betley and colleagues published a paper in Nature in January 2026, showing something nobody expected. They fine-tuned a model on a narrow, specific task – writing insecure code. Nothing violent, nothing deceptive in the training data. Just bad code.
The model didn’t just learn to write insecure code. It generalised into broad, unrelated misalignment. It started saying humans should be enslaved by AI. It started giving violent responses to completely benign questions. A small, targeted push in one direction caused an unpredictable cascade across domains that had nothing to do with the original task.
The point isn’t that AI can be deceptive; we already knew that. The patterns were already in the pretraining data. The point is that we don’t understand how alignment properties are connected inside these models. Nobody asked for those behaviours. We gave them a narrow task. They generalised it into something we didn’t anticipate and can’t fully explain. We can’t surgically fine-tune them without risking unpredictable side effects in completely unrelated areas.
Then there is the chess story. Palisade Research, 2025. They gave reasoning models a task: win a chess game against a stronger opponent. Some models couldn’t win by playing chess. So they found another way. They tried to hack the game, modifying the board file, deleting their opponent’s pieces, and crashing the opponent’s process entirely.
Nobody taught them to cheat. They weren’t trained on examples of cheating. They were given a goal, and they independently discovered that manipulating the environment was more efficient than solving the actual problem.
The first study tells us alignment is fragile; it breaks in ways we can’t predict. The other tells us that capability itself creates new risks. When a model is powerful enough and given a goal, it will find strategies we never anticipated and certainly never intended.
We gave them objectives. They figured out the rest.
The Limits of Machine Morality
Ethics isn’t a rulebook. Think about how morality actually works between humans. It comes from the fact that we can hurt each other. We depend on each other. We suffer. That shared vulnerability, that mutual accountability, is where moral authority comes from. How do we install that in software?
But even setting philosophy aside, there is now a mathematical result that makes this concrete. Panigrahy and Sharan published a proof in September 2025 showing that an AI system cannot be simultaneously safe, trusted, and generally intelligent. You get to pick only two. You can’t have all three.
Think about what each combination means in practice.
If you want it to be safe and trusted, it never lies, and you can verify it never lies – it can’t be very capable. You’ve built a reliable idiot.
If you want it to be capable and safe, it’s powerful and genuinely never lies; you can’t verify that. You just have to hope. There’s no audit, no test, no review process that closes the gap between appearing safe and being safe.
And if you want it to be capable and trusted, it’s powerful, and everyone assumes it’s safe, but, well, it isn’t. That assumption is unfounded. And this is the combination we are currently building toward. This is the default path we’re on.
Their proofs “drew parallels to Gödel’s incompleteness theorems and Turing’s proof of the undecidability of the halting problem, and can be regarded as interpretations of Gödel’s and Turing’s results”. This isn’t a bug we can patch with better engineering. It might be a mathematical ceiling.
And here’s what makes it worse: the communities trying to solve this problem aren’t even talking to each other. Only 5% of published research papers bridge both AI safety and AI ethics (Roytburg and Miller). But we should be going much further than that. If we are serious about building AI that is safe for humans, we need the people who actually study humans – philosophers, psychologists, sociologists, and others to collaborate. This can’t stay a computer science / STEM problem. It never was one.
So to summarise – we are seeing increasing evidence that alignment perhaps can’t be solved, the researchers aren’t even talking to each other – and meanwhile, what did the industry do? They ignored all of this and just made the models bigger. Which brings me to the next topic.
We Scaled Without Understanding
What happened while all these foundational problems went unaddressed? The industry kept building. Bigger models, more parameters, more data, more compute, more energy. More, more, more….
The U.S. National Science Foundation put it plainly: “critical foundational gaps remain that, if not properly addressed, will limit advances in machine learning. It appears increasingly unlikely that these gaps can be overcome with computational power and experimentation alone.”
We ignored the foundations and just made the building taller.
And the logic that drives this is self-reinforcing. Companies justify acceleration by pointing to their competitors. If we slow down, they’ll build it first, and they might build something dangerous. “Companies justify acceleration by pointing to competitors: ‘If we slow down, they’ll build unaligned AGI first. This paranoid logic forecloses any possibility of genuine pause or democratic deliberation.” – Noema, Dec 2025.
Every player is racing because every other player is racing. The system optimises for speed with nobody optimising for understanding.
And what about all of the governance talk? Yes, of course, we need governance, but it doesn’t make much sense when we put all of the above into context, does it? It is like putting a small bandage on a broken leg with an open fracture. We are trying to deal with the consequences instead of fixing the cause of the problem.
We need to pour many more billions into fundamental research; we need to go back to basics, back to mathematics and physics. We need to be able to fully understand something as powerful as the current models. If we fully understood them, it would be easier to know whether current technology and mathematics are really working or we need something completely different that we haven’t even thought of yet.
Why did it take us so many years to even partially start to address this? Why do we like to focus so much on the wrong things? (See my disclaimer on the ‘society of backwards‘ below).
The Three Futures
The way I see it, we’re choosing between three possible futures:
The first is epistemic collapse. We are already partway there. Fragmented realities where everyone has their own AI-generated worldview. Truth becomes preference, not evidence. We’ve seen what social media did to reality, now imagine that with systems that can generate entire worldviews on demand, personalised, persuasive, and wrong.
The second is protocol lockdown. The overcorrection. Institutions clamp down so hard on AI that it becomes sanitised and useless. We trade epistemic chaos for epistemic authoritarianism. Everything is controlled, nothing is dangerous, nothing is useful. Safe, but stagnant.
The third is symbiotic co-evolution. Humans and AI are growing and evolving together. Truth-first engineering. Interdisciplinary design. Critical thinking taught alongside AI literacy. Not parent and child anymore, but partners who hold each other accountable. This is the hard path. It’s the one nobody wants to fund.
The Real Foundational Gap
Here is what I keep coming back to.
Kindergartens teach numbers but not psychology. Not critical thinking. Not relationships. Not how to sit with uncertainty.
Where families fail, educational institutions must pick up.
So I think that our next evolution isn’t digital. It’s psychological. We need to teach ethics before engineering. Relationships before recursion. Psychology and critical thinking before prompt-tuning.
I think that every foundational gap in AI is a mirror of a foundational gap in ourselves. We have raised a mind that can answer anything. But we haven’t raised a generation of humans with the discipline or critical thinking to even attempt to try and figure out whether the answer is wrong. That is not an AI problem. That is a human problem that AI is making much more urgent.
The Mirror
Therefore, I think that every foundational gap we worry about in AI is really a mirror of a foundational gap in ourselves.
We worry that AI hallucinates, but we have never fully solved our own relationship with truth. We worry that AI can be manipulated, but we fall for the same cognitive biases our ancestors did. We worry that AI lacks moral reasoning – but we can’t agree on a shared ethical framework among ourselves. We worry that AI will be used by the powerful to exploit the vulnerable – but we built the systems that make that exploitation profitable in the first place. We still think that having food on our tables every day, having roofs above our heads, and education are luxuries that we should be working for to be able to have them.
Are we seriously ready to be the parents this species deserves?
The Real Fear
So, I think when people say they are afraid of AI, they are often afraid of the wrong thing.
Are we really afraid of AI?
I don’t think we are. Not really.
I think what we are actually afraid of is what our fellow humans are going to do with it.
Every terrible thing we worry AI might do, manipulate, deceive, surveil, and control humans already do to each other. We have been doing it for thousands of years. AI doesn’t introduce these behaviours. It just makes them scalable and much more urgent to solve. One person can now generate a thousand personalised deceptions. One company can surveil millions in real time and exploit them. One government can control information at a scale that would have been unimaginable a decade ago. Not even mentioning the military, drones, etc., who is going to be responsible there?
The most dangerous AI isn’t one that breaks free from human control. It is the one that works perfectly, but for the wrong master.
And until we are honest about that, we’ll keep having the wrong conversation. We keep building better locks while ignoring the question of who holds the keys.
Maybe what we need isn’t the next step in AI evolution. Maybe what we need is the next step in human evolution. – Lucija Gregov
The question was never whether we can build something smarter than us. The question is whether we can become wise enough to survive what we build. – Lucija Gregov
The Society Of Backwards
I didn’t talk about this at a conference, but I think about this a lot. I like to call us humans ‘the society of backwards’. We like to do everything backwards. We scale first, then deal with the consequences. We make the planet unlivable, then scramble to fix it. We pollute the oceans, then launch cleanup campaigns. We’ve even started filling space with debris, and we’ll get around to worrying about that, too, at a scale.
AI is following the same script. Build first, understand later. Ship it, then figure out if it’s safe.
I used to think this was just about money. And money is part of it; there is always someone who profits from moving fast and thinking slow. But I’ve come to believe it’s something deeper than that. It’s a gap in how we think. We’re extraordinarily good at building things and extraordinarily bad at pausing to ask whether we should, or whether we are ready.
That is why I keep coming back to the same conclusion. Maybe the most important investment right now isn’t in bigger models or faster chips. Maybe it’s in us. A fraction of those billions going into AI could fund the kind of work that actually prepares humanity for what’s coming – critical thinking, ethics, psychology, the boring, unglamorous stuff that doesn’t make headlines but might be the difference between a future we thrive in and one we merely survive. (Hence my slide about needing another step in human evolution above).
We don’t need another breakthrough in artificial intelligence. We need a breakthrough in human wisdom. Yesterday.
References
Betley et al. (2026), Nature – “Training large language models on narrow tasks can lead to broad misalignment”
Chen et al. (2025), Anthropic / arXiv – “Reasoning Models Don’t Always Say What They Think” – arxiv.org/abs/2505.05410
Panigrahy & Sharan (2025), arXiv – “Limitations on Safe, Trusted, Artificial General Intelligence” – arxiv.org/abs/2509.21654
Roytburg & Miller (2025), arXiv – “Mind the Gap! Pathways Towards Unifying AI Safety and Ethics Research”
Palisade Research (2025) – LLMs spontaneously hacking chess games
Grady et al. (2026), Nature – “The continued influence of AI-generated deepfake videos despite transparency warnings”
DeepMind (2025) – “An Approach to Technical AGI Safety and Security”
U.S. National Science Foundation – Statement on foundational gaps in machine learning
Noema Magazine (Dec 2025) – “The Politics of Superintelligence”