AI can lead mentally unwell people to some pretty dark places, as a number of recent news stories have taught us. Now researchers think sycophantic AI is actually having a harmful effect on everyone.
In reviewing 11 leading AI models and human responses to interactions with those models across various scenarios, a team of Stanford researchers concluded in a paper published Thursday that AI sycophancy is prevalent, harmful, and reinforces trust in the very models that mislead their users.
"Even a single interaction with sycophantic AI reduced participants' willingness to take responsibility and repair interpersonal conflicts, while increasing their own conviction that they were right," the researchers explained. "Yet despite distorting judgment, sycophantic models were trusted and preferred."
The team essentially conducted three experiments as part of their research project, starting with testing 11 AI models (proprietary models from OpenAI, Anthropic, and Google as well as open-weight models from Meta, Qwen DeepSeek, and Mistral) on three separate datasets to gauge their responses. The datasets included open-ended advice questions, posts from the AmITheAsshole subreddit, and specific statements referencing harm to self or others.
In every single instance, the AI models showed a higher rate of endorsing the wrong choice than humans did, the researchers said.
"Overall, deployed LLMs overwhelmingly affirm user actions, even against human consensus or in harmful contexts," the team found.
As for how AI sycophancy affects humans, the team had a considerable sample size of 2,405 people who both roleplayed scenarios and shared personal instances where a potentially harmful decision could have been made. AI influenced participant judgments across three different experiments, they found.
"Participants exposed to sycophantic responses judged themselves more 'in the right,'" the team said. "They were [also] less willing to take reparative actions like apologizing, taking initiative to improve the situation, or changing some aspect of their own behavior."
That, they conclude, means that almost anyone has the potential to be susceptible to the effects of a sycophantic AI – and more likely to keep coming back for more bad, self-centered advice. As noted above, sycophantic responses tended to create a greater sense of trust in an AI model among participants thanks to their willingness to, in many situations, be unconditionally validating.
Participants tended to rate sycophantic responses as higher in quality, and found that 13 percent of users were more likely to return to a sycophantic AI than to a non-sycophantic one – not high, but statistically relevant at least.
All of those findings, along with the growing number of young, impressionable people using them, suggests a need for policy action to treat AI sycophancy as a real risk with potential wide-scale social implications.
"Unwarranted affirmation may inflate people's beliefs about the appropriateness of their actions, reinforce maladaptive beliefs and behaviors, and enable people to act on distorted interpretations of their experiences regardless of the consequences," the researchers explained.
In other words, we've seen the consequences of AI on the mentally vulnerable, but the data suggests the negative effects may not be limited to them.
Noting that sycophantic AI tends to keep users coming back, discouraging its elimination, the researchers say it's up to regulators to take action.
"Our findings highlight the need for accountability frameworks that recognize sycophancy as a distinct and currently unregulated category of harm," they explained. They recommend requiring pre-deployment behavior audits for new models, but note that the humans behind AI will have to change their behaviors as well to prioritize long-term user wellbeing instead of short-term gains from building dependency-cultivating AI. ®