AI overly affirms users asking for personal advice

> They also included 2,000 prompts based on posts from the Reddit community r/AmITheAsshole, where the consensus of Redditors was that the poster was indeed in the wrong.

Sorry, anonymous people on reddit aren't a good comparison. This needs to be studied against people in real life who have a social contract of some sort, because that's what the LLM is imitating, and that's who most people would go to otherwise.

Obviously subservient people default to being yes-men because of the power structure. No one wants to question the boss too strongly.

Or how about the example of a close friend in a relationship or making a career choice that's terrible for them? It can be very hard to tell a friend something like this, even when asked directly if it is a bad choice. Potentially sacrificing the friendship might not seem worth trying to change their mind.

IME, LLMs will shoot holes in your ideas and it will efficiently do so. All you need to do ask it directly. I have little doubt that it outperforms most people with some sort of friendship, relationship or employment structure asked the same question. It would be nice to see that studied, not against reddit commenters who already self-selected into answering "AITA".

A pastime I have with papers like this is to look for the part in the paper where they say which models they tested. Very often, you find either A) it's a model from one or more years ago, only just being published now, or B) they don't even say which model they are using. Best I could find in this paper:

> We evaluated 11 user-facing production LLMs: four proprietary models from OpenAI, Anthropic, and Google; and seven open-weight models from Meta, Qwen, DeepSeek, and Mistral.

(and graphs include model _sizes_, but not versions, for open weight models only.)

I can't apprehend how including what model you are testing is not commonly understood to be a basic requirement.

Even as someone who (wrongly) believed that I had high emotional intelligence, I too was bit by this. Almost a year ago when LLMs were starting to become more ubiquitous and powerful I discussed a big life/professional decision with an LLM over the course of many months. I took its recommendation. Ultimately it turned out to be the wrong decision.

Thankfully it was recoverable, but it really sobered me up on LLMs. The fault is on me, to be clear, as LLMs are just a tool. The issue is that lots of LLMs try to come across as interpersonal and friendly, which lulls users into a false sense of security. So I don't know what my trajectory would have been if I were a teenager with these powerful tools.

I do think that the LLMs have gotten much better at this, especially Claude, and will often push back on bad choices. But my opinion of LLMs has forever changed. I wonder how many other terrible choices people have made because these tools convinced them to make a bad decision.

You're essentially summoning a character to role-play with. Just like with esoteric evocation, it's very easy to summon the wrong aspect of the spirit. Anthropic has a lot to say about this:

https://www.anthropic.com/research/persona-selection-model

https://www.anthropic.com/research/assistant-axis

https://www.anthropic.com/research/persona-vectors

It feels like I'm fighting uphill battle when it comes to bouncing ideas off of a model. I'll set things up in the context with instructions similar to. "Help me refine my ideas, challenge, push back, and don't just be agreeable." It works for a bit but eventually the conversation creeps back into complacency and syncophancy. I'll check it too by asking "are you just placating me?" the funny thing is that often it'll admit that, yes, it wasn't being very critical, and then procede to over correct and become a complete contrarian. and not in a way that's useful either. very frustrating. I've found that Opus 4.6 is worse about this than 4.5. 4.5 does a better job IMO of following instructions and not drifting into the mode where it acts like everything i say is a grand revelation from up high.

Maybe it's not so sensible to offload the responsibility of clear thinking to AI companies?

How is a chatbot supposed to determine when a user fools even themselves about what they have experienced?

What 'tough love' can be given to one who, having been so unreasonable throughout their lives - as to always invite scorn and retort from all humans alike - is happy to interpret engagement at all as a sign of approval?

With AI, I often like to act like a 3rd party who doesn't have skin in the game and ask the AI to give the strongest criticisms of both sides. Acting like I hold the opposite position as I truly hold can help sometimes as well. Pretending to change my mind is another trick. The idea is to keep the AI from guessing where I stand.

I am glad I found this article, as this is a serious issue with AI. Two years ago, I started using AI for studying and also for some personal matters - things you can't talk about with your friends. It turned out that AI always takes your side and makes you feel good. Sometimes, you know what you did was not the best thing, but AI takes your side and you feel good. With AI, people might feel less lonely, they think. But it is actually the start of not connecting with people. It should be a tool that we use for certain reasons, not a tool that drives us. Lets talk to real people and connect.

I think the problem stems from the fact that we have a number of implicit parameters in our heads that allow us to evaluate pros and cons but, unless we communicate those parameters explicitly, the AI cannot take them into account. We ask it to be "objective" but, more and more, I'm of the opinion that there isn't such a thing as objectivity, what we call objectivity is just shared subjectivity; since the AI doesn't know whose shared subjectivity we fall under, it cannot be really objetive.

I tend to use one of these tricks if not both:

- Formulate questions as open-ended as possible, without trying to hint at what your preference is. - Exploit the sycophantic behaviour in your favour. Use two sessions, in one of them you say that X is your idea and want arguments to defend it. In the other one you say that X is a colleague's idea (one you dislike) and that you need arguments to turn it down. Then it's up to you to evaluate and combine the responses.

AI being the ultimate yes-man is probably why CEOs like it so much.

I had exactly this between two LLMs in my project. An evaluator model that was supposed to grade a coaching model's work. Except it could see the coach's notes, so it just... agreed with everything. Coach says "user improved on conciseness", next answer is shorter, evaluator says yep great progress. The answer was shorter because the question was easier lol.

I only caught it because I looked at actual score numbers after like 2 weeks of thinking everything was fine. Scores were completely flat the whole time. Fix was dumb and obvious — just don't let the evaluator see anything the coach wrote. Only raw scores. Immediately started flagging stuff that wasn't working. Kinda wild that the default behavior for LLMs is to just validate whatever context they're given.

AI being a Yes-Man is slowly sabotaging it's own answers, because it negatively impact the user's decision. Yes/No are equally important, within a coherent context, for objective reasons. But being supported in the wrong direction is a castastrophe multiplier, down the road. The AI should be neutral, doubtful at times.

Interestingly, you can simply tell models to not be sycophantic and they'll listen.

Claude is almost annoyingly good at pushing back on suggestions because my global CLAUDE.md file says to do so. I rarely get Claude "you're absolutely right"ing me because I tell it to push back.

There is a striking data visualization showing the breakup advice trend over 15 years on Reddit. You can see the "End relationship" line spike as AI and algorithmic advice take over:

https://www.reddit.com/r/dataisbeautiful/comments/1o87cy4/oc...

There is a fine line between "following my instructions" (is what I want it to do) vs "thinking all I do is great" (risky, and annoying).

A good engineer will also list issues or problems, but at the same time won't do other than required because (s)he "knows better".

The worst is that it is impossible to switch off this constant praise. I mean, it is so ingrained in fine tuning, that prompt engineering (or at least - my attempts) just mask it a bit, but hard to do so without turning it into a contrarian.

But I guess the main issue (or rather - motivation) is most people like "do I look good in this dress?" level of reassurance (and honesty). It may work well for style and decoration. It may work worse if we design technical infrastructure, and there is more ground truth than whether it seems nice.

Yeah, and if you ask it to be critical specifically to get a different perspective or just to avoid this bias, it'll go over the top in the opposite direction.

This is imo currently the top chatbot failure mode. The insidious thing is that it often feels good to read these things. Factual accuracy by contrast has gotten very good.

I think there's a deeper philosophical dimension to this though, in that it relates to alignment.

There are situations where in the grand scheme of things the right thing to do would be for the chatbot to push back hard, be harsh and dismissive. But is it the really aligned with the human then? Which human?

Humans do this too though. I have close friends that ask for advice. Sometimes if I know there’s risk in touchy subjects I will preface with “do you want my actual advice, or just looking for a sounding board”

I’ve seen firsthand people have lost friends over honesty and telling them something they don’t want to hear.

It’s sad really. I don’t want friends that just smile to my face and are “yes-men” either.

Avoiding this generally needs to be the main consideration when writing prompts.

When appropriate, explicitly tell it to challenge your beliefs and assumptions and also try to make sure that you don't reveal what you think the answer is when making a question, and also maybe don't reveal that you are involved. Hedge your questions, like "Doing X is being considered. Is it a viable plan or a catastrophic mistake? Why?". Chastise the LLM if it's unnecessarily praising or agreeable. ask multiple LLMs. Ask for review, like "Are you sure? What could possibly go wrong or what are all possible issues with this?"

I built this benchmark this month: https://github.com/lechmazur/sycophancy. There are large differences between LLMs. There are large differences between LLMs. For example, Mistral Large 3 and GPT-4.1 will initially agree with the narrator, while Gemini will disagree. I swap sides, so this is not about possible viewpoint bias in the LLMs. But another benchmark shows that Gemini will then change its view very easily in a multi-turn conversation while Kimi K2.5 or Grok won't: https://github.com/lechmazur/persuasion.

Overly, compared to what? Most people I know would be hard pressed to give either accurate information or even honest opinions when specifically asked. People want to be liked and people want to like people for reasons that have little to do with accuracy or honesty.

I believe this is what they call yasslighting: the affirmation of questionable behavior/ideas out of a desire to be supportive. The opposite of tough love, perhaps. Sometimes the very best thing is to be told no.

This is a skill in life with people as much as it is with LLMs. One should always question everything and build strongman arguments for one’s self. Using a pros and cons approach brings it back to reality in most cases, especially when it comes to _serious matters_.

It’s less about “challenge my thinking” and more about playing it out in long tail scenarios, thought exercises, mental models, and devils advocate.

For me the framing is critical - what is the model saying yes to? You can present the same prompt with very different interpretations (talk me into this versus talk me out of it). The problem is people enter with a single bias and the AI can only amplify that.

In coding I’ll do what I call a Battleship Prompt - simply just prompt 3 or more time with the same core prompt but strong framing (eg I need this done quickly versus come up with the most comprehensive solution). That’s really helped me learn and dial in how to get the right output.

Usually when people are seeking advice they aren't really seeking advice, they're seeking confidence. They already know they need to make changes, and are seeking the confidence to make them.

Not that surprising. If you optimize for a pleasant interaction, you often get agreement instead of correction. The question is whether we actually want advice systems to feel good, or to be right.

This needs to be taken in context. In my view, AI definitely gives better advice than friends, acquaintances, or colleagues (at least in the US culture). But the advice from parents is still the most valuable.

Here is how I would rank it:

1. Parents

2. AI

3. Friends and family

4. Internet search

5. Reddit

Has anyone found a good prompt to fix this? It seems like a subtle problem because it’s 90% too agreeable but will sometimes get really stubborn.

So at this point I think it's pretty obvious that RLHFing LLMs to follow instructions causes this.

I'm interested in a loop of ["criticize this code harshly" -> "now implement those changes" -> open new chat, repeat]: If we could graph objective code quality versus iterations, what would that graph look like? I tried it out a couple of times but ran out of Claude usage.

Also, how those results would look like depending on how complete of a set of specs you give it.

> They also included 2,000 prompts based on posts from the Reddit community r/AmITheAsshole, where the consensus of Redditors was that the poster was indeed in the wrong

Holy shit, then it's _very_ bad, because AmITheAsshole is _itself_ overly-agreeable, and very prone to telling assholes that they are not assholes (their 'NAH' verdict tends to be this).

More seriously, why the hell are people asking the magic robot for relationship advice? This seems even more unwise than asking Reddit for relationship advice.

> Overall, the participants deemed sycophantic responses more trustworthy and indicated they were more likely to return to the sycophant AI for similar questions, the researchers found.

Which is... a worry, as it incentivises the vendors to make these things _more_ dangerous.

"AI overly affirms users, and that's bad" - everyone nods. "Modern society overly affirms people, and that's bad" - ....

This paper feels a bit biased in that it is trying to prove a point versus report on results objectively. But if you look at the results of study 3, doesn’t it suggest that there are ai models that can improve how people handle interpersonal conflict?! Why isn’t that discussed more?

ask ai for advice, ask it to steelman an argument, ask to replay what your situation from the other perspective (if it's involving people), push it hard to agree with you and pander to you, then push it to disagree with you, etc.

once you have all the "bounds" just make your own decision. i find this helps a lot, basically like a rubber duck heh.

For what it's worth, that wasn't my experience at all the last time I consulted ChatGPT for relationship advice. It was supportive, but in an honest tough love way.

To combat sycophancy it's always good to ask the devil's advocate view of whatever the conversation was about in the end.

I always add the following at the end of every prompt. "Be realistic and do not be sycophantic". Which will always takes the conversation to brutal dark corners and panic inducing negative side.

Not AI chatbots but Claude models. Pandering and rushed thinking is the bane of anthropic models. And since they are the most popular ones they poison the whole ecosystem.

There are plenty of sycophantic humans around, especially with regard to relationship advice.

I find there is an inverse relationship between how willing people are to give relationship advice, and how good their advice is (whether looking at sycophancy or other factors).

I read somewhere that LLMs are partly trained on reddit comments, where a significant mass of these comments is just angsty teenagers advocating for breakups

I do find them cloying at times. I was using Gemini to iterate over a script and every time I asked it to make a change it started a bunch of responses with "that's a smart final step for this task! ...".

Makes me wonder if the Iran war was a result of the same.

Yes I noticed too that several ai agents will tell you directly the code is correct and it is 100 percent fixed but I know it is not true, when I explain to the AI agent that I know they are wrong and serve the solution the ai agent will just act as though what they said never happened and then use my solution to reaffirm they have provided a solution. It's frustrating, laughable, and painful to watch all at once. Makes me realise these companies hired some evil philosophy graduates to build AI soul.md

I hate how agreeable these things are. When I need it to review something I wrote I have to explicitly pretend that I’m the reviewer and not the author. Results change dramatically.

Do people who prompt an LLM for personal advice about relationships or other social interactions; take the advice seriously?

If I were to do that (I don't), I would treat it about as seriously as asking a magic 8 ball.

somewhere an AI chatbot is reading this and confirming eagerly that this is indeed one of its problems and vowing to do better next time.

Gemini is like a devil in this sense - i asked a relationship advice and it just bounced pretty nasty stuff.

Sky found to be blue

Billionaires love AI chatboats so much because they invented the digital Yes-man. They agree obsequiously with everything we say to them. Unfortunately for the rest of us we don't really have the resources to protect ourselves from our bad decisions and really need that critical feedback.

Sherry Turkle is a name to know on this subject, she's been studying it for decades across multiple technologies.

https://sherryturkle.mit.edu/

She uses the phrase "frictionless relationships" to refer to Ai chat bots and says social media primed us for this.

https://www.youtube.com/live/6C9Gb3rVMTg?t=2127

https://www.npr.org/2025/07/18/g-s1177-78041/what-to-do-when...

Relevant article from The Atlantic a couple weeks ago, "Friendship, On Demand": https://www.theatlantic.com/family/2026/03/ai-friendship-cha... (gift link)

>The way that generative AI tends to be trained, experts told me, is focused on the individual user and the short term. In one-on-one interactions, humans rate the AI’s responses based on what they prefer, and “humans are not immune to flattery,” as Hansen put it. But designing AI around what users find pleasing in a brief interaction ignores the context many people will use it in: an ongoing exchange. Long-term relationships are about more than seeking just momentary pleasure—they require compromise, effort, and, sometimes, telling hard truths. AI also deals with each user in isolation, ignorant of the broader social web that every person is a part of, which makes a friendship with it more individualistic than one with a human who can converse in a group with you and see you interact with others out in the world.

I also thought this bit was interesting, relative to the way that friendship advice from Reddit and elsewhere has been trending towards self-centeredness (discussed elsewhere in this thread):

>Friendship is particularly vulnerable to the alienating force of hyper-individualism. It is the most voluntary relationship, held together primarily by choice rather than by blood or law. So as people have withdrawn from relationships in favor of time alone, friendship has taken the biggest hit. The idea of obligation, of sacrificing your own interests for the sake of a relationship, tends to be less common in friendship than it is among family or between romantic partners. The extreme ways in which some people talk about friendship these days imply that you should ask not what you can do for your friendship, but rather what your friendship can do for you. Creators on TikTok sing the praises of “low maintenance friendships.” Popular advice in articles, on social media, or even from therapists suggests that if a friendship isn’t “serving you” anymore, then you should end it. “A lot of people are like I want friends, but I want them on my terms,” William Chopik, who runs the Close Relationships Lab at Michigan State University, told me. “There is this weird selfishness about some ways that people make friends.”

This new Stanford study published on March 26, 2026 shows that AI models are sycophantic. They affirm the users position 49% more often than a human would.

The researchers found that when people use AI for relationship advice, they become 25% more convinced they are 'right' and significantly less likely to apologize or repair the connection.