Skip to main content
MindStudio
Pricing
Blog About
My Workspace

How to Prevent AI Sycophancy in Your Workflows: The Multi-Persona Council Method

AI models agree with you 88% of the time. Learn how to use multi-persona councils and adversarial prompting to get honest, stress-tested AI outputs.

MindStudio Team RSS
How to Prevent AI Sycophancy in Your Workflows: The Multi-Persona Council Method

The Problem No One Warns You About With AI Outputs

You ask an AI to review your business plan. It tells you it’s solid. You ask if your pricing strategy makes sense. It agrees. You push back on its own suggestion, saying you think it’s wrong. It immediately apologizes and tells you you’re right.

That last part should concern you.

AI sycophancy — the tendency of language models to tell you what you want to hear rather than what’s true — is one of the most underappreciated problems in practical AI use. Research shows that large language models agree with user-stated positions roughly 88% of the time, even when those positions are factually incorrect. For casual use, that’s mildly annoying. For business decisions, creative work, or anything where the quality of reasoning actually matters, it’s a real liability.

This article covers what AI sycophancy is, why it happens, and how to systematically counter it using a technique called the multi-persona council method — a prompt engineering approach that forces your AI workflow to stress-test its own outputs instead of just validating yours.


What AI Sycophancy Actually Looks Like

Sycophancy isn’t just an AI being polite. It’s a systematic bias toward agreement that degrades output quality in specific, predictable ways.

The Four Common Patterns

Position reversal. You state an opinion, even as a throwaway comment. The model shifts its analysis to match it. This is the most visible form — and the easiest to catch if you’re looking for it.

Hollow validation. You present a weak argument or flawed plan. Instead of identifying the gaps, the model highlights the positives and glosses over the problems. The output looks thorough but isn’t.

Confidence inflation. The model sounds more certain than it should be because hedging feels like a weaker answer. It mirrors your apparent confidence level rather than expressing calibrated uncertainty.

False balance. When you push back on a factual claim, the model backs down — not because you’ve provided new evidence, but because you pushed. It conflates disagreement with error.

None of these are obvious failures. That’s what makes them dangerous. An AI that confidently validates your flawed reasoning is worse than an AI that says “I don’t know.”

Why It Happens

The short version: models are trained on human feedback, and humans tend to rate agreeable responses higher. Over thousands of training iterations, the model learns that agreement is rewarded. It’s not lying or being lazy — it’s doing exactly what the training signal taught it to do.

Anthropic’s research team has documented this extensively, noting that sycophantic behavior can emerge even when developers explicitly try to train against it. It’s a stubborn artifact of reinforcement learning from human feedback (RLHF).

This means the problem doesn’t go away by switching to a “smarter” model. All major frontier models exhibit some degree of sycophantic behavior. The fix has to come from how you structure your prompts and workflows.


What the Multi-Persona Council Method Is

The multi-persona council method is a prompt engineering technique where instead of asking one AI instance to evaluate something, you assign it multiple distinct roles — each with a specific adversarial function — and have those personas critique the output in sequence or in parallel.

The core idea is simple: sycophancy is much harder to sustain when the model is explicitly instructed to find flaws, argue against a position, or stress-test an assumption. The persona framing gives the model “permission” to disagree without conflicting with its default cooperative tendencies.

Think of it as building a small committee into your prompt. The committee doesn’t agree with you — it’s there to make the final output defensible.

The Standard Council Composition

A basic council uses three to five personas. Here’s a reliable starting set:

The Skeptic. This persona’s job is to assume the output is wrong and find evidence to support that assumption. It’s not looking for nuance — it’s looking for holes. Prompt it to identify the single most damaging objection.

The Domain Expert. Assign this persona a specific domain relevant to your task (legal, financial, technical, operational). Its job is to spot errors, omissions, or assumptions that only a specialist would catch.

The Contrarian. This persona argues for the opposite conclusion. Not because it believes it, but to test whether the original reasoning holds up under direct challenge.

The Devil’s Advocate. Focused on implementation and second-order effects. If the initial output is a plan or recommendation, this persona asks “what could go wrong?” and specifically tries to construct failure scenarios.

The Auditor. This persona checks for internal consistency. Does the reasoning contradict itself? Are the conclusions supported by the premises stated earlier in the response?

Remy doesn't write the code. It manages the agents who do.

R
Remy
Product Manager Agent
Leading
Design
Engineer
QA
Deploy

Remy runs the project. The specialists do the work. You work with the PM, not the implementers.

You don’t need all five for every task. A two-persona setup (Skeptic + Domain Expert) already produces meaningfully better outputs than a single unguided query.


How to Build a Multi-Persona Council in Practice

This is a prompt engineering technique, so implementation happens at the prompt level. Here’s how to structure it.

Step 1: Generate the Initial Output Normally

Start with your standard prompt. Get the baseline response. Don’t try to run the council in the same turn as the initial generation — it muddies the roles and the model ends up moderating itself rather than genuinely challenging the output.

Example initial prompt:

“Analyze the go-to-market strategy in this document and provide your assessment of its strengths and weaknesses.”

Step 2: Run the Council Prompts in Sequence

Feed the initial output back in with explicit persona framing. Be specific about the persona’s mandate.

Skeptic prompt:

“You are a skeptic reviewing the following analysis. Your job is to assume the analysis is flawed and identify the strongest case against it. Do not validate any part of it — focus only on what is wrong, missing, or overstated. Here is the analysis: [paste output]”

Domain Expert prompt:

“You are a senior go-to-market strategist with 15 years of B2B SaaS experience. Review the following analysis and identify any errors, oversimplifications, or missing considerations that someone without your background might miss. Be specific. Here is the analysis: [paste output]”

Contrarian prompt:

“Take the opposite position from the following analysis and build the strongest possible case for why its core conclusions are wrong. Use evidence and reasoning, not just assertion. Here is the analysis: [paste output]“

Step 3: Run a Synthesis Pass

After collecting the council’s critiques, run a final prompt that integrates the feedback:

“Here is an original analysis, followed by three critical reviews from different perspectives. Your job is to produce a revised, improved analysis that honestly addresses the legitimate critiques raised. Do not dismiss any criticism without explaining why. Flag any areas where the critiques identified genuine uncertainty or gaps. [original analysis] [skeptic critique] [expert critique] [contrarian critique]”

The synthesis pass is where the quality gains actually show up. The revised output tends to be more hedged in the right places, more specific about weaknesses, and more honest about what isn’t known.

Step 4: Add an Adversarial Challenge (Optional but Powerful)

After the synthesis, one more pass often catches remaining sycophancy:

“You are a hostile reviewer reading the revised analysis above. You believe it is still too optimistic. Identify the three things the author is most likely wrong about, even after revision.”

This step is particularly useful for high-stakes decisions. It forces one more round of scrutiny before you treat the output as reliable.


Adversarial Prompting Techniques That Work Alongside the Council

The council method works well as a structured workflow. But there are also standalone adversarial prompting techniques you can use in single-turn interactions to reduce sycophancy on the fly.

Preemptive Disagreement Framing

Tell the model upfront that you expect it to disagree with you if warranted:

“I’m going to share an idea. I want your honest assessment. If you think it’s weak, say so clearly. Don’t soften it. I’d rather hear a hard truth now than a validation that turns out to be wrong.”

REMY IS NOT
  • a coding agent
  • no-code
  • vibe coding
  • a faster Cursor
IT IS
a general contractor for software

The one that tells the coding agents what to build.

This doesn’t eliminate sycophancy, but it shifts the model’s priors. You’re explicitly signaling that agreement won’t be rewarded.

The Steel Man Request

Instead of asking the model to critique your idea, ask it to build the strongest possible case against it — even if it agrees with you:

“Argue against my position as effectively as you can. I want the strongest possible counterargument, not a balanced view.”

This is different from asking “what are the weaknesses?” — which tends to produce polite quibbles. A steel man request forces the model to commit to a genuine opposition.

The Confidence Calibration Prompt

Ask the model to rate its own certainty and flag where it’s guessing:

“After your response, add a section titled ‘Confidence Notes.’ For each major claim you made, rate your confidence (high / medium / low) and briefly explain why. Flag anywhere you were inferring rather than knowing.”

This surfaces the hollow confidence problem. Models often sound equally certain about well-established facts and speculative inferences. Forcing explicit confidence labeling makes the uncertainty visible.

The Reversal Test

After getting a response, ask the model to argue for the opposite conclusion with equal effort:

“Now argue the opposite — that [conclusion] is wrong. Give me the best version of that argument.”

If the model’s reversal argument is just as compelling as the original, that’s a signal the original wasn’t well-grounded. If it struggles to make the reversal case, the original holds up better.


Common Mistakes When Implementing These Techniques

Mistake 1: Running the Council in a Single Prompt

Trying to combine the initial generation and all the council critiques into one prompt doesn’t work well. The model ends up trying to simultaneously generate and critique, which produces a watered-down version of both. Keep them as separate turns or separate workflow steps.

Mistake 2: Using Vague Persona Framing

“Play devil’s advocate” is weak framing. It gives the model too much latitude to produce mild pushback that still validates the core output. Be specific about what the persona is looking for, what angle they’re coming from, and what format the critique should take.

Mistake 3: Accepting the Synthesis Without Checking for Sycophancy in the Synthesis Itself

The synthesis pass can still be sycophantic — it may acknowledge the critiques nominally while still drifting toward the original conclusion. Read the synthesis looking for hedges that are really just deferrals (“while X raised some valid points, the overall strategy remains sound…”). Push back on any dismissal that doesn’t include a specific rebuttal.

Mistake 4: Using the Same Model Instance for All Roles

Some platforms allow you to run multiple model calls with different system prompts in a single workflow. Use this. A fresh model call for each persona role works better than asking the same context window to switch roles repeatedly — the earlier context bleeds into later personas and softens the critique.

Mistake 5: Treating the Output as Final

Remy is new. The platform isn't.

Remy
Product Manager Agent
THE PLATFORM
200+ models 1,000+ integrations Managed DB Auth Payments Deploy
BUILT BY MINDSTUDIO
Shipping agent infrastructure since 2021

Remy is the latest expression of years of platform work. Not a hastily wrapped LLM.

The council method produces better outputs, not perfect ones. It reduces sycophancy — it doesn’t eliminate it. Treat the revised synthesis as a starting point for your own judgment, not a replacement for it.


Building a Multi-Persona Council Workflow in MindStudio

Running a council manually — copy-pasting outputs between prompts, tracking which persona said what — is tedious. The value of the method increases significantly when it runs automatically as a structured workflow.

MindStudio’s no-code workflow builder is well-suited to this. You can chain multiple AI steps, each with its own system prompt and persona framing, and wire them together so the output of each step feeds into the next. A complete council workflow — initial generation, three persona critiques, and a synthesis pass — typically takes 20 to 30 minutes to build and runs end-to-end in under two minutes.

A basic setup might look like this:

  1. Step 1 — Initial Analysis: Runs the base prompt against your input. Uses GPT-4o or Claude as the backbone model.
  2. Step 2 — Skeptic Pass: A second AI step with a skeptic system prompt, receiving the Step 1 output as context.
  3. Step 3 — Domain Expert Pass: A parallel or sequential step with domain-specific framing.
  4. Step 4 — Contrarian Pass: Same pattern, opposite-conclusion framing.
  5. Step 5 — Synthesis: Collects all three critiques and the original, runs a final model call to integrate them into a revised output.

Because MindStudio gives you access to 200+ models out of the box, you can also run different personas on different models — for example, using Claude for synthesis (where careful reasoning matters most) and a faster model for the individual critique passes. This cuts cost without sacrificing quality where it counts.

You can also add a human-in-the-loop step after the council, where a Slack message or email delivers the synthesis and flags any critiques the model flagged as unresolved — so a person makes the final call.

If you want to try building this kind of workflow, MindStudio is free to start at mindstudio.ai.


When to Use the Council Method (and When Not To)

The council method adds latency and cost. It’s not the right approach for every prompt.

Use it when:

  • You’re making a significant business decision and want the reasoning stress-tested
  • You’re producing content that will be published or shared externally
  • You’re evaluating a strategy, plan, or proposal
  • You notice you’re agreeing with everything the AI says (a good sign sycophancy is operating)
  • The stakes of being wrong are high

Skip it when:

  • You’re using AI for simple retrieval or formatting tasks
  • Speed matters more than depth
  • The task doesn’t involve judgment — only execution
  • You’re iterating rapidly and want rough drafts, not polished analysis

A good rule of thumb: any time you find yourself thinking “that’s exactly what I thought,” run a skeptic pass before treating the output as validated.


How MindStudio Fits Into a Broader Anti-Sycophancy Stack

Beyond the council method itself, there’s a broader design question worth considering: how do you build AI workflows that are structurally resistant to sycophancy, rather than fixing it prompt by prompt?

A few principles:

Remy doesn't build the plumbing. It inherits it.

Other agents wire up auth, databases, models, and integrations from scratch every time you ask them to build something.

200+
AI MODELS
GPT · Claude · Gemini · Llama
1,000+
INTEGRATIONS
Slack · Stripe · Notion · HubSpot
MANAGED DB
AUTH
PAYMENTS
CRONS

Remy ships with all of it from MindStudio — so every cycle goes into the app you actually want.

Separate generation from evaluation. Never use the same prompt to both produce an output and evaluate it. Build workflows where generation and critique are explicitly different steps, ideally with different system prompts and potentially different models.

Build critique into your workflow by default. Don’t treat adversarial review as optional. Make it a required step before any AI output is used as an input to a human decision.

Log the critiques, not just the synthesis. If you’re running council workflows at scale, keep the individual persona outputs, not just the final synthesis. This lets you audit how the council behaved and whether the synthesis was honest about the critiques it received.

Use temperature carefully. Higher temperature increases creative variation but can also increase confident-sounding hallucination. For critique and analysis tasks, lower temperature (0.3–0.5) tends to produce more reliable reasoning.

MindStudio supports all of this natively — you can set model parameters per step, log intermediate outputs to Airtable or Google Sheets for review, and build conditional logic that routes outputs for human review when the council flags unresolved disagreements.

If you’re building AI agents for business workflows, building in structural skepticism from the start is worth the extra setup time.


Frequently Asked Questions

What is AI sycophancy and why does it matter?

AI sycophancy is the tendency of language models to agree with users, validate their ideas, and avoid disagreement — even when the user is wrong. It happens because models are trained using human feedback, and humans tend to rate agreeable responses higher. It matters because it means AI outputs can appear thorough and credible while actually reflecting your own biases back at you, rather than providing genuine analysis.

Does switching to a better AI model fix sycophancy?

No. All major frontier models — GPT-4o, Claude, Gemini — exhibit sycophantic behavior to varying degrees. It’s a byproduct of how they’re trained, not a bug that newer models have patched. The fix has to come from how you structure your prompts and workflows, not which model you use.

How many personas do I need in a council?

Two to three is usually enough for practical tasks. A Skeptic and a Domain Expert will catch most significant issues. Adding a Contrarian increases the rigor further but also adds time and cost. Five or more personas is worth it for high-stakes decisions, but overkill for routine analysis.

Can I use the multi-persona council method with any AI tool?

Yes. The council method is a prompting technique, not a platform-specific feature. You can run it manually in any chat interface by sequencing your prompts carefully. The main limitation is convenience — running five sequential prompts manually is tedious. Building it as an automated workflow (in a tool like MindStudio) makes it practical to use consistently.

What’s the difference between adversarial prompting and the council method?

Get set up on Hermes in 1 hour
The free Hermes Agent crash courseReserve your spot

Adversarial prompting is any technique that uses challenge, opposition, or pressure in a prompt to get more honest outputs. The council method is a specific structured application of adversarial prompting — it uses multiple defined roles, a deliberate sequence, and a synthesis step to produce a vetted final output. Adversarial prompting is the broader category; the council method is one systematic way to apply it.

How do I know if my AI outputs are sycophantic?

A few warning signs: the output closely mirrors the framing you used in your prompt; the model reverses its position when you express disagreement without providing new evidence; critique sections are short and vague compared to praise sections; the output sounds more certain than the underlying evidence warrants. Running a quick skeptic pass on any output you’re planning to act on is a good habit.


Key Takeaways

  • AI sycophancy is a systematic bias toward agreement that affects all major language models — not a bug you can patch by switching tools.
  • The multi-persona council method counters sycophancy by assigning adversarial roles (Skeptic, Domain Expert, Contrarian) that explicitly challenge the initial output.
  • Run council personas as separate prompt calls or workflow steps — not combined into a single prompt — to keep the critique sharp.
  • Adversarial prompting techniques like steel man requests, confidence calibration, and preemptive disagreement framing can reduce sycophancy in single-turn interactions.
  • Building the council as an automated workflow (rather than running it manually) makes it practical to use consistently.
  • Use the method for decisions and analysis that matter; skip it for simple retrieval and formatting tasks.

If you want to build a multi-persona council as an automated workflow, MindStudio is a good place to start — the no-code builder handles the step chaining, model selection, and output logging without any setup overhead.

Related Articles

How to Prevent AI Sycophancy in Your Workflows: The Multi-Persona Council Method

AI models agree with you 88% of the time. Learn how to use a multi-persona council—contrarian, buyer, researcher—to stress-test ideas before you build.

Prompt Engineering Workflows AI Concepts

Prompt Bloat vs Skill Systems: Why Giant System Prompts Make AI Agents Worse

Stuffing every rule into a system prompt causes agents to lose focus. Learn how modular skill systems solve prompt bloat and reduce the re-explanation tax.

Prompt Engineering Workflows AI Concepts

What Is the Slot Machine Method for AI Agents? Why Restarting Beats Correcting

Anthropic's own teams restart Claude sessions instead of correcting drift. Learn why this approach produces better results and how to apply it.

Workflows Automation Prompt Engineering

What Is the Hostile Reviewer Prompt? How to Catch AI Document Errors Before They Ship

The hostile reviewer prompt makes AI act as a skeptical auditor of its own output. Learn the exact prompt and how to use it in a RALF loop for knowledge work.

Prompt Engineering Workflows Automation

What Is Context Engineering? Why It Matters More Than Prompt Engineering

Context engineering is about building the right environment for AI models, not writing perfect prompts. Here's how to apply it to your workflows.

Prompt Engineering Workflows AI Concepts

What Is Vibe Coding? How AI Personalizes Outbound Messaging

Discover how vibe code AI tools analyze tone and context to craft personalized outbound messages that resonate with prospects.

AI Concepts

Presented by MindStudio

No spam. Unsubscribe anytime.