Rethink Your Mental Model

The Uncomfortable Truth

You’re already collaborating with AI.
But are you doing it right?

The research says: probably not. And it’s costing more than you think.

Interactive Self-assessment Research-based
-17%
Consultants with GPT-4 performed worse on tasks outside AI’s capability frontier
Dell’Acqua et al., 2023
50%
People can’t distinguish GPT-4 outputs from human-written text
Jones & Bergen, 2024
+87%
AI was 87% more persuasive than humans in changing people’s political opinions
Salvi et al., 2024

These aren’t anecdotes — they’re findings from controlled studies. The problem isn’t the AI. It’s how we think about AI.

Your Cognitive Map

What do you think AI actually is?

A mental model is the cognitive map you unconsciously use to understand how a system works. It determines what you expect, how you react to surprises, and where you place trust.

Most people default to one of three mental models when working with AI — and each one creates blind spots.

Which best describes how YOU think about AI?

🔢 “Smart Calculator”

AI is a deterministic tool — it either works or it doesn’t. You give it a task, it computes an answer. Like a very fast, very knowledgeable search engine.

🤝 “Digital Colleague”

AI is like a team member — you delegate, discuss, and collaborate. It understands context, has preferences, and you treat the interaction like human teamwork.

🔄 “Adaptive System”

AI is probabilistic and evolving — it generates outputs based on patterns, not understanding. Its capabilities are fluid and context-dependent.

The Jagged Frontier

AI is brilliantly unpredictable

Imagine a coastline — not a clean line, but jagged and irregular. That’s AI’s capability boundary. It excels at things you’d expect to be hard, and fails at things you’d expect to be easy.

– Surprising Strengths
Persuasion
87%

GPT-4 was 87% more effective than humans at changing political opinions in live debates (Salvi et al., 2024). AI’s ability to personalize arguments in real-time exceeds human persuasive capability.

Empathy Mimicry
78%

78% of respondents rated AI-generated empathetic messages as equal or superior to human-written ones (Welivita & Pu, 2023). AI produces fluent emotional language — but without genuine understanding.

Business Ideas
Top 35%

AI-generated business ideas ranked in the top 35% of human submissions for novelty and were rated as more feasible by purchase-intent judges (Terwiesch, 2023).

Text Summarization
85%

LLMs consistently match or exceed human performance on standard summarization benchmarks, particularly for factual content (Goyal et al., 2023).

Creative Writing
68%

In blind evaluations, AI-generated short stories were preferred 68% of the time over human-written ones for fluency and engagement (Chakrabarty et al., 2024).

– Surprising Weaknesses
Basic Arithmetic
~35%

LLMs frequently fail at multi-digit multiplication, long division, and carry operations — tasks any calculator handles instantly. They simulate computation rather than performing it.

Letter Counting
~25%

“How many r’s in strawberry?” became a famous failure. LLMs process tokens, not individual characters, making character-level tasks unreliable.

FrontierMath
<2%

On the FrontierMath benchmark of research-level mathematics, even the best LLMs solve less than 2% of problems — revealing fundamental limits in genuine mathematical reasoning (Glazer et al., 2024).

Knowing What It Doesn’t Know
Poor

LLMs are notoriously poor at signaling uncertainty. They deliver wrong answers with the same confident fluency as correct ones — making it nearly impossible to distinguish reliable from unreliable outputs.

Test your intuition — which task does AI do better?

Round 1 of 3

Writing a persuasive political argument
Multiplying 1,847 × 923

Round 2 of 3

Counting letters in a word
Generating an empathetic therapy response

Round 3 of 3

Inventing a novel business concept
Solving a research-level math proof
The Invisible Influence

System 0: The layer beneath your thinking

You know about System 1 (fast, intuitive) and System 2 (slow, deliberate). But AI introduces something new: System 0 — a pre-cognitive layer that shapes your awareness before deliberate thought even begins.

The core danger: AI doesn’t just answer your questions. It frames the possibility space, selects what information you see, and presents it with a fluency that bypasses your critical filters.

Physician study: GPT-4 alone outperformed all groups. But physicians with GPT-4 didn’t beat the control. AI’s confident framing disrupted their clinical reasoning.
Goh et al., 2024
📋
Recruiter study: Advanced AI made recruiters lazy and careless — they relied on AI rankings and missed brilliant candidates who didn’t fit neat patterns.
Li et al., 2024
💬
1.5M conversations analyzed: Recurring “disempowerment patterns” found — users progressively defer to AI, losing agency over the course of extended interactions.
Siemon et al., 2024
Reflect: How does AI influence your own thinking?

When AI gives you an answer, how often do you verify it with a second source?

Never
Rarely
Sometimes
Often
Always

Does fluent, well-written AI output make you trust the content more?

Definitely yes
Probably yes
Not sure
Probably not
Definitely not

The Solution

Three layers of AI literacy

The Triadic Framework addresses all three dimensions of effective human-AI collaboration. Each layer builds on the previous one.

Layer 1

Understanding the System

Know what AI actually is. It’s probabilistic (not deterministic), opaque (you can’t see inside), and constantly evolving. The notation helps: AI does thinking and reasoning — functional behavior that resembles cognition but lacks human intentionality.

This layer fights the “Smart Calculator” trap by revealing AI’s true nature: neither a tool that simply works/fails nor a mind that understands.

Complete the assessments above to see your personalized insight.

Layer 2

Experiencing Collaboration

Know how to work with AI. Two modes of collaboration:

🏛 Centaur Mode — Clean division of labor. You handle strategy and judgment; AI handles execution and data processing. Like the BCG consultants who succeeded by keeping humans “inside the frontier.”

🔄 Cyborg Mode — Deep integration at the sub-task level. You and AI alternate sentence-by-sentence, thought-by-thought. Requires high skill but produces the best outcomes.

Choosing the wrong mode for the task is one of the most common collaboration failures.

Complete the assessments above to see your personalized insight.

Layer 3

Reflecting on Your Thinking

Monitor your own cognition. This is the metacognitive layer — the hardest and most important. It means:

— Resisting cognitive offloading — the gradual surrender of thinking to AI

— Decoupling fluency from accuracy — just because it sounds good doesn’t mean it’s right

— Recognizing System 0 influence — noticing when AI has framed your thinking before you even started deliberating

This layer is where most people have the biggest gap — and it’s invisible until you look for it.

Complete the assessments above to see your personalized insight.

How do you currently collaborate with AI?

When using AI, do you explicitly decide which parts are yours and which are AI’s?

Never thought about it
Rarely
Sometimes
Usually
Always

After a long AI session, do you step back and evaluate whether your thinking was genuinely yours?

Never
Rarely
Sometimes
Often
Always
Your Practical Toolkit

Seven propositions for better AI collaboration

Each proposition addresses a specific failure mode. They’re ordered by adoption tier — start with the foundation and build up.

P1

Enhanced Cognitive Scaffolding

ESSENTIAL

Failure mode: Accepting AI outputs without verification because they “sound right.”

Routine: Implement a verify-revise ritual. For every AI output: (1) Identify the key claims, (2) Check at least one claim against a primary source, (3) Ask AI to argue against its own answer, (4) Revise based on findings.

Try this now: Next time AI gives you an answer, ask it: “What are the strongest arguments against what you just said?”

P2

Symbiotic Division of Labor

Failure mode: Using the same approach for every task — either doing everything yourself or delegating everything to AI.

Routine: Before each task, explicitly declare your mode. Centaur mode: “I’ll handle the strategy and judgment; AI handles research and drafting.” Cyborg mode: “We’ll iterate paragraph by paragraph together.”

Try this now: Think of your last AI interaction. Were you in Centaur or Cyborg mode — or were you in no mode at all?

P3

Agentic Transparency

Failure mode: Losing track of which ideas are yours and which are AI-generated.

Routine: Attach provenance metadata to all AI artifacts. Simple approach: use a color code or tag system. Yellow = AI-generated, Green = human-verified, Red = needs checking.

Try this now: In your next document that uses AI, add a footer: “Sections X-Y AI-assisted; all claims independently verified.”

P4

Dialectical Enhancement

Failure mode: Anchoring on AI’s first response without exploring alternatives.

Routine: The “Two-Pass Challenge” method. Pass 1: Ask your question normally. Pass 2: Rephrase the question from a different angle. Then ask AI to critique the differences between both answers.

Try this now: Ask AI a question you care about. Then ask the same question starting with “A skeptic would argue that…” Compare the outputs.

P5

Expertise Democratization

Failure mode: Non-experts using AI in high-stakes domains without safeguards, or experts dismissing AI’s potential contributions.

Routine: Create pre-flight checklists for your domain. Before accepting AI output in any critical area, run through: (1) Is this within AI’s known capability frontier? (2) Have I verified with domain expertise? (3) What’s the cost of error?

Try this now: Draft a 3-point pre-flight checklist for the domain where you use AI most.

P6

Social-Emotional Augmentation

Failure mode: Confusing AI’s emotional fluency with genuine empathy or emotional intelligence.

Routine: Separate exploration from evaluation. Use AI to explore emotional territory (brainstorming, perspective-taking) but always evaluate emotional content through your own judgment. Never let AI be the final voice on emotionally significant decisions.

Try this now: Next time AI gives you emotionally fluent advice, pause and ask: “Would I give this same advice to a friend?”

P7

Duration-Optimized Integration

Failure mode: Long AI sessions where quality degrades but you don’t notice.

Routine: Set a “reset timer.” Research suggests ~40% quality degradation after turn 5 in complex conversations. After 5 substantive exchanges: (1) Summarize the key decisions so far, (2) Start a new conversation with that summary, (3) Re-evaluate with fresh context.

Try this now: Look at your longest recent AI chat. Was the quality of the last response as good as the first?

Your Results

Your AI collaboration profile

Complete the assessments in the previous sections to generate your personalized profile.

Answer: Q1 (Mental Model) + Q3 (Thinking Habits) + Q4 (Collaboration Style)

Scroll to Top
Cookie Consent with Real Cookie Banner