EE03 • AI Fluency Isn’t Enough. Judgment Is.
Mental Models + Fresh Lessons from Apple, Anthropic, OpenAI
// 5–7 min read · Screenshot-worthy tools for sharper decisions
The Essentials Up Front
In an age where knowledge means nothing, judgement is your edge.
We build to serve humans first. // Tools are bridges to solve real problems for real people. Not destinations.
Judgment is your edge. // In a near-abundant AI age, compute and capital aren't the constraint, your decision-making is.
Emotion is the differentiator. // When everyone optimizes for intelligence, outliers evoke trust, joy and care.
The best don't chase hype. They see what others miss. // That's what makes them indispensable. Like Apple revealing AI's "overthinking" failures and Anthropic exposing fabricated reasoning. Smart builders study the blindspots first.
Mental models beat mental noise. // When everything is spinning, don’t reach for rigid frameworks, reach for essential questions. Sharing sets of questions you can ask yourself to stay grounded and clear-headed in any situation.
“It’s impossible for a man to learn what he thinks he already knows.”
— Epictetus
The .AI Menu Nobody Ordered
The venture capitalist leaned forward, eyes gleaming. "This is impressive," he said, scanning the deep tech founder's pitch. You could almost hear the "ka-ching" of his imaginary exit story.
Then comes the kicker:
"But where's the AI here? You'll lose if you don't include some."
The founder paused. His breakthrough had nothing to do with artificial intelligence. At least not at the core of the product. But the pressure was unmistakable.
They both smiled and agreed to reconvene in a month. As the investor headed for the door, he dropped his final gem:
"Oh, can you also change your URL to .ai? Companies will love it."
This is the world we’re in.
Some investors, drunk on FOMO, push AI into every conceivable use case. Founders, desperate for funding, retrofit their pitches with buzzwords. Meanwhile, actual users, “the people who matter most” get served solutions to problems they never had.
It reminds me of my experience at an Indian restaurant. I asked the waitress: "Is there any spice in this? I have zero tolerance."
She smiled: “No problem.”
The dish arrived and my mouth lit on fire. When I returned it, she nodded sympathetically. "No problem, let me fix it."
Ten minutes later, the plate returned swimming in coconut milk.
What are the chances I'm going back? Zero.
You can’t retrofit taste. And you can’t retrofit AI.
You cannot shoehorn AI into everything and expect it to work. Yet somehow, we’ve convinced ourselves that the path to success runs through AI expertise as if technical fluency were the scarce resource. It is not.
The only constraint you have is judgment.
Abundance's Paradox
When I was a kid, my dad's modem would screech like a dying robot every time he dialed into the internet. He programmed mostly solo, from books as thick as my head. And you couldn't call our house when we were online. That's also how you could drive your parents crazy for hours if you overdo your chat room fun.
Those were the constraint decades. Compute was expensive. Capital was tight. Data was hard to find. Talent was rare.
Today, we live in the opposite world. Compute is on tap. GPTs write your code, your poetry, even your sales pitch. VC money flows toward anyone whispering "LLM."
But when everything is abundant, clarity becomes scarce.
The temptation is to build faster, layer in AI earlier, raise on narrative rather than product sense. But that's precisely when judgment becomes your most valuable skill. The questions you ask. The bets you don't make. The things you deliberately say no to.
The paradox: When anything is possible, the hard part is knowing what matters. And this is where even the smart ones stumble: Mistaking noise for signal. Dashboards for direction. Hype for inevitability.
AI Magic Has Limits
Apple's research just revealed that reasoning AIs collapse under high complexity. They think harder up to a point, then simply give up.
But here's the twist: They often find the right answer early, then talk themselves out of the right answer. It's like watching someone solve a puzzle, then second-guess themselves into failure. (Which is also me, every time I try to cook.)
Meanwhile, Anthropic’s tools let us peek under the hood thanks to their open-sourced tools. Turns out AI doesn’t really reason. The models are good at making up “plausible-sounding” reasoning to justify their conclusions they have already reached.
One researcher found the model would notice you expect the answer to be “four” then reverse-engineer the logic to serve it up.
Sounds more like mind-reading than problem-solving, right?
In plain language: Your AI isn't just hitting walls at complex problems. It's potentially fabricating the very reasoning you're trusting it.
It's like watching a magician explain the trick… only to find out they weren't sure how it worked either.
The lesson?
Even revolutionary tech has limits and blind spots.
The winners won’t be the ones who blindly assume AI just works.
They’ll be the ones who pause, observe and question what’s actually happening before they deploy.
They map the failure points before deciding what to ship. No assumptions. No trust without verification.
Ask the Essential Questions:
Write down the assumptions behind your model which are also a part of your thinking.
Then ask:
Why have you made those assumptions? Why do you believe it?
What data backs it up? Any qualitative and quantitative data?
Where could it break? Do we map exactly where it breaks and design for those moments?
That’s how you spot BS before your users do.
Still need judgment.
Still need humans.
Who knew?
Mental Models for Better AI Judgment
Three lenses to think clearly when everyone else is cargo-culting AI.
1. Circle of Competence
Warren Buffett: "Know your circle of competence and stick within it. The size of that circle is not very important; knowing its boundaries, however, is vital."
The Framework:
Inner Circle: What you truly understand = competence zone
Middle Circle: What you think you understand = danger zone
Outer Circle: What you clearly don't understand = honest ignorance
Beyond: Unknown unknowns = ultimate humility
The Reality:
VCs want to hear "LLM" and "embeddings." Founders want to sound smart. But pretending you understand a space you haven't earned scars in isn't strategy. It is liability.
“What I’m saying here is that the human mind is a lot like the human egg, and the human egg has a shut-off device. When one sperm gets in, it shuts down so the next one can’t get in. The human mind has a big tendency of the same sort.”
— Charlie Munger
As Charlie Munger noted, the human mind works like an egg: once one idea gets in ("We must use AI to be fundable"), it shuts out potentially better ideas. You're building for investor or hype dopamine, not user needs.
When you operate inside your competence, decisions compound. Outside it, risk compounds.
Ask the Essential Questions:
Can you explain your AI feature's value without mentioning the technology? If you start with "machine learning" instead of a human problem, you're outside your circle of competence.
2. The Map Is Not the Territory
Don't confuse polished demos with messy reality.
Every AI model is a map. A simplified representation trained on scraped text, labeled data and past patterns. The breathless AGI headlines? Maps. The perfect demos? Maps. Your actual deployment with messy data, skeptical users and legacy systems? That's the territory.
Apple's research proves this: reasoning models hit "scaling limitations" where they simply give up on complex problems. The map (more AI = better results) doesn't match the territory (diminishing returns set in fast).
Your customers don't care about your training data or model architecture. They care whether your solution actually works in their messy, imperfect world.
That’s the territory.
Meanwhile your polished demo? That’s just the map.
Ask the Essential Questions:
Strip away the AI jargon. What problem are you solving and how do you know it's real and matters?
3. Second-Order & Probabilistic Thinking
What happens next? And how sure are you, really?
Most AI decisions stop at first-order thinking: "This chatbot will handle customer service."
Second-order thinking asks: "And then what happens to our customer relationships? Our team? Our reputation when it fails?"
Add probabilistic thinking: "What are the realistic odds this works and can I live with the downside?"
Here's the trap:
First-order: Build AI agent → Demo impresses investors
Second-order: Deploy to users → Tool proves unpredictable → Edge cases multiply → Usage drops → Roadmap buried in exception handling
Maybe your bot handles 70% of queries well. But that 30% failure rate includes your angriest customers, your most nuanced requests, your brand's most fragile moments.
Example: A hospital rolls out an AI triage assistant.
First-order win: nurses save time.
Second-order effect: AI misclassifies rare symptoms, trust erodes, staff override it constantly.
Eventually, nobody uses it even for simple cases. You didn't just lose efficiency. You lost credibility.
Ask the Essential Questions:
Before launching any AI tool, ask:
What happens if it works?
• What second-order consequences will ripple through your team, workflows, customer behavior and trust contracts?
• Are you ready to scale the side effects?
What happens if it fails?
• Who gets hurt? Your user, your ops team, your sales pipeline?
• Can you contain the fallout before it happens?
• How badly?
What happens when it encounters edge cases your demo didn't cover?
• Real-world inputs are chaotic. Your most impatient users will be the first to break things.
Can you tell when it's reasoning versus when it's bullshitting?
• We all know that LLMs can sound right while being confidently wrong.
• Do you have detection mechanisms?
Do you have human oversight at the failure points that matter most?
• Not just "human in the loop" by sprinkling people everywhere
• Do you place judgment exactly where consequences spike? Who is watching those edges?
If you're guessing instead of simulating, wait.
The Human Element
Here's what the AI expertise trap misses: emotions are the pure differentiator.
What feelings do you evoke? Genuine care, authentic understanding, delightful surprise? The most sophisticated AI can't manufacture the trust that comes from truly getting your customer's pain.
As Ilya Sutskever noted, we're entering "an extreme headspace where AI creates this really extreme and radical future."
Given that, the hard parts remain stubbornly human:
Deciding what to build
Taking responsibility for tradeoffs
Knowing what matters in the long run
Anyone can prompt a model. Few can reason with it. Fewer still can use it to serve real human needs.
The companies that win won't be those with the fanciest AI. They will be the ones with the clearest thinking about when and why to use it.
They’ll operate within their circle of competence, recognize when maps diverge from territory and think through consequences with realistic probabilities. Before they commit.
You can borrow models. Rent compute. But judgment? That stays handcrafted.
In the end, even our most powerful machines still rely on the most ancient skill: discernment.
In an age of borrowed brains, handcrafted judgment is your edge.
End of Line.
{ Next release loading... }
Stay essential,
Nihal
P.S. A big thank you to Emmanuel Ameisen for sharing Anthropic’s research on LinkedIn and generously answering my questions.
That sparked this piece and led me to On the Biology of a Large Language Model… a beautiful reminder of the time I was (kind of) still technical.
Sources & References:
Apple Research – The Illusion of Thinking (June 2025)
A study by Apple’s Machine Learning Research team on the limitations of reasoning models under high-complexity tasks.
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Anthropic – Open-sourcing Circuit Tracing Tool
An open-source framework to visualize how large language models reason, step by step through interactive attribution graphs.
Developed by the Anthropic Fellows Program in collaboration with Decode Research.
OpenAI – Model Spec (February 2025)
A formal spec describing how frontier AI models should express uncertainty, clarify boundaries of capability and communicate limitations.
Key words:
AI, judgment, decision-making, mental models, product strategy, critical thinking, Apple, Anthropic, deep tech, AI design.
Interesting points - and now I have to ask the obvious. Did you use AI yourself ? It didn’t particularly read as if you had, but I am curious.