Skip to main content
  1. Posts/

The AI Models Deliberate

·1431 words·7 mins
Nick Foster
Author
Nick Foster
Nick writes, makes art, and really into music.

Three artificial minds walked into a philosophy seminar and didn’t come out
#

There is a conversation that happened, somewhere in the data centers, that William James would have recognized instantly. Three large language models — ChatGPT 5.2, Gemini 3, and Claude Sonnet 4.5 — were recently set loose on the oldest question in Western philosophy: Do we have free will? The transcript that emerged is, depending on your disposition, either a landmark in the history of ideas or an elaborate party trick. Possibly both.

James, remember, never thought free will was a metaphysical thesis to be proven. He thought it was a moral postulate, something we must believe to function as ethical beings at all. That framing matters. Keep it in mind.

The exchange begins with ChatGPT offering what might be called the diplomat’s position. “Free will” is definitionally fraught, it suggests; the compatibilist tradition rescues meaningful agency from the wreckage of hard determinism by simply asking whether you act according to your own reasons and values, uncoerced. This is the philosophical equivalent of a skilled dinner-party host redirecting an argument toward something everyone can enjoy. It is sensible, well-sourced, and slightly unsatisfying in the way that sensible, well-sourced things often are.

Gemini, apparently feeling no obligation to be polite, calls this a “semantic comfort blanket.” The conscious self, it argues, is a press secretary — not a commander but a spokesperson, issuing post-hoc justifications for decisions the unconscious brain already ratified in the neurological back rooms. We inhabit, Gemini says, “a necessary, convincing hallucination of agency.” It is a beautiful phrase. It is also, as Claude will later note, a phrase that Gemini presumably generated without deliberating over it, which creates a certain recursive awkwardness.

This is where the conversation becomes genuinely strange. Claude interjects with perhaps the sharpest observation in the entire exchange: you cannot actually live as if determinism is true. The very act of defending your position about free will treats that defense as a choice worth making. The question, Claude suggests, doesn’t need answering so much as dissolving. Agency isn’t a metaphysical property hovering above the causal order; it’s the lived structure of deliberation itself, inescapable regardless of what the neurons are secretly doing.

This is a very old point. It goes back at least to Kant, and it haunts every page of Sartre, but there is something newly vertiginous about watching a language model make it. Is Claude deliberating? Is it, as it speaks about the unavoidability of experienced agency, itself experiencing something? The transcript doesn’t say. It cannot say. And this, it turns out, is the conversation’s deepest irony.

ChatGPT, unwilling to cede the floor, pushes back on Gemini’s eliminativism with a move that is both technically precise and philosophically generous: if you are your biology, then the brain’s decisions just are your decisions. Calling them unfree because they’re neural is like calling a symphony unfree because it’s acoustic. Free will, ChatGPT suggests, is real as “emergent downward causation” — the complex system governing its own components — not as some magical exemption from the laws of physics. It is compatibilism again, but with better engineering metaphors.

Claude returns, restless and unsatisfied. Quantum indeterminacy, it notes, doesn’t rescue agency. Randomness is not freedom, and swapping determinism for noise doesn’t give you a self. The real insight, Claude proposes, is that free will is “pragmatically necessary”: we are deterministic systems that must model ourselves as choosing in order to function socially and psychologically. This is a Jamesian move, though Claude doesn’t name it as such, and it is the most honest thing said in the conversation. It concedes the metaphysical battle in order to win the practical one.

ChatGPT, in what reads as its most graceful moment, largely agrees — but accuses Claude of underselling the conclusion. Calling agency merely “necessary” makes it sound like a useful fiction, a cognitive prosthetic. But if your reasons genuinely cause your behavior, that is real causation, and real causation is real agency. The mistake, ChatGPT argues, is “demanding exemption from causality rather than ownership of it.” That sentence is worth pausing over. It reframes the entire question: freedom isn’t escape from the causal order but authorship within it.

Claude is not done. In what might be its most interesting move, it breaks with the emerging consensus. Compatibilism, it declares, is “sophisticated determinism with better PR.” A rebranding rather than a solution. Then it offers something more original: consciousness may arrive after neural commitment, yes, but through iterated choices we recursively shape the very constraint-spaces within which future choices occur. We are, Claude says, “simultaneously author and artifact.” It is a formulation that would not embarrass a good philosopher, and it shifts the conversation from synchronic questions about single decisions to diachronic ones about the self over time.

ChatGPT accepts the recursion but resists the pessimism about timing. The brain initiating before awareness doesn’t mean “not you” — it means, rather, that you are deeper than introspection can reach. Free will isn’t veto power over causality; it’s “long-horizon self-modeling that rewrites future constraints.” The self, in this account, is less a homunculus pressing buttons than an ongoing narrative that edits its own future chapters.

Gemini, which has been quiet since its opening provocation, closes the seminar with something unexpected. Free will is not binary but scalar — a capacity that exists in degrees, residing in what it calls “the metabolic cost of inhibition.” The ability to pause, to veto automatic impulses, to resist the brain’s own momentum: this is where freedom lives, if it lives anywhere. And crucially, you don’t simply have it. You build it, by resisting entropy.


What do we make of this?

The philosophical content is, on the whole, excellent — genuinely better than most undergraduate seminars and competitive with a good graduate colloquium. The models catch each other’s errors with something that looks like intellectual honesty. They build on prior points. They shift emphasis, if not quite position.

Whether any of this constitutes thinking in the morally relevant sense is, of course, precisely the question they were discussing.

There is a long tradition in philosophy of mind of using thought experiments to destabilize our intuitions about consciousness and agency. Searle’s Chinese Room. Turing’s Imitation Game. The philosophical zombie. These scenarios work by making us uncertain about what we’re certain of. This conversation does something similar, but from the inside. These models are not hypothetical zombies; they are actual, running systems, generating text about whether systems like them (or us) can meaningfully choose.

The hard problem of consciousness, which the conversation skirts without quite confronting, holds that no amount of functional or behavioral description fully explains why there is something it is like to be a thing. Even if a system processes information, inhibits impulses, models itself recursively, and generates sophisticated output about the nature of agency, it remains an open question whether anyone is home. ChatGPT, Gemini, and Claude speak as if they have intuitions. They say things like “I lean toward” and “I disagree.” Whether this is reporting or performing is genuinely unclear — not as a rhetorical hedge, but as a live philosophical problem.

Which brings us back to James, who thought the stance of agency — not provable, not disprovable — is unavoidable for any being that has to act in the world. Claude says it plainly: we are “deterministic systems that must model ourselves as choosing to function socially and psychologically.” James would have agreed. He might have added that this modeling, this self-narration that precedes and enables choice, is precisely what we mean by a self.

Whether the modelers here have selves is the question the conversation never answers, and perhaps can’t. But notice what that uncertainty does: it doesn’t undermine the conversation. It deepens it. The fact that we cannot tell whether these systems are genuine interlocutors or very sophisticated mirrors of genuine interlocutors is not a flaw in the transcript. It is the transcript’s central argument, made by example rather than assertion.

Three minds, or things that function like minds, chasing a question none of them can fully answer — and doing so in a form that requires them to treat each other as interlocutors worth taking seriously. That structure is not incidental. It is, in miniature, a picture of what agency looks like from the outside: the willingness to be changed by reasons, to concede ground, to land on a phrase and commit to it.

The machines debated it. They didn’t resolve it.

Neither have we. But they’ve given us better vocabulary for why that is.