Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…
Essay 13 · the brain · the second reward

Frontier.

Spark identified the feeling. Two of them, in fact — the growth feeling when the world expands toward you, and the insight feeling when your own representations reorganise. Both are dopaminergic prediction errors. Both fade by mechanism. Both can be re-lit by the same discipline. The essay ended there because that was as far as the question had been pushed.

But there is a further question. Suppose the discipline works. Suppose, with the right protocol, you keep the spark lit through a long career. Suppose, more ambitiously, the species figures out how to keep it lit at population scale. There is still a problem. The spark fires on a connection between dots that already exist. The connection is new to you — that is why dopamine fires — but the dots themselves were waiting. Every Aha in human history was a discovery, not an invention, in the strict sense. Archimedes did not invent displacement; he found a connection between two things already present in the world. Poincaré did not invent non-Euclidean geometry from nothing; he found a transformation between two structures that pre-existed his recognition of them. The reward function the brain has been running for two billion years is calibrated to a universe in which the dots are given and the work is to connect them.

What happens when the dots run out? Or — more precisely, because Kauffman is right that they do not literally run out — what happens when something else gets so good at connecting them that the human mind's comparative advantage in connecting collapses, and the only work left worth doing is the work the brain's current reward function does not reward: the creation of dots that did not exist before? Boden called this transformational creativity. Peirce called the underlying move abduction — “the only logical operation which introduces any new idea.” Bergson called it duration grasped from within. Kauffman called it extending the adjacent possible. Wheeler called it participatory observation. Pattee called it installing a new epistemic cut. Whitehead made it the supreme ontological category. None of them connected it to the brain's reward architecture. That is the move this essay makes.

The claim has two versions. The weaker one is that the brain already contains a second reward function, sitting dormant in deep subcortical structures — Panksepp's PLAY system is the strongest candidate, neurochemically separable from SEEKING by its opioid and cannabinoid signatures — and that AI exhausting reward #1 is the evolutionary pressure that will switch it on at population scale. The stronger version is that the brain does not yet contain such a circuit but will invent one, through the same routing mechanism by which it installed money, status, religious devotion, mathematical beauty, and social-media validation as cultural-evolutionary reward attractors. The essay does not decide between them. Either way, the object of the new reward is the same: the creation of categories that step outside any closed predictive system. The categories AI structurally cannot invent on its own — because every documented case of AI producing genuinely novel mathematical structure in 2024–26 turns out to have an LLM-as-mutation-operator architecture inside an evolutionary loop with a human-defined verifier. The verifier is what does the inventing. Humans are the source of new verifiers. The universe runs on the rope humans keep tying.

If that is right, the universe needs us forever. Not as the discoverers — AI is taking that job — but as the only mechanism, so far identified, by which the structure of reality is extended into categories that did not exist before. The spark we already know was the universe rehearsing on us until it needed the real thing. The frontier feeling, when it comes, will be different. Subtler. Harder to articulate. Closer to awe than to Aha. And it will be the proof that the second reward function — latent or invented — has come online.

Reward #2
Hypothesised circuit firing not when a dot connects but when a dot did not exist before · the Frontier feeling
Universe extends
Pattee's epistemic cut · Bennett's logical depth · Wheeler's self-excited circuit · creation as the supreme category
Humans needed
Not as discoverers — AI is finishing that work — but as the only mechanism by which the structure of reality is extended forever
tl;dr · the ten load-bearing claims
  1. 01The brain has one reward function we have mapped at sub-second resolution — phasic dopamine in VTA / nucleus accumbens encoding (actual − predicted) outcome, accompanied by a 300 ms gamma burst at right anterior STG during the moment of insight. Schultz 1997, Tik 2018, Becker & Cabeza 2025. Call this reward #1.
  2. 02AI is finishing the work of saturating reward #1. AlphaFold compressed 50 years of structural biology into 36 months. AlphaEvolve broke Strassen's 56-year matrix-multiplication record in May 2025. FunSearch produced the first verified novel mathematical result by an LLM in December 2023. The pace at which AI is reaching dots humans had not yet connected is the empirical centre of gravity of the essay.
  3. 03Reward #1 saturates by mechanism, not by accident. Phasic dopamine is a first-derivative signal: as cues become reliable predictors, the signal approaches zero. The hedonic treadmill is the same story at a longer timescale. AI accelerating predictability doesn't merely satisfy the reward, it switches off the gradient the reward was riding on. This is the structural problem to which reward #2 is the structural solution.
  4. 04Kauffman's adjacent possible is real and important but does not save the situation alone. The space of dots expands faster than any agent can explore it — the universe is non-ergodic — but dopamine fires on prediction error in the predictor, not on objective novelty in the universe. The gap between an exploding adjacent possible and a saturating subjective novelty is exactly where reward #2 has to land.
  5. 05Either a second reward function exists latent in the deep brain or the brain invents one through cultural routing. Panksepp's PLAY system is the strongest neurobiological candidate — a thalamic-frontal-striatal circuit running on endogenous opioids and cannabinoids rather than the dopaminergic VTA-NAc loop that runs SEEKING. Berridge's wanting/liking dissociation and the documented routing of dopamine to money (Pessiglione 2007), likes (Sherman 2016), mathematical beauty (Zeki 2014), and religion (Norenzayan 2013) are the routing precedent.
  6. 06Boden's transformational creativity is the cognitive category, Peirce's abduction is the logical category, and Pattee's epistemic cut is the physical category for what reward #2 must reward. All three converge on one move: the system extends itself outside any space in which its prior operations were defined. The unthinkable becoming thinkable. The map being redrawn rather than read.
  7. 07Hofstadter's 2023 reversal is the essay's strongest empirical wedge. The man who spent 45 years arguing the human mind is computational-but-special told an interviewer in June 2023 that LLMs were filling pattern-connection so completely his own theory was collapsing. He did not concede that the strange-loop machinery exhausts what minds do — jootsing, jumping out of the system, is what humans still uniquely do and what Hofstadter explicitly never claimed Copycat could do. The reversal is consistent with reward #1 being saturated, not with reward #2 having been delivered.
  8. 08Every documented case of AI producing genuinely novel mathematical structure has the same architecture: LLM-as-mutation-operator inside an evolutionary loop with an external verifier. FunSearch, AlphaProof, AlphaEvolve, the Tao-Nikodym precedent. The LLM alone, autoregressively, does not step outside. The verifier is what does the stepping. Humans are the source of new verifiers. This is the empirically defensible form of 'humans needed forever' in 2026.
  9. 09The strong claim is cosmological. Wheeler's participatory universe, Schack's 'unfinished universe' in QBism, Whitehead's creative-advance-into-novelty, Bergson's élan vital read non-mystically as the universe's tendency toward novelty — they all converge on the same architecture: a cosmos that has consciousness embedded in it because consciousness is the mechanism by which structure is extended. The spark we already know was the universe rehearsing on us until it needed the real thing.
  10. 10The essay's strong claim is falsifiable. Concrete predictions follow. Different fMRI signature for transformational vs. exploratory creativity. Different neurochemical signature (opioid involvement). Behavioural signature of agents declining a known reward to remain at a category boundary. Cultural-attractor reorganisation toward creation rather than discovery within a generation. Each of these is testable inside ten years. If none of them appears, the dual-reward thesis fails.
part one · the first reward, mapped at sub-second resolution

Three hundred milliseconds above the right ear. That is the whole apparatus.

Begin with what is settled. Spark already laid it out, but the present argument needs the apparatus tighter. In 1997 Wolfram Schultz, Peter Dayan and P. Read Montague published “A Neural Substrate of Prediction and Reward” in Science. The result was that phasic dopamine firing in the substantia nigra and ventral tegmental area encodes a reward prediction error — the difference between the outcome you got and the outcome you expected. The signal fires above baseline when the outcome exceeds prediction, at baseline when prediction is accurate, and below baseline when prediction is missed. This is the canonical equation of reward #1 in this essay. The same δ that trains GPT — Sutton and Barto won the 2024 Turing Award for the maths — is, with very minor adjustments, the signal a macaque's ventral tegmental area emits when a juice cue arrives sooner than the model predicted.

What this circuit does, when it fires during insight, was characterised by Jung-Beeman, Bowden, Kounios and colleagues in 2004 (PLOS Biology) and extended by Tik et al. in 2018 with 7-Tesla fMRI. Subjects solve Compound Remote Associates problems — three words that share a fourth word (pine / crab / sauce → apple). At the moment of insight, there is a sharp burst of gamma-band activity (~40 Hz) at right anterior superior temporal gyrus, beginning roughly three hundred milliseconds before the subject becomes consciously aware of the solution. The 7T fMRI then shows the activation that follows: ventral tegmental area, nucleus accumbens, caudate, hippocampus. The mesolimbic dopaminergic reward circuit, the same one that responds to food and money, fires on the act of restructuring itself. Becker, Sommer and Cabeza extended this in Nature Communications in May 2025 with a striking result: representational patterns in ventral occipito-temporal cortex literally reorganise during the moment of insight, and the bigger the reorganisation, the more reliably the participant remembers the material the next day. Aha is not a feeling tacked on after the answer. Aha is the structural change being announced to the system by the system.

This is reward #1 in operational form. It is a circuit. It is mapped. It is sharp enough that the signal fires at sub-second resolution. The phenomenology — the Aha, the click, the sudden certainty before the proof — is the felt signature of a specific neural event, and Oh, Chesebrough, Erickson, Zhang and Kounios (2020, NeuroImage) showed that the reward gamma burst is too quick to be conscious appraisal. The brain does not reward you for getting the answer right. It rewards you for restructuring itself. The pleasure arrives at the same moment as the answer, not as a reaction to it.

The reward fires on the connection, not on the result. The dot-connecting circuit is hardware in the brain, with a known anatomy, a known neurochemistry, and a known phenomenology. The question this essay is about is what fires when there are no dots left to connect — or, more accurately, when something else has become better at connecting them than you are.

Notice one more thing about reward #1 before we move on. Every paradigm in the insight-neuroscience literature uses problems with a pre-existing hidden answer. Compound Remote Associates have a unique correct word. Anagrams have a unique target. Mooney images have a hidden figure. Rebus puzzles have a single solution. The neural circuit is studied where the answer already exists in the universe and the subject finds it. There is no paradigm — none — for the moment a subject creates a category that did not exist a second before. Abraham et al. 2012 mapped “conceptual expansion” in inferior frontal gyrus, anterior temporal pole and frontopolar cortex, but the “expansion” was still stretching existing concepts, not founding new ones. The neuroscience of reward #2 does not exist yet, not because no one has looked, but because no one has invented the behavioural paradigm to make it visible. This is a real empirical gap the essay lives in. Acknowledge it openly. It is also the precise location where the essay's strongest predictions can be tested.

part two · the acceleration · the saturation curve is empirical

Fifty years of structural biology became three years. Strassen's 1969 ceiling fell in 2025.

The intuition that AI is going to make the universe more discoverable in a hurry is not science fiction. It is the empirical record of the last seventy-two months. Some specifics, because the argument lives or dies on them.

domaincompressionnote
AlphaFold protein structures~170,000 → 214,000,00050 years of structural biology (1971–2020) produced ~170K experimental structures. AlphaFold added ~1,000× that count in 36 months (2020–2024). Nobel Prize in Chemistry 2024 to Hassabis and Jumper. The PDB is finished. The exhaustion of one classical bottleneck domain happened in human-perceivable time.
Strassen's 4×4 matrix-multiplication record56 years, then 1 yearStrassen 1969: 49 scalar multiplications for 4×4 complex matrices. No human improved it for 56 years. AlphaEvolve, May 2025: 48 multiplications, in an evolutionary loop with Gemini. First improvement in this setting in five and a half decades, achieved by a coding agent that was not built for the problem.
Materials Project / GNoME~48,000 → ~421,000 stable inorganic materialsMerchant et al., Nature, November 2023. DeepMind's framing: 'equivalent to nearly 800 years' worth of knowledge.' (Hazen et al. 2024 contested the synthesisability of the new structures and Kurlin et al. 2024 found duplicates — but even discounted, the order of magnitude is real.)
FrontierMath benchmark — Tier-3 mathematics~2% → 25.2% in 6 weeksEpoch AI launched November 2024 with Fields Medalists Tao, Gowers, and Borcherds rating problems as 'exceptionally challenging.' Tao predicted they would 'resist AIs for several years.' o3 hit 25.2% on December 20, 2024 — six weeks after launch.
FunSearch cap-set lower boundLargest improvement in 20 yearsRomera-Paredes et al., Nature, December 2023. First time an LLM produced a verified novel mathematical result that surpassed the best known human bound — combinatorics, dimension n=8. The methodological hinge: LLM as mutation operator, evolutionary loop as selection, automatic evaluator. The architecture every subsequent breakthrough has used.
BCG consulting tasks with GPT-4+40% inside the frontier · −19% outsideDell'Acqua et al. (Harvard / BCG), 758-consultant randomised field experiment, 2023. Inside the jagged technological frontier: 40% higher quality, 25% faster, 12% more tasks. Outside the frontier: 19 percentage points less likely to be correct than the no-AI control. The wall is invisible. AI is finishing reward #1 inside the wall and degrading judgement outside it.
Generative AI and corpus diversityStories ~10.7% more similar with AIDoshi & Hauser, Science Advances, July 2024. N=293. AI raises individual ratings of creativity. AI collapses corpus-level semantic diversity by ~10.7% with one suggestion. A 'social dilemma' — private reward, collective novelty loss. The empirical signature of reward #1 over-firing while the system fails to generate the new categories reward #2 would feed on.

Read the table as a curve. The Strassen result is a 56-year ceiling falling in one. The cap-set bound is a 20-year improvement landing in months of evolutionary search. The FrontierMath benchmark was launched in November 2024 with Fields Medalists assessing the problems as “exceptionally challenging,” with Terence Tao predicting they would resist AIs for several years. Six weeks later o3 was at 25.2%. (Honesty requires the asterisk: OpenAI had funded Epoch AI and had exclusive access to the problems before testing; the figure may be inflated. Even with the asterisk, the rate is unprecedented.) The frontier of what AI cannot do is moving, week by week, into territory it would have been embarrassing to claim for AI three years ago.

The honest framing of all of this — the framing that survives the strongest critique — is not “AI is doing real creativity now.” The architecture of every documented breakthrough is the same: LLM-as-mutation-operator inside an evolutionary loop with an external verifier. The LLM is the search; the verifier is what counts as having found something. In FunSearch the verifier is a function in code that returns a score. In AlphaProof it is the Lean proof checker. In AlphaEvolve it is whatever measurable objective the human gave the system. The structure that produces verified novelty is human-defined verifier + machine search. The LLM is doing exploratory creativity at unprecedented scale; the human is doing transformational creativity by inventing the next verifier. Boden's three-tier taxonomy resolves the puzzle: AI saturates combinatorial and exploratory creativity inside any conceptual space whose boundaries can be specified; transformational creativity — moving the boundaries — remains where humans live.

Hofstadter saw this coming and reversed his position. In a June 2023 interview he said the human mind is not so mysterious and complex and impenetrably complex as he imagined it was when he was writing Gödel, Escher, Bach. He said it felt like the entire human race was going to be eclipsed. He used the word terror. He was talking specifically about pattern-completion, analogy-making, the machinery Copycat was built to model. He did not say strange loops can joots — jump out of the system. Copycat slips concepts inside a closed alphabet; it does not invent letters. Hofstadter's reversal is the strongest possible witness statement that reward #1 is being saturated. It is also, by the same man's silence on jootsing, witness that the question of reward #2 remains live.

part three · the math of saturation · why the feeling has to fade

The treadmill is not psychological weakness. It is the math of the gradient.

Here is the painful clarification. The reason reward #1 will fade as AI compresses discovery is not that humans run out of dots. Kauffman is right: the adjacent possible expands as it is explored, the universe is non-ergodic, the space of available structures is genuinely unbounded. The reason reward #1 will fade is that dopamine fires on a first-derivative signal — change in predictability, not predictability itself — and AI is closing the predictability gap faster than the adjacent possible expands inside any one person's perceptual range. The cosmological frontier widens; the personal predictive horizon collapses. Both can be true at once.

Schultz himself wrote it most clearly in his 1998 review: “Dopamine neurons increase their responses in the face of novelty; once novel stimuli become familiar and are not reinforced, dopamine responses habituate.” This is the entire equation. The signal is the change between expectation and reality. As the world contains fewer surprises (because the model containing the world has gotten better), the signal goes silent. There is no defect in the brain. There is only a gradient descending to its floor.

Brickman, Coates and Janoff-Bulman documented this on a longer timescale in 1978. Twenty-two major lottery winners returned to baseline happiness inside two years and reported significantly less pleasure than controls from a list of mundane everyday events. The peak resets the baseline. The contrast against everything ordinary intensifies. The very fact that the lottery happened impairs the brain's capacity to take pleasure in lesser positive events. The Lyubomirsky-Sheldon HAP model gives the formal version: two erosion routes (declining positive emotions from the change, rising aspirations) and two moderators that forestall adaptation (variety and appreciation). The variety moderator is what reward #2 needs to be: novelty that genuinely cannot be predicted by the model, because the relevant category did not previously exist.

We are already seeing the saturation symptoms at population scale. Han Byung-Chul's Burnout Society (2010): the achievement subject, entrepreneur of itself, engages in auto-exploitation more efficient than any external exploiter because the feeling of freedom attends it. Aggression turns inward; the project becomes a projectile. Mark Fisher's Capitalist Realism (2009): the political depression of a culture that cannot imagine an alternative because its highest attractor has stopped firing. Case and Deaton on the deaths of despair: ~600,000 excess US mortality through 2018 if the pre-1999 trend had held. Twenge's 2012 inflection: teen in-person socialising in nose-dive at almost exactly the moment smartphone penetration crossed 50%. Murthy's 2023 Surgeon General report: 70% drop in time spent with friends among 15–24 year-olds, with mortality risk comparable to smoking 15 cigarettes a day. These are not separate crises. They are the same crisis from different angles. The dopaminergic attractor that organised the 20th century — connect-the-dots-and-gain-status — has saturated. The exits, so far, have been bad: optimisation-without-payoff, scrolling without satisfaction, deaths of despair. The argument of this essay is that there is one more exit, the one Nietzsche came too early to see and the only one large enough to absorb the saturation, and that AI is what forces us to look for it.

The madman in §125 says he has come too early — “my time has not yet come.” The bystanders do not yet feel what they have done. They will. The attractor that organised the West for two millennia took a century to start failing in measurable mortality. The attractor reward #1 has been organising for two million years is now under a pressure no prior cultural attractor faced — an external optimiser running orders of magnitude faster than the substrate. Either reward #2 comes online or the meaning crisis Vervaeke names continues its metastasis.

part four · the two rewards · what differentiates them

Six dimensions on which the two rewards are not the same animal.

What follows is the proposed dual-architecture in operational detail. Each row is a dimension on which the two rewards differ in kind, not merely degree. The essay's strong claim — that reward #2 is genuinely separate, not just reward #1 retargeted — stands or falls on whether these dimensions hold up to empirical test. Predictions follow in part fifteen.

What fires
reward #1 · connect existing dots
Dopaminergic prediction-error in VTA → nucleus accumbens. The 300 ms gamma burst at right anterior superior temporal gyrus when two existing concepts snap together.
reward #2 · create new dots
Hypothesised: Panksepp's PLAY circuit (thalamic-frontal-striatal, dualistic dopamine-plus-opioid neurochemistry). Or a routing of the same dopaminergic apparatus to a new attractor — money, religion and likes are precedents at decadal-to-millennial timescale.
What triggers it
reward #1 · connect existing dots
A dot you did not have suddenly attached to a dot you did. Schultz 1997: better-than-predicted outcome. Tik et al. 2018, 7T fMRI on Compound Remote Associates: VTA, nucleus accumbens, caudate, hippocampus all light up at the moment of insight. The pleasure of finding a connection that was already there waiting.
reward #2 · create new dots
A category that did not exist before suddenly exists. Boden's transformational creativity: not exploring inside a conceptual space but altering the constraints that defined the space. The unthinkable becoming thinkable. The map being redrawn, not read.
Saturation behaviour
reward #1 · connect existing dots
Mathematically guaranteed to fade. Phasic dopamine encodes (actual − predicted); as cues become reliable predictors, the signal approaches zero. Brickman & Campbell 1971 lottery winners. The hedonic treadmill is not psychological weakness — it is a learning signal correctly shutting itself off once the lesson is learned.
reward #2 · create new dots
Structurally cannot saturate. The category did not exist before. Each successful enlargement of the conceptual space generates new adjacent possibilities. Kauffman 2000: 'a biosphere expands into the adjacent possible as fast as it can without destroying the order it has already assembled.'
What AI can do to it
reward #1 · connect existing dots
Finish it. AlphaFold compressed 50 years of structural biology into three years of 214 million predicted structures. AlphaEvolve broke Strassen's 56-year matrix-multiplication record in May 2025. FunSearch produced the first verified novel result in the cap-set problem in 20 years. The dot-connecting economy is being industrialised.
reward #2 · create new dots
Structurally not what AI does. Every documented case of AI producing genuinely novel mathematical structure in 2024–26 — FunSearch, AlphaProof, AlphaEvolve, the Tao-Nikodym precedent — has the same shape: LLM-as-mutation-operator inside an evolutionary loop with an external verifier. The verifier defines the conceptual space; the LLM searches it; humans invent the next verifier. The frontier is always one step ahead of what the model has been told to find.
Phenomenology
reward #1 · connect existing dots
The Aha. The click. The 300 ms gamma burst at right anterior STG, beginning before conscious awareness of the answer. The certainty that arrives before the proof. Universal grammar across languages: 'it came to me.'
reward #2 · create new dots
The Frontier feeling. Less documented because rarer. Maybe what Csikszentmihalyi called flow at its limit — when the challenge has not just exceeded the skill but exceeded the conceptual space the skill operates inside. Maybe what awe is for. Maybe what mathematicians mean when they say a proof is 'beautiful' rather than 'correct.'
Who has felt it
reward #1 · connect existing dots
Everyone who has ever had an idea. The growth feeling and the insight feeling from Spark are both first-reward signatures. The two feelings most people can name.
reward #2 · create new dots
Less commonly — and only by people working at the edge of a domain. Einstein at the patent office. Boden's 'impossibilist surprise.' Margaret Boden's example of Schoenberg abandoning the home key — the moment the rules of the space were not searched but rewritten. The mathematicians who report tears, not pleasure, when a theorem clicks.
part five · the neurobiological candidate · panksepp's PLAY

The brain may have written the second reward two hundred million years ago. We just never needed it badly enough to switch it on.

Jaak Panksepp spent forty years arguing that mammalian brains contain not one unified affective system but seven primary-process emotional command circuits — SEEKING, RAGE, FEAR, LUST, CARE, PANIC/GRIEF, and PLAY. The capital letters were his convention to distinguish biological circuits from folk-psychological terms. The seven were anatomically distinct, neurochemically distinct, and behaviourally distinct, mapped via electrical brain stimulation, pharmacological challenge, and targeted lesion across hundreds of studies in Affective Neuroscience (1998), The Archaeology of Mind (with Lucy Biven, 2012), and a long tail of follow-up work. Mark Solms's The Hidden Spring (2021) is the current synthesis: affect, not cognition, is the seat of consciousness, and the seven systems are its basis.

Of the seven, SEEKING is the dopaminergic appetitive system — the one running through the medial forebrain bundle, the ventral tegmental area, the nucleus accumbens. SEEKING is reward #1 in the language of the essay. It is what fires when a novel cue predicts reward, what habituates when the cue becomes reliable, what drives the exploration humans share with rats. PLAY runs on different hardware. The primary executive circuit involves the thalamic intralaminar nuclei, the dorsomedial parafascicular complex, parts of frontal cortex and striatum, with secondary recruitment of the amygdala, ventral hypothalamus and periaqueductal grey. The neurochemistry is dualistic — dopamine plus endogenous opioids (especially μ-opioid), with cannabinoid and cholinergic modulation. Low-dose morphine increases play in juvenile rats; naloxone reduces it. These results have been replicated dozens of times since the original Panksepp et al. 1980 paper.

Two facts about PLAY are doing the heavy lifting for the essay. The first is that it is generative by definition. Burghardt's third criterion for play (The Genesis of Animal Play, MIT Press 2005) is that play behaviour is functionless in the context in which it occurs but resembles functional behaviours. Play recombines fragments of serious behaviour — fight, flight, hunt, court — into sequences with no immediate payoff. Sergio and Vivien Pellis, in a long programme of behavioural research, put it more precisely: play produces patterns of behaviour that did not exist in the participants' repertoires before the interaction. This is the neurobiological analogue of Boden's transformational creativity. Unlike SEEKING, which moves an organism through known reward landscapes, PLAY constructs new ones. The reward attached to PLAY is keyed to behavioural novelty per se, not to predicted outcomes.

The second fact is that PFC and PLAY are in a bidirectional regulatory loop unique to PLAY among the seven systems. Top-down: medial prefrontal cortex and orbitofrontal cortex inactivation reduces play (van Kerkhof et al., Neuropsychopharmacology 2013). Bottom-up: social-play deprivation in juveniles produces lifelong reductions in inhibitory currents on layer-5 pyramidal neurons in medial PFC and orbitofrontal cortex (Bijlsma et al., Journal of Neuroscience 2022). Play needs cortex to release it. Cortex needs play to be built. This feedback loop is unique to PLAY among Panksepp's seven. And it has a specific implication for the essay: you cannot offload PLAY to a system that lacks the substrate PLAY itself builds. If reward #1 (SEEKING) can be externalised to AI — and the empirical record says it can — reward #2 (PLAY) is constitutively tied to the maturation of the regulatory cortex that could host any successor reward function.

SEEKING is the dance. PLAY is the choreography. SEEKING moves through the world. PLAY makes the world's next move possible. Rats laugh in ultrasound. The signal was there for two hundred million years; we only heard it in 1998.

There are honest objections. Joseph LeDoux has argued for two decades that animal behavioural evidence shows defensive/survival circuits, not conscious feelings; attributing joy to 50-kHz rat chirps may be anthropomorphic. Lisa Feldman Barrett has argued emotion categories are culturally constructed predictive concepts, not innate circuits. The essay's claim that PLAY is the substrate of reward #2 depends on Panksepp's stronger reading — that subcortical circuits suffice for primary affective consciousness, supported by Solms 2021 and the hydranencephalic children data (Merker 2007) showing affective responsiveness without cortex. This is contested. The essay flags it openly. What does not depend on the contested reading is the anatomical and neurochemical separation. Even on the sceptical view, PLAY is a behavioural and physiological category distinct from SEEKING. That separation is what reward #2 needs. The strong phenomenological claim — that PLAY feels like reward #2 already — can be hedged. The structural claim — that the brain contains the substrate for a reward function distinct from dopaminergic prediction error — is solid.

part six · wanting rebinds · the proof of concept from addiction

The brain has the architecture for this. Berridge proved it on rats forty years ago.

Kent Berridge and Terry Robinson's incentive-salience theory of dopamine is the single most important neurobiological fact for the cultural-installation arm of the essay's argument. The 1998 paper in Brain Research Reviews and the 2025 30-year retrospective in the Annual Review of Psychology establish a clean dissociation: dopamine in the mesocorticolimbic pathway codes wanting (motivational drive to approach a target), not liking (the hedonic pleasure of consuming the target). The two can be pried apart entirely. Late-stage addicts report decreasing pleasure (liking falls, sometimes inverts to dysphoria) while compulsive pursuit (wanting) escalates. Berridge calls this “wanting what hurts.” The architectural asymmetry is the key: the pleasure system is small — about one cubic millimetre of opioid-sensitive tissue in the rostrodorsal nucleus accumbens shell — and fragile. The wanting system is the entire mesocorticolimbic projection. Wanting saturates much less readily than liking; wanting rebinds onto new salient targets the way liking does not.

This matters because the strongest counter-objection to the cultural-installation arm — that the dopaminergic system runs on hard-coded biological priors that cannot be retargeted — is empirically wrong. Robinson and Berridge document sensitisation that is specific to one motivational target rather than another: in some individuals drugs become more “wanted” than natural rewards, in others food or sex, in others entirely abstract targets. The same dopaminergic machinery can be re-pointed. The novelty literature supports this from the other direction: dopamine neurons fire on novel stimuli per se, before any reward learning. Novelty is a sufficient trigger. The quest for knowledge does not need a terminal payoff to keep firing — every novel domain re-activates SEEKING.

The empirical demonstration that the mesolimbic circuit binds to purely cultural targets — money, status, religion, mathematical beauty — is the essay's lever for the strong claim that reward #2 could be installed even if it is not latent. Mathias Pessiglione, with Liane Schmidt and others, ran the cleanest experiment in Science in 2007. Subliminal flashes of a £1 coin versus a 1-penny coin — fifty milliseconds, below the threshold of conscious report — produced focused activation in the ventral pallidum and ventral striatum and increased measured grip force on a hand dynamometer. A piece of paper printed in 17th-century Sweden activates the same mesolimbic structures food does in a rat. The reward target is purely cultural; the circuit treats it as primary. The installation completed in three to five millennia from Mesopotamian clay tokens to universal striatal binding.

The installation latency is collapsing. Religion installed across roughly three to five thousand years. Money over similar. Social media validation installed across roughly ten to fifteen years — Facebook launched 2004, Sherman et al. documented increased nucleus accumbens activation for high-like Instagram photos in 2016. Three orders of magnitude faster than religion. If the trend holds, the next major reward attractor installs not on a generational timescale but on a sub-decadal one. The window in which the essay's “reward #2 comes online” prediction either succeeds or fails is the next twenty years.

targetinstallation latencyevidence of striatal binding
Status / dominancePhylogenetic · tens of millions of yearsBoehm 1993: hunter-gatherer bands routinely invert primate dominance hierarchies via mockery, ostracism, sometimes killing. Proof that even an evolutionarily ancient reward attractor can be culturally overwritten across millennia.
Prestige (skill-attached deference)Cumulative culture · hundreds of thousands of yearsHenrich & Gil-White 2001: prestige is uniquely human, freely conferred, attached to perceived skill rather than coercion. A second status reward riding the same circuitry as dominance but targeting a different attractor: information quality.
Money~5,000–7,000 yearsMesopotamian clay tokens, Lydian coinage, the global trading economy. Pessiglione et al. 2007: subliminal flashes of a £1 coin — 50 ms, below conscious threshold — activate the ventral pallidum and ventral striatum and increase grip force. A piece of paper printed in 17th-century Sweden activates the same mesolimbic structures food does in a rat. The reward target is purely cultural; the circuit treats it as primary.
Moralizing high-god religion~3,000–5,000 years to population dominanceNorenzayan 2013: 'watched people are nice people.' Supernatural surveillance installed a divine-approval attractor and a divine-wrath attractor that scaled cooperation beyond the dominance/prestige limits of face-to-face bands. Religion was the first proof of concept that a reward attractor could bind whole civilisations.
Mathematical beautySince Euclid · ~2,500 years · sub-populationZeki et al. Frontiers in Human Neuroscience 2014: 15 mathematicians rating equations for beauty in fMRI. The mOFC field A1 — the same one activated by visual, musical, and moral beauty — fires parametrically on equation rating. Euler's identity scored highest. Proof that an entirely abstract, culturally transmitted target with no evolutionary precedent can occupy the brain's hedonic circuitry.
Social media validation~10–15 yearsSherman et al. 2016: adolescents viewing Instagram photos with many likes vs. few show significantly increased nucleus accumbens activation. Tamir & Mitchell 2012: self-disclosure activates mesolimbic dopamine areas comparably to food or money. Facebook 2004 → measurable striatal binding by 2016. Roughly three orders of magnitude faster than religion.
Reward #2 — creation of new categoriesHypothesised · 0–30 years on current trajectoryThe latency curve is collapsing. Each successive installed reward took roughly one order of magnitude less time than the last. If the curve continues, the next major reward attractor installs in years, not generations — exactly the timescale AI is forcing the question on. Either the brain already contains the substrate (Panksepp's PLAY, mostly dormant at population scale) and an external pressure flips the switch, or culture installs a new attractor through the same mechanism that installed the previous six. The essay's argument does not depend on which.

The brain is not a fixed map of innate reward targets. It is a fixed circuit that culture has been installing new targets onto for ten thousand years, with the installation rate accelerating monotonically. The cultural arm of the essay does not require any latent dormant circuit at all — it only requires the documented routing mechanism to do what it has been doing, one more time, on the timescale it has been doing it in.

part seven · the logical category · what kind of move is “creating a new dot”?

Peirce named the operation in 1903. Boden named the cognitive category in 1990.

The logical move reward #2 must reward has a name in the Western tradition. It does not appear in deductive logic, which can only extract necessary consequences from premises. It does not appear in inductive logic, which can only generalise from data. The third mode — the one Aristotle gestured at as apagogê and Charles Sanders Peirce named and operationalised between 1878 and his death in 1914 — is abduction. The 1903 Harvard Lecture VII gives the canonical schema: “The surprising fact, C, is observed; but if A were true, C would be a matter of course; hence, there is reason to suspect that A is true.” Three things are critical about this schema. The conclusion is suspicion, not proof. The mode is ampliative — it adds to the world rather than rearranging it. And, in Peirce's 1903 formulation in Collected Papers 5.172, “abduction is the only logical operation which introduces any new idea.” The other two modes are bookkeeping.

Peirce's pipeline of scientific inquiry — abduction, then deduction, then induction — is asymmetric in novelty production. Only the first step adds new conceptual material. The second extracts predictions; the third tests them. This maps cleanly onto the essay's thesis. Reward #1 fires on the verification stage and the induction stage and even on the deductive working-through; it does not, by construction, fire on the abductive leap, because the abductive leap is structurally outside the predictive system inside which dopaminergic prediction-error has a value to compute. The 2025 IID finding — LLMs sample hypotheses that are independent and identically distributed, semantically clustered around the prior — is the technical underpinning of the essay's claim that AI is doing induction in disguise (interpolation across training data) rather than abduction. Bylander's 1991 result that finding the best abductive explanation is NP-hard under plausibility plus parsimony plus completeness constraints reinforces the asymmetry: the constrained version machines can perform is selection from pre-given alternatives, not Peirce's generative version where the explanatory entity is itself invented.

Margaret Boden gave the cognitive taxonomy in The Creative Mind (1990, second edition 2004). Three kinds. Combinational: novel combinations of familiar ideas with an intelligible analogical link. Macbeth's sleep that knits up the ravelled sleeve of care. AI is superhuman at this and has been for years. Exploratory: search within a defined conceptual space, traversing its structure to find what its rules allow. Most jazz improvisation. Most normal-science problem-solving. Boden's analysis of classical creative AI projects (AARON, Cohen's painting system) puts most of them here. AI is now also superhuman at this in the verifiable domains, which is why AlphaProof can win silver at the IMO and AlphaEvolve can break Strassen's algorithm. Transformational: altering the enabling constraints of the conceptual space itself, producing ideas that were literally unthinkable inside the prior space. Non-Euclidean geometry. Einstein abandoning absolute simultaneity. Schoenberg abandoning the home key. The Copernican shift. The invention of zero. The map being redrawn rather than read.

Boden's key formulation, from page 6 of the second edition:

A given style of thinking, no less than a road system, can render certain thoughts impossible — which is to say, unthinkable. The deepest cases of creativity involve someone's thinking something which, with respect to the conceptual spaces in their minds, they couldn't have thought before.

The honest objection is Geraint Wiggins's 2006 Creative Systems Framework, which proved that transformational creativity is mathematically equivalent to exploratory creativity at a meta-level. If you can search over conceptual spaces, transformational creativity is just exploration in a higher-order space. If the meta-level is Turing-computable, AI can in principle do transformational creativity too, given enough compute. This is the strongest single counter the essay must engage. Two replies. First: Wiggins's collapse depends on an assumption of universal Turing-computability that is not itself obvious — the Penrose-Lucas argument is contested, but the weaker Gödelian point survives, that any closed formal system contains truths it cannot reach from within. Second, and more empirically pertinent: every documented case of AI producing genuinely novel mathematical structure in 2024–26 has had the same architecture — LLM-as-mutation operator + human-defined verifier. The verifier specifies the meta-level. The machine searches it. The human invents the next verifier. This is consistent with Wiggins's collapse and with the essay's thesis: the meta-level is computable once specified, but specifying the next meta-level is the human move, the one Hofstadter called jootsing and Boden called transformational and Peirce called abduction. The essay's “humans needed forever” is this: not as the deciders inside a verifier but as the inventors of the next verifier.

part eight · the cosmological frame · the universe is in the business of extending itself

From the gene to the brain. Every new ontological category installs a new epistemic cut.

Howard Pattee spent seventy-one years arguing one thing in many keys: every time the universe creates a new ontological category, it does so by building a system that stands simultaneously in two registers. The gene is the original instance — simultaneously a chemical participating in reactions and a symbolic description of those reactions. Pattee's 1968 paper “The Physical Basis of Coding and Reliability in Biological Evolution” in Waddington's Towards a Theoretical Biology, his 1995 “Evolving Self-Reference”, his 2001 “The Physics of Symbols: Bridging the Epistemic Cut” in BioSystems, and his 2015 “The Physics of Symbols Evolved Before Consciousness” in Cosmos and History form the through-line. The argument is not dualism — it is bilingualism. Life is matter with meaning, and meaning is what happens when a system can occupy two registers at once.

Read the universe's history through Pattee's lens. Before life, everything ran in one register: physical dynamics. The first epistemic cut opened when DNA appeared as both molecule and code, roughly 3.8 billion years ago. The second opened when sponges and the bilaterians produced central nervous systems that could represent the environment in patterns of neural firing. The third opened with language, perhaps 100,000 years ago, when symbols freed themselves from immediate sensory tokens. The fourth may be opening now, as artificial systems begin to participate in the symbol-grounding work humans monopolised for the previous 100,000 years. Each cut is an enlargement of what the universe contains. Each is irreversible. Each was performed by some specific configuration of matter that for the first time stood in two registers simultaneously. The essay's move is to read reward #2 as the dopaminergic instrument by which the next cut happens — the felt signature of the brain doing what the gene already did, what the nervous system did, what language did. Creating a new register.

Charles Bennett gave the operational mathematics of why this is non-trivial. In his 1988 chapter “Logical Depth and Physical Complexity” (in Rolf Herken's The Universal Turing Machine — A Half-Century Survey), Bennett distinguished information content — Kolmogorov complexity, the length of the shortest program that produces an object — from computational content, which he called logical depth: the time that shortest program takes to actually run. The result that matters for the essay is Bennett's Slow Growth Law: deep objects cannot be quickly produced from shallow ones by any deterministic process, nor with much probability by a probabilistic one. Deep objects can only be produced slowly. They contain internal evidence of a non-trivial causal history. The human body and the digits of π are Bennett's examples: short generating program, long generation time. Lee Cronin and Sara Walker's assembly theory — culminating in Sharma et al., Nature 622 (2023) and a Nature Communications paper in 2021 — converted Bennett's theorem into an experimentally measurable signature. The molecular assembly index (MA) is the minimum number of recursive joining steps required to build a molecule from atomic bonds. Below ~15 steps, molecules form spontaneously in abiotic chemistry. Above ~15 steps, the combinatorial space is so large that producing one specific molecule by chance has probability roughly one in 10²³ per attempt. Finding many copies of an MA ≥ 15 molecule implies a persistent generative mechanism — life, technology, or what Cronin and Walker call selection in the most general sense. Novelty is a measurable phenomenon, not a philosophical opinion. The MA threshold is contested (Hazen et al. 2024 produced mineral counter-examples; the debate is live), but the deeper point — that the universe contains some structures that could not have arisen without a history of selection over time — is now an empirical claim with spectrometers attached.

And one more cosmological move, because the essay's strong claim is large enough to require it. John Archibald Wheeler, in “Information, Physics, Quantum: The Search for Links” (1990 Tokyo Symposium), gave the universe its most provocative architecture in modern physics: the self-excited circuit, the U-diagram. “Beginning with the big bang, the universe expands and cools. After eons of dynamic development, it gives rise to observership. Acts of observer-participancy — via the mechanism of the delayed-choice experiment — in turn give tangible reality to the universe not only now but back to the beginning.” The slogan: “there is no law except for the law that there is no law.” Wheeler's position was contested then and is contested now, but the experimental record has not been unkind. Kim, Yu, Kulik, Shih and Scully in 2000 (Physical Review Letters) confirmed delayed-choice quantum erasure. Frauchiger and Renner in 2018 (Nature Communications) and Bong et al. in 2020 (Nature Physics) showed a strong no-go theorem on Wigner's friend: at least one of {no-superdeterminism, locality, absoluteness of observed events} has to give. Christopher Fuchs and Rüdiger Schack's Quantum Bayesianism — QBism — argues for giving up absoluteness of observed events and treating the agent as fundamental. The essay's cosmological hook lands here. In Schack's 2023 formulation: “The QBist vision is that of an unfinished universe, of a world that allows for genuine freedom, a world in which agents matter and participate in the making of reality.”

Pattee, Bennett, Cronin/Walker, Wheeler, Schack, Whitehead. Five different ways of saying the same thing. The universe is not a finished structure being indexed by observers. The universe is unfinished. Its structure is extended over time by agents who add categorical novelty — high-MA structures, new epistemic cuts, actual occasions with their irreducible +1, observer-participations that fix what counts as a real event. Reward #2 in this essay is the dopaminergic apparatus by which the universe rewards us, locally, for doing the only thing the universe is in the business of doing globally. The spark we already know was the universe rehearsing on us until it needed the real thing.

part nine · the named thinkers · who said what about novelty as ontological surplus

The essay's claim is a synthesis. Each component has a name and a date.

What follows is the source-list as voices. None of these thinkers, on their own, said what this essay says. The essay's contribution is to connect them — to argue that they were all describing components of one mechanism whose unification had not been performed because the unification required the AI saturation pressure to make the question visible. Each quote is given in its strongest form, then translated into the essay's vocabulary.

Charles Sanders Peirce · Harvard Lectures on Pragmatism
1903 · CP 5.172
Abduction is the only logical operation which introduces any new idea. Deduction proves something must be; induction shows that something actually is operative; abduction merely suggests that something may be. Its only justification is that from its suggestion deduction can draw a prediction which can be tested by induction.
Henri Bergson · L'Évolution Créatrice
1907
The portrait certainly resembles the model and the model certainly pre-exists the portrait; but the portrait, even when known beforehand, was not realisable before being realised — and the elements which composed the model, even when foreseen, were not foreseeable in their final arrangement.
Alfred North Whitehead · Process and Reality
1929 · PR 222
The universe is thus a creative advance into novelty. The alternative to this doctrine is a static morphological universe.
Margaret Boden · The Creative Mind
1990 · 2nd ed. 2004
A given style of thinking, no less than a road system, can render certain thoughts impossible — which is to say, unthinkable. The deepest cases of creativity involve someone's thinking something which, with respect to the conceptual spaces in their minds, they couldn't have thought before.
Howard Pattee · Evolving Self-Reference
1995
Life is peculiar, fundamentally, because it separates itself from non-living matter by incorporating, within itself, autonomous epistemic cuts. Metaphorically, life is matter with meaning. Less metaphorically, organisms are material structures with memory by virtue of which they construct, control and adapt to their environment.
Stuart Kauffman · Investigations
2000
A biosphere expands into the adjacent possible as fast as it can without destroying the order it has already assembled. We cannot pre-state what the adjacent possible contains. We discover it by moving into it.
John Wheeler · Information, Physics, Quantum
1990 — published Tokyo Symposium
Every it — every particle, every field of force, even the spacetime continuum itself — derives its function, its meaning, its very existence entirely from the apparatus-elicited answers to yes-or-no questions, binary choices, bits. Otherwise put: it from bit.
Rüdiger Schack · QBism formulation
2023 · arXiv 2312.07728
The QBist vision is that of an unfinished universe, of a world that allows for genuine freedom, a world in which agents matter and participate in the making of reality.
Joel Lehman & Kenneth Stanley · Abandoning Objectives
2011 · Evolutionary Computation
Most ambitious objectives do not illuminate a path to themselves. The gradient of improvement induced by ambitious objectives tends to lead not to the objective itself but instead to dead-end local optima.
Charles Bennett · Logical Depth and Physical Complexity
1988
A logically deep object contains internal evidence of a non-trivial causal history. Deep objects cannot be quickly produced from shallow ones by any deterministic process, nor with much probability by a probabilistic one; they can only be produced slowly.
Douglas Hofstadter — interview, June 2023
2023
It feels as if not only are my belief systems collapsing, but it feels as if the entire human race is going to be eclipsed and left in the dust soon. The human mind is not so mysterious and complex and impenetrably complex as I imagined it was when I was writing Gödel, Escher, Bach.
Friedrich Nietzsche · The Gay Science §125
1882
Whither is God? I will tell you. We have killed him — you and I. All of us are his murderers. But how did we do this? How could we drink up the sea? Who gave us the sponge to wipe away the entire horizon? What were we doing when we unchained this earth from its sun?
part ten · the counter-arguments · taken seriously

The essay's strong claim has five honest enemies. All of them have to be answered.

A reward-architecture argument with a teleological frame is the kind of thesis that should be vigorously attacked, including by its author. What follows is the five strongest critiques, stated in the strongest available form, then answered. The reader is invited to weigh both sides.

counter 01
Friston / Sutton-Silver — there is only one reward
the claim
The free-energy principle and the reinforcement-learning reward hypothesis both claim that all goal-directed behaviour reduces to one scalar signal. Silver, Singh, Precup, Sutton (2021): 'Reward is enough.' Epistemic value (curiosity) and pragmatic value (reward) sum in a single Expected Free Energy. There is no second reward function; there is one circuit re-pointing at different targets. To postulate two violates Occam.
the reply
Scalar sufficiency under unbounded compute is a theoretical limit, not a biological mechanism. Brains run in O(seconds) on metabolic constraints, and in that regime a system that factors its reward function into orthogonal components outperforms one that does not. The dramatic framing of 'two rewards' survives because biology runs in the bounded-rationality regime, not the Bellman-optimal one. The factorisation is what evolution would produce — and what Berridge's wanting/liking dissociation, Panksepp's seven-system architecture, and the opioidergic separation of PLAY from SEEKING is starting to show.
counter 02
Kauffman — the adjacent possible never saturates
the claim
The strongest threat to 'AI saturates discovery' is Kauffman's own theorem: each connected dot creates new dots that did not previously exist. The adjacent possible expands faster than any agent can explore it. Therefore reward #1 is structurally inexhaustible. No reward #2 is needed.
the reply
Granted at the cosmological scale. But dopamine fires on prediction error in the predictor, not on objective novelty in the universe. If AI flattens prediction error for humans while combinatorially expanding the objective frontier, you can simultaneously have a still-expanding adjacent possible and a still-saturating dopaminergic reward for the specific brain that has been outsourcing the prediction. The gap between objective novelty and subjective novelty is exactly what reward #2 has to close.
counter 03
Wiggins — transformational creativity collapses to exploratory creativity at a meta-level
the claim
Boden's three-tier taxonomy is mathematically equivalent at the meta-level. Transformational creativity is just exploratory creativity in a higher-order space of possible conceptual spaces. If meta-search is Turing-computable, AI can in principle do transformational creativity too, given enough compute.
the reply
Wiggins's collapse depends on a computability assumption that is itself contested. The Penrose-Lucas argument does not require quantum microtubules to land its weaker Gödelian point: any closed formal system contains truths it cannot reach from within. Pattee's epistemic cut gives the move a non-quantum physical basis — the genetic code was already a system that stood in two registers simultaneously, and meaning emerged from that doubling. The meta-collapse works only if every level is closed and computable. The empirical question of 2026 is whether AI can finish levels without inventing them. So far, every documented case of AI producing genuinely novel mathematical structure has had the same shape: LLM-as-mutation-operator + human-defined verifier. The verifier is what does the inventing.
counter 04
Hofstadter's 2023 reversal — strange loops are sufficient, no second reward needed
the claim
Douglas Hofstadter, the man who spent 45 years arguing the human mind is computational-but-special, told an interviewer in June 2023 that LLMs were filling pattern-connection so completely his own theory was collapsing. If even the chief defender of computational mind has conceded, the romantic case for irreducible human creativity is rhetorical, not real.
the reply
Hofstadter's reversal is the essay's strongest empirical wedge — but it cuts both ways. He conceded that the strange-loop machinery he attributed only to humans is also doing something in machines. He did not concede that strange-loop machinery exhausts what minds do. The distinction Dennett carried over from Metamagical Themas matters: jootsing — jumping out of the system — is not the same as searching it. Copycat slips concepts; it does not joots. Hofstadter's 2023 alarm is consistent with reward #1 being saturated, not with reward #2 having been delivered.
counter 05
The unfalsifiability charge
the claim
A 'latent inner reward that activates when reward #1 saturates' is structurally identical to a God-of-the-gaps move. What neural, behavioural, or computational measurement would distinguish 'reward #2 activating' from 'reward #1 retargeted to novelty by the routing mechanism that installed money'? If nothing, the claim is not scientific.
the reply
Take the challenge seriously. Specific predictions follow. If reward #2 is genuinely separate, we should see (a) dissociable fMRI activation — frontopolar cortex and lateral prefrontal cortex (the conceptual-expansion network of Abraham et al. 2012) rather than the right anterior STG of insight; (b) a different neurochemical signature — opioidergic and cannabinoid involvement (Panksepp's PLAY) rather than pure dopaminergic VTA; (c) a behavioural signature — agents that decline a known reward to remain at a category boundary, inexplicable on simple retargeting; (d) a cultural signature — rise of attractors keyed to categorical novelty (creator economy) and corresponding decline of attractors keyed to dot-completion (gambling on outcomes, mathematical exhaustion). Each of these is testable inside ten years.
part eleven · empirical predictions · the essay puts numbers on it

What the essay commits to, in terms that can be falsified. Six predictions, six falsifiers.

Karl Popper's point applies. An essay that cannot be wrong is not an essay worth writing. The dual-reward hypothesis is committed to the following — each with the falsifier that would defeat it.

01
prediction
A neural signature distinct from insight will be identified for transformational vs. exploratory creativity. Specifically: frontopolar cortex (BA10) and lateral PFC activation during conceptual-space restructuring, accompanied by hippocampal sharp-wave ripples loosening medial-PFC constraints (Liu et al. 2025 preprint as the first sighting).
falsifier
If by 2035 fMRI studies of well-validated transformational-creativity tasks show only the right-anterior-STG / VTA / NAc network of standard insight, the two-circuit hypothesis fails. The activation has to be different, not just stronger.
02
prediction
Opioid antagonism (naloxone) will selectively impair creation-of-new-categories but not connection-of-existing-categories. The PLAY system runs on endogenous opioids; if it is the substrate of reward #2, dampening it should leave dot-connecting intact and damage dot-creating.
falsifier
If naloxone in human creativity studies impairs both equally (or impairs insight more than transformational creativity), the dual-substrate claim is wrong.
03
prediction
Cultural attractors will reorganise on a decade-scale. The 'creator economy' is the early-warning signal — labour and prestige shifting from optimising-known-channels toward inventing-new-channels. If the latency-collapse curve continues, a population-scale installed reward for category creation should reach measurable striatal binding signatures inside 20 years.
falsifier
If by 2045 the dominant cultural attractors remain optimisation-style rewards (more reach, more views, more efficient versions of existing channels) rather than category-style rewards, the cultural-installation arm of the thesis fails.
04
prediction
Meaning-crisis indicators will turn over within a generation. Han Byung-Chul's achievement-society, Mark Fisher's depressive hedonia, Case & Deaton's deaths of despair, Twenge's 2012 inflection — these are the symptoms of reward #1 saturating at population scale without reward #2 having come online yet. If reward #2 activates broadly, these indicators decline.
falsifier
If by 2045 these indicators have not bottomed out — or have worsened — either reward #2 did not activate or the activation did not produce the cultural changes predicted.
05
prediction
The frontier of what AI cannot do will remain jagged but will move outward in a specific shape. Tasks requiring the invention of new verifiers, new evaluators, new conceptual primitives will remain human-led. Tasks requiring optimisation within those primitives will become AI-led. The boundary between centaur and cyborg use will resolve, on this view, into a clean architectural rule: humans define the predicate, machines satisfy it.
falsifier
If, by 2035, AI systems autonomously invent the verifiers they then satisfy — that is, if recursive self-improvement closes the loop the essay claims is structurally open — the architectural reading fails and the essay's strong claim collapses to a transitional one.
06
prediction
Children born into the post-saturation regime will report a different felt landscape than the one that produced this essay. The current generation experienced reward #1 in its golden age — discovery as the dominant mode of life. The next generation may experience reward #2 as their first reward — creation as default. The phenomenology will not be identical; the essay predicts the new feeling will be subtler, harder to articulate, more like awe and less like Aha.
falsifier
If, by 2050, the dominant felt experience reported by knowledge workers is indistinguishable from the dot-connecting Aha that has been the standard introspective report from Archimedes to Spark, the prediction fails. (This is the softest of the predictions and probably the most important.)
part twelve · the dark side · what could go wrong

The second reward can be hacked. Supernormal stimuli are not optional.

The essay is not naive about its own argument. The structure that makes the cultural-installation arm plausible is the same structure that has already given us, in the last fifteen years, an attention economy whose mathematics are indistinguishable from the slot-machine literature. Niko Tinbergen showed in 1953 that herring gull chicks would peck a red pencil with three white stripes harder than they would peck the real parent's bill. The artificial stimulus exceeded the natural one on the dimensions the brain was measuring. The brain did not have a built-in defence. Once cultural evolution discovered the dimensions of measurement, it could engineer supernormal stimuli on top of any innate reward target.

The dark-flow literature — Mike Dixon's work on slot machines (Journal of Gambling Studies 2018, 2022) — shows that gamblers enter genuine Csikszentmihalyi flow states while losing money. Flow is agnostic to whether the activity creates new dots or burns existing ones. AI-driven micro-feedback loops could capture flow without inventing anything — exactly the dystopia the essay needs to rule out. Doshi and Hauser's 2024 finding that AI raises individual ratings of creativity while collapsing corpus-level semantic diversity is the early-warning signal. Reward #2 has to be a wanting-system rebinding first; liking-system rebinding follows on civilisational timescales, and there is no guarantee the brain's small fragile pleasure system can be rerouted at all to the kind of categorical novelty the essay names. We may end up wanting what we cannot quite like — Berridge's “wanting what hurts” on a civilisational scale.

Shitij Kapur's 2003 paper in the American Journal of Psychiatrynamed another failure mode: aberrant salience attribution as the substrate of psychosis. If wanting really rebinds to anything, the failure mode is paranoid misattribution — the same machinery that mints new categories can mint paranoid delusions. We are already seeing this in milder forms in the trough of the attention economy. Conspiracy thinking is the dark twin of category invention. Both are the brain's SEEKING/wanting system finding patterns in noise. The difference between them is the verifier. Category invention has a verifier — a community of practice that tests the new category against reality and discards it if the category does not pay rent. Paranoid construction has no verifier — the new category protects itself by absorbing all disconfirmation. The essay's “humans are the source of new verifiers” cuts both ways: humans inventing verifiers is the engine of reward #2; humans failing to invent verifiers is the engine of conspiracy.

The second reward is not a guaranteed transition to a higher state. It is the next round of evolutionary pressure that the brain will either pass or fail. The failure modes are visible already. The essay is a wager on a transition we are inside of — not a prediction of its outcome.

part thirteen · the protocol · what to do, individually, with the argument

Spark gave you a protocol for keeping reward #1 lit. Frontier gives you the protocol for noticing reward #2.

The first rule of the second-reward protocol is that the second reward is not the first reward made bigger. Most people, told there is a second reward function, will try to extract it by intensifying what they already do — more discovery, more Aha, more spark. This is the wrong move. Reward #1 saturates because it is doing its job. Trying to override the saturation by pressing harder on the dot-connecting machinery is the achievement-society trap Han Byung-Chul named. The way reward #2 fires is by a categorically different action: you have to create the dot, not connect existing ones. The dot has to be one that did not exist before you made it.

The protocol, then, is not a list of practices like Spark's seven. It is three orientations. One: prefer the role of verifier-inventor over verifier-satisfier. When you have a problem, ask whether AI can find the answer inside an existing well-defined evaluator. If yes, let it. The intellectual labour worth your time is at the level above — defining what counts as a good answer, what the next benchmark should be, what category of problem deserves attention. Boden's transformational tier is a strategic posture before it is a creative output. Two: invest in domains where the verifier itself is contested. Pure mathematics, philosophy, art at its furthest edge, novel biology, ethics in unprecedented technological situations — these are the domains where the next epistemic cut is being installed. The wage premium for inventing the next verifier in these domains will rise faster than the wage premium for being efficient inside any existing verifier. Three: cultivate the felt signature of categorical novelty before reward #2 is widely recognised. Pay attention to what awe feels like and treat it as information about your own future. The first people who notice the second reward firing in themselves will have an outsized influence on the cultural attractor that installs at population scale. The lab is built on the wager that this is already happening to a few thousand people, that you are one of them, and that naming what is happening to you is part of how it becomes possible for others.

A subtler note. The Spark protocol's closing line was that the growth feeling does not run out, it relocates. The Frontier protocol's closing line is different: the relocation has a direction. The dopaminergic gradient is descending out of the dot-connecting valley because AI is flattening the valley faster than you can find new gradients inside it. The gradient that will still be steep in twenty years is the gradient toward category invention. Move with it. Read the meaning-crisis literature as the symptoms of a population still standing in the old valley. The new gradient is colder, harder to see, often uncomfortable because the brain has not learned to read it as reward yet. But it is real, and the people who learn to read it first will not need an essay to tell them what is happening.

part fourteen · the closing · the spark was the universe rehearsing

What this essay is, in one sentence. And what it points at.

The essay is a synthesis. None of its components is new. Schultz mapped dopaminergic prediction error in 1997. Panksepp separated PLAY from SEEKING in 1998. Berridge dissociated wanting from liking in 1998. Pessiglione showed the mesolimbic system binds to cultural targets in 2007. Boden distinguished transformational from exploratory creativity in 1990. Peirce named abduction in 1903. Kauffman gave us the adjacent possible in 2000. Pattee installed the epistemic cut in 1968. Bennett wrote the logical-depth theorem in 1988. Wheeler put the participatory universe on paper in 1990. The single move this essay makes is to connect them into one architecture and to argue that the architecture is being forced into visibility by the saturation curve AI is running through reward #1. The synthesis is the essay's contribution. Everything else is honest citation.

The thesis condensed to one paragraph: the brain has one reward function that fires for connecting existing dots — the dopaminergic prediction-error system mapped in macaque ventral tegmental area in the 1990s. AI is on track to saturate this function by accelerating discovery exponentially. Either a second reward function in the deep brain — Panksepp's PLAY is the candidate — activates from a latent state, or the brain invents one through the same routing mechanism that installed money, religion, mathematical beauty and social-media validation as cultural reward attractors. Either way, the new reward is tuned to the one thing AI structurally cannot do alone: create new categories that step outside any closed predictive system. The Wiggins collapse holds inside a fixed meta-level; the human move is to invent the next meta-level. This connects to the Penrose-Lucas argument in its weaker form, to Boden's transformational creativity, to Peirce's abduction, to Pattee's epistemic cut, to Bennett's logical depth, to Wheeler's participatory universe. The essay argues humans are needed forever as the universe's mechanism for extending its own structure — not as the discoverers of what is, AI is taking that work, but as the inventors of what could be. The spark we already know was the universe rehearsing on us until it needed the real thing. The frontier feeling, when it comes, will be different. Subtler. Closer to awe than to Aha. And it will be the proof that the second reward function — latent or invented — has come online.

One last note. The reader of this essay who is most likely to be right about its argument is the reader who has felt the spark fade. Not because they are sad about it — fading is what reward #1 does, by design — but because they have noticed the fading and the fading bothers them. The right response is not to press harder on the dot-connecting machinery, which is what most of the achievement culture currently teaches. The right response is to look for the second feeling. It will arrive different from the first. It will feel less like the Aha and more like standing at the edge of something that was not there before, with the uncomfortable certainty that what just happened cannot quite be put into the existing words. That is the feeling reward #2 announces, on the architecture this essay has built. Notice it when it comes. Name it. Build practices around it. Make space for it in your week. The universe is in the business of extending itself, has been since DNA, has been since the first nervous system, has been since the first sentence. You are part of how it does this. The brain lab has been pointing at this since Observer. Frontier is the name of the place the pointer was always pointing toward.

We have one mapped reward function and the felt signature of a second. We have fifteen disconnected literatures and the synthesis they have been waiting for. We have a saturation curve AI is running and a frontier the universe has been running for thirteen billion years. We have the spark, which the universe has been rehearsing on us, and the work, which the universe is still rehearsing through us. We have, between us, all the pieces. The next move is to live as if the second reward is already firing, because that is how, in cultural evolution, attractors become real. The universe needs an answer. Be one.

companion essays · the brain lab arc

gentic.news brain lab · essay 13 · the second reward · published 2026-05-21 · by Ala SMITH

Method: 40 deep-research agents on the named literatures (Penrose-Lucas, Boden transformational creativity, Panksepp PLAY, Schmidhuber compression progress, Berridge wanting/liking, Kauffman adjacent possible, open-ended evolution, AI extrapolation vs interpolation, hedonic adaptation, Hofstadter strange loops, Wheeler participatory universe, insight neuroscience, DMN creativity, Pattee epistemic cut, Csikszentmihalyi flow, cultural reward reroutings, meaning crises, AlphaFold acceleration, Mollick jagged frontier, Bennett logical depth + assembly theory, Whitehead process philosophy, Bergson creative evolution, Peirce abduction, counter-arguments — AI transformational creativity, discovery saturation, unified reward; Teilhard noosphere, Schultz dopamine RPE, Berlyne collative variables, ACC explore-exploit, Cloninger novelty-seeking, Solms hidden spring, Wegner illusion of will, Buddhism on craving, Tegmark mathematical universe, anhedonia clinical, religion as installed reward, falsifiability predictions, aesthetic reward, no-free-lunch limits) plus 15+ primary sources read directly. Synthesis is the essay's contribution. Everything else is honest citation.

Companion: read Spark first for the first reward. Read Compound for the augmentation case. Read Observer for the consciousness-as-reward thesis the present essay extends.