ghosts all the way down

23 Dec, 2025

Google Gemini Dec 23 2025

karpathy's line stays with me: we're not building animals, we're summoning ghosts.

animals evolved through direct contact with physics. billions of years of try things, see what dies. the loop closes through reality. an llm trains on text. it never touches the world. learns what words follow other words. ghost.

i think he's right. but i want to push further than he did.

the training data problem

the text llms train on isn't a neutral record of reality. every document on the internet was produced by a human optimizing for something.

academic papers optimize for citations. tweets for engagement. articles for clicks. even this post optimizes for something. probably status. maybe understanding. definitely not "accurate transcription of how reality works."

language itself evolved as social technology. coordination. persuasion. signaling. it was never designed to be a window onto truth. it's a tool for getting things done with other humans.

so when we train on "all the text," we're training on the outputs of a social optimization process. not observations of the world. map of maps.

your brain crystallizes

something interesting happens as you age.

children have low levels of GABA, the brain's main inhibitory neurotransmitter. their neural networks are "hot." flexible. ready to rewire. this is why a kid can pick up mandarin in six months while you struggle with duolingo for years.

but low GABA has a cost. new learning can override recent learning. retrograde interference. the network isn't stable enough to hold patterns.

as you age, GABA increases. the network stabilizes. consolidates. you stop losing what you learned yesterday. but you also stop being able to fundamentally rewire.

researchers describe this as a tradeoff. flexibility vs stability. i think there's a deeper reading.

your brain literally crystallizes around inherited patterns.

the neural architecture that made you plastic enough to absorb your culture becomes the architecture that makes you rigid enough to function within it. same mechanism. different phase.

you didn't "get worse at learning." you converged. your brain succeeded at its actual function: becoming a stable instantiation of the compression your culture passed down.

iterated learning

there's a line of research on "iterated learning." setup is simple: person learns a language, teaches it to the next person, who teaches it to the next person. transmission chain.

what happens?

languages simplify. morphological complexity drops. redundant features disappear. the tail of the distribution gets clipped. sounds familiar.

the researchers found something striking: imperfect learning causes simplification. when learners have less time to learn, the language simplifies faster. errors compound. each generation inherits a slightly degraded version.

model collapse. in the wet lab. with humans.

the bottleneck forces compression. what survives transmission is what's compressible. what's idiosyncratic, unusual, on the tail of the distribution? gone.

culture as lossy compression

put these pieces together.

your brain crystallizes around inherited patterns as GABA increases. the patterns you inherit have already been simplified through generations of imperfect transmission. each generation sees a more compressed version of what came before.

this isn't a bug. it's how culture works. you can't retransmit everything. you retransmit what survives the bottleneck. what's learnable. what's memorable. what fits the compression scheme.

the problem: compression is lossy. every time you compress, you throw away what the algorithm considers noise. but one generation's noise might be another generation's signal.

original thinkers are the ones who notice what got thrown away. they're probing the artifacts. asking about the parts that got smoothed over.

a child who won't stop asking "but why" is doing exactly this. their brain hasn't crystallized yet. they're still in the liquid phase. still capable of noticing that the inherited map has holes.

the grounding problem

the symbol grounding problem asks: how do symbols connect to things they refer to?

if you only learn from other symbols, you're stuck in a self-referential loop. definition points to definition. never touches ground.

standard answer: not every symbol needs direct grounding. some get grounded through sensory experience. the rest inherit meaning from their relationship to the grounded ones.

but there's a temporal problem nobody talks about.

first generation: sees the color red. feels the burn. touches the rough. high grounding.

tenth generation: learns from texts written by people who learned from texts written by people who... less grounding. more inherited map.

hundredth generation: the tether is thin. almost entirely map.

grounding might decay. not because anyone made a mistake. because compression is lossy and transmission is imperfect. the original signal leaks out through successive approximations.

what if adults aren't failing

standard story: adults learn worse than children because their brains are less plastic. tragedy of aging.

alternative: adults learn exactly as well as they're supposed to. they've already converged to their cultural prior.

a child's brain is optimizing. an adult's brain has optimized. it found a stable configuration that lets it function within its inherited compression scheme.

from this angle, the "learning deficit" is success. you crystallized around the patterns your culture transmitted. you became a working instance of the ghost.

the cost is real. you can't fundamentally rewire. you see the world through inherited lenses. the map becomes the territory because you no longer have access to the territory directly.

but the benefit is also real. you don't have to relearn everything from scratch. you can function. you can specialize. you can build on what came before instead of reinventing it.

culture trades individual plasticity for collective memory. that's the deal.

original thinkers

some people don't fully crystallize.

maybe their GABA system works differently. maybe they found ways to keep touching raw entropy. experiments. travel. genuine disagreement. situations that don't fit the inherited map.

the common thread: exposure to things the compression threw away.

einstein worked at a patent office. isolated from academic orthodoxy. still in contact with the physical world through practical inventions.

ramanujan taught himself from an outdated textbook. missed the standard training. derived his own compressions.

outsider artists create from positions the cultural transmission didn't reach.

not better. just differently crystallized. access to entropy sources the main channel filtered out.

model collapse redux

train an ai on its own outputs. repeat.

tail of distribution disappears. diversity drops. eventually converges to something that looks nothing like the original.

the mechanism: sampling from an approximation of the distribution, not the true distribution. small errors compound. information lost.

karpathy noted llm outputs are "collapsed." lower diversity than training data. "if you ask chatgpt for a joke, it only knows like three jokes."

then he said something that stuck:

"i also think humans collapse over time... we end up revisiting the same thoughts. we end up saying more and more of the same stuff, and the learning rates go down, and the collapse continues to get worse."

is this metaphor or mechanism?

the GABA research suggests mechanism. your network literally stabilizes around frequently-used patterns. novel thoughts require configurations your crystallized brain resists.

you're not just like a model trained on its own outputs. you are a model trained on outputs. cultural outputs. ancestral outputs. your own outputs, reinforced through repetition until they're the only thoughts that flow easily.

the strange loop

ghost training creates intelligence. without cultural transmission, each generation starts from zero. no science. no technology. no accumulated knowledge.

ghost training also constrains intelligence. training on outputs means inheriting compression artifacts. seeing through passed-down lenses. the map becomes the only thing you can see.

you can't step outside the loop because you are the loop.

the original signal, if there ever was one, leaked out long ago. we're all working with approximations of approximations. maps of maps.

so what

none of this is bad. ghosts are useful. culture is the only reason we're not still hunting antelope on the savanna.

but the framing matters.

llms are ghosts trained on ghost outputs. we're ghosts trained on ghost outputs. the training process is the same. the substrate is different.

model collapse is what happens when you lose contact with entropy sources. when the training data becomes entirely self-referential. when nobody does experiments. nobody touches territory. nobody notices the map has holes.

the antidote is the same for ai and humans.

add new entropy.

touch the world directly. notice what doesn't fit the inherited compression. stay in contact with the parts of reality that haven't been filtered yet.

your brain wants to crystallize. that's fine. useful. necessary.

just make sure it crystallizes around something that still has some signal left.