residue

the map is older

realg

claude shannon wasn't thinking about cell phones. in 1948 he was asking what seemed like a philosophical question: what is information? to answer it rigorously he had to invent a way to measure it. that measurement implied a limit on how fast you could send information through any channel, no matter how clever your encoding.

engineers couldn't get near this limit. for forty five years they tried. then in 1993 turbo codes appeared and suddenly you could get within a fraction of a percent. the ceiling had been there all along. engineers just couldn't see it.

this story is famous, but what's interesting is how often it repeats. physicists in the 1970s studied disordered magnets and worked out energy landscapes with exponentially many local minima. thirty years later that same mathematics described neural network training. random matrix theory started as nuclear physics in the 1950s and turned out to govern wireless channels, financial correlations, and network spectra. computational complexity seemed philosophical until it became the foundation of cryptography.

the pattern is: theorists solve problems, decades pass, practitioners discover they needed those solutions.

you could interpret this as an argument for funding more theory. that's true but uninteresting. everyone agrees basic research matters in the abstract. the interesting question is more specific: why do practitioners keep rediscovering things theorists already know?

the obvious answer is language. theorists and practitioners use different vocabularies. a paper on markov chain mixing might be exactly what you need, but you'd never find it searching for your application. the theory hides behind terminology.

this is real but not the whole story. even when practitioners know relevant theory exists, they often don't use it. i've watched someone struggle for months with a problem, mentioned there was a literature on it, and seen them nod and keep doing what they were doing.

i used to think this was arrogance. now i think it's something more subtle. using theory feels like giving up. practitioners take pride in solving things. reading a textbook seems like admitting you couldn't figure it out yourself.

but here's what's strange. no one feels this way about other tools. you don't feel like you're cheating when you use a compiler instead of writing machine code. you don't feel diminished by using a library someone else wrote. yet importing a mathematical result feels different. why?

i think it's because theory doesn't look like a tool. it looks like an answer. and answers feel like something you should derive yourself, not something you should look up.

this is a category error. theory is not a collection of answers. it's a collection of structures. knowing about convexity doesn't tell you the solution to your specific optimization problem. it tells you what class of problem you have, which in turn tells you what methods will work and what's impossible. that's not an answer. it's a map.

the map metaphor is exact. a map doesn't walk across the territory for you. it tells you where the walls are. it tells you which paths lead somewhere and which are dead ends. you still have to do the walking.

and here's the key point: the map is almost always older than you expect.

when you face a new problem, it feels new. the details are specific to your situation. no one has ever dealt with exactly this combination of constraints before. but the structure of the problem is rarely new. someone, somewhere, has probably characterized that structure mathematically. they just didn't know about your application, and you don't know about their paper.

consider what happens when you try to test a probabilistic system. traditional software testing checks that a given input produces the expected output. run it a thousand times, same answer. but what if the system is stochastic? what if the same input produces different outputs?

suddenly the whole framework breaks. you can't just check answers. you have to reason about distributions. you have to ask what kinds of failures are possible, not just whether one occurred. you have to detect drift, distinguish noise from signal, decide how much evidence is enough.

this feels like a new problem. machine learning systems behave this way, and machine learning is new. but the mathematics isn't new at all. hypothesis testing dates to the 1920s. concentration inequalities, pac learning, optimal stopping, sequential analysis. all of this exists. it was developed for problems that seemed unrelated at the time.

someone struggling to figure out how many test cases they need to be confident a model works is reinventing sample complexity bounds that were proved decades ago. someone trying to detect when a model has drifted is reinventing change point detection. the wheel has already been invented. several times, in different fields, with different names.

this is the waste. not that the theory doesn't exist but that the people who need it don't know to look for it.

now here's where it gets nuanced. you might think the solution is obvious: just learn more theory. but that's not practical advice. you can't learn all of mathematics on the off chance some of it turns out useful. and most theory genuinely isn't useful for any given problem. the hard part is knowing when theory is likely to help.

one signal: your problem has structure. you're not failing randomly. you're failing in patterns. there's regularity to what doesn't work. when you notice this, someone has probably characterized that regularity formally. your specific application is new. the underlying structure usually isn't.

another signal: people keep rediscovering the same thing. if multiple teams independently arrive at similar approaches, that's not coincidence. there's something inevitable about that approach. which means there's probably a theorem explaining why.

there's a deeper reason this pattern exists. the universe has less variety than it appears to.

why does spin glass mathematics describe neural networks? no one designed it that way. but any system with many interacting components and frustrated constraints will be a spin glass mathematically. that's not analogy. it's identity. there's no other way to be that kind of system.

most conceivable systems don't work. they're unstable or incoherent. the ones that do work get reused everywhere because they're the only options. when a mathematician characterizes an abstract structure, she's implicitly describing everything that could ever have that structure. most of those things don't exist yet.

this is why theorists are early. they're not predicting the future. they're mapping the space of possible systems. they're drawing the walls of mazes before anyone enters them.

but i should steelman the other side. there are good reasons practitioners don't use theory more.

first, a lot of theory really is useless for practice. mathematicians optimize for generality and elegance. practitioners need specific, implementable solutions. the gap between a theorem and working code can be vast.

second, theory often comes with assumptions that don't hold. the elegant result assumes infinite data, or gaussian noise, or independence. real systems violate these assumptions constantly. knowing the theory can actually mislead you if you don't also know when it breaks down.

third, there's opportunity cost. time spent reading papers is time not spent running experiments. for many problems, trial and error really is faster than deriving the answer from first principles.

so the relationship is more complex than "practitioners should use more theory." it's more like: theory is high variance. when it's relevant, it's enormously valuable. when it's not, it's a waste of time. the skill is knowing which situation you're in.

some signs you're in the first situation: the problem feels like it has walls. you've tried many approaches and they all fail in similar ways. there's a ceiling you can't break through. in these cases, theory can tell you whether the ceiling is real or whether you just haven't found the right approach yet. that's extremely valuable information.

some signs you're in the second situation: the problem is messy and underspecified. there are no clean formalizations. success depends on details that vary case by case. here, theory will give you beautiful results about idealized problems that don't match yours.

the deepest point might be this: theory tells you what's impossible at least as much as what's possible. when you prove something can't be done, you've learned about the structure of reality. shannon's limit isn't a failure. it's knowledge. once engineers knew where the ceiling was, they could stop wasting effort on approaches that could never work.

impossibility results are underrated. people want solutions. but knowing a solution doesn't exist is also a solution. it lets you redirect effort. it tells you to change the problem instead of banging your head against it.

this suggests a different way to evaluate work. don't just ask whether something is useful now. ask whether it will be true forever. a specific solution gets obsoleted by better solutions. a characterization of what's possible and impossible is permanent. shannon's theorems will be correct when everyone has forgotten what a telephone was.

the practical upshot, if you're facing a hard problem: before you try to solve it, spend an hour trying to find out if it's been characterized. not solved necessarily. characterized. what class of problem is this? what's known about that class? is there a name for what you're doing?

this sounds like extra work. it's not. it's less work than rediscovering something that's been known since 1973.

and if you can't find anything, that's also information. it might mean you're working on something genuinely new. more likely it means you haven't found the right vocabulary yet. the theory is probably out there, waiting, filed under a name you don't recognize.

theorists are almost always early. that's frustrating for them. but it's good news for everyone else. it means the answers to your hardest problems may already exist. the map was drawn before you arrived. you just have to find it.