“Essentially all models are wrong, but some are useful.”
– George E.P. Box
This is an old anecdote that always draws a chuckle. A farmer, a biologist and a theoretical physicist walk into a bar, and end up discussing how cows can produce more milk. The farmer ruminates for a moment, and says, “We need to improve the food content, nutrition and living conditions of cows. Then they’ll give more milk!” The biologist pauses just for a moment and declares emphatically that we should make genetically engineered cows that will produce much more milk. Meanwhile the physicist, who has been furiously thinking, suddenly breaks into a victorious smile, and says, “Assume that the cow is a sphere…”
It’s ridiculously easy to heckle physicists and mathematicians with such jibes. After all, who else can reduce complexity to unknown constants and ignore messy realities for an ideal, impossible scenario? And who else grapples for years with abstract, impractical and seemingly useless problems in order to describe a natural phenomenon? But it is exactly this feature of physicists and mathematicians that have helped them come up with theory and models that explain complex phenomena from the natural world, and which have transformed our understanding of nature.
A cornerstone of the scientific method is the scientific theory, and at the heart of scientific theories are scientific models. Theory in science is very different from ‘theory’ in commonspeak. To most of us, unfortunately, theory means an idea or a random thought. This usually means that on a good day, our ‘theories’ might be a reasonable scientific hypothesis. But in science, a theory is a well-established principle that explains some phenomenon of the natural world.
So scientific theory is at a pinnacle and provides explanations for phenomena that are built upon observations and evidence. Theories are the most substantial form of scientific knowledge. Famous scientific theories include the theory of gravity, evolution by natural selection and many more. Also, there is some confusion about a scientific law and a theory, and this vague belief that a law is somehow superior to a theory. This is not true because they are not mutually exclusive. A law only describes how a particular phenomenon will work under certain conditions. A theory aspires to be broader and tries to give us an all-encompassing view for how natural phenomena work. So a theory can contain one or many laws within it, and both laws and theories can be fact.
There are two features of a scientific theory. Theories give us models that explain data and theories provide predictions that are testable. So all theories can and should be refined by new experiments, and this leads to new, testable predictions. When you properly stick to the scientific method, this becomes a cycle of theory driving experiments driving theory, all leading to new knowledge.
Scientist need those “spherical cows” to develop testable models and theories. These encompass the need for assumptions. By common definition, an assumption is something that is accepted without evidence. But just like the meaning of a theory changes from commonspeak to science, so does the meaning of an assumption in science. First, since you need to start somewhere; a scientific theory needs assumptions but uses as few assumptions as possible. Second, assumptions should be of the kinds that have to be made (like the assumption that truth exists) and these assumptions need to be built on actual evidence.
So the hallmark of scientific theory is this combination of falsifiability and testability, something we have explored before. But the best part of a theory, and models coming from such a theory is the opening up of whole new worlds of testable possibilities and creating new areas of research. This is what has made the past century such a supercharged era for scientific discovery. And this is what allows the process of discovery to “boldly go where no man has gone before”.
Sticking to stereotype, physicists have rightly delighted in theory and models. Yet in biology (outside of the evolutionary sciences), theory has maintained a lower profile and is perhaps under-appreciated. But just as in physics, theory, and models emerging from theory, have transformed the landscape of biology. There are famous theories and models in biology – the most being Darwin’s and Wallace’s theories of evolution by natural selection. The theory of evolution by natural selection relied on a series of observations Darwin and Wallace had independently made, in different parts of the world.
While the ideas suggested by their theory far exceeded what their observations then showed, they were within the framework of testable ideas, and over the coming decades a whole host of science coming from a range of disciplines ranging from biology to genomics to geology built, refined and expanded the existing framework. There are other famous models in biology that came from limited data but could be proposed because they could provide an explanation that fit known rules, and could be experimentally tested. They also opened up new possibilities for other research, which could also be tested. A famous such example is the DNA double helix model by Watson and Crick.
When Watson and Crick got into the race for discovering what DNA looked like and how it worked, many things were known. DNA had been discovered decades earlier, and biochemical giants like Phoebus Levene and Erwin Chargaff had worked out the composition and chemistry of nucleic acids, of which DNA is a type. Oswald Avery and his colleagues had shown that all hereditary units, or ‘genes’, were made up of DNA. So there was great excitement in understanding how DNA managed to code this information. So the search was for a model that could explain all of this.
The only real data they had at hand were spots on an X-ray film, but scientists of the time, notably Linus Pauling and Max Perutz, had come up with a way of deciphering how crystals of proteins diffracted X-rays, and translating that information to bond-angles and structure. But DNA’s structure was much harder to decipher, and here Rosalind Franklin’s data and tragic story played a critical role. Still, the spots of DNA diffraction could not easily be built into an understandable model, and Franklin herself was struggling with it. But Watson and Crick used a combination of a thorough understanding of DNA chemistry, the physical rules by which they worked and building actual cardboard-and-wire models of the four nucleic acids making up DNA, to work out how DNA could assemble. Their physical model, like a jigsaw puzzle, fit perfectly. The experiments needed to test it were self-evident (and proved to be immediately true). The model also explained lots of existing data that had remained inexplicable.
This moment truly transformed biology.
Now, all these models we’ve talked about come from experimental data and are built by using parsimonious explanations of the data to explain broader natural phenomenon. They don’t require a special ability in mathematics as such. But even in biology, there is a special place for the type of theory provided by physicists and mathematicians. At its best, mathematical modelling explains what is and is not possible and decisively helps rule out something very unlikely to be. This is very powerful when deterministic models are used. These can determine outcomes exactly the same way for a given set of initial conditions. This is also very powerful in stochastic models, which really are statistical models, where randomness is present but the outcomes are probabilistic distributions.
When done rigorously, statistical models make a compelling argument on what is the most statistically likely explanation for a phenomenon or outcome for an event. There are other types of modelling in biology that heavily rely on aspects of mathematical modelling. This includes constructing metabolic or signalling networks in cells or organisms, which tell you how information is transferred within organisms (these could be food, chemical molecules or external stimulus). This also includes cellular modelling – understanding how proteins fold and function. And concepts from mathematical modelling are heavily used in assembling big genomes as well as making sense of the enormous amounts of information in the metagenomes of multiple organisms.
Theory, in biology, has now almost come full circle, from being prominent a century ago to fading into near obscurity to now coming back into prominence. The importance of theory is only going to increase in the coming years of Big Data. A future in biological discovery for a student with no appreciation of mathematics seems somewhat bleak. So in our current frenetic era of experimental biology, let us raise a small toast of appreciation for theory and models, which help us understand how the natural world works.
Sunil Laxman is a scientist at the Institute for Stem Cell Biology and Regenerative Medicine, studying cellular decision-making. He has a keen interest in the history and process of science and how science influences society.