[tt] Computer Program Self-Discovers Laws of Physics

Eugen Leitl <eugen at leitl.org> on Fri Apr 3 10:12:52 CEST 2009

http://blog.wired.com/wiredscience/2009/04/newtonai.html

Computer Program Self-Discovers Laws of Physics

By Brandon Keim EmailApril 02, 2009 | 5:09:59 PMCategories: Artificial
Intelligence, Science Tools, Systems Biology, Web/Tech  

In just over a day, a powerful computer program accomplished a feat that took
physicists centuries to complete: extrapolating the laws of motion from a
pendulum's swings.

Developed by Cornell researchers, the program deduced the natural laws
without a shred of knowledge about physics or geometry.

The research is being heralded as a potential breakthrough for science in the
Petabyte Age, where computers try to find regularities in massive datasets
that are too big and complex for the human mind. (See Wired magazine's July
2008 cover story on "The End of Science.")

"One of the biggest problems in science today is moving forward and finding
the underlying principles in areas where there is lots and lots of data, but
there's a theoretical gap. We don't know how things work," said Hod Lipson,
the Cornell University computational researcher who co-wrote the program. "I
think this is going to be an important tool."

Condensing rules from raw data has long been considered the province of human
intuition, not machine intelligence. It could foreshadow an age in which
scientists and programs work as equals to decipher datasets too complex for
human analysis.

Lipson's program, co-designed with Cornell computational biologist Michael
Schmidt and described in a paper published Thursday in Science, may represent
a breakthrough in the old, unfulfilled quest to use artificial intelligence
to discover mathematical theorems and scientific laws:

    * Half a century ago, IBM's Herbert Gelernter authored a program that
purportedly rediscovered Euclid's geometry theorems, but critics said it
relied too much on programmer-supplied rules.

    * In the 1970s, Douglas Lenat's Automated Mathematician automatically
generated mathematical theorems, but they proved largely useless.

    * Stanford University's Dendral project, was started in 1965 and used for
two decades to extrapolate possible structures for organic molecules from
chemical measurements gathered by NASA spacecraft. But it was ultimately
unable to assess the likelihood of the various answers that it generated.

    * The $100,000 Leibniz Prize, established in the 1980s, was promised to
the first program to discover a theorem that "profoundly effects" math. It
was never claimed.

But now artificial intelligence experts say Lipson and Schmidt may have
fulfilled the field's elusive promise.

Unlike the Automated Mathematician and its heirs, their program is primed
only with a set of simple, basic mathematical functions and the data it's
asked to analyze. Unlike Dendral and its counterparts, it can winnow possible
explanations into a likely few. And it comes at an opportune moment —
scientists have vastly more data than theories to describe it.

Lipson and Schmidt designed their program to identify linked factors within a
dataset fed to the program, then generate equations to describe their
relationship. The dataset described the movements of simple mechanical
systems like spring-loaded oscillators, single pendulums and double pendulums
— mechanisms used by professors to illustrate physical laws.

The program started with near-random combinations of basic mathematical
processes — addition, subtraction, multiplication, division and a few
algebraic operators.

Initially, the equations generated by the program failed to explain the data,
but some failures were slightly less wrong than others. Using a genetic
algorithm, the program modified the most promising failures, tested them
again, chose the best, and repeated the process until a set of equations
evolved to describe the systems. Turns out, some of these equations were very
familiar: the law of conservation of momentum, and Newton's second law of
motion.

"It's a powerful approach," said University of Michigan computer scientist
Martha Pollack, with "the potential to apply to any type of dynamical
system." As possible fields of application, Pollack named environmental
systems, weather patterns, population genetics, cosmology and oceanography.
"Just about any natural science has the type of structure that would be
amenable," she said.

Compared to laws likely to govern the brain or genome, the laws of motion
discovered by the program are extremely simple. But the principles of Lipson
and Schmidt's program should work at higher scales.

The researchers have already applied the program to recordings of
individuals' physiological states and their levels of metabolites, the
cellular proteins that collectively run our bodies but remain, molecule by
molecule, largely uncharacterized — a perfect example of data lacking a
theory.

Their results are still unpublished, but "we've found some interesting laws
already, some laws that are not known," said Lipson. "What we're working on
now is the next step — ways in which we can try to explain these equations,
correlate them with existing knowledge, try to break these things down into
components for which we have clues."

Lipson likened the quest to a "detective story" — a hint of the changing role
of researchers in hybridized computer-human science. Programs produce sets of
equations — describing the role of rainfall on a desert plateau, or air
pollution in triggering asthma, or multitasking on cognitive function.
Researchers test the equations, determine whether they're still incomplete or
based on flawed data, use them to identify new questions, and apply them to
messy reality.

The Human Genome Project, for example, produced a dataset largely impervious
to traditional analysis. The function of nearly every gene depends on the
function of other genes, which depend on still more genes, which change with
time and place. The same level of complexity confronts researchers studying
the body's myriad proteins, the human brain and even ecosystems.

"The rules are mathematical formulae that capture regularities in the
system," said Pollack, "but the scientist needs to interpret those
regularities. They need, for example, to explain" why an animal population is
affected by changes in rainfall, and what might be done to protect it.

Michael Atherton, a University of Minnesota cognitive neuroscientist who
recently predicted that computer intelligence would not soon supplant human
artistic and scientific insight, said that the program "could be a great
tool, in the same way visualization software is: It helps to generate
perspectives that might not be intuitive."

However, said Atherton, "the creativity, expertise, and the recognition of
importance is still dependent on human judgment. The main problem remains the
same: how to codify a complex frame of reference."

"In the end, we still need a scientist to look at this and say, this is
interesting," said Lipson.

Humans are, in other words, still important.

Citations: "Distilling Free-Form Natural Laws from Experimental Data." By
Michael Schmidt and Hod Lipson.  Science, Vol. 324, April 3, 2009.

"Automating Science." By David Waltz and Bruce Buchanan. Science, Vol. 324,
April 3, 2009.

More information about the tt mailing list