<< EpiSimS: a simulation made transparent   |   WEBLOG   |   Politics in Turkey visualised by an artist-munger. >>

A massive simulation network - protein folding.

The BBC has a story about folding@home (FAH) which has just been recognised by Guinness World Records as the world's most powerful distributed computing network.

More than one petaflop of computing power is comparable to IBM's Roadrunner. folding@home has signed up nearly 700,000 PS3s to examine how the shape of proteins affect diseases such as Alzheimer's. (Previously they had 200,000 PCs producing 250 teraflops. The PS3 processor, according to the BBC, runs at ten times the speed of current PC chips; adding 670,000 PS3s brings the network to more than one petaflop.)

The folding@home FAQ explains: "Before proteins can carry out their biochemical function, they remarkably assemble themselves, or "fold." The process of protein folding, while critical and fundamental to virtually all of biology, remains a mystery.... when proteins do not fold correctly (i.e. "misfold"), there can be serious effects, including many well known diseases... We use novel computational methods and large scale distributed computing, to simulate timescales thousands to millions of times longer than previously achieved. This has allowed us to simulate folding for the first time, and to now direct our approach to examine folding related disease."

I've commented before about an earlier massive collaborative distributed simulation project (which went wrong in a very instructive way). This was the BBC Climate Change programme: their site currently says: "By the end of 2006, thousands of people had finished running their model up to the year 2080 and their data contributed to a BBC One programme in January 2007."

SETI is another example, set up to look for traces of intelligent activity in radio signals recieved from space. Wikipedia says: "Over 5 million computer users in more than 200 countries have signed up for SETI@home and have collectively contributed over 19 billion hours of computer processing time. As of December 4, 2006 the Seti@Home grid operates at 257 TeraFLOPS, making it equivalent to the second fastest supercomputer on Earth."

The folding@home FAQ says:
"Why not just use a supercomputer?
Modern supercomputers are essentially clusters of hundreds of processors linked by fast networking. The speed of these processors is comparable to (and often slower than) those found in PCs! Thus, if an algorithm (like ours) does not need the fast networking, it will run just as fast on a supercluster as a supercomputer. However, our application needs not the hundreds of processors found in modern supercomputers, but hundreds of thousands of processors. Hence, the calculations performed on Folding@Home would not be possible by any other means! Moreover, even if we were given exclusive access to all of the supercomputers in the world, we would still have fewer cycles than we do with the Folding@Home cluster!"

There's an interesting article in Dr Dodds' by Catherine Crawford of IBM which argues that the constraint on supercomputing is not currently a hardware one, but the lack of software to handle the power becoming available. She says that "Roadrunner is the first rendering of a hybrid computing architecture: multiple heterogeneous cores with a multitier memory hierarchy. It's also built entirely out of commodity parts.... The philosophy is a "division of labor" approach." I confess that much of her article is beyond me, but it looks as if the supercomputers and the super-networks are going much the same way. Crawford identifies several other uses for this level of power, including:
- Financial services. (predict the ripple effect of a stock market change throughout the markets).
- Digital animation. ("characters and scenarios so realistic that the line will be blurred between animated and live-action movies").
- Information-based medicine. ("Complex 3-D renderings of tissues and bone structures will happen in real-time, with in-line analytics used for tumor detection as well as comparison with historical data and real-time patient data. Synthesis of real-time patient data can be used to generate predictive alerts.")
- Oil and gas exploration seismic analysis.
- Nanotechnology.


ADD YOUR COMMENT:

name
Email
Location
Homepage


Show email   Remember me

Notify me when someone replies to this post?




COMMENTS SO FAR:

Powered by pMachine