Biomodelling

December 7, 200520 yr

If I understand the goal of the BlueGene project correctly they are attempting to build a lookup table for folding pathways and kinetics based upon protein structures. To accomplish this lofty goal (given the immensely vast number of protein permutations which exist in nature) they're throwing a previously unheard of petaflop of processing power at the problem.

Now, if I understand correctly protein folding was the biggest barrier in building a complete computer model of biomolecular behavior within a cell, being at present so computationally costly to model as to be worthless. With a lookup table of how protein folds, the computational cost is eliminated because the computations have already been done and the answers can be stored for fast retrieval rather than having to be computed each time.

This leads me to wonder how long until we see computer simulations of the complete biochemical behavior of multicellular lifeforms. How long until we can model sponge colonies? Insects? Simple vertebrates? Fish? Reptiles? Mammals? Man?

Kurzweil estimates that by 2009 a $1000 personal computer will have reached teraflop performance levels (equivalent to the supercomputer ASCI Red which held the #1 most powerful supercomputer slot for many years after its construction) and supercomputers will have reached the equivalent computational power of the human brain - 20 petaflops, or 20 times as powerful as BlueGene/L.

Imagine a future supercomputer (or Internet network of personal computers similar to BOINC, the Berkeley Open Infrastructure for Network Computing) which is given the input of a fertilized human egg that proceeds to grow a biomolecular model of a human being.

20 petaflops probably isn't going to be enough, but imagine if you could get half a million teraflop PCs working on the problem in their spare CPU time? Even if you couldn't model in realtime what you'd eventually end up with, maybe a few years of computation down the road (remember that computing the problem gets exponentially faster thanks to Moore's law), is a fully developed human baby, inside of a computer. And right there you have an entire blueprint of consciousness (or what will become consciousness if you let the simulation run longer) sitting inside of a computer.

It seems like in a decade or less the computers will be up to the challenge, but what about molecular biology? If BlueGene/L (with the help of future protein folding supercomputer projects) is able to construct a lookup table of how all proteins which we reasonably expect to occur in nature fold, is our present knowledge of molecular biology sufficient to create computer models of multicellular organisms which operates on a simplified molecular model (i.e. not having to do an incremental atom-by-atom calculation of energetics and other properties, such as BlueGene/L is presently doing, for every molecule in every cell)?

Armed with a protein folding lookup table, how much computational power would it take to reasonably model the molecular operation of a single cell?

December 7, 200520 yr

....

This leads me to wonder how long until we see computer simulations of the complete biochemical behavior of multicellular lifeforms. How long until we can model sponge colonies? Insects? Simple vertebrates? Fish? Reptiles? Mammals? Man?

....

Armed with a protein folding lookup table' date=' how much computational power would it take to reasonably model the molecular operation of a single cell?[/quote']

Because basic cellular processes are highly conserved between eukaroytic cells, as soon as we had 1 eukaryotic cell mapped it would be very easy to adapt that to humans or other animals. Of course it would require some tuning considering human cells are highly differentiated compared to yeast cells, but the basic groundwork would already be there to transfer over.

Also, even if we have computer models to predict protein folding with high fidelity we would still not be able to map out a 'reasonable' model for the operations of a cell. Why? Simply because probably what we know about cellular biology is just the 'tip of the iceberg'. Regardless, a computer model that can predict protein folding would be an invaluable research tool.

December 7, 200520 yr

A computer program which could predict protein structure would aid researchers greatly in determining the function of all the proteins in the cell. However, the function defines just one aspect of that protein's role in the cell. A proteins full biological role also must take into account the various regulatory pathways which control the transcription of its mRNA, the translation/degredation of that mRNA, its localization within the cell, and any regulation of its activity (e.g. by allosteric regulators or covalent modification). It would take a much more sophisticated system to be able to predict the complex regulatory pathways altering the protein's function.

Furthermore, many biologists believe that alternative splicing accounts for a large amount of the complexity in higher organisms like humans. Since a protein structure prediction program could not predict alternative splicing, it would not completely describe the entire function of the human proteome.

So, just as the advent of high-throughput DNA sequencing technology spurred the Human Genome Project, the creation of a protein structure prediction program would spur a Human Proteome Project, where researchers could rapidly define the binding parters, substrates, and products of most of the expressed sequences in the genome. The next logical step would be to focus on functional genomics, which deals with determining the paterns of expression of our genome. Combining the information of protein function from proteomics with the information about its regulation from functional genomics would likely give enough information to model a reasonably simple organism.

December 7, 200520 yr

If you had a perfect protein folding predictor, you could model a prokaryote pretty accurately. Basically all you need from there are the relative abundances of proteins, the lipid composition of the membrane, some microarray data, and the diffusion constant of the cytosol. The intercellular signaling in eukaryotes makes it much more difficult, but there's no reason to think that you couldn't model a eukaryotic cell in culture. If you know exactly what signals are going into a cell, and you know how, for example, all the spliceosome proteins and ribozymes are folded, there's no reason you can't predict alternative splicing or more complex intracellular phenomena.

December 8, 200520 yr

Author

Also, even if we have computer models to predict protein folding with high fidelity we would still not be able to map out a 'reasonable' model for the operations of a cell. Why? Simply because probably what we know about cellular biology is just the 'tip of the iceberg'.

I would contend that a lack of a complete mathematical model of cellular behavior is one of the reasons we are so ignorant about it.

Models are obviously flawed; that's what makes them models and not the real deal. But they evolve: discrepencies between modeled behavior and real results given a set of inputs and be compared, analyzed, and resolved as the model is enchanced.

December 8, 200520 yr

Models are obviously flawed; that's what makes them models and not the real deal. But they evolve: discrepencies between modeled behavior and real results given a set of inputs and be compared, analyzed, and resolved as the model is enchanced.

I forget who originally said it, but: "All models are wrong. But some are more useful than others."

Sign In

Biomodelling

Featured Replies

Archived

Important Information

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)