Bioinformatics research project

scientistsahai · March 28, 2006

Hi friends,

Calling all bioinformaticians to help me! I have to make a project using the bioinformatics tools. Can anyone please suggest what could be the best topic foe me to pursue at grad studies?

I have to develop a tool(program) to do some basic BI calculation/simulations /interpretations that can achieve a desired level of result. I have already done a project on the analysis of Microarray data, so I am inclined towards Proteomics/Protein Chemistry, but is open to any good research topic.

Can some people help me describe some good projects that could be done(at grad lvl). Please give relevant details on the data availability and conclusions that could be drawn at the end with some reference to the procedure/algorithm adopted. I will be grateful to all and he/she will definitely get a mention in the final project/ thesis.

Some broad areas that I am aware of are :

Microarray data analysis
Comparing genomic sequences using Dotplots
Computational Evolutinary genomics
Protein Structure Prediction etc.

Please help me out with as diverse projects as possible and it would be great if anyone could help me with a current/live project.

mattbimbo · March 28, 2006

a very challenging project for you to consider...

given the distances between the Calpha atoms of a protein and the nearest solvent boundary, ie Calpha-(nearest-surface-)h2o distances, could you accurately predict the protein structure?

scientistsahai · March 31, 2006

Matt,

That was a bit tough to get, can u please discuss more of it. I don't think I have got what u really wanted to put through.

let me tell what I got from that, we want to predict the tertiary struc when the protein is in solvent(water) and the c-alpha and nearest H20 distances are known. Does it mean that we have to gather the water and C interactions and steric energy and get the tert struc in a minimum energy level in solvent??

if so, can u please exemplify it more, or if not then please put forward in very basic terminology, so that it makes more sense to me.

mattbimbo · March 31, 2006

ok, first let's get some experimental data.

we have a protein in solution to which we add D2O.

over time the deuterium/D protons will replace the hydrogen/H protons of the protein.

then at a certain time we stop the exchange of D and H, ie quench the reaction.

following this we analyse the protein by peptide mass spectroscopy.

this analysis will reveal sites of H-D exchange.

now by looking at the kinetics of H-D exchange on the protein, ie performing a time course, it is possible to identify which protein residues are most solvent exposed.

the question is, with this data, which before i simply called Ca-solvent distances, would it be possible to predict the protein structure?

scientistsahai · March 31, 2006

on more thoughts, I can find which AA lie to the edges of the protein but then it wud b difficult to comment on the struct. And how does the water molecules play a role in the struct determination of the protein ??

mattbimbo · March 31, 2006

the old theory goes, that core of proteins tend to contain water-hating moleulces, hydrophobic residues, while the surface of the protein contains water-loving, hydrophilic.

mattbimbo · March 31, 2006

here is a ref to help you...

but if the link doesn't work put the following into pubmed

>smith dl mass spectroscopy

scientistsahai · April 3, 2006

Matt, I have gone thru this exhaustive article(on the link).I have understood most part of the paper, but am unable to decide where to start and how will the initial data look like(data, values, format of data, file format etc). Also, I have looked at the equations discussed, where by eq(6) is the most appropriate and will give the proton-deuterium exchange. This gives us which AA are around the boundary and which are inside the core of protein.

But we could also get the hydrophobic and hydrophilic AA from the seuence itself and move on those lines? But how will this value help us predict the structure of protein? The higher value depicts the alpha-helix, but this is only for Cyt c and how can this be generalised ?(what threshold value shud be taken?)

Plz help me further on this. I am really looking forward to this project now.

mattbimbo · April 4, 2006

hi, i reread your initial post

I have to develop a tool(program) to do some basic BI calculation/simulations /interpretations that can achieve a desired level of result.

as you wrote, knowing where to start is very important. also important is realising what is achievable with all your resources and limitations. i have no idea on the later. but where to start? break the problem into steps.

a) obtain data (this may be invented or borrowed from a paper. remember that the data will come from the same experiment repeated many times.)

b) prepare data for analysis (this may involve going from residue H-D exchange times into probability distributions that a residue is at the surface)

c) analyse data (does the data make sense? ie, have some expected distribution for what the data should look like. perhaps this kind of analysis may suggest that the protein has multiple domains? can you detect correlations in exchange times between adjacent residues?)

d) build model (algorithms already exist for this, but they will have limitations. perhaps focus on a small part of the algorithm. or do you want to try to incorporate other data, ie protein secondary structure predictions?)

e) display model and stats analysis

in my opinion if you tackled any of the steps a) to e) this would be a good project, or focus even further and tackle a small part of one of the steps.

the main reason why i suggested this kind of project to you was that i thought it would be educational. think of the challenge! you are going from one-dimensional information to a 3D model. but if you want you could simplify it, go from 1D to a 2D protein. admittedly 2D proteins don't exist but you can invent them (or maybe they have already been invented). in my experience, thinking along these lines, 1D to 2D, is better for testing ideas and easier for programming.

it is a project where there is plenty of room for imagination, more so as your use of statistics and probability advances.

scientistsahai · April 12, 2006

Are there any other ideas too?

Seems Matt, being the only person who has landed up one!

Sign In

Bioinformatics research project

Recommended Posts

scientistsahai

mattbimbo

scientistsahai

mattbimbo

scientistsahai

mattbimbo

mattbimbo

scientistsahai

mattbimbo

scientistsahai

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Important Information