cazmantis Posted June 5, 2010 Posted June 5, 2010 Bit new to bioinformatics and just wondered how useful PREDICTED amino acid sequences derived from the genome are (I have found several in NCBI). If trying to BLAST these sequences to find conserved sequences, are the results going to be of use or is it better to not bother with these sequences at all? Thanks, Caz
Greippi Posted June 5, 2010 Posted June 5, 2010 Well of course there are problems knowing whether you've got the right open reading frame, as well as knowing whether you're actually in a gene or not. But other than that, i have never had any problem (I only have limited experience though).
CharonY Posted June 7, 2010 Posted June 7, 2010 It depends on what you are looking for and what kind of database you use. Sequences for already well characterized proteins tend to be useful in most cases. However due to the automated pipelines that are used nowadays errors could still be there. Swissprot, for instance is a better curated database, yet with overall fewer sequences. As a rule of thumb reality checking with well-characterized reference genomes are helpful, especially with regards to functional assignments. But again, it really depends on what you are looking for (e.g. single protein vs whole genome analyses, intergenic regions etc.)
cazmantis Posted June 14, 2010 Author Posted June 14, 2010 Hiya! Wow thanks so much for the replies - very useful stuff so far and has served to illuminate my own lack of knowledge of the subject! I think I am getting a little confused with these things. The PREDICTED sequences I am looking at are nucleotide sequences and I am not sure how I would cross reference that against a genome. It just doesn't seem to make sense to me so I assume that my lack of experience in the field means I don't have access to all the facts! For example NCBI accession number XM_001120951 - it says this nucleotide sequence has been predicted from the genomic sequence. I am finding this quite confusing as surely the nucleotide exists or it doesn't. I want to understand how these predicted sequences are different to (let's say) "normal" sequences. I understand this may be a little in depth to expect an answer on but if anyone could perhaps reccomend a book which may cover this aspect of genomes that would be just as helpful for me. Thanks so much for your help, Caroline
CharonY Posted June 14, 2010 Posted June 14, 2010 Predicted does not mean that the sequence is predicted (it has been sequenced) but that it has been predicted to be an open reading frame. This topic should be covered by most molecular genetics text books (e.g. Genes). Again, the function of a locus is predicted but the sequence itself is based on data (though depending on source it may be faulty, but that is another issue).
cazmantis Posted June 14, 2010 Author Posted June 14, 2010 That's great - thanks so much for your help. Making a lot more sense now. Best, Caroline
Chido Posted October 18, 2021 Posted October 18, 2021 Good day CharonY. Following what cazmantis asked you, I want to also know if it's appropriate for me to design primers using 'predicted' nucleotide sequences, obtained from NCBI. (Am carrying out a project, and it involves primer design) Please reply soon CharonY. Thanks
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now