misterPCR Posted August 29, 2014 Share Posted August 29, 2014 (edited) Hello people, I am investigating about candidate CNV (copy number varations) which could play an important role in the development of some complex diseases. I wonder if there are any browser or software which could help me to know which of the huge amount of genetic regions of different patients, belong to exonic regions. Thanks in advance. Edited August 29, 2014 by misterPCR Link to comment Share on other sites More sharing options...
hypervalent_iodine Posted August 29, 2014 Share Posted August 29, 2014 I may be totally off on this, but would something like BLAST not work? If you have a large amount of data, you will probably want to operate this through a HPC system. Link to comment Share on other sites More sharing options...
misterPCR Posted August 29, 2014 Author Share Posted August 29, 2014 (edited) Yes, it could work for one or a few genetic regions, what I guess it should be something to do that automatically with about 200 regions at the same time. Some kind of tool to help me filtering all my candidate regions to just the ones which are on exonic regions About the HPC systems, could be, but I know browser that can deal with huge amount of regions and check their protein-protein interactions....but I dont know a browser about differenciating between exonic/intronic regions Edited August 29, 2014 by misterPCR Link to comment Share on other sites More sharing options...
Mr.Zurich92 Posted August 29, 2014 Share Posted August 29, 2014 (edited) go to dbSNP or UCSC genome browser! Do you know which SNPs are in these CNV sequence so you can go to dbSNP or UCSC genome browser and you can see if the SNP is in an exome or intron! Yes, it could work for one or a few genetic regions, what I guess it should be something to do that automatically with about 200 regions at the same time. Some kind of tool to help me filtering all my candidate regions to just the ones which are on exonic regions About the HPC systems, could be, but I know browser that can deal with huge amount of regions and check their protein-protein interactions....but I dont know a browser about differenciating between exonic/intronic regions Is there a chance that autism or male homosexuality is caused by complex polygenetic interactions? Sounds interesting what you say! Could you tell me more about this browser that can check a large amount of protein-protein interactions? Edited August 29, 2014 by Mr.Zurich92 Link to comment Share on other sites More sharing options...
chadn737 Posted August 29, 2014 Share Posted August 29, 2014 I would avoid a browser for working with so many SNPs. What format is your data in? If you happen to have it in a BED file or something similar, this task could be extremely simple and fast. Link to comment Share on other sites More sharing options...
misterPCR Posted September 1, 2014 Author Share Posted September 1, 2014 (edited) Exactly, I do know the SNPs of that regions (or I could get that info easily), but if I have to do that for all the regions, It would take me ages. I have 261 patients of one mental disorder (Tourette syndrome) with 20.000 genomic regions with a deletion or a duplication of regions (about 76 per patient) compared with controls. The most relevants are the exonics CNV, obviously. My data is excel format. But I have as well the origianl format from the Affymetrix software ChAS (which is .cychp). I am sure Im not the first researcher with this problem, should be something Edited September 1, 2014 by misterPCR Link to comment Share on other sites More sharing options...
chadn737 Posted September 9, 2014 Share Posted September 9, 2014 Excel.....do not use excel. Export your data in a tab delimited or comma separated text file and use command line tools to handle this. Its the only way to reliably handle large data sets like this and not go crazy. The information you are seeking could be quickly determined if you learn to use the command line and the great tools available. I highly recommend that you export to a tab-delimited format and try to convert in some way to a BED format. http://genome.ucsc.edu/FAQ/FAQformat.html This will enable you to use BED tools https://code.google.com/p/bedtools/ to find those sites that intersect with coding regions. Not only that, but once it is in BED format, you will be easily able to upload it or convert it into formats uploadable to a browser for visualization. If you do not know how to work from the terminal and have time, learn how to. UNIX and basic scripting should be a standard part of any education in genetics at the graduate level anymore. If you don't have the time, try to find somebody with a computer science/bioinformatics background to help you. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now