-
Posts
120 -
Joined
-
Last visited
About MedGen
- Birthday 05/16/1984
Contact Methods
-
Website URL
http://geneticsofamedicalpersuasion.blogspot.com/
Profile Information
-
Location
Leeds, UK
-
Interests
Snowboarding, country walking, power kiting, photography, climbing and reading
-
College Major/Degree
BSc Medical Genetics
-
Favorite Area of Science
Genetics, pharmacogenetics, molecular biology, cell biology and immunology
-
Biography
Graduated in July 2010, start a PhD in pharmacogenetics in October 2010
-
Occupation
Student without portfolio, i.e. a bum
Retained
- Baryon
MedGen's Achievements
Baryon (4/13)
59
Reputation
-
Fast-Track Gene-ID Method Speeds Rare Disease Search
MedGen replied to Dimension_Traveller's topic in Genetics
[citation needed] -
Arlequin is a population genetics program that takes several different sequence types, including whole sequence, SNPs, microsatellites, etc. It's quite computationally heavy if you've got a lot of samples or a lot of variants. It doesn't use R, but rather a much friendlier Java-based front end. Might be worth checking out: http://cmpg.unibe.ch/software/arlequin3/ I've used this in the past for crunching through large amounts of variation data for a number of populations and found it to be reliable. The only down side is the formatting is very fickle, every single line in the project file has to be absolutely correct, which can be a bit of a pain when dealing with large numbers of samples, and/or variants. I found setting it up in Excel first helped then C&Ping to text pad format. It does allow you to calculate Fst, but only as population differentiation for a set of variants, but not for each variant individually, so it may not be exactly what you're after.
-
You can test your primers (providing they aren't degenerate) using an in silico PCR that's available on the UCSC Genome browser, also providing the organism you are amplifying from has had it's genome sequenced: http://genome.ucsc.edu/cgi-bin/hgPcr?org=Human&db=hg19&hgsid=167311066 This may allow you to see the expected size of your product. I'd also say that when running a temperature gradient it's best to run the gradient from about 3 degrees below the Tm (use the Nearest Neighbour Tm if you can as this takes into account the presence of divalent cations, i.e. Mg2+) upto 72oC (optimal Taq temperature).
-
There are a number of techniques that can be used to knock-out or knock-down genes in recombinant animals. One of these utilises the Cre-loxP system which relies on homologous recombination occuring during sexual reproduction of selected transgenic animals. http://en.wikipedia.org/wiki/Gene_knockout Genes can also be conditionally knocked out based on their cell-type specific expression by removing specific promoter and regulatory regions of the gene in question. I think there are other systems which can induce gene silencing, but I'm not too familiar with those, perhaps one of the resident experts can fill in the gaps here. As to your final question, every somatic cell contains a full complement of the organisms genome, so yes in humans that is 23 pairs of chromosomes constituting (2x) 3.2 billion base pairs. Germ cells contain only a haploid complement of and thus only contain 3.2 billion base pairs. In addition the mitochondrial genome contains a further 16.6 kilobases of DNA, which technically also contributes to our complete genome.
-
True, UCSC has its limitations, but I think Apis mellifera is definitely available, not sure about Nasonia though.
-
If you go to UCSC Genome Browser, you can perform a BLAT search which will align your query sequence against their reference builds, and thus provide you with a genomic location for your sequences (provided those areas have been covered and mapped correctly). The best place to start is with your +1 position and find out where the 5'UTR begins. There are a number of programs that can be used to predict the positions of promoters and regulatory elements: http://www.gene-regulation.com/pub/programs.html http://alggen.lsi.upc.es/cgi-bin/promo_v3/promo/promoinit.cgi?dirDB=TF_8.3 http://bip.weizmann.ac.il/toolbox/seq_analysis/promoters.html To extract the upstream and downstream sequences from NCBI, ideally you need homologous sequences or use the whole genome shotgun (WGS) and genomic sequence options when you are performing a BLAST query and extract the entire contig, not just the homologous regions. The better way to do it is to use the aforementioned UCSC genome browser which allows you to define the distance of upstream sequences you want to extract.
-
Oh to be sure there is an issue regarding the so-called hidden heritability regarding the findings of most GWAS, especially regarding their limitations, but at the end of the day they still have their uses. For instance, in uncovering previously previously unsuspected pathways that might contribute to disease aetiology and pathogenesis. My personal feelings are that possibly there are multiple rare variants in linkage with the known associations that only fully genome sequencing will uncover. Of course this has its own problems in that these rare variants may each have different synergistic or additive effects with other variants. At the end of the day GWAS have only scratched the surface of complex disease genetics, there is still a long way to go in our understanding of how they contribute disease, and of course to their clinical utility. I think it is a little premature to simply say that GWAS have failed as a tool for uncovering the genetic basis, they are merely another tool, a stepping-stone perhaps, in our understanding. It frustrates me a little that they have come under so much fire, perhaps because of the unwarranted hype that they would ultimately uncover all the dark secrets of the genome. A rather naive view to take if you ask me. Any attempt to uncover a few limited biomarkers as prognostic and diagnostic indicators are also going to fall foul of the same mistakes, as you state because of their complexity. In my opinion it is going to be a case that large panels of different serum/blood/bodily fluid biomarkers are going to be required to be clinically useful rather than one or a few of high impact. This has particular ramifications for pharmacogenomics, and of course personalized medicine. I think we a lot further away from this goal than many people make out to believe.
-
I was thinking more in terms of GWAS for complex common diseases. I think many of the GWAS to date have utilised samples form other disease-based repositories, for instance using cancer patients as controls for autoimmune disease GWAS (can't recall the exact study now). Though now I recall that the 1958 Birth Control Cohort and British Blood Service were used for the WTCCC GWAS of seven common diseases as controls. Of course the HapMap and 1000 Genomes project could provide a certain number of samples, well if HapMap the populations were also able to provide phenotypic information that is. Hopefully the Personal Genome Project will also become a repository of control samples that can be used for GWAS, afterall, it is only the genotypes themselves that are required, not the physical DNA samples (unless of course there are disparities between genotyping platforms, etc - but that's more an issue of standardising genotype calling methods and techniques).
-
Every GWAS needs suitable controls. Perhaps there should be a consortium for healthy control volunteers to donate to with all the same informed consent requirements as other clinical samples.
-
I'm pretty sure the current recombinant insulin produced by Eli Lilly (under the name Humulin) is a fully humanised protein that has an increased pharmacological effectiveness (most likely through an increased affinity for the insulin receptors - don't quote me on that though). The reason they use a recombinant fully humanised insulin is to prevent the 5% of the population suffering from the immunogenic effect caused by the 2 amino acid difference between the human and bovine insulins. It sounds like a bit of an anti-Big Pharma conspiracy to me.
-
The colours on the plots them selves aren't dyes. Everytime an event is recorded it is plotted based on its forward scatter and side scatter (the two left hand panels). Sometimes an event will occur in the same 2D position (that is at the same X,Y co-ordiantes) as another. Everytime an event occurs in the same position as another it changes the colour of the event on the plot. This goes from blue (few events) through light blue, green, yellow and then to red (many events). So the areas of red are essentially lots and lots of cells all of the same size and granularity and can be indicative of the purity of the population of that cell type. Not all cells will be exactly the same size or granularity but will tend to be around a given range. The numbers in the gating look to indicate the % of events within that gate as a fraction of total events. For instance in the top middle panel the gating has determined that 1.8% of all events occur within that user-defined section (~340 events). This is confirmed by the panel to the right which is the gated section and tells you that there are 343 events within that gated section. The far-most panels are a measure of the number of events (cells) within the gated sections corresponding to the intensity of that particular antibody (the x-axis is designated FL1-H) flurochrome.
-
Strachan and Read - Human Molecular Genetics. Your one-stop shop for all things molecular genetics and more. I'd also recommend Roger Miesfeld - Applied Molecular Genetics for a slightly (but not much more) advanced take on recombinant DNA techniques and basics of molecular genetics.
-
I wondered if any of the fine minds here would care to shed light on a little conundrum I've encountered regarding the use of Fst and clustering populations by admixture. I've clustered my populations in question using UPGMA with pairwise Fst values, but I'm not entirely sure if this is the right approach to take. I understand that UPGMA relies on the assumption of similar evolutionary rates and that if that assumption is violated any ultrametric trees derived will have incorrect topology and essentially be a useless waste of time. However, the SNPs I'm using are all synonymous or non-coding (though that doesn't discount linkage with functional untyped variants of course), so am I right in assuming equal evolutionary rates between populations? Or would I be better off using a different statistic to cluster my populations? I essentially just want to show that the population substructure I am seeing for a particular locus is in general agreement with historical models of human evolutionary distance and admixture. Could someone help a poor struggling student?
-
I'd be tempted to speculate that they play roles structural integrity as well in chromosome segragation-IIRC thats what the alpha-satellites are involved in anyway. I'm not sure of the exact role they do play though, if any, or it may be that different regions of the chromsome have functions whereas others do not and in fact contain many transposable elements. In this latter case it could further be speculated that the constituitive heterochromatin is essentially protecting the genome from potentially detrimental transpon events. What specific marks or sequence induces chromatin condensation in these particular cases is probably somewhat similar to those involved in other forms of heterochromatin, i.e. hypermethylation of CpGs and extensive deacetylation of histone residues. Someone more informed may be able to shed a better light on things though.
-
I'm currently trying to find a way to perform a hyperbolic regression on a set of data I have as various transformations (i.e. logarithmic and reciprocal) can't linearise it. Having look through the literature it would appear that it can be described by a hyperbola, which is all well and good, but I need to use it as a standard curve for later on. The reason it is described by a hyperbola is because it is a the result of antibody binding and enzyme-substrate reactions which are themselves described by a hyperbola. I have an article which describes a derivation and a description of the requisite Python script that spits out the relevant coefficients which can then be fed into their equation to describe the relationship of the dependent and independent. They say it is based on a least squares fit (????), I've tried to find something that describes that, but to no avail. The problem is they don't actually link to the script itself which can be run in a Python shell, even though they say it is in the supplementary documents (it really isn't). So this a bit of a computing and maths problem I've got. I can't write my own programme as I don't know how, and my maths is insufficient to be able to derive my own forumla for calculating the relevant coefficients to plug into their forumla. Help!