-
Posts
120 -
Joined
-
Last visited
Content Type
Profiles
Forums
Events
Everything posted by MedGen
-
Fast-Track Gene-ID Method Speeds Rare Disease Search
MedGen replied to Dimension_Traveller's topic in Genetics
[citation needed] -
Arlequin is a population genetics program that takes several different sequence types, including whole sequence, SNPs, microsatellites, etc. It's quite computationally heavy if you've got a lot of samples or a lot of variants. It doesn't use R, but rather a much friendlier Java-based front end. Might be worth checking out: http://cmpg.unibe.ch/software/arlequin3/ I've used this in the past for crunching through large amounts of variation data for a number of populations and found it to be reliable. The only down side is the formatting is very fickle, every single line in the project file has to be absolutely correct, which can be a bit of a pain when dealing with large numbers of samples, and/or variants. I found setting it up in Excel first helped then C&Ping to text pad format. It does allow you to calculate Fst, but only as population differentiation for a set of variants, but not for each variant individually, so it may not be exactly what you're after.
-
You can test your primers (providing they aren't degenerate) using an in silico PCR that's available on the UCSC Genome browser, also providing the organism you are amplifying from has had it's genome sequenced: http://genome.ucsc.edu/cgi-bin/hgPcr?org=Human&db=hg19&hgsid=167311066 This may allow you to see the expected size of your product. I'd also say that when running a temperature gradient it's best to run the gradient from about 3 degrees below the Tm (use the Nearest Neighbour Tm if you can as this takes into account the presence of divalent cations, i.e. Mg2+) upto 72oC (optimal Taq temperature).
-
There are a number of techniques that can be used to knock-out or knock-down genes in recombinant animals. One of these utilises the Cre-loxP system which relies on homologous recombination occuring during sexual reproduction of selected transgenic animals. http://en.wikipedia.org/wiki/Gene_knockout Genes can also be conditionally knocked out based on their cell-type specific expression by removing specific promoter and regulatory regions of the gene in question. I think there are other systems which can induce gene silencing, but I'm not too familiar with those, perhaps one of the resident experts can fill in the gaps here. As to your final question, every somatic cell contains a full complement of the organisms genome, so yes in humans that is 23 pairs of chromosomes constituting (2x) 3.2 billion base pairs. Germ cells contain only a haploid complement of and thus only contain 3.2 billion base pairs. In addition the mitochondrial genome contains a further 16.6 kilobases of DNA, which technically also contributes to our complete genome.
-
True, UCSC has its limitations, but I think Apis mellifera is definitely available, not sure about Nasonia though.
-
If you go to UCSC Genome Browser, you can perform a BLAT search which will align your query sequence against their reference builds, and thus provide you with a genomic location for your sequences (provided those areas have been covered and mapped correctly). The best place to start is with your +1 position and find out where the 5'UTR begins. There are a number of programs that can be used to predict the positions of promoters and regulatory elements: http://www.gene-regulation.com/pub/programs.html http://alggen.lsi.upc.es/cgi-bin/promo_v3/promo/promoinit.cgi?dirDB=TF_8.3 http://bip.weizmann.ac.il/toolbox/seq_analysis/promoters.html To extract the upstream and downstream sequences from NCBI, ideally you need homologous sequences or use the whole genome shotgun (WGS) and genomic sequence options when you are performing a BLAST query and extract the entire contig, not just the homologous regions. The better way to do it is to use the aforementioned UCSC genome browser which allows you to define the distance of upstream sequences you want to extract.
-
Oh to be sure there is an issue regarding the so-called hidden heritability regarding the findings of most GWAS, especially regarding their limitations, but at the end of the day they still have their uses. For instance, in uncovering previously previously unsuspected pathways that might contribute to disease aetiology and pathogenesis. My personal feelings are that possibly there are multiple rare variants in linkage with the known associations that only fully genome sequencing will uncover. Of course this has its own problems in that these rare variants may each have different synergistic or additive effects with other variants. At the end of the day GWAS have only scratched the surface of complex disease genetics, there is still a long way to go in our understanding of how they contribute disease, and of course to their clinical utility. I think it is a little premature to simply say that GWAS have failed as a tool for uncovering the genetic basis, they are merely another tool, a stepping-stone perhaps, in our understanding. It frustrates me a little that they have come under so much fire, perhaps because of the unwarranted hype that they would ultimately uncover all the dark secrets of the genome. A rather naive view to take if you ask me. Any attempt to uncover a few limited biomarkers as prognostic and diagnostic indicators are also going to fall foul of the same mistakes, as you state because of their complexity. In my opinion it is going to be a case that large panels of different serum/blood/bodily fluid biomarkers are going to be required to be clinically useful rather than one or a few of high impact. This has particular ramifications for pharmacogenomics, and of course personalized medicine. I think we a lot further away from this goal than many people make out to believe.
-
I was thinking more in terms of GWAS for complex common diseases. I think many of the GWAS to date have utilised samples form other disease-based repositories, for instance using cancer patients as controls for autoimmune disease GWAS (can't recall the exact study now). Though now I recall that the 1958 Birth Control Cohort and British Blood Service were used for the WTCCC GWAS of seven common diseases as controls. Of course the HapMap and 1000 Genomes project could provide a certain number of samples, well if HapMap the populations were also able to provide phenotypic information that is. Hopefully the Personal Genome Project will also become a repository of control samples that can be used for GWAS, afterall, it is only the genotypes themselves that are required, not the physical DNA samples (unless of course there are disparities between genotyping platforms, etc - but that's more an issue of standardising genotype calling methods and techniques).
-
Every GWAS needs suitable controls. Perhaps there should be a consortium for healthy control volunteers to donate to with all the same informed consent requirements as other clinical samples.
-
I'm pretty sure the current recombinant insulin produced by Eli Lilly (under the name Humulin) is a fully humanised protein that has an increased pharmacological effectiveness (most likely through an increased affinity for the insulin receptors - don't quote me on that though). The reason they use a recombinant fully humanised insulin is to prevent the 5% of the population suffering from the immunogenic effect caused by the 2 amino acid difference between the human and bovine insulins. It sounds like a bit of an anti-Big Pharma conspiracy to me.
-
The colours on the plots them selves aren't dyes. Everytime an event is recorded it is plotted based on its forward scatter and side scatter (the two left hand panels). Sometimes an event will occur in the same 2D position (that is at the same X,Y co-ordiantes) as another. Everytime an event occurs in the same position as another it changes the colour of the event on the plot. This goes from blue (few events) through light blue, green, yellow and then to red (many events). So the areas of red are essentially lots and lots of cells all of the same size and granularity and can be indicative of the purity of the population of that cell type. Not all cells will be exactly the same size or granularity but will tend to be around a given range. The numbers in the gating look to indicate the % of events within that gate as a fraction of total events. For instance in the top middle panel the gating has determined that 1.8% of all events occur within that user-defined section (~340 events). This is confirmed by the panel to the right which is the gated section and tells you that there are 343 events within that gated section. The far-most panels are a measure of the number of events (cells) within the gated sections corresponding to the intensity of that particular antibody (the x-axis is designated FL1-H) flurochrome.
-
Strachan and Read - Human Molecular Genetics. Your one-stop shop for all things molecular genetics and more. I'd also recommend Roger Miesfeld - Applied Molecular Genetics for a slightly (but not much more) advanced take on recombinant DNA techniques and basics of molecular genetics.
-
I wondered if any of the fine minds here would care to shed light on a little conundrum I've encountered regarding the use of Fst and clustering populations by admixture. I've clustered my populations in question using UPGMA with pairwise Fst values, but I'm not entirely sure if this is the right approach to take. I understand that UPGMA relies on the assumption of similar evolutionary rates and that if that assumption is violated any ultrametric trees derived will have incorrect topology and essentially be a useless waste of time. However, the SNPs I'm using are all synonymous or non-coding (though that doesn't discount linkage with functional untyped variants of course), so am I right in assuming equal evolutionary rates between populations? Or would I be better off using a different statistic to cluster my populations? I essentially just want to show that the population substructure I am seeing for a particular locus is in general agreement with historical models of human evolutionary distance and admixture. Could someone help a poor struggling student?
-
I'd be tempted to speculate that they play roles structural integrity as well in chromosome segragation-IIRC thats what the alpha-satellites are involved in anyway. I'm not sure of the exact role they do play though, if any, or it may be that different regions of the chromsome have functions whereas others do not and in fact contain many transposable elements. In this latter case it could further be speculated that the constituitive heterochromatin is essentially protecting the genome from potentially detrimental transpon events. What specific marks or sequence induces chromatin condensation in these particular cases is probably somewhat similar to those involved in other forms of heterochromatin, i.e. hypermethylation of CpGs and extensive deacetylation of histone residues. Someone more informed may be able to shed a better light on things though.
-
I'm currently trying to find a way to perform a hyperbolic regression on a set of data I have as various transformations (i.e. logarithmic and reciprocal) can't linearise it. Having look through the literature it would appear that it can be described by a hyperbola, which is all well and good, but I need to use it as a standard curve for later on. The reason it is described by a hyperbola is because it is a the result of antibody binding and enzyme-substrate reactions which are themselves described by a hyperbola. I have an article which describes a derivation and a description of the requisite Python script that spits out the relevant coefficients which can then be fed into their equation to describe the relationship of the dependent and independent. They say it is based on a least squares fit (????), I've tried to find something that describes that, but to no avail. The problem is they don't actually link to the script itself which can be run in a Python shell, even though they say it is in the supplementary documents (it really isn't). So this a bit of a computing and maths problem I've got. I can't write my own programme as I don't know how, and my maths is insufficient to be able to derive my own forumla for calculating the relevant coefficients to plug into their forumla. Help!
-
I'm currently doing some immunoblots and quantifying them using integrated densitometry. I've probed for my protein of interest and B-actin as the control housekeeping protein. I've also normalized the bands against the background signal. I want to combine the data from three separate membranes so that I can statistically analyse them (I'm being a little naughty, but I need to be able to confirm a previous finding from this). So I've run the blots in triplicate, and normalised the bands to the background. The question is whether I can use the relative ratio of b-actin to protein of interest as a general (i.e. semi-quantitative measure) of expression or whether I have to retain the integrated density as the measure of expression? Thanks in advance. p.s. the integrated density measurements are done in Adobde Photoshop CS4 using 1200 dpi 16-bit grey scale TIF files scans of the developed membranes.
-
The whole definition of a recessive allele is its phenotypic effect only becomes apparent in a homozygous state, so yes it can be propagated through multiple generations without observing the phenotype. That actually depends on recombination, random sampling and how many offspring are in the subsequent filial generations. The likelihood of a single recessive allele being passed on through the first generation between a homozygote and a heterozygote is 1/4. Assuming all generations mate with a homozygote for the dominant allele and only have one offspring in each pairing the odds of it appearing in the F3 generation (great-grandchildren) is 1/4^3 = 0.015625 assuming a biallelic model with no other confounding factors.
-
I've just had a cursory glance at the paper, but the GRR merely reflects how they have treated each genotype as an independent event for the purposes of calculating risk. As for your second question, the numbers are the relative risk (RR) followed by the 95% confidence intervals. Relative risks (and odds ratios) are not informative unless the 95 CI's are also stated which reflect the level of variation in RR's within the cohort. Where 95% CI starts below 1.0 you can see that there are some individuals who might carry the risk genotype under that model and not have the disease (in this case colorectal cancer). This likewise applies to RR below 1.0 (i.e. associated with a protective effect) where the 95%CI rises considerably above 1.0 then you will have healthy controls carrying the risk genotype but not suffering from the disease. It is generally a good indicator (though not an accurate one) of some of the within-cohort heterogeniety (this should actually be tested for between the studies to confirm associations are genuine and not down to chance, which they have done with both a fixed and random effects model). As for the infinity symbol this might be a reflection of the low numbers of MM genotypes, and thus they are dividing by zero when calculating the RR for that model in a lot of the studies. Hope that answers your question.
-
Pseudoscience forums?! Bloody April Fools Day. Had me going for a bit then.
-
Thanks for the help. As I said we've changed the reagents now, still got that large clump of DNA though. I'm not going to remove the actual pellet, just the massive piece of DNA floating around. I appreciate that it will remove some of the RNA as well, however, the concentration is very high already, so removing a little shouldn't be an issue.
-
I'm not talking about some residual DNA, I'm talking about a visible aggregate as soon as the isopropanol is added, and I mean immediately visible, even after a second phenol extraction. We checked all the reagents. Acid phenol was at pH4.5, all the other reagents we made up fresh today, there just appears to be a crap load of DNA in these samples. I'll DNase them and see how they come out afterwards. We've had the idea of physically removing the visible DNA aggregate from the samples (yes it is that big) by pipette. I know this may result in some RNA loss, but it's worth it for the purity of the isolate.
-
Okay, I'm extracting RNA from whole blood samples. I've done a red cell lysis and got my RNA extracted via a modified version of the Chomsinsky method. The problem is that I'm co-extracting far too much DNA with the RNA. I ran the samples down a gel, and I've got the 28S and 16S rRNA bands as well as a lot of genomic DNA contamination. I've run a second extraction against phenol:chloroform. I've just added the isopropanol and there's an aggregate forming instantly, which to me screams of DNA contamination. I've not DNase treated the samples yet because there seemed to much contamination to get rid of it all sufficiently. I need to get this extraction really quite homogenous in terms of RNA: DNA because of the downstream procedures I'm running. I've nanodropped the samples and they all get OD260/280 of 1.89-1.90. Is there any modification I can make that will reduce this DNA contamination significantly? Thanks in advance.
-
Ahh, the wonderful mess that is biology, everytime something gets nailed down along comes another exception.
-
I believe the only cases of polycistronic RNA's in eukaryotes are the retrotransposons, LINE's (e.g.L1).
-
Correct, both DNA and RNA are synthesis 5' to 3' by adding dNTP's to the free hydroxyl group to the 3' end.