jbn22 Posted January 6, 2017 Posted January 6, 2017 (edited) Hey guys, I need some help wrapping my head around a study on the genetics of Schizophrenia. Perhaps you guys could lend me a hand? 1.) How does a Manhattan plot work? I understand the Y-Axis shows significance and the X-Axis shows the location on the genome/chromosome, but how does the individual data add up?I would have expected one value per point on the X-Axis (that is, the value of significance that a given SNP is associated with an increased risk in the disease group). Alternatively, I could explain up to four values on the X-Axis (that is, the respective significance of an SNP (G,A,C or T) being associated with an increased risk in the disease group). And yet, when I look at the plot, I see far more Y-values for a point on the X-axis. I am afraid that I may have completely misunderstood GWAS. Edited January 6, 2017 by jbn22
Function Posted January 6, 2017 Posted January 6, 2017 (edited) I might be wrong, but could it mean different haplotypes per gene (set)? I mean, I'm looking at the peak at chromosome 6, and if I'm not mistaken, that's where HLA comes in (long story short: HLA is an important family of proteins presenting antigens to CD4+/CD8+ T-lymphocytes (MHC class II type HLA (DR, DQ) and MHC class I type HLA (A, B, C, E are the most important ones), respectively. (If you're wondering: MHC = major histocompatibility complex; the complex of genes on Cx6 of which transcription and subsequent translation results in HLA) Now, you have to know, that there are much subtypes of HLA. Every single person has a full set of HLA-proteins, and can express 2 types of HLA-A, 2 types of HLA-B, ..., 2 types of HLA-DR, 2 types of HLA-DQ, ... Why 2 types? Your mother and father have 2 different types themselves, and then there's you, resulting from a combination of 2 of those 4 subtypes. With subtype, I mean e.g. HLA-DR*0401, HLA-A2 (with different subtypes *020X with X being another number) ... I'm not sure whether I should speak of haplotypes, allotypes, ..., because we've never gotten these terms explained well ... While we're at it: could someone explain allotype, haplotype? Well, altogether: there are about 15 000 possible HLA-subtypes. I think that's responsible for the peak. And I think the highest point you see there, is a subtype of HLA-A2. If I'm not mistaken, (a subtype of) HLA-A2 is present in about 50% of the population. An specific example: people who carry a copy of HLA-DR4 with a specific amino acid configuration (L67,Q70,K/R71 instead of e.g. I67,D70,E71) are at higher risk of developing rheumatoid arthritis (the specific HLA-DR4 subtype is capable of presenting citrullinated proteins - which happen to pop up in every single one of us - to CD4+ T-lympho's, inducing the whole cellular inflammation cascade), though other factors need to be present as well (PTPN22-deficit: the phosphatase won't be able to nib phosphate off of the T-cell receptors (type Trk), leaving it constitutively active for HLA-DR4) So I think the reason why there's so many variation in number of Y-axis-units here, is the possibility of variation in the presentation of the proteins for which the genes depicted code Edited January 6, 2017 by Function
ecoli Posted January 6, 2017 Posted January 6, 2017 This is probably a trick of the eye. Since there is spatial correlation between SNP-phenotype associations, and you are squeezing a lot of information into a small space, genome locations (x-axis) appear to overlap but don't actually. Did you plot this yourself? To be sure you can zoom in on a specific location. For example: http://www.gettinggeneticsdone.com/2014/05/qqman-r-package-for-qq-and-manhattan-plots-for-gwas-results.html
Function Posted January 6, 2017 Posted January 6, 2017 genome locations (x-axis) appear to overlap Ah so that was the problem of the OP. Seems like I totally didn't get the problem. Forget about the HLA thing then
jbn22 Posted January 7, 2017 Author Posted January 7, 2017 I might be wrong, but could it mean different haplotypes per gene (set)? I mean, I'm looking at the peak at chromosome 6, and if I'm not mistaken, that's where HLA comes in (long story short: HLA is an important family of proteins presenting antigens to CD4+/CD8+ T-lymphocytes (MHC class II type HLA (DR, DQ) and MHC class I type HLA (A, B, C, E are the most important ones), respectively. (If you're wondering: MHC = major histocompatibility complex; the complex of genes on Cx6 of which transcription and subsequent translation results in HLA) Now, you have to know, that there are much subtypes of HLA. Every single person has a full set of HLA-proteins, and can express 2 types of HLA-A, 2 types of HLA-B, ..., 2 types of HLA-DR, 2 types of HLA-DQ, ... Why 2 types? Your mother and father have 2 different types themselves, and then there's you, resulting from a combination of 2 of those 4 subtypes. With subtype, I mean e.g. HLA-DR*0401, HLA-A2 (with different subtypes *020X with X being another number) ... I'm not sure whether I should speak of haplotypes, allotypes, ..., because we've never gotten these terms explained well ... While we're at it: could someone explain allotype, haplotype? Well, altogether: there are about 15 000 possible HLA-subtypes. I think that's responsible for the peak. And I think the highest point you see there, is a subtype of HLA-A2. If I'm not mistaken, (a subtype of) HLA-A2 is present in about 50% of the population. An specific example: people who carry a copy of HLA-DR4 with a specific amino acid configuration (L67,Q70,K/R71 instead of e.g. I67,D70,E71) are at higher risk of developing rheumatoid arthritis (the specific HLA-DR4 subtype is capable of presenting citrullinated proteins - which happen to pop up in every single one of us - to CD4+ T-lympho's, inducing the whole cellular inflammation cascade), though other factors need to be present as well (PTPN22-deficit: the phosphatase won't be able to nib phosphate off of the T-cell receptors (type Trk), leaving it constitutively active for HLA-DR4) So I think the reason why there's so many variation in number of Y-axis-units here, is the possibility of variation in the presentation of the proteins for which the genes depicted code Thanks a lot! While you're right about the MHC locus, the plot in question highlights an association of schizophrenia to the MHC-III locus - a follow-up study that was published in February showed that, to be specific, the complement-4 receptor gene (which is part of the MHCIII locus) expression seems to correlate directly with schizophrenia risk. As far as I understood it, SNPs are just single areas on a gene that are used as markers in genome wide association studies. If my understanding is correct - and please, someone correct or confirm this - a haplotype is defined as a set of genes that correlate with a given SNP (that is, that have a very high linkage disequilibrium with said SNP). But, as I said, I may be wrong. Can someone help? Your line of thought got me thinking, though. In GWAS, are both sets of alleles (parental and maternal) analyzed? As such, would you have a data set of two Y-values per SNP? As in, one chromosome as an SNP of "AAAC" at position XYZ, that correlates with a risk of x for a given disease, whereas the other chromosome might have an SNP of "AGAC" that correlates with given risk? This is probably a trick of the eye. Since there is spatial correlation between SNP-phenotype associations, and you are squeezing a lot of information into a small space, genome locations (x-axis) appear to overlap but don't actually. Did you plot this yourself? To be sure you can zoom in on a specific location. For example: http://www.gettinggeneticsdone.com/2014/05/qqman-r-package-for-qq-and-manhattan-plots-for-gwas-results.html This seems sound. Thanks! Ah so that was the problem of the OP. Seems like I totally didn't get the problem. Forget about the HLA thing then Thanks anyway. I never understood those HLA-genes anyway. Do you happen to know the difference between HLA-A, HLA-DQ, HLA-DR? Do they all present to CD8/CD4 cells? Or does each subtype have a different function?
Function Posted January 7, 2017 Posted January 7, 2017 (edited) Do you happen to know the difference between HLA-A, HLA-DQ, HLA-DR? Do they all present to CD8/CD4 cells? Or does each subtype have a different function? The difference between HLA-A/B/C is not clear to me, but I also don't feel the need to understand it; it's probably quite irrelevant. Same probably goes for HLA-DQ/DR HLA coded by MHC type I (that is, HLA-A, HLA-B, HLA-C, HLA-E being the most important ones) are HLA-proteins presenting intracellular (that is, non-vesicular) particles (only proteins) to CD8+ T-lymphocytes (cytotoxic T "killer cells") HLA coded by MHC type II (that is, HLA-DR, HLA-DQ being the most important ones) are HLA proteins presenting vesicular particles (again, only proteins) to CD4+ T-lymphocytes (T "helper cells": Th1, Th2, Th17, THF, Treg) --- Reason why I put some emphasis on "only proteins" is because B-cell receptors (immunoglobulins, Ig; and antibodies, Ab, which basically are excreted Igs) can, proteins aside, also bind saccharides Edited January 7, 2017 by Function
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now