This pipeline inspects significant overlapping pathway gene sets for a given gene list using a hypergeometric test. For the gene set database, we uses GSEA MSigDB Class2: Canonical Pathways DB as a gene set data. Further details about the MsigDB gene sets, please visit The Broad Institute GSEA MsigDB
For a given gene list, a hypergeometric test was tried to find significant overlapping canonical pathways using 1320 gene sets. In terms of FDR adjusted p.values, top 5 significant overlapping gene sets are listed as below.
-
REACTOME_IMMUNE_SYSTEM, REACTOME_CYTOKINE_SIGNALING_IN_IMMUNE_SYSTEM, REACTOME_INTERFERON_GAMMA_SIGNALING, REACTOME_INTERFERON_SIGNALING, REACTOME_ENDOSOMAL_VACUOLAR_PATHWAY
GS(gene set) pathway name | gene.list | GS size (m) | n.NotInGS (n) | Gene universe (N) | n.drawn (k) | n.found (x) | p.value (p(X>=x)) | FDR (q.value) |
---|---|---|---|---|---|---|---|---|
REACTOME IMMUNE SYSTEM | gene.list | 933 | 45023 | 45956 | 41 | 13 | 9.556e-13 | 1.261e-09 |
REACTOME CYTOKINE SIGNALING IN IMMUNE SYSTEM | gene.list | 270 | 45686 | 45956 | 41 | 8 | 1.034e-10 | 6.823e-08 |
REACTOME INTERFERON GAMMA SIGNALING | gene.list | 63 | 45893 | 45956 | 41 | 5 | 2.970e-09 | 1.307e-06 |
REACTOME INTERFERON SIGNALING | gene.list | 159 | 45797 | 45956 | 41 | 6 | 6.346e-09 | 2.094e-06 |
REACTOME ENDOSOMAL VACUOLAR PATHWAY | gene.list | 9 | 45947 | 45956 | 41 | 3 | 5.515e-08 | 1.456e-05 |
REACTOME INTERFERON ALPHA BETA SIGNALING | gene.list | 64 | 45892 | 45956 | 41 | 4 | 3.331e-07 | 7.329e-05 |
REACTOME ADAPTIVE IMMUNE SYSTEM | gene.list | 539 | 45417 | 45956 | 41 | 7 | 4.675e-07 | 8.816e-05 |
REACTOME ANTIGEN PRESENTATION FOLDING ASSEMBLY AND PEPTIDE LOADING OF CLASS I MHC | gene.list | 21 | 45935 | 45956 | 41 | 3 | 8.668e-07 | 1.430e-04 |
KEGG ANTIGEN PROCESSING AND PRESENTATION | gene.list | 89 | 45867 | 45956 | 41 | 4 | 1.260e-06 | 1.848e-04 |
KEGG ALLOGRAFT REJECTION | gene.list | 38 | 45918 | 45956 | 41 | 3 | 5.440e-06 | 7.181e-04 |
KEGG GRAFT VERSUS HOST DISEASE | gene.list | 42 | 45914 | 45956 | 41 | 3 | 7.385e-06 | 8.862e-04 |
KEGG TYPE I DIABETES MELLITUS | gene.list | 44 | 45912 | 45956 | 41 | 3 | 8.509e-06 | 9.360e-04 |
KEGG AUTOIMMUNE THYROID DISEASE | gene.list | 53 | 45903 | 45956 | 41 | 3 | 1.497e-05 | 1.520e-03 |
REACTOME ER PHAGOSOME PATHWAY | gene.list | 61 | 45895 | 45956 | 41 | 3 | 2.288e-05 | 2.157e-03 |
ST FAS SIGNALING PATHWAY | gene.list | 65 | 45891 | 45956 | 41 | 3 | 2.770e-05 | 2.438e-03 |
REACTOME IMMUNOREGULATORY INTERACTIONS BETWEEN A LYMPHOID AND A NON LYMPHOID CELL | gene.list | 70 | 45886 | 45956 | 41 | 3 | 3.461e-05 | 2.855e-03 |
KEGG B CELL RECEPTOR SIGNALING PATHWAY | gene.list | 75 | 45881 | 45956 | 41 | 3 | 4.256e-05 | 3.247e-03 |
REACTOME ANTIGEN PROCESSING CROSS PRESENTATION | gene.list | 76 | 45880 | 45956 | 41 | 3 | 4.428e-05 | 3.247e-03 |
KEGG APOPTOSIS | gene.list | 88 | 45868 | 45956 | 41 | 3 | 6.861e-05 | 4.767e-03 |
REACTOME CLASS I MHC MEDIATED ANTIGEN PROCESSING PRESENTATION | gene.list | 251 | 45705 | 45956 | 41 | 4 | 7.506e-05 | 4.954e-03 |
-
Gene set database = c2.cp.v4.0.symbols.gmt
-
Input gene list = MutSig2CV.input.genenames.txt
For a given gene list, it uses a hypergeometric test to get a significance of each overlapping pathway gene set. The hypergeometric p-value is obtained by R library function phyper() and is defined as a probability of randomly drawing x or more successes(gene matches) from the population consisting N genes in k(the input genes) total draws.
-
a cumulative p-value using the R function phyper():
-
ex). a probability to see at least x genes in the group is defined as p(X>=x) = 1 - p(X<=x)= 1 - phyper(x-1, m, n, k, lower.tail=FALSE, log.p=FALSE) that is, f(x| N, m, k) = (m) C (k) * ((N-m) C (n-k)) / ((N) C (n))
-
The hypergeometric test is identical to the corresponding one-tailed version of Fisher's exact test.
-
ex). Fisher' exact test = matrix(c(n.Found, n.GS-n.Found, n.drawn-n.Found, n.NotGS- (n.drawn-n.Found)), nrow=2, dimnames = list(inputGenes = c("Found", "NotFound"),GeneUniverse = c("GS", "nonGS")) )
In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.