This pipeline inspects significant overlapping pathway genesets for a given gene list using a hypergeometric test. For the gene set database, we uses GSEA MSigDB Class2: Canonical Pathways DB as a geneset data. Further details about the MsigDB genesets, please visit The Broad Institute GSEA MsigDB
For a given gene list, a hypergeometric test was tried to find significant overlapping canonical pathway gene sets. In terms of FDR adjusted p.values, top 5 significant overlapping gene sets are listed as below.
-
KEGG_NATURAL_KILLER_CELL_MEDIATED_CYTOTOXICITY, KEGG_PATHWAYS_IN_CANCER, KEGG_PROSTATE_CANCER, BIOCARTA_HER2_PATHWAY, KEGG_ERBB_SIGNALING_PATHWAY
GS(geneset) pathway name | gene.list | GS size (m) | n.NotInGS (n) | Gene universe (N) | n.drawn (k) | n.found (x) | p.value (p(X>=x)) | FDR (q.value) |
---|---|---|---|---|---|---|---|---|
KEGG NATURAL KILLER CELL MEDIATED CYTOTOXICITY | gene.list | 387 | 45569 | 45956 | 22 | 6 | 2.284e-08 | 4.614e-06 |
KEGG PATHWAYS IN CANCER | gene.list | 387 | 45569 | 45956 | 22 | 6 | 2.284e-08 | 4.614e-06 |
KEGG PROSTATE CANCER | gene.list | 387 | 45569 | 45956 | 22 | 5 | 9.659e-07 | 1.301e-04 |
BIOCARTA HER2 PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 4 | 3.212e-05 | 1.442e-03 |
KEGG ERBB SIGNALING PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 4 | 3.212e-05 | 1.442e-03 |
KEGG RENAL CELL CARCINOMA | gene.list | 387 | 45569 | 45956 | 22 | 4 | 3.212e-05 | 1.442e-03 |
KEGG ENDOMETRIAL CANCER | gene.list | 387 | 45569 | 45956 | 22 | 4 | 3.212e-05 | 1.442e-03 |
KEGG GLIOMA | gene.list | 387 | 45569 | 45956 | 22 | 4 | 3.212e-05 | 1.442e-03 |
KEGG MELANOMA | gene.list | 387 | 45569 | 45956 | 22 | 4 | 3.212e-05 | 1.442e-03 |
BIOCARTA PPARA PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
BIOCARTA PTEN PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
BIOCARTA EIF4 PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
BIOCARTA MET PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
KEGG CHEMOKINE SIGNALING PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
KEGG ENDOCYTOSIS | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
KEGG VEGF SIGNALING PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
KEGG FOCAL ADHESION | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
KEGG TOLL LIKE RECEPTOR SIGNALING PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
KEGG JAK STAT SIGNALING PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
KEGG T CELL RECEPTOR SIGNALING PATHWAY | gene.list | 387 | 45569 | 45956 | 22 | 3 | 8.102e-04 | 9.352e-03 |
-
Gene set database = c2.cp.v3.0-2.symbols.gmt
-
Input gene list = sig_genes.txt
For a given gene list, it uses a hypergeometric test to get a significance of each overlapping pathway geneset. The hypergeometric p-value is obtained by R library function phyer() and is defined as a probability of randomly drawing x or more successes(gene matches) from the population consisting N genes in k(the input genes) total draws.
-
a cumulative p.val with lower tail==T in phyer():
-
ex). a probability to see at least 3 genes in the group is p(x>=3) = 1 - p(x<=2)= 1 - phyer(2, lower.tail=T) that is, f(x| N, m, k) = mCk * ((N-m) C (n-k)) / ((N) C (n))
-
The hypergeometric test is identical to the corresponding one-tailed version of Fisher's exact test.
-
ex). Fisher' exact test = matrix(c(n.Found, n.GS-n.Found, n.drawn-n.Found, n.NotGS- (n.drawn-n.Found)), nrow=2, dimnames = list(inputGenes = c("Found", "NotFound"),GeneUniverse = c("GS", "nonGS")) )
In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.