This pipeline uses various statistical tests to identify mRNAs whose log2 expression levels correlated to selected clinical features.
Testing the association between 18333 genes and 8 clinical features across 400 samples, statistically thresholded by P value < 0.05 and Q value < 0.3, 6 clinical features related to at least one genes.
-
1170 genes correlated to 'AGE'.
-
SYT6|148281 , ABI1|10006 , PRSS35|167681 , CTBP2|1488 , RIN1|9610 , ...
-
6 genes correlated to 'GENDER'.
-
NCRNA00183|554203 , CYORF15A|246126 , HDHD1A|8226 , CYORF15B|84663 , NLRP2|55655 , ...
-
89 genes correlated to 'KARNOFSKY.PERFORMANCE.SCORE'.
-
EEF1A1|1915 , EEF2|1938 , UGP2|7360 , VAV3|10451 , OPN3|23596 , ...
-
5516 genes correlated to 'HISTOLOGICAL.TYPE'.
-
AK2|204 , STK40|83931 , ASAP3|55616 , TXNDC12|51060 , TXLNA|200081 , ...
-
696 genes correlated to 'RADIATIONS.RADIATION.REGIMENINDICATION'.
-
MAMDC4|158056 , CSAD|51380 , RHOT2|89941 , MAN2C1|4123 , GOLGA6L9|440295 , ...
-
1 gene correlated to 'RACE'.
-
ARMC10|83787
-
No genes correlated to 'Time to Death', and 'ETHNICITY'.
Complete statistical result table is provided in Supplement Table 1
Clinical feature | Statistical test | Significant genes | Associated with | Associated with | ||
---|---|---|---|---|---|---|
Time to Death | Cox regression test | N=0 | ||||
AGE | Spearman correlation test | N=1170 | older | N=613 | younger | N=557 |
GENDER | Wilcoxon test | N=6 | male | N=6 | female | N=0 |
KARNOFSKY PERFORMANCE SCORE | Spearman correlation test | N=89 | higher score | N=64 | lower score | N=25 |
HISTOLOGICAL TYPE | Kruskal-Wallis test | N=5516 | ||||
RADIATIONS RADIATION REGIMENINDICATION | Wilcoxon test | N=696 | yes | N=696 | no | N=0 |
RACE | Kruskal-Wallis test | N=1 | ||||
ETHNICITY | Wilcoxon test | N=0 |
Time to Death | Duration (Months) | 0-211.2 (median=15) |
censored | N = 324 | |
death | N = 71 | |
Significant markers | N = 0 |
AGE | Mean (SD) | 43.22 (13) |
Significant markers | N = 1170 | |
pos. correlated | 613 | |
neg. correlated | 557 |
SpearmanCorr | corrP | Q | |
---|---|---|---|
SYT6|148281 | -0.4409 | 1.869e-20 | 3.43e-16 |
ABI1|10006 | -0.4134 | 6.097e-18 | 1.12e-13 |
PRSS35|167681 | -0.4003 | 7.921e-17 | 1.45e-12 |
CTBP2|1488 | -0.3994 | 9.503e-17 | 1.74e-12 |
RIN1|9610 | 0.3865 | 1.054e-15 | 1.93e-11 |
NOL3|8996 | 0.385 | 1.404e-15 | 2.57e-11 |
NR2E1|7101 | 0.3818 | 2.502e-15 | 4.59e-11 |
RBM17|84991 | -0.3812 | 2.757e-15 | 5.05e-11 |
SFRP2|6423 | -0.3806 | 3.069e-15 | 5.62e-11 |
CNTN3|5067 | -0.377 | 5.904e-15 | 1.08e-10 |
GENDER | Labels | N |
FEMALE | 176 | |
MALE | 224 | |
Significant markers | N = 6 | |
Higher in MALE | 6 | |
Higher in FEMALE | 0 |
W(pos if higher in 'MALE') | wilcoxontestP | Q | AUC | |
---|---|---|---|---|
NCRNA00183|554203 | 3849 | 1.929e-43 | 3.53e-39 | 0.9024 |
CYORF15A|246126 | 9182 | 2.649e-24 | 4.85e-20 | 0.9998 |
HDHD1A|8226 | 8102 | 4.756e-24 | 8.71e-20 | 0.7945 |
CYORF15B|84663 | 5369 | 1.01e-15 | 1.85e-11 | 0.9987 |
NLRP2|55655 | 13181 | 2.659e-07 | 0.00486 | 0.6513 |
GZF1|64412 | 25103 | 2.648e-06 | 0.0484 | 0.6367 |
89 genes related to 'KARNOFSKY.PERFORMANCE.SCORE'.
KARNOFSKY.PERFORMANCE.SCORE | Mean (SD) | 87.93 (12) |
Significant markers | N = 89 | |
pos. correlated | 64 | |
neg. correlated | 25 |
SpearmanCorr | corrP | Q | |
---|---|---|---|
EEF1A1|1915 | 0.3531 | 3.244e-08 | 0.000595 |
EEF2|1938 | 0.3415 | 9.554e-08 | 0.00175 |
UGP2|7360 | -0.3408 | 1.018e-07 | 0.00187 |
VAV3|10451 | -0.3349 | 1.743e-07 | 0.00319 |
OPN3|23596 | -0.3328 | 2.107e-07 | 0.00386 |
RPL15|6138 | 0.3298 | 2.746e-07 | 0.00503 |
RPL3|6122 | 0.3282 | 3.143e-07 | 0.00576 |
HNRNPA1|3178 | 0.3277 | 3.282e-07 | 0.00601 |
IMMT|10989 | -0.3277 | 3.293e-07 | 0.00603 |
RPL13|6137 | 0.3238 | 4.593e-07 | 0.00842 |
HISTOLOGICAL.TYPE | Labels | N |
ASTROCYTOMA | 142 | |
OLIGOASTROCYTOMA | 105 | |
OLIGODENDROGLIOMA | 153 | |
Significant markers | N = 5516 |
ANOVA_P | Q | |
---|---|---|
AK2|204 | 3.02e-31 | 5.54e-27 |
STK40|83931 | 1.078e-30 | 1.98e-26 |
ASAP3|55616 | 1.775e-29 | 3.25e-25 |
TXNDC12|51060 | 1.88e-29 | 3.45e-25 |
TXLNA|200081 | 3.032e-29 | 5.56e-25 |
WDR77|79084 | 5.699e-29 | 1.04e-24 |
NADK|65220 | 9.182e-29 | 1.68e-24 |
TRAPPC3|27095 | 1.335e-28 | 2.45e-24 |
WLS|79971 | 7.442e-28 | 1.36e-23 |
HDAC1|3065 | 1.001e-27 | 1.83e-23 |
696 genes related to 'RADIATIONS.RADIATION.REGIMENINDICATION'.
RADIATIONS.RADIATION.REGIMENINDICATION | Labels | N |
NO | 89 | |
YES | 311 | |
Significant markers | N = 696 | |
Higher in YES | 696 | |
Higher in NO | 0 |
W(pos if higher in 'YES') | wilcoxontestP | Q | AUC | |
---|---|---|---|---|
MAMDC4|158056 | 21237 | 1.457e-14 | 2.67e-10 | 0.7673 |
CSAD|51380 | 21231 | 1.529e-14 | 2.8e-10 | 0.767 |
RHOT2|89941 | 21120 | 3.743e-14 | 6.86e-10 | 0.763 |
MAN2C1|4123 | 21090 | 4.756e-14 | 8.72e-10 | 0.7619 |
GOLGA6L9|440295 | 20969 | 1.238e-13 | 2.27e-09 | 0.7576 |
NSUN5P2|260294 | 20966 | 1.268e-13 | 2.32e-09 | 0.7575 |
NCRNA00105|80161 | 20895 | 2.206e-13 | 4.04e-09 | 0.7549 |
HOOK2|29911 | 20882 | 2.44e-13 | 4.47e-09 | 0.7544 |
CDK10|8558 | 20823 | 3.848e-13 | 7.05e-09 | 0.7523 |
LAT|27040 | 20791 | 4.918e-13 | 9.01e-09 | 0.7511 |
RACE | Labels | N |
AMERICAN INDIAN OR ALASKA NATIVE | 1 | |
ASIAN | 2 | |
BLACK OR AFRICAN AMERICAN | 14 | |
WHITE | 376 | |
Significant markers | N = 1 |
ANOVA_P | Q | |
---|---|---|
ARMC10|83787 | 5.654e-06 | 0.104 |
-
Expresson data file = LGG-TP.uncv2.mRNAseq_RSEM_normalized_log2.txt
-
Clinical data file = LGG-TP.merged_data.txt
-
Number of patients = 400
-
Number of genes = 18333
-
Number of clinical features = 8
For survival clinical features, Wald's test in univariate Cox regression analysis with proportional hazards model (Andersen and Gill 1982) was used to estimate the P values using the 'coxph' function in R. Kaplan-Meier survival curves were plot using the four quartile subgroups of patients based on expression levels
For continuous numerical clinical features, Spearman's rank correlation coefficients (Spearman 1904) and two-tailed P values were estimated using 'cor.test' function in R
For two-class clinical features, two-tailed Student's t test with unequal variance (Lehmann and Romano 2005) was applied to compare the log2-expression levels between the two clinical classes using 't.test' function in R
For multi-class clinical features (ordinal or nominal), one-way analysis of variance (Howell 2002) was applied to compare the log2-expression levels between different clinical classes using 'anova' function in R
For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.
In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.