This pipeline uses various statistical tests to identify mRNAs whose log2 expression levels correlated to selected clinical features.
Testing the association between 18333 genes and 8 clinical features across 430 samples, statistically thresholded by P value < 0.05 and Q value < 0.3, 6 clinical features related to at least one genes.
-
1140 genes correlated to 'AGE'.
-
SYT6|148281 , ABI1|10006 , PRSS35|167681 , CTBP2|1488 , CNTN3|5067 , ...
-
7 genes correlated to 'GENDER'.
-
NCRNA00183|554203 , HDHD1A|8226 , CYORF15A|246126 , CYORF15B|84663 , GZF1|64412 , ...
-
160 genes correlated to 'KARNOFSKY.PERFORMANCE.SCORE'.
-
EEF1A1|1915 , UGP2|7360 , EEF2|1938 , HNRNPA1|3178 , RPL3|6122 , ...
-
5679 genes correlated to 'HISTOLOGICAL.TYPE'.
-
AK2|204 , STK40|83931 , TXLNA|200081 , ASAP3|55616 , NADK|65220 , ...
-
915 genes correlated to 'RADIATIONS.RADIATION.REGIMENINDICATION'.
-
CSAD|51380 , MAMDC4|158056 , RHOT2|89941 , MAN2C1|4123 , NSUN5P2|260294 , ...
-
1 gene correlated to 'RACE'.
-
ARMC10|83787
-
No genes correlated to 'Time to Death', and 'ETHNICITY'.
Complete statistical result table is provided in Supplement Table 1
Clinical feature | Statistical test | Significant genes | Associated with | Associated with | ||
---|---|---|---|---|---|---|
Time to Death | Cox regression test | N=0 | ||||
AGE | Spearman correlation test | N=1140 | older | N=610 | younger | N=530 |
GENDER | Wilcoxon test | N=7 | male | N=7 | female | N=0 |
KARNOFSKY PERFORMANCE SCORE | Spearman correlation test | N=160 | higher score | N=105 | lower score | N=55 |
HISTOLOGICAL TYPE | Kruskal-Wallis test | N=5679 | ||||
RADIATIONS RADIATION REGIMENINDICATION | Wilcoxon test | N=915 | yes | N=915 | no | N=0 |
RACE | Kruskal-Wallis test | N=1 | ||||
ETHNICITY | Wilcoxon test | N=0 |
Time to Death | Duration (Months) | 0-211.2 (median=15.9) |
censored | N = 352 | |
death | N = 75 | |
Significant markers | N = 0 |
AGE | Mean (SD) | 42.96 (13) |
Significant markers | N = 1140 | |
pos. correlated | 610 | |
neg. correlated | 530 |
SpearmanCorr | corrP | Q | |
---|---|---|---|
SYT6|148281 | -0.4291 | 1.213e-20 | 2.22e-16 |
ABI1|10006 | -0.3959 | 1.51e-17 | 2.77e-13 |
PRSS35|167681 | -0.385 | 1.321e-16 | 2.42e-12 |
CTBP2|1488 | -0.377 | 6.126e-16 | 1.12e-11 |
CNTN3|5067 | -0.3746 | 9.799e-16 | 1.8e-11 |
NOL3|8996 | 0.3689 | 2.838e-15 | 5.2e-11 |
RIN1|9610 | 0.3654 | 5.406e-15 | 9.91e-11 |
PPP2R5A|5525 | -0.3649 | 5.916e-15 | 1.08e-10 |
RBM17|84991 | -0.3645 | 6.338e-15 | 1.16e-10 |
SFRP2|6423 | -0.3642 | 6.62e-15 | 1.21e-10 |
GENDER | Labels | N |
FEMALE | 190 | |
MALE | 240 | |
Significant markers | N = 7 | |
Higher in MALE | 7 | |
Higher in FEMALE | 0 |
W(pos if higher in 'MALE') | wilcoxontestP | Q | AUC | |
---|---|---|---|---|
NCRNA00183|554203 | 4434 | 1.055e-46 | 1.93e-42 | 0.9028 |
HDHD1A|8226 | 9481 | 2.304e-25 | 4.22e-21 | 0.7921 |
CYORF15A|246126 | 10078 | 5.052e-25 | 9.25e-21 | 0.9998 |
CYORF15B|84663 | 6233 | 6.645e-17 | 1.22e-12 | 0.9989 |
GZF1|64412 | 29027 | 1.143e-06 | 0.0209 | 0.6366 |
NLRP2|55655 | 16103 | 2.769e-06 | 0.0507 | 0.6328 |
SPESP1|246777 | 28360 | 1.398e-05 | 0.256 | 0.6219 |
160 genes related to 'KARNOFSKY.PERFORMANCE.SCORE'.
KARNOFSKY.PERFORMANCE.SCORE | Mean (SD) | 87.77 (12) |
Significant markers | N = 160 | |
pos. correlated | 105 | |
neg. correlated | 55 |
SpearmanCorr | corrP | Q | |
---|---|---|---|
EEF1A1|1915 | 0.3676 | 2.561e-09 | 4.7e-05 |
UGP2|7360 | -0.3536 | 1.096e-08 | 0.000201 |
EEF2|1938 | 0.3531 | 1.16e-08 | 0.000213 |
HNRNPA1|3178 | 0.3503 | 1.531e-08 | 0.000281 |
RPL3|6122 | 0.3419 | 3.512e-08 | 0.000644 |
OPN3|23596 | -0.3401 | 4.197e-08 | 0.000769 |
VAV3|10451 | -0.34 | 4.236e-08 | 0.000776 |
RPL15|6138 | 0.3391 | 4.616e-08 | 0.000846 |
RPL13|6137 | 0.3326 | 8.6e-08 | 0.00158 |
RPS23|6228 | 0.3269 | 1.472e-07 | 0.0027 |
HISTOLOGICAL.TYPE | Labels | N |
ASTROCYTOMA | 158 | |
OLIGOASTROCYTOMA | 108 | |
OLIGODENDROGLIOMA | 164 | |
Significant markers | N = 5679 |
ANOVA_P | Q | |
---|---|---|
AK2|204 | 4.18e-34 | 7.66e-30 |
STK40|83931 | 3.954e-33 | 7.25e-29 |
TXLNA|200081 | 2.604e-32 | 4.77e-28 |
ASAP3|55616 | 8.143e-32 | 1.49e-27 |
NADK|65220 | 8.282e-32 | 1.52e-27 |
WDR77|79084 | 2.513e-31 | 4.61e-27 |
TXNDC12|51060 | 7.249e-31 | 1.33e-26 |
HDAC1|3065 | 8.136e-31 | 1.49e-26 |
TRAPPC3|27095 | 1.276e-30 | 2.34e-26 |
VIM|7431 | 1.828e-30 | 3.35e-26 |
915 genes related to 'RADIATIONS.RADIATION.REGIMENINDICATION'.
RADIATIONS.RADIATION.REGIMENINDICATION | Labels | N |
NO | 89 | |
YES | 341 | |
Significant markers | N = 915 | |
Higher in YES | 915 | |
Higher in NO | 0 |
W(pos if higher in 'YES') | wilcoxontestP | Q | AUC | |
---|---|---|---|---|
CSAD|51380 | 23669 | 4.097e-16 | 7.51e-12 | 0.7799 |
MAMDC4|158056 | 23555 | 1.003e-15 | 1.84e-11 | 0.7761 |
RHOT2|89941 | 23490 | 1.663e-15 | 3.05e-11 | 0.774 |
MAN2C1|4123 | 23467 | 1.987e-15 | 3.64e-11 | 0.7732 |
NSUN5P2|260294 | 23342 | 5.181e-15 | 9.5e-11 | 0.7691 |
NCRNA00105|80161 | 23312 | 6.507e-15 | 1.19e-10 | 0.7681 |
HOOK2|29911 | 23247 | 1.063e-14 | 1.95e-10 | 0.766 |
GOLGA6L9|440295 | 23186 | 1.68e-14 | 3.08e-10 | 0.764 |
CDK10|8558 | 23156 | 2.101e-14 | 3.85e-10 | 0.763 |
CCDC154|645811 | 22956.5 | 3.502e-14 | 6.42e-10 | 0.7609 |
RACE | Labels | N |
AMERICAN INDIAN OR ALASKA NATIVE | 1 | |
ASIAN | 6 | |
BLACK OR AFRICAN AMERICAN | 14 | |
WHITE | 400 | |
Significant markers | N = 1 |
ANOVA_P | Q | |
---|---|---|
ARMC10|83787 | 4.71e-06 | 0.0863 |
-
Expresson data file = LGG-TP.uncv2.mRNAseq_RSEM_normalized_log2.txt
-
Clinical data file = LGG-TP.merged_data.txt
-
Number of patients = 430
-
Number of genes = 18333
-
Number of clinical features = 8
For survival clinical features, Wald's test in univariate Cox regression analysis with proportional hazards model (Andersen and Gill 1982) was used to estimate the P values using the 'coxph' function in R. Kaplan-Meier survival curves were plot using the four quartile subgroups of patients based on expression levels
For continuous numerical clinical features, Spearman's rank correlation coefficients (Spearman 1904) and two-tailed P values were estimated using 'cor.test' function in R
For two-class clinical features, two-tailed Student's t test with unequal variance (Lehmann and Romano 2005) was applied to compare the log2-expression levels between the two clinical classes using 't.test' function in R
For multi-class clinical features (ordinal or nominal), one-way analysis of variance (Howell 2002) was applied to compare the log2-expression levels between different clinical classes using 'anova' function in R
For multiple hypothesis correction, Q value is the False Discovery Rate (FDR) analogue of the P value (Benjamini and Hochberg 1995), defined as the minimum FDR at which the test may be called significant. We used the 'Benjamini and Hochberg' method of 'p.adjust' function in R to convert P values into Q values.
In addition to the links below, the full results of the analysis summarized in this report can also be downloaded programmatically using firehose_get, or interactively from either the Broad GDAC website or TCGA Data Coordination Center Portal.